1
|
Handa S, Isaacs KK, Wall JT, Larger A, Burns S, Koval LE, Baron-Furuyama K, Elonen CM, Lyons D, Dionisio KL, Horton MB, Phillips KA. The Chemical and Products Database v4.0, an updated resource supporting chemical exposure evaluations. Sci Data 2025; 12:950. [PMID: 40481042 PMCID: PMC12144101 DOI: 10.1038/s41597-025-05240-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2024] [Accepted: 05/20/2025] [Indexed: 06/11/2025] Open
Abstract
Since the initial release of the Chemical and Products Database (CPDat) in 2018, the United States Environmental Protection Agency has added a considerable amount of chemical exposure-related information to the database and has expanded its schema to accommodate new types of data. This data descriptor provides information regarding the structure and types of data contained within CPDat (both existing and new), new controlled vocabularies implemented to harmonize terminology across the different data types, application of a rigorous data curation and quality assurance tracking system, and various methods of accessing CPDat.
Collapse
Affiliation(s)
- Sakshi Handa
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, North Carolina, USA
| | - Kristin K Isaacs
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, North Carolina, USA
| | - Jonathan T Wall
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, North Carolina, USA
| | - Allison Larger
- General Dynamics Information Technology, Falls Church, Virginia, USA
| | - Scott Burns
- Oak Ridge Associated Universities, Oak Ridge, Tennessee, USA
- U.S. Environmental Protection Agency, Office of Research and Development, Center of Public Health and Environmental Assessment, Washington, District of Columbia, USA
| | - Lauren E Koval
- Oak Ridge Associated Universities, Oak Ridge, Tennessee, USA
| | - Kenta Baron-Furuyama
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, North Carolina, USA
| | - Colleen M Elonen
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, North Carolina, USA
| | - David Lyons
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, North Carolina, USA
| | - Kathie L Dionisio
- U.S. Environmental Protection Agency, Office of Research and Development, Research Triangle Park, North Carolina, USA
| | - M Beth Horton
- Oak Ridge Associated Universities, Oak Ridge, Tennessee, USA
| | - Katherine A Phillips
- U.S. Environmental Protection Agency, Office of Research and Development, Center for Computational Toxicology and Exposure, Research Triangle Park, North Carolina, USA.
| |
Collapse
|
2
|
Pradeep P, Seifert S, Singh AV, Laux P, Pirow R. Applicability of In Silico New Approach Methods for the Risk Assessment of Tattoo Ink Ingredients. ENVIRONMENTAL AND MOLECULAR MUTAGENESIS 2025; 66:199-209. [PMID: 40388032 PMCID: PMC12087733 DOI: 10.1002/em.70010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/07/2024] [Revised: 03/11/2025] [Accepted: 03/26/2025] [Indexed: 05/20/2025]
Abstract
Tattoo inks contain several substances, including organic and inorganic pigments, additives, and solvents, which may pose a health risk to not only the tattooed skin but also to other parts of the human body due to intradermal exposure. Substances in tattoo inks are regulated by entry 75 in Annex XVII of REACH Regulation (EC) No. 1907/2006. However, despite these legal requirements, a well-defined criterion for the safety assessment of tattoo inks remains lacking. In this context, 2021 BfR opinion titled "Tattoo inks: minimum requirements and test methods" proposed a comprehensive risk assessment of pigments using in vitro/in-chemico data in accordance with OECD Guidelines and CLP Regulations. In the absence of experimental data, new approach methodologies (NAMs) may be used for data-gap filling. Therefore, this work evaluates the applicability of in silico NAMs for data-gap filling for a list of tattoo ink ingredients identified by the Joint Research Centre (JRC) and BfR for genotoxicity assessment. Experimental in vitro genotoxicity data were acquired from the International Uniform Chemical Information Database (IUCLID) which makes non-confidential REACH Study Results publicly accessible. The specific aims of this analysis were the evaluation of in silico genotoxicity predictions from publicly available QSAR tools and structural alerts, the development and validation of new QSAR models specific to tattoo ink ingredients, and the application of in silico models for categorization and prioritization of data-poor ingredients for further screening. Based on the workflow developed in this study, 4 high priority, 18 medium priority, and 2 low priority substances were identified for further assessment.
Collapse
Affiliation(s)
- Prachi Pradeep
- German Federal Institute for Risk Assessment (BfR)Department of Chemical and Product SafetyBerlinGermany
| | - Stefanie Seifert
- German Federal Institute for Risk Assessment (BfR)Department of Chemical and Product SafetyBerlinGermany
| | - Ajay Vikram Singh
- German Federal Institute for Risk Assessment (BfR)Department of Chemical and Product SafetyBerlinGermany
| | - Peter Laux
- German Federal Institute for Risk Assessment (BfR)Department of Chemical and Product SafetyBerlinGermany
| | - Ralph Pirow
- German Federal Institute for Risk Assessment (BfR)Department of Chemical and Product SafetyBerlinGermany
| |
Collapse
|
3
|
Guo W, Liu J, Dong F, Hong H. Unlocking the potential of AI: Machine learning and deep learning models for predicting carcinogenicity of chemicals. JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH. PART C, TOXICOLOGY AND CARCINOGENESIS 2024; 43:23-50. [PMID: 39228157 DOI: 10.1080/26896583.2024.2396731] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/05/2024]
Abstract
The escalating apprehension surrounding the carcinogenic potential of chemicals emphasizes the imperative need for efficient methods of assessing carcinogenicity. Conventional experimental approaches such as in vitro and in vivo assays, albeit effective, suffer from being costly and time-consuming. In response to this challenge, new alternative methodologies, notably machine learning and deep learning techniques, have attracted attention for their potential in developing carcinogenicity prediction models. This article reviews the progress in predicting carcinogenicity using various machine learning and deep learning algorithms. A comparative analysis on these developed models reveals that support vector machine, random forest, and ensemble learning are commonly preferred for their robustness and effectiveness in predicting chemical carcinogenicity. Conversely, models based on deep learning algorithms, such as feedforward neural network, convolutional neural network, graph convolutional neural network, capsule neural network, and hybrid neural networks, exhibit promising capabilities but are limited by the size of available carcinogenicity datasets. This review provides a comprehensive analysis of current machine learning and deep learning models for carcinogenicity prediction, underscoring the importance of high-quality and large datasets. These observations are anticipated to catalyze future advancements in developing effective and generalizable machine learning and deep learning models for predicting chemical carcinogenicity.
Collapse
Affiliation(s)
- Wenjing Guo
- National Center for Toxicological Research (NCTR), U.S. Food & Drug Administration (FDA), Jefferson, AR
| | - Jie Liu
- National Center for Toxicological Research (NCTR), U.S. Food & Drug Administration (FDA), Jefferson, AR
| | - Fan Dong
- National Center for Toxicological Research (NCTR), U.S. Food & Drug Administration (FDA), Jefferson, AR
| | - Huixiao Hong
- National Center for Toxicological Research (NCTR), U.S. Food & Drug Administration (FDA), Jefferson, AR
| |
Collapse
|
4
|
Kataria A, Srivastava A, Singh DD, Haque S, Han I, Yadav DK. Systematic computational strategies for identifying protein targets and lead discovery. RSC Med Chem 2024; 15:2254-2269. [PMID: 39026640 PMCID: PMC11253860 DOI: 10.1039/d4md00223g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2024] [Accepted: 05/10/2024] [Indexed: 07/20/2024] Open
Abstract
Computational algorithms and tools have retrenched the drug discovery and development timeline. The applicability of computational approaches has gained immense relevance owing to the dramatic surge in the structural information of biomacromolecules and their heteromolecular complexes. Computational methods are now extensively used in identifying new protein targets, druggability assessment, pharmacophore mapping, molecular docking, the virtual screening of lead molecules, bioactivity prediction, molecular dynamics of protein-ligand complexes, affinity prediction, and for designing better ligands. Herein, we provide an overview of salient components of recently reported computational drug-discovery workflows that includes algorithms, tools, and databases for protein target identification and optimized ligand selection.
Collapse
Affiliation(s)
- Arti Kataria
- Laboratory of Bacteriology, Rocky Mountain Laboratories, National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH) Hamilton MT 59840 USA
| | - Ankit Srivastava
- Laboratory of Neurological Infections and Immunity, Rocky Mountain Laboratories, National Institute of Allergy and Infectious Diseases (NIAID), National Institutes of Health (NIH) Hamilton MT 59840 USA
| | - Desh Deepak Singh
- Amity Institute of Biotechnology, Amity University Rajasthan Jaipur India
| | - Shafiul Haque
- Research and Scientific Studies Unit, College of Nursing and Health Sciences, Jazan University Jazan-45142 Saudi Arabia
| | - Ihn Han
- Plasma Bioscience Research Center, Applied Plasma Medicine Center, Department of Electrical & Biological Physics, Kwangwoon University Seoul 01897 Republic of Korea +82 32 820 4948
| | - Dharmendra Kumar Yadav
- Department of Biologics, College of Pharmacy, Gachon University Hambakmoeiro 191, Yeonsu-gu Incheon 21924 Republic of Korea
| |
Collapse
|
5
|
Mansouri K, Moreira-Filho JT, Lowe CN, Charest N, Martin T, Tkachenko V, Judson R, Conway M, Kleinstreuer NC, Williams AJ. Free and open-source QSAR-ready workflow for automated standardization of chemical structures in support of QSAR modeling. J Cheminform 2024; 16:19. [PMID: 38378618 PMCID: PMC10880251 DOI: 10.1186/s13321-024-00814-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2023] [Accepted: 02/10/2024] [Indexed: 02/22/2024] Open
Abstract
The rapid increase of publicly available chemical structures and associated experimental data presents a valuable opportunity to build robust QSAR models for applications in different fields. However, the common concern is the quality of both the chemical structure information and associated experimental data. This is especially true when those data are collected from multiple sources as chemical substance mappings can contain many duplicate structures and molecular inconsistencies. Such issues can impact the resulting molecular descriptors and their mappings to experimental data and, subsequently, the quality of the derived models in terms of accuracy, repeatability, and reliability. Herein we describe the development of an automated workflow to standardize chemical structures according to a set of standard rules and generate two and/or three-dimensional "QSAR-ready" forms prior to the calculation of molecular descriptors. The workflow was designed in the KNIME workflow environment and consists of three high-level steps. First, a structure encoding is read, and then the resulting in-memory representation is cross-referenced with any existing identifiers for consistency. Finally, the structure is standardized using a series of operations including desalting, stripping of stereochemistry (for two-dimensional structures), standardization of tautomers and nitro groups, valence correction, neutralization when possible, and then removal of duplicates. This workflow was initially developed to support collaborative modeling QSAR projects to ensure consistency of the results from the different participants. It was then updated and generalized for other modeling applications. This included modification of the "QSAR-ready" workflow to generate "MS-ready structures" to support the generation of substance mappings and searches for software applications related to non-targeted analysis mass spectrometry. Both QSAR and MS-ready workflows are freely available in KNIME, via standalone versions on GitHub, and as docker container resources for the scientific community. Scientific contribution: This work pioneers an automated workflow in KNIME, systematically standardizing chemical structures to ensure their readiness for QSAR modeling and broader scientific applications. By addressing data quality concerns through desalting, stereochemistry stripping, and normalization, it optimizes molecular descriptors' accuracy and reliability. The freely available resources in KNIME, GitHub, and docker containers democratize access, benefiting collaborative research and advancing diverse modeling endeavors in chemistry and mass spectrometry.
Collapse
Affiliation(s)
- Kamel Mansouri
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA.
| | - José T Moreira-Filho
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA
| | - Charles N Lowe
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| | - Nathaniel Charest
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| | - Todd Martin
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| | | | - Richard Judson
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| | - Mike Conway
- National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA
| | - Nicole C Kleinstreuer
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27709, USA
| | - Antony J Williams
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, NC, 27711, USA
| |
Collapse
|
6
|
Richard AM. Paths to cheminformatics: Q&A with Ann M. Richard. J Cheminform 2023; 15:93. [PMID: 37798636 PMCID: PMC10557182 DOI: 10.1186/s13321-023-00749-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/07/2023] Open
Affiliation(s)
- Ann M Richard
- The U.S. Environmental Protection Agency, Durham, NC, USA.
| |
Collapse
|
7
|
Rojas C, Ballabio D, Consonni V, Suárez-Estrella D, Todeschini R. Classification-based machine learning approaches to predict the taste of molecules: A review. Food Res Int 2023; 171:113036. [PMID: 37330849 DOI: 10.1016/j.foodres.2023.113036] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 05/02/2023] [Accepted: 05/22/2023] [Indexed: 06/19/2023]
Abstract
The capacity to discriminate safe from dangerous compounds has played an important role in the evolution of species, including human beings. Highly evolved senses such as taste receptors allow humans to navigate and survive in the environment through information that arrives to the brain through electrical pulses. Specifically, taste receptors provide multiple bits of information about the substances that are introduced orally. These substances could be pleasant or not according to the taste responses that they trigger. Tastes have been classified into basic (sweet, bitter, umami, sour and salty) or non-basic (astringent, chilling, cooling, heating, pungent), while some compounds are considered as multitastes, taste modifiers or tasteless. Classification-based machine learning approaches are useful tools to develop predictive mathematical relationships in such a way as to predict the taste class of new molecules based on their chemical structure. This work reviews the history of multicriteria quantitative structure-taste relationship modelling, starting from the first ligand-based (LB) classifier proposed in 1980 by Lemont B. Kier and concluding with the most recent studies published in 2022.
Collapse
Affiliation(s)
- Cristian Rojas
- Grupo de Investigación en Quimiometría y QSAR, Facultad de Ciencia y Tecnología, Universidad del Azuay, Av. 24 de Mayo 7-77 y Hernán Malo, Cuenca 010107, Ecuador.
| | - Davide Ballabio
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, P.za della Scienza 1-20126, Milano, Italy
| | - Viviana Consonni
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, P.za della Scienza 1-20126, Milano, Italy
| | - Diego Suárez-Estrella
- Grupo de Investigación en Quimiometría y QSAR, Facultad de Ciencia y Tecnología, Universidad del Azuay, Av. 24 de Mayo 7-77 y Hernán Malo, Cuenca 010107, Ecuador
| | - Roberto Todeschini
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, P.za della Scienza 1-20126, Milano, Italy
| |
Collapse
|
8
|
Buckley TJ, Egeghy PP, Isaacs K, Richard AM, Ring C, Sayre RR, Sobus JR, Thomas RS, Ulrich EM, Wambaugh JF, Williams AJ. Cutting-edge computational chemical exposure research at the U.S. Environmental Protection Agency. ENVIRONMENT INTERNATIONAL 2023; 178:108097. [PMID: 37478680 PMCID: PMC10588682 DOI: 10.1016/j.envint.2023.108097] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 06/05/2023] [Accepted: 07/12/2023] [Indexed: 07/23/2023]
Abstract
Exposure science is evolving from its traditional "after the fact" and "one chemical at a time" approach to forecasting chemical exposures rapidly enough to keep pace with the constantly expanding landscape of chemicals and exposures. In this article, we provide an overview of the approaches, accomplishments, and plans for advancing computational exposure science within the U.S. Environmental Protection Agency's Office of Research and Development (EPA/ORD). First, to characterize the universe of chemicals in commerce and the environment, a carefully curated, web-accessible chemical resource has been created. This DSSTox database unambiguously identifies >1.2 million unique substances reflecting potential environmental and human exposures and includes computationally accessible links to each compound's corresponding data resources. Next, EPA is developing, applying, and evaluating predictive exposure models. These models increasingly rely on data, computational tools like quantitative structure activity relationship (QSAR) models, and machine learning/artificial intelligence to provide timely and efficient prediction of chemical exposure (and associated uncertainty) for thousands of chemicals at a time. Integral to this modeling effort, EPA is developing data resources across the exposure continuum that includes application of high-resolution mass spectrometry (HRMS) non-targeted analysis (NTA) methods providing measurement capability at scale with the number of chemicals in commerce. These research efforts are integrated and well-tailored to support population exposure assessment to prioritize chemicals for exposure as a critical input to risk management. In addition, the exposure forecasts will allow a wide variety of stakeholders to explore sustainable initiatives like green chemistry to achieve economic, social, and environmental prosperity and protection of future generations.
Collapse
Affiliation(s)
- Timothy J Buckley
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States.
| | - Peter P Egeghy
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Kristin Isaacs
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Ann M Richard
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Caroline Ring
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Risa R Sayre
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Jon R Sobus
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Russell S Thomas
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Elin M Ulrich
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - John F Wambaugh
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| | - Antony J Williams
- U.S. Environmental Protection Agency, Office of Research & Development, Center for Computational Toxicology & Exposure (CCTE), 109 TW Alexander Drive, Research Triangle Park, NC 27711, United States
| |
Collapse
|
9
|
Exploring Deep Learning for Metalloporphyrins: Databases, Molecular Representations, and Model Architectures. Catalysts 2022. [DOI: 10.3390/catal12111485] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
Abstract
Metalloporphyrins have been studied as biomimetic catalysts for more than 120 years and have accumulated a large amount of data, which provides a solid foundation for deep learning to discover chemical trends and structure–function relationships. In this study, key components of deep learning of metalloporphyrins, including databases, molecular representations, and model architectures, were systematically investigated. A protocol to construct canonical SMILES for metalloporphyrins was proposed, which was then used to represent the two-dimensional structures of over 10,000 metalloporphyrins in an existing computational database. Subsequently, several state-of-the-art chemical deep learning models, including graph neural network-based models and natural language processing-based models, were employed to predict the energy gaps of metalloporphyrins. Two models showed satisfactory predictive performance (R2 0.94) with canonical SMILES as the only source of structural information. In addition, an unsupervised visualization algorithm was used to interpret the molecular features learned by the deep learning models.
Collapse
|
10
|
Yesiltepe Y, Govind N, Metz TO, Renslow RS. An initial investigation of accuracy required for the identification of small molecules in complex samples using quantum chemical calculated NMR chemical shifts. J Cheminform 2022; 14:64. [PMID: 36138446 PMCID: PMC9499888 DOI: 10.1186/s13321-022-00587-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2020] [Accepted: 02/06/2022] [Indexed: 11/24/2022] Open
Abstract
The majority of primary and secondary metabolites in nature have yet to be identified, representing a major challenge for metabolomics studies that currently require reference libraries from analyses of authentic compounds. Using currently available analytical methods, complete chemical characterization of metabolomes is infeasible for both technical and economic reasons. For example, unambiguous identification of metabolites is limited by the availability of authentic chemical standards, which, for the majority of molecules, do not exist. Computationally predicted or calculated data are a viable solution to expand the currently limited metabolite reference libraries, if such methods are shown to be sufficiently accurate. For example, determining nuclear magnetic resonance (NMR) spectroscopy spectra in silico has shown promise in the identification and delineation of metabolite structures. Many researchers have been taking advantage of density functional theory (DFT), a computationally inexpensive yet reputable method for the prediction of carbon and proton NMR spectra of metabolites. However, such methods are expected to have some error in predicted 13C and 1H NMR spectra with respect to experimentally measured values. This leads us to the question-what accuracy is required in predicted 13C and 1H NMR chemical shifts for confident metabolite identification? Using the set of 11,716 small molecules found in the Human Metabolome Database (HMDB), we simulated both experimental and theoretical NMR chemical shift databases. We investigated the level of accuracy required for identification of metabolites in simulated pure and impure samples by matching predicted chemical shifts to experimental data. We found 90% or more of molecules in simulated pure samples can be successfully identified when errors of 1H and 13C chemical shifts in water are below 0.6 and 7.1 ppm, respectively, and below 0.5 and 4.6 ppm in chloroform solvation, respectively. In simulated complex mixtures, as the complexity of the mixture increased, greater accuracy of the calculated chemical shifts was required, as expected. However, if the number of molecules in the mixture is known, e.g., when NMR is combined with MS and sample complexity is low, the likelihood of confident molecular identification increased by 90%.
Collapse
Affiliation(s)
- Yasemin Yesiltepe
- The Gene and Linda Voiland School of Chemical Engineering and Bioengineering, Washington State University, Pullman, WA, USA
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Niranjan Govind
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, USA
| | - Thomas O Metz
- The Gene and Linda Voiland School of Chemical Engineering and Bioengineering, Washington State University, Pullman, WA, USA
| | - Ryan S Renslow
- The Gene and Linda Voiland School of Chemical Engineering and Bioengineering, Washington State University, Pullman, WA, USA.
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, WA, USA.
| |
Collapse
|
11
|
Kumar Konidala K, Bommu U, Pabbaraju N. Integration of in silico methods to determine endocrine-disrupting tobacco pollutants binding potency with steroidogenic genes: comprehensive QSAR modeling and ensemble docking strategies. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2022; 29:65806-65825. [PMID: 35501431 DOI: 10.1007/s11356-022-20443-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 04/21/2022] [Indexed: 06/14/2023]
Abstract
A myriad of tobacco-associated chemicals may have possibilities to developmental/reproductive axis and endocrine-disruption impacts. Mostly they breach the biotransformation of cholesterol in mitochondria by interfering with steroidogenic pathway genes, prompting to adverse effects in steroid biosynthesis; however, studies are scanty. The quantitative structure-activity relationship (QSAR) modeling and comparative docking strategies were used to understand structural features of dataset compounds that influence developmental/reproductive toxicity and estrogen and androgen receptor-binding abilities, and to predict binding levels of toxicants with steroidogenic acute regulatory protein (StAR) and cholesterol side-chain cleavage enzyme (CYP11A1) active sites. Developed QSAR models presented good robustness and predictive ability that were determined from the applicability domain and, clustering and classification of chemicals by performing self-organizing maps. Accordingly, the exorbitant amount of polycyclic aromatic hydrocarbons (PAHs) and a limited number of other chemicals including N-nitrosamines and nicotine was represented as potential developmental/reproductive toxicants as well as estrogen and androgen receptor binders. From the docking analysis, hydrogen bonding, nonpolar, atomic π-stacking, and π-cation interactions were found between PAHs (bay and fjord structural pockets) and functional hotspot residues of StAR and CYP11A1, which strengthened the subtle structural changes at domains. These govern barrier effects to cholesterol binding and/or locking cholesterol to complicate its ejection from the Ω1 loop of StAR, and further mitigates steroid biosynthesis through cholesterol by CYP11A1; therefore, they are presumably considered as block-cluster mechanisms. These outcomes are significant to be hopeful to estimate developmental/reproductive toxicity and endocrine-disruption activities of other environmental pollutants, and could be useful for further assessment to discover binding mechanisms of PAHs with other steroidogenesis pathway genes.
Collapse
Affiliation(s)
| | - Umadevi Bommu
- Department of Zoology, Sri Venkateswara University, Tirupati, 517502 AP, India
| | - Neeraja Pabbaraju
- Department of Zoology, Sri Venkateswara University, Tirupati, 517502 AP, India.
| |
Collapse
|
12
|
Hasan MR, Alsaiari AA, Fakhurji BZ, Molla MHR, Asseri AH, Sumon MAA, Park MN, Ahammad F, Kim B. Application of Mathematical Modeling and Computational Tools in the Modern Drug Design and Development Process. Molecules 2022; 27:4169. [PMID: 35807415 PMCID: PMC9268380 DOI: 10.3390/molecules27134169] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2022] [Revised: 06/22/2022] [Accepted: 06/27/2022] [Indexed: 01/18/2023] Open
Abstract
The conventional drug discovery approach is an expensive and time-consuming process, but its limitations have been overcome with the help of mathematical modeling and computational drug design approaches. Previously, finding a small molecular candidate as a drug against a disease was very costly and required a long time to screen a compound against a specific target. The development of novel targets and small molecular candidates against different diseases including emerging and reemerging diseases remains a major concern and necessitates the development of novel therapeutic targets as well as drug candidates as early as possible. In this regard, computational and mathematical modeling approaches for drug development are advantageous due to their fastest predictive ability and cost-effectiveness features. Computer-aided drug design (CADD) techniques utilize different computer programs as well as mathematics formulas to comprehend the interaction of a target and drugs. Traditional methods to determine small-molecule candidates as a drug have several limitations, but CADD utilizes novel methods that require little time and accurately predict a compound against a specific disease with minimal cost. Therefore, this review aims to provide a brief insight into the mathematical modeling and computational approaches for identifying a novel target and small molecular candidates for curing a specific disease. The comprehensive review mainly focuses on biological target prediction, structure-based and ligand-based drug design methods, molecular docking, virtual screening, pharmacophore modeling, quantitative structure-activity relationship (QSAR) models, molecular dynamics simulation, and MM-GBSA/MM-PBSA approaches along with valuable database resources and tools for identifying novel targets and therapeutics against a disease. This review will help researchers in a way that may open the road for the development of effective drugs and preventative measures against a disease in the future as early as possible.
Collapse
Affiliation(s)
- Md Rifat Hasan
- Department of Mathematics, Faculty of Science, King Abdul-Aziz University, Jeddah 21589, Saudi Arabia;
- Department of Applied Mathematics, Faculty of Science, Noakhali Science and Technology University, Noakhali 3814, Bangladesh
| | - Ahad Amer Alsaiari
- College of Applied Medical Science, Clinical Laboratories Science Department, Taif University, Taif 21944, Saudi Arabia;
| | - Burhan Zain Fakhurji
- iGene Medical Training and Molecular Research Center, Jeddah 21589, Saudi Arabia;
| | | | - Amer H. Asseri
- Biochemistry Department, Faculty of Science, King Abdul-Aziz University, Jeddah 21589, Saudi Arabia;
- Centre for Artificial Intelligence in Precision Medicines, King Abdul-Aziz University, Jeddah 21589, Saudi Arabia
| | - Md Afsar Ahmed Sumon
- Department of Marine Biology, Faculty of Marine Sciences, King Abdul-Aziz University, Jeddah 21589, Saudi Arabia;
| | - Moon Nyeo Park
- College of Korean Medicine, Kyung Hee University, Hoigidong, Dongdaemungu, Seoul 02453, Korea;
| | - Foysal Ahammad
- Department of Biological Sciences, Faculty of Science, King Abdul-Aziz University, Jeddah 21589, Saudi Arabia;
| | - Bonglee Kim
- College of Korean Medicine, Kyung Hee University, Hoigidong, Dongdaemungu, Seoul 02453, Korea;
| |
Collapse
|
13
|
Wang F, Allen D, Tian S, Oler E, Gautam V, Greiner R, Metz TO, Wishart DS. CFM-ID 4.0 - a web server for accurate MS-based metabolite identification. Nucleic Acids Res 2022; 50:W165-W174. [PMID: 35610037 PMCID: PMC9252813 DOI: 10.1093/nar/gkac383] [Citation(s) in RCA: 41] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Revised: 04/14/2022] [Accepted: 05/17/2022] [Indexed: 01/31/2023] Open
Abstract
The CFM-ID 4.0 web server (https://cfmid.wishartlab.com) is an online tool for predicting, annotating and interpreting tandem mass (MS/MS) spectra of small molecules. It is specifically designed to assist researchers pursuing studies in metabolomics, exposomics and analytical chemistry. More specifically, CFM-ID 4.0 supports the: 1) prediction of electrospray ionization quadrupole time-of-flight tandem mass spectra (ESI-QTOF-MS/MS) for small molecules over multiple collision energies (10 eV, 20 eV, and 40 eV); 2) annotation of ESI-QTOF-MS/MS spectra given the structure of the compound; and 3) identification of a small molecule that generated a given ESI-QTOF-MS/MS spectrum at one or more collision energies. The CFM-ID 4.0 web server makes use of a substantially improved MS fragmentation algorithm, a much larger database of experimental and in silico predicted MS/MS spectra and improved scoring methods to offer more accurate MS/MS spectral prediction and MS/MS-based compound identification. Compared to earlier versions of CFM-ID, this new version has an MS/MS spectral prediction performance that is ∼22% better and a compound identification accuracy that is ∼35% better on a standard (CASMI 2016) testing dataset. CFM-ID 4.0 also features a neutral loss function that allows users to identify similar or substituent compounds where no match can be found using CFM-ID’s regular MS/MS-to-compound identification utility. Finally, the CFM-ID 4.0 web server now offers a much more refined user interface that is easier to use, supports molecular formula identification (from MS/MS data), provides more interactively viewable data (including proposed fragment ion structures) and displays MS mirror plots for comparing predicted with observed MS/MS spectra. These improvements should make CFM-ID 4.0 much more useful to the community and should make small molecule identification much easier, faster, and more accurate.
Collapse
Affiliation(s)
- Fei Wang
- Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E8, Canada
| | - Dana Allen
- Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada
| | - Siyang Tian
- Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada
| | - Eponine Oler
- Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada
| | - Vasuk Gautam
- Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada
| | - Russell Greiner
- Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E8, Canada.,Alberta Machine Intelligence Institute, University of Alberta, Edmonton, AB, T6G 2E8, Canada
| | - Thomas O Metz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| | - David S Wishart
- Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E8, Canada.,Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada.,Department of Laboratory Medicine and Pathology, University of Alberta, Edmonton, AB, T6G 2B7, Canada.,Faculty of Pharmacy and Pharmaceutical Sciences, University of Alberta, Edmonton, AB, T6G 2H7, Canada.,Biological Sciences Division, Pacific Northwest National Laboratory, Richland, WA 99352, USA
| |
Collapse
|
14
|
Ciallella HL, Russo DP, Sharma S, Li Y, Sloter E, Sweet L, Huang H, Zhu H. Predicting Prenatal Developmental Toxicity Based On the Combination of Chemical Structures and Biological Data. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2022; 56:5984-5998. [PMID: 35451820 PMCID: PMC9191745 DOI: 10.1021/acs.est.2c01040] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/05/2023]
Abstract
For hazard identification, classification, and labeling purposes, animal testing guidelines are required by law to evaluate the developmental toxicity potential of new and existing chemical products. However, guideline developmental toxicity studies are costly, time-consuming, and require many laboratory animals. Computational modeling has emerged as a promising, animal-sparing, and cost-effective method for evaluating the developmental toxicity potential of chemicals, such as endocrine disruptors, without the use of animals. We aimed to develop a predictive and explainable computational model for developmental toxicants. To this end, a comprehensive dataset of 1244 chemicals with developmental toxicity classifications was curated from public repositories and literature sources. Data from 2140 toxicological high-throughput screening assays were extracted from PubChem and the ToxCast program for this dataset and combined with information about 834 chemical fragments to group assays based on their chemical-mechanistic relationships. This effort revealed two assay clusters containing 83 and 76 assays, respectively, with high positive predictive rates for developmental toxicants identified with animal testing guidelines (PPV = 72.4 and 77.3% during cross-validation). These two assay clusters can be used as developmental toxicity models and were applied to predict new chemicals for external validation. This study provides a new strategy for constructing alternative chemical developmental toxicity evaluations that can be replicated for other toxicity modeling studies.
Collapse
Affiliation(s)
- Heather L. Ciallella
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, 08103, USA
| | - Daniel P. Russo
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, 08103, USA
- Department of Chemistry, Rutgers University, Camden, NJ, 08102, USA
| | - Swati Sharma
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, 08103, USA
| | - Yafan Li
- The Lubrizol Corporation, Wickliffe, OH, 44092, USA
| | - Eddie Sloter
- The Lubrizol Corporation, Wickliffe, OH, 44092, USA
| | - Len Sweet
- The Lubrizol Corporation, Wickliffe, OH, 44092, USA
| | - Heng Huang
- Department of Electrical and Computer Engineering, University of Pittsburgh, Pittsburgh, PA, 15261, USA
| | - Hao Zhu
- Center for Computational and Integrative Biology, Rutgers University, Camden, NJ, 08103, USA
- Department of Chemistry, Rutgers University, Camden, NJ, 08102, USA
| |
Collapse
|
15
|
Chang X, Tan YM, Allen DG, Bell S, Brown PC, Browning L, Ceger P, Gearhart J, Hakkinen PJ, Kabadi SV, Kleinstreuer NC, Lumen A, Matheson J, Paini A, Pangburn HA, Petersen EJ, Reinke EN, Ribeiro AJS, Sipes N, Sweeney LM, Wambaugh JF, Wange R, Wetmore BA, Mumtaz M. IVIVE: Facilitating the Use of In Vitro Toxicity Data in Risk Assessment and Decision Making. TOXICS 2022; 10:232. [PMID: 35622645 PMCID: PMC9143724 DOI: 10.3390/toxics10050232] [Citation(s) in RCA: 59] [Impact Index Per Article: 19.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/25/2022] [Accepted: 04/24/2022] [Indexed: 02/04/2023]
Abstract
During the past few decades, the science of toxicology has been undergoing a transformation from observational to predictive science. New approach methodologies (NAMs), including in vitro assays, in silico models, read-across, and in vitro to in vivo extrapolation (IVIVE), are being developed to reduce, refine, or replace whole animal testing, encouraging the judicious use of time and resources. Some of these methods have advanced past the exploratory research stage and are beginning to gain acceptance for the risk assessment of chemicals. A review of the recent literature reveals a burst of IVIVE publications over the past decade. In this review, we propose operational definitions for IVIVE, present literature examples for several common toxicity endpoints, and highlight their implications in decision-making processes across various federal agencies, as well as international organizations, including those in the European Union (EU). The current challenges and future needs are also summarized for IVIVE. In addition to refining and reducing the number of animals in traditional toxicity testing protocols and being used for prioritizing chemical testing, the goal to use IVIVE to facilitate the replacement of animal models can be achieved through their continued evolution and development, including a strategic plan to qualify IVIVE methods for regulatory acceptance.
Collapse
Affiliation(s)
- Xiaoqing Chang
- Inotiv-RTP, 601 Keystone Park Drive, Suite 200, Morrisville, NC 27560, USA; (X.C.); (D.G.A.); (S.B.); (L.B.); (P.C.)
| | - Yu-Mei Tan
- U.S. Environmental Protection Agency, Office of Pesticide Programs, 109 T.W. Alexander Drive, Durham, NC 27709, USA;
| | - David G. Allen
- Inotiv-RTP, 601 Keystone Park Drive, Suite 200, Morrisville, NC 27560, USA; (X.C.); (D.G.A.); (S.B.); (L.B.); (P.C.)
| | - Shannon Bell
- Inotiv-RTP, 601 Keystone Park Drive, Suite 200, Morrisville, NC 27560, USA; (X.C.); (D.G.A.); (S.B.); (L.B.); (P.C.)
| | - Paul C. Brown
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research, 10903 New Hampshire Avenue, Silver Spring, MD 20903, USA; (P.C.B.); (A.J.S.R.); (R.W.)
| | - Lauren Browning
- Inotiv-RTP, 601 Keystone Park Drive, Suite 200, Morrisville, NC 27560, USA; (X.C.); (D.G.A.); (S.B.); (L.B.); (P.C.)
| | - Patricia Ceger
- Inotiv-RTP, 601 Keystone Park Drive, Suite 200, Morrisville, NC 27560, USA; (X.C.); (D.G.A.); (S.B.); (L.B.); (P.C.)
| | - Jeffery Gearhart
- The Henry M. Jackson Foundation, Air Force Research Laboratory, 711 Human Performance Wing, Wright-Patterson Air Force Base, OH 45433, USA;
| | - Pertti J. Hakkinen
- National Library of Medicine, National Center for Biotechnology Information, 8600 Rockville Pike, Bethesda, MD 20894, USA;
| | - Shruti V. Kabadi
- U.S. Food and Drug Administration, Center for Food Safety and Applied Nutrition, Office of Food Additive Safety, 5001 Campus Drive, HFS-275, College Park, MD 20740, USA;
| | - Nicole C. Kleinstreuer
- National Institute of Environmental Health Sciences, National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, P.O. Box 12233, Research Triangle Park, NC 27709, USA;
| | - Annie Lumen
- U.S. Food and Drug Administration, National Center for Toxicological Research, 3900 NCTR Road, Jefferson, AR 72079, USA;
| | - Joanna Matheson
- U.S. Consumer Product Safety Commission, Division of Toxicology and Risk Assessment, 5 Research Place, Rockville, MD 20850, USA;
| | - Alicia Paini
- European Commission, Joint Research Centre (JRC), 21027 Ispra, Italy;
| | - Heather A. Pangburn
- Air Force Research Laboratory, 711 Human Performance Wing, 2729 R Street, Area B, Building 837, Wright-Patterson Air Force Base, OH 45433, USA;
| | - Elijah J. Petersen
- U.S. Department of Commerce, National Institute of Standards and Technology, 100 Bureau Drive, Gaithersburg, MD 20899, USA;
| | - Emily N. Reinke
- U.S. Army Public Health Center, 8252 Blackhawk Rd., Aberdeen Proving Ground, MD 21010, USA;
| | - Alexandre J. S. Ribeiro
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research, 10903 New Hampshire Avenue, Silver Spring, MD 20903, USA; (P.C.B.); (A.J.S.R.); (R.W.)
| | - Nisha Sipes
- U.S. Environmental Protection Agency, Center for Computational Toxicology and Exposure, 109 TW Alexander Dr., Research Triangle Park, NC 27711, USA; (N.S.); (J.F.W.); (B.A.W.)
| | - Lisa M. Sweeney
- UES, Inc., 4401 Dayton-Xenia Road, Beavercreek, OH 45432, Assigned to Air Force Research Laboratory, 711 Human Performance Wing, Wright-Patterson Air Force Base, OH 45433, USA;
| | - John F. Wambaugh
- U.S. Environmental Protection Agency, Center for Computational Toxicology and Exposure, 109 TW Alexander Dr., Research Triangle Park, NC 27711, USA; (N.S.); (J.F.W.); (B.A.W.)
| | - Ronald Wange
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research, 10903 New Hampshire Avenue, Silver Spring, MD 20903, USA; (P.C.B.); (A.J.S.R.); (R.W.)
| | - Barbara A. Wetmore
- U.S. Environmental Protection Agency, Center for Computational Toxicology and Exposure, 109 TW Alexander Dr., Research Triangle Park, NC 27711, USA; (N.S.); (J.F.W.); (B.A.W.)
| | - Moiz Mumtaz
- Agency for Toxic Substances and Disease Registry, Office of the Associate Director for Science, 1600 Clifton Road, S102-2, Atlanta, GA 30333, USA
| |
Collapse
|
16
|
The 2021 update of the EPA's adverse outcome pathway database. Sci Data 2021; 8:169. [PMID: 34253739 PMCID: PMC8275694 DOI: 10.1038/s41597-021-00962-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2020] [Accepted: 05/13/2021] [Indexed: 11/22/2022] Open
Abstract
The EPA developed the Adverse Outcome Pathway Database (AOP-DB) to better characterize adverse outcomes of toxicological interest that are relevant to human health and the environment. Here we present the most recent version of the EPA Adverse Outcome Pathway Database (AOP-DB), version 2. AOP-DB v.2 introduces several substantial updates, which include automated data pulls from the AOP-Wiki 2.0, the integration of tissue-gene network data, and human AOP-gene data by population, semantic mapping and SPARQL endpoint creation, in addition to the presentation of the first publicly available AOP-DB web user interface. Potential users of the data may investigate specific molecular targets of an AOP, the relation of those gene/protein targets to other AOPs, cross-species, pathway, or disease-AOP relationships, or frequencies of AOP-related functional variants in particular populations, for example. Version updates described herein help inform new testable hypotheses about the etiology and mechanisms underlying adverse outcomes of environmental and toxicological concern. Measurement(s) | adverse outcome pathway • gene interactions • Orthologous Gene • chemical gene interactions • molecular pathway • disease gene associations • SNP | Technology Type(s) | digital curation |
Machine-accessible metadata file describing the reported data: 10.6084/m9.figshare.14737557
Collapse
|
17
|
Borges R, Colby SM, Das S, Edison AS, Fiehn O, Kind T, Lee J, Merrill AT, Merz KM, Metz TO, Nunez JR, Tantillo DJ, Wang LP, Wang S, Renslow RS. Quantum Chemistry Calculations for Metabolomics. Chem Rev 2021; 121:5633-5670. [PMID: 33979149 PMCID: PMC8161423 DOI: 10.1021/acs.chemrev.0c00901] [Citation(s) in RCA: 47] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2020] [Indexed: 02/07/2023]
Abstract
A primary goal of metabolomics studies is to fully characterize the small-molecule composition of complex biological and environmental samples. However, despite advances in analytical technologies over the past two decades, the majority of small molecules in complex samples are not readily identifiable due to the immense structural and chemical diversity present within the metabolome. Current gold-standard identification methods rely on reference libraries built using authentic chemical materials ("standards"), which are not available for most molecules. Computational quantum chemistry methods, which can be used to calculate chemical properties that are then measured by analytical platforms, offer an alternative route for building reference libraries, i.e., in silico libraries for "standards-free" identification. In this review, we cover the major roadblocks currently facing metabolomics and discuss applications where quantum chemistry calculations offer a solution. Several successful examples for nuclear magnetic resonance spectroscopy, ion mobility spectrometry, infrared spectroscopy, and mass spectrometry methods are reviewed. Finally, we consider current best practices, sources of error, and provide an outlook for quantum chemistry calculations in metabolomics studies. We expect this review will inspire researchers in the field of small-molecule identification to accelerate adoption of in silico methods for generation of reference libraries and to add quantum chemistry calculations as another tool at their disposal to characterize complex samples.
Collapse
Affiliation(s)
- Ricardo
M. Borges
- Walter
Mors Institute of Research on Natural Products, Federal University of Rio de Janeiro, Rio de Janeiro 21941-901, Brazil
| | - Sean M. Colby
- Biological
Science Division, Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| | - Susanta Das
- Department
of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
| | - Arthur S. Edison
- Departments
of Genetics and Biochemistry and Molecular Biology, Complex Carbohydrate
Research Center and Institute of Bioinformatics, University of Georgia, Athens, Georgia 30602, United States
| | - Oliver Fiehn
- West
Coast Metabolomics Center for Compound Identification, UC Davis Genome
Center, University of California, Davis, California 95616, United States
| | - Tobias Kind
- West
Coast Metabolomics Center for Compound Identification, UC Davis Genome
Center, University of California, Davis, California 95616, United States
| | - Jesi Lee
- West
Coast Metabolomics Center for Compound Identification, UC Davis Genome
Center, University of California, Davis, California 95616, United States
- Department
of Chemistry, University of California, Davis, California 95616, United States
| | - Amy T. Merrill
- Department
of Chemistry, University of California, Davis, California 95616, United States
| | - Kenneth M. Merz
- Department
of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
| | - Thomas O. Metz
- Biological
Science Division, Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| | - Jamie R. Nunez
- Biological
Science Division, Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| | - Dean J. Tantillo
- Department
of Chemistry, University of California, Davis, California 95616, United States
| | - Lee-Ping Wang
- Department
of Chemistry, University of California, Davis, California 95616, United States
| | - Shunyang Wang
- West
Coast Metabolomics Center for Compound Identification, UC Davis Genome
Center, University of California, Davis, California 95616, United States
- Department
of Chemistry, University of California, Davis, California 95616, United States
| | - Ryan S. Renslow
- Biological
Science Division, Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| |
Collapse
|
18
|
Pradeep P, Judson R, DeMarini DM, Keshava N, Martin TM, Dean J, Gibbons CF, Simha A, Warren SH, Gwinn MR, Patlewicz G. Evaluation of Existing QSAR Models and Structural Alerts and Development of New Ensemble Models for Genotoxicity Using a Newly Compiled Experimental Dataset. ACTA ACUST UNITED AC 2021; 18. [PMID: 34504984 DOI: 10.1016/j.comtox.2021.100167] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022]
Abstract
Regulatory agencies world-wide face the challenge of performing risk-based prioritization of thousands of substances in commerce. In this study, a major effort was undertaken to compile a large genotoxicity dataset (54,805 records for 9299 substances) from several public sources (e.g., TOXNET, COSMOS, eChemPortal). The names and outcomes of the different assays were harmonized, and assays were annotated by type: gene mutation in Salmonella bacteria (Ames assay) and chromosome mutation (clastogenicity) in vitro or in vivo (chromosome aberration, micronucleus, and mouse lymphoma Tk +/- assays). This dataset was then evaluated to assess genotoxic potential using a categorization scheme, whereby a substance was considered genotoxic if it was positive in at least one Ames or clastogen study. The categorization dataset comprised 8442 chemicals, of which 2728 chemicals were genotoxic, 5585 were not and 129 were inconclusive. QSAR models (TEST and VEGA) and the OECD Toolbox structural alerts/profilers (e.g., OASIS DNA alerts for Ames and chromosomal aberrations) were used to make in silico predictions of genotoxicity potential. The performance of the individual QSAR tools and structural alerts resulted in balanced accuracies of 57-73%. A Naïve Bayes consensus model was developed using combinations of QSAR models and structural alert predictions. The 'best' consensus model selected had a balanced accuracy of 81.2%, a sensitivity of 87.24% and a specificity of 75.20%. This in silico scheme offers promise as a first step in ranking thousands of substances as part of a prioritization approach for genotoxicity.
Collapse
Affiliation(s)
- Prachi Pradeep
- Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee, USA
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
| | - Richard Judson
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
| | - David M DeMarini
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
| | - Nagalakshmi Keshava
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Cincinnati, Ohio, USA
| | - Todd M Martin
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Cincinnati, Ohio, USA
| | - Jeffry Dean
- Center for Public Health and Environmental Assessment, U.S. Environmental Protection Agency, Cincinnati, Ohio, USA
| | - Catherine F Gibbons
- Center for Public Health and Environmental Assessment, U.S. Environmental Protection Agency, Washington, District of Columbia, USA
| | - Anita Simha
- ORAU, contractor to U.S. Environmental Protection Agency through the National Student Services Contract, Research Triangle Park, North Carolina, USA
| | - Sarah H Warren
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
| | - Maureen R Gwinn
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
| | - Grace Patlewicz
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
| |
Collapse
|
19
|
Mansouri K, Karmaus AL, Fitzpatrick J, Patlewicz G, Pradeep P, Alberga D, Alepee N, Allen TE, Allen D, Alves VM, Andrade CH, Auernhammer TR, Ballabio D, Bell S, Benfenati E, Bhattacharya S, Bastos JV, Boyd S, Brown J, Capuzzi SJ, Chushak Y, Ciallella H, Clark AM, Consonni V, Daga PR, Ekins S, Farag S, Fedorov M, Fourches D, Gadaleta D, Gao F, Gearhart JM, Goh G, Goodman JM, Grisoni F, Grulke CM, Hartung T, Hirn M, Karpov P, Korotcov A, Lavado GJ, Lawless M, Li X, Luechtefeld T, Lunghini F, Mangiatordi GF, Marcou G, Marsh D, Martin T, Mauri A, Muratov EN, Myatt GJ, Nguyen DT, Nicolotti O, Note R, Pande P, Parks AK, Peryea T, Polash AH, Rallo R, Roncaglioni A, Rowlands C, Ruiz P, Russo DP, Sayed A, Sayre R, Sheils T, Siegel C, Silva AC, Simeonov A, Sosnin S, Southall N, Strickland J, Tang Y, Teppen B, Tetko IV, Thomas D, Tkachenko V, Todeschini R, Toma C, Tripodi I, Trisciuzzi D, Tropsha A, Varnek A, Vukovic K, Wang Z, Wang L, Waters KM, Wedlake AJ, Wijeyesakere SJ, Wilson D, Xiao Z, Yang H, Zahoranszky-Kohalmi G, Zakharov AV, Zhang FF, Zhang Z, Zhao T, Zhu H, Zorn KM, et alMansouri K, Karmaus AL, Fitzpatrick J, Patlewicz G, Pradeep P, Alberga D, Alepee N, Allen TE, Allen D, Alves VM, Andrade CH, Auernhammer TR, Ballabio D, Bell S, Benfenati E, Bhattacharya S, Bastos JV, Boyd S, Brown J, Capuzzi SJ, Chushak Y, Ciallella H, Clark AM, Consonni V, Daga PR, Ekins S, Farag S, Fedorov M, Fourches D, Gadaleta D, Gao F, Gearhart JM, Goh G, Goodman JM, Grisoni F, Grulke CM, Hartung T, Hirn M, Karpov P, Korotcov A, Lavado GJ, Lawless M, Li X, Luechtefeld T, Lunghini F, Mangiatordi GF, Marcou G, Marsh D, Martin T, Mauri A, Muratov EN, Myatt GJ, Nguyen DT, Nicolotti O, Note R, Pande P, Parks AK, Peryea T, Polash AH, Rallo R, Roncaglioni A, Rowlands C, Ruiz P, Russo DP, Sayed A, Sayre R, Sheils T, Siegel C, Silva AC, Simeonov A, Sosnin S, Southall N, Strickland J, Tang Y, Teppen B, Tetko IV, Thomas D, Tkachenko V, Todeschini R, Toma C, Tripodi I, Trisciuzzi D, Tropsha A, Varnek A, Vukovic K, Wang Z, Wang L, Waters KM, Wedlake AJ, Wijeyesakere SJ, Wilson D, Xiao Z, Yang H, Zahoranszky-Kohalmi G, Zakharov AV, Zhang FF, Zhang Z, Zhao T, Zhu H, Zorn KM, Casey W, Kleinstreuer NC. CATMoS: Collaborative Acute Toxicity Modeling Suite. ENVIRONMENTAL HEALTH PERSPECTIVES 2021; 129:47013. [PMID: 33929906 PMCID: PMC8086800 DOI: 10.1289/ehp8495] [Show More Authors] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/19/2020] [Revised: 03/10/2021] [Accepted: 03/19/2021] [Indexed: 05/02/2023]
Abstract
BACKGROUND Humans are exposed to tens of thousands of chemical substances that need to be assessed for their potential toxicity. Acute systemic toxicity testing serves as the basis for regulatory hazard classification, labeling, and risk management. However, it is cost- and time-prohibitive to evaluate all new and existing chemicals using traditional rodent acute toxicity tests. In silico models built using existing data facilitate rapid acute toxicity predictions without using animals. OBJECTIVES The U.S. Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) Acute Toxicity Workgroup organized an international collaboration to develop in silico models for predicting acute oral toxicity based on five different end points: Lethal Dose 50 (LD 50 value, U.S. Environmental Protection Agency hazard (four) categories, Globally Harmonized System for Classification and Labeling hazard (five) categories, very toxic chemicals [LD 50 (LD 50 ≤ 50 mg / kg )], and nontoxic chemicals (L D 50 > 2,000 mg / kg ). METHODS An acute oral toxicity data inventory for 11,992 chemicals was compiled, split into training and evaluation sets, and made available to 35 participating international research groups that submitted a total of 139 predictive models. Predictions that fell within the applicability domains of the submitted models were evaluated using external validation sets. These were then combined into consensus models to leverage strengths of individual approaches. RESULTS The resulting consensus predictions, which leverage the collective strengths of each individual model, form the Collaborative Acute Toxicity Modeling Suite (CATMoS). CATMoS demonstrated high performance in terms of accuracy and robustness when compared with in vivo results. DISCUSSION CATMoS is being evaluated by regulatory agencies for its utility and applicability as a potential replacement for in vivo rat acute oral toxicity studies. CATMoS predictions for more than 800,000 chemicals have been made available via the National Toxicology Program's Integrated Chemical Environment tools and data sets (ice.ntp.niehs.nih.gov). The models are also implemented in a free, standalone, open-source tool, OPERA, which allows predictions of new and untested chemicals to be made. https://doi.org/10.1289/EHP8495.
Collapse
Affiliation(s)
- Kamel Mansouri
- Integrated Laboratory Systems, LLC, Morrisville, North Carolina, USA
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, Research Triangle Park, North Carolina, USA
| | - Agnes L. Karmaus
- Integrated Laboratory Systems, LLC, Morrisville, North Carolina, USA
| | | | - Grace Patlewicz
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
| | - Prachi Pradeep
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
- Oak Ridge Institute for Science and Education (ORISE) Research Participation Program, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
| | - Domenico Alberga
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | | | - Timothy E.H. Allen
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Dave Allen
- Integrated Laboratory Systems, LLC, Morrisville, North Carolina, USA
| | - Vinicius M. Alves
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina, USA
- Laboratory for Molecular Modeling and Design, Faculty of Pharmacy, Federal University of Goiás, Goiania, Brazil
| | - Carolina H. Andrade
- Laboratory for Molecular Modeling and Design, Faculty of Pharmacy, Federal University of Goiás, Goiania, Brazil
| | | | - Davide Ballabio
- Milano Chemometrics & QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Shannon Bell
- Integrated Laboratory Systems, LLC, Morrisville, North Carolina, USA
| | - Emilio Benfenati
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | - Sudin Bhattacharya
- Institute for Quantitative Health Science and Engineering, Department of Biomedical Engineering, Michigan State University, East Lansing, Michigan, USA
| | - Joyce V. Bastos
- Laboratory for Molecular Modeling and Design, Faculty of Pharmacy, Federal University of Goiás, Goiania, Brazil
| | - Stephen Boyd
- Department of Plant, Soil, and Microbial Sciences, Michigan State University, East Lansing, Michigan, USA
| | - J.B. Brown
- Kyoto University Graduate School of Medicine, Kyoto, Japan
| | - Stephen J. Capuzzi
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Yaroslav Chushak
- Aeromedical Research Department, Force Health Protection, USAFSAM, Dayton, Ohio, USA
- Henry M Jackson Foundation for the Advancement of Military Medicine, Dayton, Ohio, USA
| | - Heather Ciallella
- Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, USA
| | - Alex M. Clark
- Collaborations Pharmaceuticals, Inc., Raleigh, North Carolina, USA
| | - Viviana Consonni
- Milano Chemometrics & QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | | | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., Raleigh, North Carolina, USA
| | - Sherif Farag
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Maxim Fedorov
- Skoltech, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Denis Fourches
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina, USA
- Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, USA
| | - Domenico Gadaleta
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | - Feng Gao
- Department of Plant, Soil, and Microbial Sciences, Michigan State University, East Lansing, Michigan, USA
| | - Jeffery M. Gearhart
- Aeromedical Research Department, Force Health Protection, USAFSAM, Dayton, Ohio, USA
- Henry M Jackson Foundation for the Advancement of Military Medicine, Dayton, Ohio, USA
| | - Garett Goh
- Pacific Northwest National Laboratory, Richland, Washington, USA
| | - Jonathan M. Goodman
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Francesca Grisoni
- Milano Chemometrics & QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Christopher M. Grulke
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
| | | | - Matthew Hirn
- Department of Computational Mathematics, Science & Engineering, Department of Mathematics, Michigan State University, East Lansing, Michigan, USA
| | - Pavel Karpov
- Institute of Structural Biology, Helmholtz Zentrum München (GmbH), Neuherberg, Germany
| | | | - Giovanna J. Lavado
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | | | - Xinhao Li
- Department of Chemistry, North Carolina State University, Raleigh, North Carolina, USA
| | | | - Filippo Lunghini
- Laboratoire de Chemoinformatique, URM7140, Université de Strasbourg, Strasbourg, France
| | - Giuseppe F. Mangiatordi
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | - Gilles Marcou
- Laboratoire de Chemoinformatique, URM7140, Université de Strasbourg, Strasbourg, France
| | - Dan Marsh
- Underwriters Laboratories, Northbrook, Illinois, USA
| | - Todd Martin
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Cincinnati, Ohio, USA
| | | | - Eugene N. Muratov
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina, USA
- Laboratory for Molecular Modeling and Design, Faculty of Pharmacy, Federal University of Goiás, Goiania, Brazil
| | | | - Dac-Trung Nguyen
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | - Reine Note
- L’Oréal Research & Innovation, Aulnay-sous-Bois, France
| | - Paritosh Pande
- Pacific Northwest National Laboratory, Richland, Washington, USA
| | | | - Tyler Peryea
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | | | - Robert Rallo
- Pacific Northwest National Laboratory, Richland, Washington, USA
| | - Alessandra Roncaglioni
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | | | - Patricia Ruiz
- Office of Innovation and Analytics, Agency for Toxic Substances and Disease Registry, Centers for Disease Control and Prevention, Atlanta, Georgia, USA
| | - Daniel P. Russo
- Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, USA
| | - Ahmed Sayed
- Rosettastein Consulting UG, Freising, Germany
| | - Risa Sayre
- Center for Computational Toxicology and Exposure, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
- Oak Ridge Institute for Science and Education (ORISE) Research Participation Program, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina, USA
| | - Timothy Sheils
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Charles Siegel
- Pacific Northwest National Laboratory, Richland, Washington, USA
| | - Arthur C. Silva
- Laboratory for Molecular Modeling and Design, Faculty of Pharmacy, Federal University of Goiás, Goiania, Brazil
| | - Anton Simeonov
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Sergey Sosnin
- Skoltech, Skolkovo Institute of Science and Technology, Moscow, Russia
| | - Noel Southall
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Judy Strickland
- Integrated Laboratory Systems, LLC, Morrisville, North Carolina, USA
| | - Yun Tang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Brian Teppen
- Department of Plant, Soil, and Microbial Sciences, Michigan State University, East Lansing, Michigan, USA
| | - Igor V. Tetko
- Institute of Structural Biology, Helmholtz Zentrum München (GmbH), Neuherberg, Germany
- BIGCHEM GmbH, Unterschleissheim, Germany
| | - Dennis Thomas
- Pacific Northwest National Laboratory, Richland, Washington, USA
| | | | - Roberto Todeschini
- Milano Chemometrics & QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Cosimo Toma
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | - Ignacio Tripodi
- Computer Science/Interdisciplinary Quantitative Biology, University of Colorado, Boulder, Colorado, USA
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari “Aldo Moro”, Bari, Italy
| | - Alexander Tropsha
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina, USA
| | - Alexandre Varnek
- Laboratoire de Chemoinformatique, URM7140, Université de Strasbourg, Strasbourg, France
| | - Kristijan Vukovic
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Milan, Italy
| | - Zhongyu Wang
- School of Environmental Sciences and Technology, Dalian University of Technology; Dalian, Liaoning, China
| | - Liguo Wang
- School of Environmental Sciences and Technology, Dalian University of Technology; Dalian, Liaoning, China
| | | | - Andrew J. Wedlake
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, UK
| | | | - Dan Wilson
- The Dow Chemical Company, Midland, Michigan, USA
| | - Zijun Xiao
- School of Environmental Sciences and Technology, Dalian University of Technology; Dalian, Liaoning, China
| | - Hongbin Yang
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Gergely Zahoranszky-Kohalmi
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Alexey V. Zakharov
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | | | - Zhen Zhang
- Dow Agrosciences, Indianapolis, Indiana, USA
| | - Tongan Zhao
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Hao Zhu
- Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, USA
| | | | - Warren Casey
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, Research Triangle Park, North Carolina, USA
| | - Nicole C. Kleinstreuer
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, Research Triangle Park, North Carolina, USA
| |
Collapse
|
20
|
Islamaj R, Leaman R, Kim S, Kwon D, Wei CH, Comeau DC, Peng Y, Cissel D, Coss C, Fisher C, Guzman R, Kochar PG, Koppel S, Trinh D, Sekiya K, Ward J, Whitman D, Schmidt S, Lu Z. NLM-Chem, a new resource for chemical entity recognition in PubMed full text literature. Sci Data 2021; 8:91. [PMID: 33767203 PMCID: PMC7994842 DOI: 10.1038/s41597-021-00875-1] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Accepted: 01/19/2021] [Indexed: 11/13/2022] Open
Abstract
Automatically identifying chemical and drug names in scientific publications advances information access for this important class of entities in a variety of biomedical disciplines by enabling improved retrieval and linkage to related concepts. While current methods for tagging chemical entities were developed for the article title and abstract, their performance in the full article text is substantially lower. However, the full text frequently contains more detailed chemical information, such as the properties of chemical compounds, their biological effects and interactions with diseases, genes and other chemicals. We therefore present the NLM-Chem corpus, a full-text resource to support the development and evaluation of automated chemical entity taggers. The NLM-Chem corpus consists of 150 full-text articles, doubly annotated by ten expert NLM indexers, with ~5000 unique chemical name annotations, mapped to ~2000 MeSH identifiers. We also describe a substantially improved chemical entity tagger, with automated annotations for all of PubMed and PMC freely accessible through the PubTator web-based interface and API. The NLM-Chem corpus is freely available.
Collapse
Affiliation(s)
- Rezarta Islamaj
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Robert Leaman
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Sun Kim
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Dongseop Kwon
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Chih-Hsuan Wei
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Donald C Comeau
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Yifan Peng
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - David Cissel
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Cathleen Coss
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Carol Fisher
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Rob Guzman
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Preeti Gokal Kochar
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Stella Koppel
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Dorothy Trinh
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Keiko Sekiya
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Janice Ward
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Deborah Whitman
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Susan Schmidt
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Zhiyong Lu
- National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
| |
Collapse
|
21
|
Krettler CA, Thallinger GG. A map of mass spectrometry-based in silico fragmentation prediction and compound identification in metabolomics. Brief Bioinform 2021; 22:6184408. [PMID: 33758925 DOI: 10.1093/bib/bbab073] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2020] [Revised: 01/29/2021] [Accepted: 02/12/2021] [Indexed: 12/27/2022] Open
Abstract
Metabolomics, the comprehensive study of the metabolome, and lipidomics-the large-scale study of pathways and networks of cellular lipids-are major driving forces in enabling personalized medicine. Complicated and error-prone data analysis still remains a bottleneck, however, especially for identifying novel metabolites. Comparing experimental mass spectra to curated databases containing reference spectra has been the gold standard for identification of compounds, but constructing such databases is a costly and time-demanding task. Many software applications try to circumvent this process by utilizing cutting-edge advances in computational methods-including quantum chemistry and machine learning-and simulate mass spectra by performing theoretical, so called in silico fragmentations of compounds. Other solutions concentrate directly on experimental spectra and try to identify structural properties by investigating reoccurring patterns and the relationships between them. The considerable progress made in the field allows recent approaches to provide valuable clues to expedite annotation of experimental mass spectra. This review sheds light on individual strengths and weaknesses of these tools, and attempts to evaluate them-especially in view of lipidomics, when considering complex mixtures found in biological samples as well as mass spectrometer inter-instrument variability.
Collapse
Affiliation(s)
- Christoph A Krettler
- Institute of Biomedical Informatics, Graz University of Technology, Stremayrgasse 16/I, 8010, Graz, Austria.,Omics Center Graz, BioTechMed-Graz, Stiftingtalstrasse 24, 8010, Graz, Austria
| | - Gerhard G Thallinger
- Institute of Biomedical Informatics, Graz University of Technology, Stremayrgasse 16/I, 8010, Graz, Austria.,Omics Center Graz, BioTechMed-Graz, Stiftingtalstrasse 24, 8010, Graz, Austria
| |
Collapse
|
22
|
Schultz KJ, Colby SM, Yesiltepe Y, Nuñez JR, McGrady MY, Renslow RS. Application and assessment of deep learning for the generation of potential NMDA receptor antagonists. Phys Chem Chem Phys 2021; 23:1197-1214. [PMID: 33355332 DOI: 10.1039/d0cp03620j] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Uncompetitive antagonists of the N-methyl d-aspartate receptor (NMDAR) have demonstrated therapeutic benefit in the treatment of neurological diseases such as Parkinson's and Alzheimer's, but some also cause dissociative effects that have led to the synthesis of illicit drugs. The ability to generate NMDAR antagonists in silico is therefore desirable for both new medication development and preempting and identifying new designer drugs. Recently, generative deep learning models have been applied to de novo drug design as a means to expand the amount of chemical space that can be explored for potential drug-like compounds. In this study, we assess the application of a generative model to the NMDAR to achieve two primary objectives: (i) the creation and release of a comprehensive library of experimentally validated NMDAR phencyclidine (PCP) site antagonists to assist the drug discovery community and (ii) an analysis of both the advantages conferred by applying such generative artificial intelligence models to drug design and the current limitations of the approach. We apply, and provide source code for, a variety of ligand- and structure-based assessment techniques used in standard drug discovery analyses to the deep learning-generated compounds. We present twelve candidate antagonists that are not available in existing chemical databases to provide an example of what this type of workflow can achieve, though synthesis and experimental validation of these compounds are still required.
Collapse
Affiliation(s)
| | - Sean M Colby
- Pacific Northwest National Laboratory, Richland, WA, USA.
| | | | - Jamie R Nuñez
- Pacific Northwest National Laboratory, Richland, WA, USA.
| | | | - Ryan S Renslow
- Pacific Northwest National Laboratory, Richland, WA, USA.
| |
Collapse
|
23
|
Jain S, Siramshetty VB, Alves VM, Muratov EN, Kleinstreuer N, Tropsha A, Nicklaus MC, Simeonov A, Zakharov AV. Large-Scale Modeling of Multispecies Acute Toxicity End Points Using Consensus of Multitask Deep Learning Methods. J Chem Inf Model 2021; 61:653-663. [PMID: 33533614 DOI: 10.1021/acs.jcim.0c01164] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Computational methods to predict molecular properties regarding safety and toxicology represent alternative approaches to expedite drug development, screen environmental chemicals, and thus significantly reduce associated time and costs. There is a strong need and interest in the development of computational methods that yield reliable predictions of toxicity, and many approaches, including the recently introduced deep neural networks, have been leveraged towards this goal. Herein, we report on the collection, curation, and integration of data from the public data sets that were the source of the ChemIDplus database for systemic acute toxicity. These efforts generated the largest publicly available such data set comprising > 80,000 compounds measured against a total of 59 acute systemic toxicity end points. This data was used for developing multiple single- and multitask models utilizing random forest, deep neural networks, convolutional, and graph convolutional neural network approaches. For the first time, we also reported the consensus models based on different multitask approaches. To the best of our knowledge, prediction models for 36 of the 59 end points have never been published before. Furthermore, our results demonstrated a significantly better performance of the consensus model obtained from three multitask learning approaches that particularly predicted the 29 smaller tasks (less than 300 compounds) better than other models developed in the study. The curated data set and the developed models have been made publicly available at https://github.com/ncats/ld50-multitask, https://predictor.ncats.io/, and https://cactus.nci.nih.gov/download/acute-toxicity-db (data set only) to support regulatory and research applications.
Collapse
Affiliation(s)
- Sankalp Jain
- National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Vishal B Siramshetty
- National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Vinicius M Alves
- UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Eugene N Muratov
- UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Nicole Kleinstreuer
- Division of Intramural Research, Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, 111 T.W. Alexander Drive, Durham, North Carolina 27709, United States.,National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, 111 T.W. Alexander Drive, Durham, North Carolina 27709, United States
| | - Alexander Tropsha
- UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina 27599, United States
| | - Marc C Nicklaus
- Computer-Aided Drug Design (CADD) Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, DHHS, NCI-Frederick, 376 Boyles Street, Frederick, Maryland 21702, United States
| | - Anton Simeonov
- National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| | - Alexey V Zakharov
- National Center for Advancing Translational Sciences (NCATS), National Institutes of Health, 9800 Medical Center Drive, Rockville, Maryland 20850, United States
| |
Collapse
|
24
|
Schultz KJ, Colby SM, Lin VS, Wright AT, Renslow RS. Ligand- and Structure-Based Analysis of Deep Learning-Generated Potential α2a Adrenoceptor Agonists. J Chem Inf Model 2021; 61:481-492. [PMID: 33404240 DOI: 10.1021/acs.jcim.0c01019] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The α2a adrenoceptor is a medically relevant subtype of the G protein-coupled receptor family. Unfortunately, high-throughput techniques aimed at producing novel drug leads for this receptor have been largely unsuccessful because of the complex pharmacology of adrenergic receptors. As such, cutting-edge in silico ligand- and structure-based assessment and de novo deep learning methods are well positioned to provide new insights into protein-ligand interactions and potential active compounds. In this work, we (i) collect a dataset of α2a adrenoceptor agonists and provide it as a resource for the drug design community; (ii) use the dataset as a basis to generate candidate-active structures via deep learning; and (iii) apply computational ligand- and structure-based analysis techniques to gain new insights into α2a adrenoceptor agonists and assess the quality of the computer-generated compounds. We further describe how such assessment techniques can be applied to putative chemical probes with a case study involving proposed medetomidine-based probes.
Collapse
Affiliation(s)
- Katherine J Schultz
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Sean M Colby
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Vivian S Lin
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Aaron T Wright
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States.,The Gene and Linda Voiland School of Chemical Engineering and Bioengineering, Washington State University, Pullman, Washington 99163, United States
| | - Ryan S Renslow
- Biological Sciences Division, Pacific Northwest National Laboratory, Richland, Washington 99352, United States.,The Gene and Linda Voiland School of Chemical Engineering and Bioengineering, Washington State University, Pullman, Washington 99163, United States
| |
Collapse
|
25
|
Borrel A, Mansouri K, Nolte S, Saddler T, Conway M, Schmitt C, Kleinstreuer NC. InterPred: a webtool to predict chemical autofluorescence and luminescence interference. Nucleic Acids Res 2020; 48:W586-W590. [PMID: 32421835 DOI: 10.1093/nar/gkaa378] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2020] [Revised: 04/10/2020] [Accepted: 04/29/2020] [Indexed: 02/07/2023] Open
Abstract
High-throughput screening (HTS) research programs for drug development or chemical hazard assessment are designed to screen thousands of molecules across hundreds of biological targets or pathways. Most HTS platforms use fluorescence and luminescence technologies, representing more than 70% of the assays in the US Tox21 research consortium. These technologies are subject to interferent signals largely explained by chemicals interacting with light spectrum. This phenomenon results in up to 5-10% of false positive results, depending on the chemical library used. Here, we present the InterPred webserver (version 1.0), a platform to predict such interference chemicals based on the first large-scale chemical screening effort to directly characterize chemical-assay interference, using assays in the Tox21 portfolio specifically designed to measure autofluorescence and luciferase inhibition. InterPred combines 17 quantitative structure activity relationship (QSAR) models built using optimized machine learning techniques and allows users to predict the probability that a new chemical will interfere with different combinations of cellular and technology conditions. InterPred models have been applied to the entire Distributed Structure-Searchable Toxicity (DSSTox) Database (∼800,000 chemicals). The InterPred webserver is available at https://sandbox.ntp.niehs.nih.gov/interferences/.
Collapse
|
26
|
Pradeep P, Patlewicz G, Pearce R, Wambaugh J, Wetmore B, Judson R. Using Chemical Structure Information to Develop Predictive Models for In Vitro Toxicokinetic Parameters to Inform High-throughput Risk-assessment. ACTA ACUST UNITED AC 2020; 16. [PMID: 34124416 DOI: 10.1016/j.comtox.2020.100136] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
The toxicokinetic (TK) parameters fraction of the chemical unbound to plasma proteins and metabolic clearance are critical for relating exposure and internal dose when building in vitro-based risk assessment models. However, experimental toxicokinetic studies have only been carried out on limited chemicals of environmental interest (~1000 chemicals with TK data relative to tens of thousands of chemicals of interest). This work evaluated the utility of chemical structure information to predict TK parameters in silico; development of cluster-based read-across and quantitative structure-activity relationship models of fraction unbound or fub (regression) and intrinsic clearance or Clint (classification and regression) using a dataset of 1487 chemicals; utilization of predicted TK parameters to estimate uncertainty in steady-state plasma concentration (Css); and subsequent in vitro-in vivo extrapolation analyses to derive bioactivity-exposure ratio (BER) plot to compare human oral equivalent doses and exposure predictions using androgen and estrogen receptor activity data for 233 chemicals as an example dataset. The results demonstrate that fub is structurally more predictable than Clint. The model with the highest observed performance for fub had an external test set RMSE/σ=0.62 and R2=0.61, for Clint classification had an external test set accuracy = 65.9%, and for intrinsic clearance regression had an external test set RMSE/σ=0.90 and R2=0.20. This relatively low performance is in part due to the large uncertainty in the underlying Clint data. We show that Css is relatively insensitive to uncertainty in Clint. The models were benchmarked against the ADMET Predictor software. Finally, the BER analysis allowed identification of 14 out of 136 chemicals for further risk assessment demonstrating the utility of these models in aiding risk-based chemical prioritization.
Collapse
Affiliation(s)
- Prachi Pradeep
- Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee.,Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - Grace Patlewicz
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - Robert Pearce
- Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee.,Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - John Wambaugh
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - Barbara Wetmore
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - Richard Judson
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| |
Collapse
|
27
|
Pradeep P, Friedman KP, Judson R. Structure-based QSAR Models to Predict Repeat Dose Toxicity Points of Departure. ACTA ACUST UNITED AC 2020; 16. [PMID: 34017928 DOI: 10.1016/j.comtox.2020.100139] [Citation(s) in RCA: 17] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]
Abstract
Human health risk assessment for environmental chemical exposure is limited by a vast majority of chemicals with little or no experimental in vivo toxicity data. Data gap filling techniques, such as quantitative structure activity relationship (QSAR) models based on chemical structure information, can predict hazard in the absence of experimental data. Risk assessment requires identification of a quantitative point-of-departure (POD) value, the point on the dose-response curve that marks the beginning of a low-dose extrapolation. This study presents two sets of QSAR models to predict POD values (PODQSAR) for repeat dose toxicity. For training and validation, a publicly available in vivo toxicity dataset for 3592 chemicals was compiled using the U.S. Environmental Protection Agency's Toxicity Value database (ToxValDB). The first set of QSAR models predict point-estimates of POD values (PODQSAR) using structural and physicochemical descriptors for repeat dose study types and species combinations. A random forest QSAR model using study type and species as descriptors showed the best performance, with an external test set root mean square error (RMSE) of 0.71 log10-mg/kg/day and coefficient of determination (R2) of 0.53. The second set of QSAR models predict the 95% confidence intervals for PODQSAR using a constructed POD distribution with a mean equal to the median POD value and a standard deviation of 0.5 log10-mg/kg/day, based on previously published typical study-to-study variability that may lead to uncertainty in model predictions. Bootstrap resampling of the pre-generated POD distribution was used to derive point-estimates and 95% confidence intervals for each POD prediction. Enrichment analysis to evaluate the accuracy of PODQSAR showed that 80% of the 5% most potent chemicals were found in the top 20% of the most potent chemical predictions, suggesting that the repeat dose POD QSAR models presented here may help inform screening level human health risk assessments in the absence of other data.
Collapse
Affiliation(s)
- Prachi Pradeep
- Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee.,Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - Katie Paul Friedman
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - Richard Judson
- Center for Computational Toxicology and Exposure, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| |
Collapse
|
28
|
Wu F, Zhou Y, Li L, Shen X, Chen G, Wang X, Liang X, Tan M, Huang Z. Computational Approaches in Preclinical Studies on Drug Discovery and Development. Front Chem 2020; 8:726. [PMID: 33062633 PMCID: PMC7517894 DOI: 10.3389/fchem.2020.00726] [Citation(s) in RCA: 153] [Impact Index Per Article: 30.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2020] [Accepted: 07/14/2020] [Indexed: 12/11/2022] Open
Abstract
Because undesirable pharmacokinetics and toxicity are significant reasons for the failure of drug development in the costly late stage, it has been widely recognized that drug ADMET properties should be considered as early as possible to reduce failure rates in the clinical phase of drug discovery. Concurrently, drug recalls have become increasingly common in recent years, prompting pharmaceutical companies to increase attention toward the safety evaluation of preclinical drugs. In vitro and in vivo drug evaluation techniques are currently more mature in preclinical applications, but these technologies are costly. In recent years, with the rapid development of computer science, in silico technology has been widely used to evaluate the relevant properties of drugs in the preclinical stage and has produced many software programs and in silico models, further promoting the study of ADMET in vitro. In this review, we first introduce the two ADMET prediction categories (molecular modeling and data modeling). Then, we perform a systematic classification and description of the databases and software commonly used for ADMET prediction. We focus on some widely studied ADMT properties as well as PBPK simulation, and we list some applications that are related to the prediction categories and web tools. Finally, we discuss challenges and limitations in the preclinical area and propose some suggestions and prospects for the future.
Collapse
Affiliation(s)
- Fengxu Wu
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Research Platform Service Management Center, Dongguan, China
- Key Laboratory of Pesticide & Chemical Biology, Ministry of Education, College of Chemistry, Central China Normal University, Wuhan, China
| | - Yuquan Zhou
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Research Platform Service Management Center, Dongguan, China
- The Second School of Clinical Medicine, Guangdong Medical University, Dongguan, China
| | - Langhui Li
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Research Platform Service Management Center, Dongguan, China
- Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan, China
| | - Xianhuan Shen
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Research Platform Service Management Center, Dongguan, China
- Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan, China
| | - Ganying Chen
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Research Platform Service Management Center, Dongguan, China
- The Second School of Clinical Medicine, Guangdong Medical University, Dongguan, China
| | - Xiaoqing Wang
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Research Platform Service Management Center, Dongguan, China
- Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan, China
| | - Xianyang Liang
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Research Platform Service Management Center, Dongguan, China
- The Second School of Clinical Medicine, Guangdong Medical University, Dongguan, China
| | - Mengyuan Tan
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Research Platform Service Management Center, Dongguan, China
- Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan, China
| | - Zunnan Huang
- Key Laboratory of Big Data Mining and Precision Drug Design of Guangdong Medical University, Research Platform Service Management Center, Dongguan, China
- Key Laboratory for Research and Development of Natural Drugs of Guangdong Province, School of Pharmacy, Guangdong Medical University, Dongguan, China
- Marine Biomedical Research Institute of Guangdong Zhanjiang, Zhanjiang, China
| |
Collapse
|
29
|
Kowalewski J, Ray A. Predicting novel drugs for SARS-CoV-2 using machine learning from a >10 million chemical space. Heliyon 2020; 6:e04639. [PMID: 32802980 PMCID: PMC7409807 DOI: 10.1016/j.heliyon.2020.e04639] [Citation(s) in RCA: 42] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2020] [Revised: 06/16/2020] [Accepted: 08/03/2020] [Indexed: 12/19/2022] Open
Abstract
There is an urgent need for the identification of effective therapeutics for COVID-19 and we have developed a machine learning drug discovery pipeline to identify several drug candidates. First, we collect assay data for 65 target human proteins known to interact with the SARS-CoV-2 proteins, including the ACE2 receptor. Next, we train machine learning models to predict inhibitory activity and use them to screen FDA registered chemicals and approved drugs (~100,000) and ~14 million purchasable chemicals. We filter predictions according to estimated mammalian toxicity and vapor pressure. Prospective volatile candidates are proposed as novel inhaled therapeutics since the nasal cavity and respiratory tracts are early bottlenecks for infection. We also identify candidates that act across multiple targets as promising for future analyses. We anticipate that this theoretical study can accelerate testing of two categories of therapeutics: repurposed drugs suited for short-term approval, and novel efficacious drugs suitable for a long-term follow up.
Collapse
Affiliation(s)
- Joel Kowalewski
- Interdepartmental Neuroscience Program, University of California, Riverside, CA 92521, USA
| | - Anandasankar Ray
- Interdepartmental Neuroscience Program, University of California, Riverside, CA 92521, USA
- Department of Molecular, Cell and Systems Biology, University of California, Riverside, CA 92521, USA
| |
Collapse
|
30
|
Grashow R, Bessonneau V, Gerona RR, Wang A, Trowbridge J, Lin T, Buren H, Rudel RA, Morello-Frosch R. Integrating Exposure Knowledge and Serum Suspect Screening as a New Approach to Biomonitoring: An Application in Firefighters and Office Workers. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2020; 54:4344-4355. [PMID: 31971370 PMCID: PMC7182169 DOI: 10.1021/acs.est.9b04579] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2019] [Revised: 01/15/2020] [Accepted: 01/23/2020] [Indexed: 05/18/2023]
Abstract
Firefighters (FF) are exposed to recognized and probable carcinogens, yet there are few studies of chemical exposures and associated health concerns in women FFs, such as breast cancer. Biomonitoring often requires a priori selection of compounds to be measured, and so, it may not detect relevant, lesser known, exposures. The Women FFs Biomonitoring Collaborative (WFBC) created a biological sample archive and conducted a general suspect screen (GSS) to address this data gap. Using liquid chromatography-quadrupole time-of-flight tandem mass spectrometry, we sought to identify candidate chemicals of interest in serum samples from 83 women FFs and 79 women office workers (OW) in San Francisco. We identified chemical peaks by matching accurate mass from serum samples against a custom chemical database of 722 slightly polar phenolic and acidic compounds, including many of relevance to firefighting or breast cancer etiology. We then selected tentatively identified chemicals for confirmation based on the following criteria: (1) detection frequency or peak area differences between OW and FF; (2) evidence of mammary carcinogenicity, estrogenicity, or genotoxicity; and (3) not currently measured in large biomonitoring studies. We detected 620 chemicals that matched 300 molecular formulas in the WFBC database, including phthalate metabolites, phosphate flame-retardant metabolites, phenols, pesticides, nitro and nitroso compounds, and per- and polyfluoroalkyl substances. Of the 20 suspect chemicals selected for validation, 8 were confirmed-including two alkylphenols, ethyl paraben, BPF, PFOSAA, benzophenone-3, benzyl p-hydroxybenzoate, and triphenyl phosphate-by running a matrix spike of the reference standards and using m/z, retention time, and the confirmation of at least two fragment ions as criteria for matching. GSS provides a powerful high-throughput approach to identify and prioritize novel chemicals for biomonitoring and health studies.
Collapse
Affiliation(s)
- Rachel Grashow
- Silent
Spring Institute, Newton, Massachusetts 02460, United States
| | | | - Roy R. Gerona
- Clinical
Toxicology and Environmental Biomonitoring Lab, Department of Obstetrics,
Gynecology and Reproductive Sciences, University
of California San Francisco, San
Francisco, California 94143, United States
| | - Aolin Wang
- Program
on Reproductive Health and the Environment, Department of Obstetrics,
Gynecology and Reproductive Sciences & Bakar Computational Health
Sciences Institute, University of California
San Francisco, San Francisco, California 94143, United States
| | - Jessica Trowbridge
- School
of Public Health, University of California
Berkeley, Berkeley, California 94720, United States
| | - Thomas Lin
- Clinical
Toxicology and Environmental Biomonitoring Lab, Department of Obstetrics,
Gynecology and Reproductive Sciences, University
of California San Francisco, San
Francisco, California 94143, United States
| | - Heather Buren
- United Fire
Service Women, San Francisco, California 94143, United States
| | - Ruthann A. Rudel
- Silent
Spring Institute, Newton, Massachusetts 02460, United States
- E-mail: . Phone: 617-332-4288 (R.A.R.)
| | - Rachel Morello-Frosch
- School
of Public Health, University of California
Berkeley, Berkeley, California 94720, United States
- Department
of Environmental Science, Policy and Management
University of California Berkeley, Berkeley, California 94720, United States
- E-mail: , Phone: 510-643-6358 (R.M.-F.)
| |
Collapse
|
31
|
Zhang D, Gong L, Ding S, Tian Y, Jia C, Liu D, Han M, Cheng X, Sun D, Cai P, Tian Y, Yuan L, Tu W, Chen J, Wu A, Hu QN. FRCD: A comprehensive food risk component database with molecular scaffold, chemical diversity, toxicity, and biodegradability analysis. Food Chem 2020; 318:126470. [PMID: 32120139 DOI: 10.1016/j.foodchem.2020.126470] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 02/21/2020] [Accepted: 02/22/2020] [Indexed: 12/26/2022]
Abstract
The presence of natural toxins, pesticide residues, and illegal additives in food products has been associated with a range of potential health hazards. However, no systematic database exists that comprehensively includes and integrates all research information on these compounds, and valuable information remains scattered across numerous databases and extensive literature reports. Thus, using natural language processing technology, we curated 12,018 food risk components from 152,737 literature reports, 12 authoritative databases, and numerous related regulatory documents. Data on molecular structures, physicochemical properties, chemical taxonomy, absorption, distribution, metabolism, excretion, toxicity properties, and physiological targets within the human body were integrated to afford the comprehensive food risk component database (FRCD, http://www.rxnfinder.org/frcd/). We also analyzed the molecular scaffold and chemical diversity, in addition to evaluating the toxicity and biodegradability of the food risk components. The FRCD could be considered a highly promising tool for future food safety studies.
Collapse
Affiliation(s)
- Dachuan Zhang
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200333, PR China.
| | - Linlin Gong
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200333, PR China.
| | - Shaozhen Ding
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200333, PR China.
| | - Ye Tian
- CAS Key Laboratory of Nutrition, Metabolism and Food Safety, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, PR China.
| | - Cancan Jia
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200333, PR China.
| | - Dongliang Liu
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200333, PR China.
| | - Mengying Han
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200333, PR China.
| | - Xingxiang Cheng
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200333, PR China.
| | - Dandan Sun
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200333, PR China.
| | - Pengli Cai
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200333, PR China; Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin 300308, PR China.
| | - Yu Tian
- School of Biology and Pharmaceutical Engineering, Wuhan Polytechnic University, Wuhan, Hubei 430023, PR China.
| | - Le Yuan
- Department of Biology and Biological Engineering, Chalmers University of Technology, Kemivägen 10, SE412 96 Gothenburg, Sweden.
| | - Weizhong Tu
- Wuhan LifeSynther Science and Technology Co. Limited, Wuhan 430070, PR China
| | - Junni Chen
- Wuhan LifeSynther Science and Technology Co. Limited, Wuhan 430070, PR China
| | - Aibo Wu
- CAS Key Laboratory of Nutrition, Metabolism and Food Safety, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, PR China.
| | - Qian-Nan Hu
- CAS Key Laboratory of Computational Biology, CAS-MPG Partner Institute for Computational Biology, Shanghai Institute of Nutrition and Health, Shanghai Institutes for Biological Sciences, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200333, PR China.
| |
Collapse
|
32
|
Probst D, Reymond JL. Visualization of very large high-dimensional data sets as minimum spanning trees. J Cheminform 2020; 12:12. [PMID: 33431043 PMCID: PMC7015965 DOI: 10.1186/s13321-020-0416-x] [Citation(s) in RCA: 147] [Impact Index Per Article: 29.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2020] [Accepted: 02/04/2020] [Indexed: 01/10/2023] Open
Abstract
The chemical sciences are producing an unprecedented amount of large, high-dimensional data sets containing chemical structures and associated properties. However, there are currently no algorithms to visualize such data while preserving both global and local features with a sufficient level of detail to allow for human inspection and interpretation. Here, we propose a solution to this problem with a new data visualization method, TMAP, capable of representing data sets of up to millions of data points and arbitrary high dimensionality as a two-dimensional tree (http://tmap.gdb.tools). Visualizations based on TMAP are better suited than t-SNE or UMAP for the exploration and interpretation of large data sets due to their tree-like nature, increased local and global neighborhood and structure preservation, and the transparency of the methods the algorithm is based on. We apply TMAP to the most used chemistry data sets including databases of molecules such as ChEMBL, FDB17, the Natural Products Atlas, DSSTox, as well as to the MoleculeNet benchmark collection of data sets. We also show its broad applicability with further examples from biology, particle physics, and literature.![]()
Collapse
Affiliation(s)
- Daniel Probst
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland.
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Bern, Freiestrasse 3, 3012, Bern, Switzerland.
| |
Collapse
|
33
|
Mansouri K, Kleinstreuer N, Abdelaziz AM, Alberga D, Alves VM, Andersson PL, Andrade CH, Bai F, Balabin I, Ballabio D, Benfenati E, Bhhatarai B, Boyer S, Chen J, Consonni V, Farag S, Fourches D, García-Sosa AT, Gramatica P, Grisoni F, Grulke CM, Hong H, Horvath D, Hu X, Huang R, Jeliazkova N, Li J, Li X, Liu H, Manganelli S, Mangiatordi GF, Maran U, Marcou G, Martin T, Muratov E, Nguyen DT, Nicolotti O, Nikolov NG, Norinder U, Papa E, Petitjean M, Piir G, Pogodin P, Poroikov V, Qiao X, Richard AM, Roncaglioni A, Ruiz P, Rupakheti C, Sakkiah S, Sangion A, Schramm KW, Selvaraj C, Shah I, Sild S, Sun L, Taboureau O, Tang Y, Tetko IV, Todeschini R, Tong W, Trisciuzzi D, Tropsha A, Van Den Driessche G, Varnek A, Wang Z, Wedebye EB, Williams AJ, Xie H, Zakharov AV, Zheng Z, Judson RS. CoMPARA: Collaborative Modeling Project for Androgen Receptor Activity. ENVIRONMENTAL HEALTH PERSPECTIVES 2020; 128:27002. [PMID: 32074470 DOI: 10.23645/epacomptox.5176876] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
BACKGROUND Endocrine disrupting chemicals (EDCs) are xenobiotics that mimic the interaction of natural hormones and alter synthesis, transport, or metabolic pathways. The prospect of EDCs causing adverse health effects in humans and wildlife has led to the development of scientific and regulatory approaches for evaluating bioactivity. This need is being addressed using high-throughput screening (HTS) in vitro approaches and computational modeling. OBJECTIVES In support of the Endocrine Disruptor Screening Program, the U.S. Environmental Protection Agency (EPA) led two worldwide consortiums to virtually screen chemicals for their potential estrogenic and androgenic activities. Here, we describe the Collaborative Modeling Project for Androgen Receptor Activity (CoMPARA) efforts, which follows the steps of the Collaborative Estrogen Receptor Activity Prediction Project (CERAPP). METHODS The CoMPARA list of screened chemicals built on CERAPP's list of 32,464 chemicals to include additional chemicals of interest, as well as simulated ToxCast™ metabolites, totaling 55,450 chemical structures. Computational toxicology scientists from 25 international groups contributed 91 predictive models for binding, agonist, and antagonist activity predictions. Models were underpinned by a common training set of 1,746 chemicals compiled from a combined data set of 11 ToxCast™/Tox21 HTS in vitro assays. RESULTS The resulting models were evaluated using curated literature data extracted from different sources. To overcome the limitations of single-model approaches, CoMPARA predictions were combined into consensus models that provided averaged predictive accuracy of approximately 80% for the evaluation set. DISCUSSION The strengths and limitations of the consensus predictions were discussed with example chemicals; then, the models were implemented into the free and open-source OPERA application to enable screening of new chemicals with a defined applicability domain and accuracy assessment. This implementation was used to screen the entire EPA DSSTox database of ∼875,000 chemicals, and their predicted AR activities have been made available on the EPA CompTox Chemicals dashboard and National Toxicology Program's Integrated Chemical Environment. https://doi.org/10.1289/EHP5580.
Collapse
Affiliation(s)
- Kamel Mansouri
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
- ScitoVation LLC, Research Triangle Park, North Carolina, USA
- Integrated Laboratory Systems, Inc., Morrisville, North Carolina, USA
| | - Nicole Kleinstreuer
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM), National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Ahmed M Abdelaziz
- Technische Universität München, Wissenschaftszentrum Weihenstephan für Ernährung, Landnutzung und Umwelt, Department für Biowissenschaftliche Grundlagen, Weihenstephaner Steig 23, 85350 Freising, Germany
| | - Domenico Alberga
- Department of Pharmacy-Drug Sciences, University of Bari, Bari, Italy
| | - Vinicius M Alves
- Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Federal University of Goiás, Goiânia, Brazil
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | | | - Carolina H Andrade
- Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Federal University of Goiás, Goiânia, Brazil
| | - Fang Bai
- School of Pharmacy, Lanzhou University, China
| | - Ilya Balabin
- Information Systems & Global Solutions (IS&GS), Lockheed Martin, USA
| | - Davide Ballabio
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Emilio Benfenati
- Istituto di Ricerche Farmacologiche "Mario Negri", IRCCS, Milan, Italy
| | - Barun Bhhatarai
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Scott Boyer
- Swedish Toxicology Sciences Research Center, Karolinska Institutet, Södertälje, Sweden
| | - Jingwen Chen
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Viviana Consonni
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Sherif Farag
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Denis Fourches
- Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, USA
| | | | - Paola Gramatica
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Francesca Grisoni
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Chris M Grulke
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Dragos Horvath
- Laboratoire de Chémoinformatique-UMR7140, University of Strasbourg/CNRS, Strasbourg, France
| | - Xin Hu
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | | | - Jiazhong Li
- School of Pharmacy, Lanzhou University, China
| | - Xuehua Li
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | | | - Serena Manganelli
- Istituto di Ricerche Farmacologiche "Mario Negri", IRCCS, Milan, Italy
| | | | - Uko Maran
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | - Gilles Marcou
- Laboratoire de Chémoinformatique-UMR7140, University of Strasbourg/CNRS, Strasbourg, France
| | - Todd Martin
- National Risk Management Research Laboratory, U.S. EPA, Cincinnati, Ohio, USA
| | - Eugene Muratov
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Dac-Trung Nguyen
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Orazio Nicolotti
- Department of Pharmacy-Drug Sciences, University of Bari, Bari, Italy
| | - Nikolai G Nikolov
- Division of Risk Assessment and Nutrition, National Food Institute, Technical University of Denmark, Copenhagen, Denmark
| | - Ulf Norinder
- Swedish Toxicology Sciences Research Center, Karolinska Institutet, Södertälje, Sweden
| | - Ester Papa
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Michel Petitjean
- Computational Modeling of Protein-Ligand Interactions (CMPLI)-INSERM UMR 8251, INSERM ERL U1133, Functional and Adaptative Biology (BFA), Universite de Paris, Paris, France
| | - Geven Piir
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | - Pavel Pogodin
- Institute of Biomedical Chemistry IBMC, 10 Building 8, Pogodinskaya st., Moscow 119121, Russia
| | - Vladimir Poroikov
- Institute of Biomedical Chemistry IBMC, 10 Building 8, Pogodinskaya st., Moscow 119121, Russia
| | - Xianliang Qiao
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Ann M Richard
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | | | - Patricia Ruiz
- Computational Toxicology and Methods Development Laboratory, Division of Toxicology and Human Health Sciences, Agency for Toxic Substances and Disease Registry, Centers for Disease Control and Prevention, Atlanta, Georgia, USA
| | - Chetan Rupakheti
- National Risk Management Research Laboratory, U.S. EPA, Cincinnati, Ohio, USA
- Department of Biochemistry and Molecular Biophysics, University of Chicago, Chicago, Illinois, USA
| | - Sugunadevi Sakkiah
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Alessandro Sangion
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Karl-Werner Schramm
- Technische Universität München, Wissenschaftszentrum Weihenstephan für Ernährung, Landnutzung und Umwelt, Department für Biowissenschaftliche Grundlagen, Weihenstephaner Steig 23, 85350 Freising, Germany
| | - Chandrabose Selvaraj
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Imran Shah
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | - Sulev Sild
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | - Lixia Sun
- Department of Pharmaceutical Sciences, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Olivier Taboureau
- Computational Modeling of Protein-Ligand Interactions (CMPLI)-INSERM UMR 8251, INSERM ERL U1133, Functional and Adaptative Biology (BFA), Universite de Paris, Paris, France
| | - Yun Tang
- Department of Pharmaceutical Sciences, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Igor V Tetko
- BIGCHEM GmbH, Neuherberg, Germany
- Helmholtz Zentrum Muenchen - German Research Center for Environmental Health (GmbH), Neuherberg, Germany
| | - Roberto Todeschini
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | | | - Alexander Tropsha
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - George Van Den Driessche
- Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, USA
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique-UMR7140, University of Strasbourg/CNRS, Strasbourg, France
| | - Zhongyu Wang
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Eva B Wedebye
- Division of Risk Assessment and Nutrition, National Food Institute, Technical University of Denmark, Copenhagen, Denmark
| | - Antony J Williams
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | - Hongbin Xie
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Alexey V Zakharov
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Ziye Zheng
- Chemistry Department, Umeå University, Umeå, Sweden
| | - Richard S Judson
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| |
Collapse
|
34
|
Mansouri K, Kleinstreuer N, Abdelaziz AM, Alberga D, Alves VM, Andersson PL, Andrade CH, Bai F, Balabin I, Ballabio D, Benfenati E, Bhhatarai B, Boyer S, Chen J, Consonni V, Farag S, Fourches D, García-Sosa AT, Gramatica P, Grisoni F, Grulke CM, Hong H, Horvath D, Hu X, Huang R, Jeliazkova N, Li J, Li X, Liu H, Manganelli S, Mangiatordi GF, Maran U, Marcou G, Martin T, Muratov E, Nguyen DT, Nicolotti O, Nikolov NG, Norinder U, Papa E, Petitjean M, Piir G, Pogodin P, Poroikov V, Qiao X, Richard AM, Roncaglioni A, Ruiz P, Rupakheti C, Sakkiah S, Sangion A, Schramm KW, Selvaraj C, Shah I, Sild S, Sun L, Taboureau O, Tang Y, Tetko IV, Todeschini R, Tong W, Trisciuzzi D, Tropsha A, Van Den Driessche G, Varnek A, Wang Z, Wedebye EB, Williams AJ, Xie H, Zakharov AV, Zheng Z, Judson RS. CoMPARA: Collaborative Modeling Project for Androgen Receptor Activity. ENVIRONMENTAL HEALTH PERSPECTIVES 2020; 128:27002. [PMID: 32074470 PMCID: PMC7064318 DOI: 10.1289/ehp5580] [Citation(s) in RCA: 114] [Impact Index Per Article: 22.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2019] [Revised: 11/27/2019] [Accepted: 12/05/2019] [Indexed: 05/04/2023]
Abstract
BACKGROUND Endocrine disrupting chemicals (EDCs) are xenobiotics that mimic the interaction of natural hormones and alter synthesis, transport, or metabolic pathways. The prospect of EDCs causing adverse health effects in humans and wildlife has led to the development of scientific and regulatory approaches for evaluating bioactivity. This need is being addressed using high-throughput screening (HTS) in vitro approaches and computational modeling. OBJECTIVES In support of the Endocrine Disruptor Screening Program, the U.S. Environmental Protection Agency (EPA) led two worldwide consortiums to virtually screen chemicals for their potential estrogenic and androgenic activities. Here, we describe the Collaborative Modeling Project for Androgen Receptor Activity (CoMPARA) efforts, which follows the steps of the Collaborative Estrogen Receptor Activity Prediction Project (CERAPP). METHODS The CoMPARA list of screened chemicals built on CERAPP's list of 32,464 chemicals to include additional chemicals of interest, as well as simulated ToxCast™ metabolites, totaling 55,450 chemical structures. Computational toxicology scientists from 25 international groups contributed 91 predictive models for binding, agonist, and antagonist activity predictions. Models were underpinned by a common training set of 1,746 chemicals compiled from a combined data set of 11 ToxCast™/Tox21 HTS in vitro assays. RESULTS The resulting models were evaluated using curated literature data extracted from different sources. To overcome the limitations of single-model approaches, CoMPARA predictions were combined into consensus models that provided averaged predictive accuracy of approximately 80% for the evaluation set. DISCUSSION The strengths and limitations of the consensus predictions were discussed with example chemicals; then, the models were implemented into the free and open-source OPERA application to enable screening of new chemicals with a defined applicability domain and accuracy assessment. This implementation was used to screen the entire EPA DSSTox database of ∼ 875,000 chemicals, and their predicted AR activities have been made available on the EPA CompTox Chemicals dashboard and National Toxicology Program's Integrated Chemical Environment. https://doi.org/10.1289/EHP5580.
Collapse
Affiliation(s)
- Kamel Mansouri
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
- ScitoVation LLC, Research Triangle Park, North Carolina, USA
- Integrated Laboratory Systems, Inc., Morrisville, North Carolina, USA
| | - Nicole Kleinstreuer
- National Toxicology Program Interagency Center for the Evaluation of Alternative Toxicological Methods (NICEATM), National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, USA
| | - Ahmed M. Abdelaziz
- Technische Universität München, Wissenschaftszentrum Weihenstephan für Ernährung, Landnutzung und Umwelt, Department für Biowissenschaftliche Grundlagen, Weihenstephaner Steig 23, 85350 Freising, Germany
| | - Domenico Alberga
- Department of Pharmacy-Drug Sciences, University of Bari, Bari, Italy
| | - Vinicius M. Alves
- Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Federal University of Goiás, Goiânia, Brazil
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | | | - Carolina H. Andrade
- Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Federal University of Goiás, Goiânia, Brazil
| | - Fang Bai
- School of Pharmacy, Lanzhou University, China
| | - Ilya Balabin
- Information Systems & Global Solutions (IS&GS), Lockheed Martin, USA
| | - Davide Ballabio
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Emilio Benfenati
- Istituto di Ricerche Farmacologiche “Mario Negri”, IRCCS, Milan, Italy
| | - Barun Bhhatarai
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Scott Boyer
- Swedish Toxicology Sciences Research Center, Karolinska Institutet, Södertälje, Sweden
| | - Jingwen Chen
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Viviana Consonni
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Sherif Farag
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Denis Fourches
- Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, USA
| | | | - Paola Gramatica
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Francesca Grisoni
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Chris M. Grulke
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | - Huixiao Hong
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Dragos Horvath
- Laboratoire de Chémoinformatique—UMR7140, University of Strasbourg/CNRS, Strasbourg, France
| | - Xin Hu
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | | | - Jiazhong Li
- School of Pharmacy, Lanzhou University, China
| | - Xuehua Li
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | | | - Serena Manganelli
- Istituto di Ricerche Farmacologiche “Mario Negri”, IRCCS, Milan, Italy
| | | | - Uko Maran
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | - Gilles Marcou
- Laboratoire de Chémoinformatique—UMR7140, University of Strasbourg/CNRS, Strasbourg, France
| | - Todd Martin
- National Risk Management Research Laboratory, U.S. EPA, Cincinnati, Ohio, USA
| | - Eugene Muratov
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Dac-Trung Nguyen
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Orazio Nicolotti
- Department of Pharmacy-Drug Sciences, University of Bari, Bari, Italy
| | - Nikolai G. Nikolov
- Division of Risk Assessment and Nutrition, National Food Institute, Technical University of Denmark, Copenhagen, Denmark
| | - Ulf Norinder
- Swedish Toxicology Sciences Research Center, Karolinska Institutet, Södertälje, Sweden
| | - Ester Papa
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Michel Petitjean
- Computational Modeling of Protein-Ligand Interactions (CMPLI)–INSERM UMR 8251, INSERM ERL U1133, Functional and Adaptative Biology (BFA), Universite de Paris, Paris, France
| | - Geven Piir
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | - Pavel Pogodin
- Institute of Biomedical Chemistry IBMC, 10 Building 8, Pogodinskaya st., Moscow 119121, Russia
| | - Vladimir Poroikov
- Institute of Biomedical Chemistry IBMC, 10 Building 8, Pogodinskaya st., Moscow 119121, Russia
| | - Xianliang Qiao
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Ann M. Richard
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | | | - Patricia Ruiz
- Computational Toxicology and Methods Development Laboratory, Division of Toxicology and Human Health Sciences, Agency for Toxic Substances and Disease Registry, Centers for Disease Control and Prevention, Atlanta, Georgia, USA
| | - Chetan Rupakheti
- National Risk Management Research Laboratory, U.S. EPA, Cincinnati, Ohio, USA
- Department of Biochemistry and Molecular Biophysics, University of Chicago, Chicago, Illinois, USA
| | - Sugunadevi Sakkiah
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Alessandro Sangion
- QSAR Research Unit in Environmental Chemistry and Ecotoxicology, Department of Theoretical and Applied Sciences, University of Insubria, Varese, Italy
| | - Karl-Werner Schramm
- Technische Universität München, Wissenschaftszentrum Weihenstephan für Ernährung, Landnutzung und Umwelt, Department für Biowissenschaftliche Grundlagen, Weihenstephaner Steig 23, 85350 Freising, Germany
| | - Chandrabose Selvaraj
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | - Imran Shah
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | - Sulev Sild
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | - Lixia Sun
- Department of Pharmaceutical Sciences, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Olivier Taboureau
- Computational Modeling of Protein-Ligand Interactions (CMPLI)–INSERM UMR 8251, INSERM ERL U1133, Functional and Adaptative Biology (BFA), Universite de Paris, Paris, France
| | - Yun Tang
- Department of Pharmaceutical Sciences, School of Pharmacy, East China University of Science and Technology, Shanghai, China
| | - Igor V. Tetko
- BIGCHEM GmbH, Neuherberg, Germany
- Helmholtz Zentrum Muenchen – German Research Center for Environmental Health (GmbH), Neuherberg, Germany
| | - Roberto Todeschini
- Milano Chemometrics and QSAR Research Group, Department of Earth and Environmental Sciences, University of Milano-Bicocca, Milan, Italy
| | - Weida Tong
- Division of Bioinformatics and Biostatistics, National Center for Toxicology Research, U.S. Food and Drug Administration, Jefferson, Arkansas, USA
| | | | - Alexander Tropsha
- Laboratory for Molecular Modeling, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - George Van Den Driessche
- Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, North Carolina, USA
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique—UMR7140, University of Strasbourg/CNRS, Strasbourg, France
| | - Zhongyu Wang
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Eva B. Wedebye
- Division of Risk Assessment and Nutrition, National Food Institute, Technical University of Denmark, Copenhagen, Denmark
| | - Antony J. Williams
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| | - Hongbin Xie
- School of Environmental Science and Technology, Dalian University of Technology, Dalian, China
| | - Alexey V. Zakharov
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, USA
| | - Ziye Zheng
- Chemistry Department, Umeå University, Umeå, Sweden
| | - Richard S. Judson
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency (U.S. EPA), Research Triangle Park, North Carolina, USA
| |
Collapse
|
35
|
Alberga D, Trisciuzzi D, Mansouri K, Mangiatordi GF, Nicolotti O. Prediction of Acute Oral Systemic Toxicity Using a Multifingerprint Similarity Approach. Toxicol Sci 2020; 167:484-495. [PMID: 30371864 DOI: 10.1093/toxsci/kfy255] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The implementation of nonanimal approaches is of particular importance to regulatory agencies for the prediction of potential hazards associated with acute exposures to chemicals. This work was carried out in the framework of an international modeling initiative organized by the Acute Toxicity Workgroup (ATWG) of the Interagency Coordinating Committee on the Validation of Alternative Methods (ICCVAM) with the participation of 32 international groups across government, industry, and academia. Our contribution was to develop a multifingerprints similarity approach for predicting five relevant toxicology endpoints related to the acute oral systemic toxicity that are: the median lethal dose (LD50) point prediction, the "nontoxic" (LD50 > 2000 mg/kg) and "very toxic" (LD50<50 mg/kg) binary classification, and the multiclass categorization of chemicals based on the United States Environmental Protection Agency and Globally Harmonized System of Classification and Labeling of Chemicals schemes. Provided by the ICCVAM's ATWG, the training set used to develop the models consisted of 8944 chemicals having high-quality rat acute oral lethality data. The proposed approach integrates the results coming from a similarity search based on 19 different fingerprint definitions to return a consensus prediction value. Moreover, the herein described algorithm is tailored to properly tackling the so-called toxicity cliffs alerting that a large gap in LD50 values exists despite a high structural similarity for a given molecular pair. An external validation set made available by ICCVAM and consisting in 2896 chemicals was employed to further evaluate the selected models. This work returned high-accuracy predictions based on the evaluations conducted by ICCVAM's ATWG.
Collapse
Affiliation(s)
- Domenico Alberga
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro," I-70126 Bari, Italy
| | - Daniela Trisciuzzi
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro," I-70126 Bari, Italy
| | - Kamel Mansouri
- ScitoVation LLC, Research Triangle Park, North Carolina 27709.,Integrated Laboratory Systems, Morrisville, NC 27560
| | - Giuseppe Felice Mangiatordi
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro," I-70126 Bari, Italy.,Istituto Tumori IRCCS Giovanni Paolo II, 70124 Bari, Italy
| | - Orazio Nicolotti
- Dipartimento di Farmacia-Scienze del Farmaco, Università degli Studi di Bari "Aldo Moro," I-70126 Bari, Italy
| |
Collapse
|
36
|
Sheffield TY, Judson RS. Ensemble QSAR Modeling to Predict Multispecies Fish Toxicity Lethal Concentrations and Points of Departure. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2019; 53:12793-12802. [PMID: 31560848 PMCID: PMC7047609 DOI: 10.1021/acs.est.9b03957] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/10/2023]
Abstract
QSAR modeling can be used to aid testing prioritization of the thousands of chemical substances for which no ecological toxicity data are available. We drew on the U.S. Environmental Protection Agency's ECOTOX database with additional data from ECHA to build a large data set containing in vivo test data on fish for thousands of chemical substances. This was used to create QSAR models to predict two types of end points: acute LC50 (median lethal concentration) and points of departure similar to the NOEC (no observed effect concentration) for any duration (named the "LC50" and "NOEC" models, respectively). These models used study covariates, such as species and exposure route, as features to facilitate the simultaneous use of varied data types. A novel method of substituting taxonomy groups for species dummy variables was introduced to maximize generalizability to different species. A stacked ensemble of three machine learning methods-random forest, gradient boosted trees, and support vector regression-was implemented to best make use of a large data set with many descriptors. The LC50 and NOEC models predicted end points within 1 order of magnitude 81% and 76% of the time, respectively, and had RMSEs of roughly 0.83 and 0.98 log10(mg/L), respectively. Benchmarks against the existing TEST and ECOSAR tools suggest improved prediction accuracy.
Collapse
Affiliation(s)
- Thomas Y. Sheffield
- U.S. Department of Energy, Oak Ridge Institute for Science and Education, Oak Ridge, TN, 37830, USA
| | - Richard S. Judson
- U.S. Environmental Protection Agency, National Center for Computational Toxicology, Research Triangle Park, NC, 27709, USA
| |
Collapse
|
37
|
Grulke CM, Williams AJ, Thillanadarajah I, Richard AM. EPA's DSSTox database: History of development of a curated chemistry resource supporting computational toxicology research. ACTA ACUST UNITED AC 2019; 12. [PMID: 33426407 PMCID: PMC7787967 DOI: 10.1016/j.comtox.2019.100096] [Citation(s) in RCA: 107] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
The US Environmental Protection Agency's (EPA) Distributed Structure-Searchable Toxicity (DSSTox) database, launched publicly in 2004, currently exceeds 875 K substances spanning hundreds of lists of interest to EPA and environmental researchers. From its inception, DSSTox has focused curation efforts on resolving chemical identifier errors and conflicts in the public domain towards the goal of assigning accurate chemical structures to data and lists of importance to the environmental research and regulatory community. Accurate structure-data associations, in turn, are necessary inputs to structure-based predictive models supporting hazard and risk assessments. In 2014, the legacy, manually curated DSSTox_V1 content was migrated to a MySQL data model, with modern cheminformatics tools supporting both manual and automated curation processes to increase efficiencies. This was followed by sequential auto-loads of filtered portions of three public datasets: EPA's Substance Registry Services (SRS), the National Library of Medicine's ChemID, and PubChem. This process was constrained by a key requirement of uniquely mapped identifiers (i.e., CAS RN, name and structure) for each substance, rejecting content where any two identifiers were conflicted either within or across datasets. This rejected content highlighted the degree of conflicting, inaccurate substance-structure ID mappings in the public domain, ranging from 12% (within EPA SRS) to 49% (across ChemID and PubChem). Substances successfully added to DSSTox from each auto-load were assigned to one of five qc_levels, conveying curator confidence in each dataset. This process enabled a significant expansion of DSSTox content to provide better coverage of the chemical landscape of interest to environmental scientists, while retaining focus on the accuracy of substance-structure-data associations. Currently, DSSTox serves as the core foundation of EPA's CompTox Chemicals Dashboard [https://comptox.epa.gov/dashboard], which provides public access to DSSTox content in support of a broad range of modeling and research activities within EPA and, increasingly, across the field of computational toxicology.
Collapse
Affiliation(s)
- Christopher M Grulke
- National Center for Computational Toxicology, Office of Research & Development, US Environmental Protection Agency, Mail Drop D143-02, Research Triangle Park, NC 27711, USA
| | - Antony J Williams
- National Center for Computational Toxicology, Office of Research & Development, US Environmental Protection Agency, Mail Drop D143-02, Research Triangle Park, NC 27711, USA
| | - Inthirany Thillanadarajah
- Senior Environmental Employment Program, US Environmental Protection Agency, Research Triangle Park, NC 27711, USA
| | - Ann M Richard
- National Center for Computational Toxicology, Office of Research & Development, US Environmental Protection Agency, Mail Drop D143-02, Research Triangle Park, NC 27711, USA
| |
Collapse
|
38
|
Colby SM, Nuñez JR, Hodas NO, Corley CD, Renslow RR. Deep Learning to Generate in Silico Chemical Property Libraries and Candidate Molecules for Small Molecule Identification in Complex Samples. Anal Chem 2019; 92:1720-1729. [DOI: 10.1021/acs.analchem.9b02348] [Citation(s) in RCA: 60] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Sean M. Colby
- Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Jamie R. Nuñez
- Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Nathan O. Hodas
- Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Courtney D. Corley
- Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Ryan R. Renslow
- Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| |
Collapse
|
39
|
Hasselgren C, Ahlberg E, Akahori Y, Amberg A, Anger LT, Atienzar F, Auerbach S, Beilke L, Bellion P, Benigni R, Bercu J, Booth ED, Bower D, Brigo A, Cammerer Z, Cronin MTD, Crooks I, Cross KP, Custer L, Dobo K, Doktorova T, Faulkner D, Ford KA, Fortin MC, Frericks M, Gad-McDonald SE, Gellatly N, Gerets H, Gervais V, Glowienke S, Van Gompel J, Harvey JS, Hillegass J, Honma M, Hsieh JH, Hsu CW, Barton-Maclaren TS, Johnson C, Jolly R, Jones D, Kemper R, Kenyon MO, Kruhlak NL, Kulkarni SA, Kümmerer K, Leavitt P, Masten S, Miller S, Moudgal C, Muster W, Paulino A, Lo Piparo E, Powley M, Quigley DP, Reddy MV, Richarz AN, Schilter B, Snyder RD, Stavitskaya L, Stidl R, Szabo DT, Teasdale A, Tice RR, Trejo-Martin A, Vuorinen A, Wall BA, Watts P, White AT, Wichard J, Witt KL, Woolley A, Woolley D, Zwickl C, Myatt GJ. Genetic toxicology in silico protocol. Regul Toxicol Pharmacol 2019; 107:104403. [PMID: 31195068 PMCID: PMC7485926 DOI: 10.1016/j.yrtph.2019.104403] [Citation(s) in RCA: 44] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2019] [Revised: 05/20/2019] [Accepted: 06/05/2019] [Indexed: 01/23/2023]
Abstract
In silico toxicology (IST) approaches to rapidly assess chemical hazard, and usage of such methods is increasing in all applications but especially for regulatory submissions, such as for assessing chemicals under REACH as well as the ICH M7 guideline for drug impurities. There are a number of obstacles to performing an IST assessment, including uncertainty in how such an assessment and associated expert review should be performed or what is fit for purpose, as well as a lack of confidence that the results will be accepted by colleagues, collaborators and regulatory authorities. To address this, a project to develop a series of IST protocols for different hazard endpoints has been initiated and this paper describes the genetic toxicity in silico (GIST) protocol. The protocol outlines a hazard assessment framework including key effects/mechanisms and their relationships to endpoints such as gene mutation and clastogenicity. IST models and data are reviewed that support the assessment of these effects/mechanisms along with defined approaches for combining the information and evaluating the confidence in the assessment. This protocol has been developed through a consortium of toxicologists, computational scientists, and regulatory scientists across several industries to support the implementation and acceptance of in silico approaches.
Collapse
Affiliation(s)
| | - Ernst Ahlberg
- Predictive Compound ADME & Safety, Drug Safety & Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden
| | - Yumi Akahori
- Chemicals Evaluation and Research Institute, 1-4-25 Kouraku, Bunkyo-ku, Tokyo, 112-0004, Japan
| | - Alexander Amberg
- Sanofi, R&D Preclinical Safety Frankfurt, Industriepark Hoechst, D-65926, Frankfurt am Main, Germany
| | - Lennart T Anger
- Sanofi, R&D Preclinical Safety Frankfurt, Industriepark Hoechst, D-65926, Frankfurt am Main, Germany
| | - Franck Atienzar
- UCB BioPharma SPRL, Chemin du Foriest, B-1420 Braine-l'Alleud, Belgium
| | - Scott Auerbach
- The National Institute of Environmental Health Sciences, Division of the National Toxicology Program, Research Triangle Park, NC, 27709, USA
| | - Lisa Beilke
- Toxicology Solutions Inc., San Diego, CA, USA
| | | | | | - Joel Bercu
- Gilead Sciences, 333 Lakeside Drive, Foster City, CA, USA
| | - Ewan D Booth
- Syngenta, Product Safety Department, Jealott's Hill International Research Centre, Bracknell, Berkshire, RG42 6EY, UK
| | - Dave Bower
- Leadscope, Inc, 1393 Dublin Rd, Columbus, OH, 43215, USA
| | - Alessandro Brigo
- Roche Pharmaceutical Research & Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Grenzacherstrasse 124, 4070, Basel, Switzerland
| | - Zoryana Cammerer
- Janssen Research & Development, 1400 McKean Road, Spring House, PA, 19477, USA
| | - Mark T D Cronin
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, L3 3AF, UK
| | - Ian Crooks
- British American Tobacco, Research and Development, Regents Park Road, Southampton, Hampshire, SO15 8TL, UK
| | - Kevin P Cross
- Leadscope, Inc, 1393 Dublin Rd, Columbus, OH, 43215, USA
| | - Laura Custer
- Bristol-Myers Squibb, Drug Safety Evaluation, 1 Squibb Dr, New Brunswick, NJ, 08903, USA
| | - Krista Dobo
- Pfizer Global Research & Development, 558 Eastern Point Road, Groton, CT, 06340, USA
| | - Tatyana Doktorova
- Douglas Connect GmbH, Technology Park Basel, Hochbergerstrasse 60C, CH-4057, Basel / Basel-Stadt, Switzerland
| | - David Faulkner
- Lawrence Berkeley National Laboratory, One Cyclotron Road, MS 70A-1161A, Berkeley, CA, 947020, USA
| | - Kevin A Ford
- Global Blood Therapeutics, 171 Oyster Point Boulevard, South San Francisco, CA, 94080, USA
| | - Marie C Fortin
- Jazz Pharmaceuticals, Inc., 200 Princeton South Corporate Center, Suite 180, Ewing, NJ, 08628, USA; Department of Pharmacology and Toxicology, Ernest Mario School of Pharmacy, Rutgers University, 170 Frelinghuysen Rd, Piscataway, NJ, 08855, USA
| | | | | | - Nichola Gellatly
- National Centre for the Replacement, Refinement and Reduction of Animals in Research (NC3Rs), Gibbs Building, 215 Euston Road, London, NW1 2BE, UK
| | - Helga Gerets
- UCB BioPharma SPRL, Chemin du Foriest, B-1420, Braine-l'Alleud, Belgium
| | | | - Susanne Glowienke
- Novartis Pharma AG, Pre-Clinical Safety, Werk Klybeck, CH, 4057, Basel, Switzerland
| | - Jacky Van Gompel
- Janssen Pharmaceutical Companies of Johnson & Johnson, 2340, Beerse, Belgium
| | - James S Harvey
- GlaxoSmithKline Pre-Clinical Development, Park Road, Ware, Hertfordshire, SG12 0DP, UK
| | - Jedd Hillegass
- Bristol-Myers Squibb, Drug Safety Evaluation, 1 Squibb Dr, New Brunswick, NJ, 08903, USA
| | - Masamitsu Honma
- Division of Genetics and Mutagenesis, National Institute of Health Sciences, Kanagawa, 210-9501, Japan
| | - Jui-Hua Hsieh
- Kelly Government Solutions, Research Triangle Park, NC, 27709, USA
| | - Chia-Wen Hsu
- FDA Center for Drug Evaluation and Research, Silver Spring, MD, USA
| | | | | | - Robert Jolly
- Toxicology Division, Eli Lilly and Company, Indianapolis, IN, USA
| | - David Jones
- Medicines and Healthcare Products Regulatory Agency, 10 South Colonnade, Canary Wharf, London, E14 4PU, UK
| | - Ray Kemper
- Vertex Pharmaceuticals Inc., Predictive and Investigative Safety Assessment, 50 Northern Ave, Boston, MA, USA
| | - Michelle O Kenyon
- Pfizer Global Research & Development, 558 Eastern Point Road, Groton, CT, 06340, USA
| | - Naomi L Kruhlak
- FDA Center for Drug Evaluation and Research, Silver Spring, MD, USA
| | - Sunil A Kulkarni
- Existing Substances Risk Assessment Bureau, Health Canada, Ottawa, ON, K1A 0K9, Canada
| | - Klaus Kümmerer
- Institute for Sustainable and Environmental Chemistry, Leuphana University Lüneburg, Scharnhorststraße 1/C13.311b, 21335, Lüneburg, Germany
| | - Penny Leavitt
- Bristol-Myers Squibb, Drug Safety Evaluation, 1 Squibb Dr, New Brunswick, NJ, 08903, USA
| | - Scott Masten
- The National Institute of Environmental Health Sciences, Division of the National Toxicology Program, Research Triangle Park, NC, 27709, USA
| | - Scott Miller
- Leadscope, Inc, 1393 Dublin Rd, Columbus, OH, 43215, USA
| | | | - Wolfgang Muster
- Roche Pharmaceutical Research & Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd, Grenzacherstrasse 124, 4070, Basel, Switzerland
| | | | | | - Mark Powley
- Merck Research Laboratories, West Point, PA, 19486, USA
| | | | | | | | | | - Ronald D Snyder
- RDS Consulting Services, 2936 Wooded Vista Ct, Mason, OH, 45040, USA
| | | | | | | | | | | | | | | | - Brian A Wall
- Colgate-Palmolive Company, Piscataway, NJ, 08854, USA
| | - Pete Watts
- Bibra, Cantium House, Railway Approach, Wallington, Surrey, SM6 0DZ, UK
| | - Angela T White
- GlaxoSmithKline Pre-Clinical Development, Park Road, Ware, Hertfordshire, SG12 0DP, UK
| | - Joerg Wichard
- Bayer AG, Pharmaceuticals Division, Investigational Toxicology, Muellerstr. 178, D-13353, Berlin, Germany
| | - Kristine L Witt
- The National Institute of Environmental Health Sciences, Division of the National Toxicology Program, Research Triangle Park, NC, 27709, USA
| | - Adam Woolley
- ForthTox Limited, PO Box 13550, Linlithgow, EH49 7YU, UK
| | - David Woolley
- ForthTox Limited, PO Box 13550, Linlithgow, EH49 7YU, UK
| | - Craig Zwickl
- Transendix LLC, 1407 Moores Manor, Indianapolis, IN, 46229, USA
| | - Glenn J Myatt
- Leadscope, Inc, 1393 Dublin Rd, Columbus, OH, 43215, USA
| |
Collapse
|
40
|
Gadaleta D, Vuković K, Toma C, Lavado GJ, Karmaus AL, Mansouri K, Kleinstreuer NC, Benfenati E, Roncaglioni A. SAR and QSAR modeling of a large collection of LD 50 rat acute oral toxicity data. J Cheminform 2019; 11:58. [PMID: 33430989 PMCID: PMC6717335 DOI: 10.1186/s13321-019-0383-2] [Citation(s) in RCA: 64] [Impact Index Per Article: 10.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 08/13/2019] [Indexed: 11/10/2022] Open
Abstract
The median lethal dose for rodent oral acute toxicity (LD50) is a standard piece of information required to categorize chemicals in terms of the potential hazard posed to human health after acute exposure. The exclusive use of in vivo testing is limited by the time and costs required for performing experiments and by the need to sacrifice a number of animals. (Quantitative) structure-activity relationships [(Q)SAR] proved a valid alternative to reduce and assist in vivo assays for assessing acute toxicological hazard. In the framework of a new international collaborative project, the NTP Interagency Center for the Evaluation of Alternative Toxicological Methods and the U.S. Environmental Protection Agency's National Center for Computational Toxicology compiled a large database of rat acute oral LD50 data, with the aim of supporting the development of new computational models for predicting five regulatory relevant acute toxicity endpoints. In this article, a series of regression and classification computational models were developed by employing different statistical and knowledge-based methodologies. External validation was performed to demonstrate the real-life predictability of models. Integrated modeling was then applied to improve performance of single models. Statistical results confirmed the relevance of developed models in regulatory frameworks, and confirmed the effectiveness of integrated modeling. The best integrated strategies reached RMSEs lower than 0.50 and the best classification models reached balanced accuracies over 0.70 for multi-class and over 0.80 for binary endpoints. Computed predictions will be hosted on the EPA's Chemistry Dashboard and made freely available to the scientific community.
Collapse
Affiliation(s)
- Domenico Gadaleta
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156, Milan, Italy.
| | - Kristijan Vuković
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156, Milan, Italy
| | - Cosimo Toma
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156, Milan, Italy
- Institute for Risk Assessment Sciences, Utrecht University, PO Box 80177, 3508 TD, Utrecht, The Netherlands
| | - Giovanna J Lavado
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156, Milan, Italy
| | - Agnes L Karmaus
- Integrated Laboratory Systems, Research Triangle Park, NC, 27560, USA
| | - Kamel Mansouri
- Integrated Laboratory Systems, Research Triangle Park, NC, 27560, USA
| | - Nicole C Kleinstreuer
- NTP Interagency Center for the Evaluation of Alternative Toxicological Methods, National Institute of Environmental Health Sciences, Research Triangle Park, NC, 27560, USA
| | - Emilio Benfenati
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156, Milan, Italy
| | - Alessandra Roncaglioni
- Laboratory of Environmental Chemistry and Toxicology, Department of Environmental Health Sciences, Istituto di Ricerche Farmacologiche Mario Negri IRCCS, Via Mario Negri 2, 20156, Milan, Italy
| |
Collapse
|
41
|
Banerjee P, Eckert AO, Schrey AK, Preissner R. ProTox-II: a webserver for the prediction of toxicity of chemicals. Nucleic Acids Res 2019; 46:W257-W263. [PMID: 29718510 PMCID: PMC6031011 DOI: 10.1093/nar/gky318] [Citation(s) in RCA: 1420] [Impact Index Per Article: 236.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2018] [Accepted: 04/26/2018] [Indexed: 01/06/2023] Open
Abstract
Advancement in the field of computational research has made it possible for the in silico methods to offer significant benefits to both regulatory needs and requirements for risk assessments, and pharmaceutical industry to assess the safety profile of a chemical. Here, we present ProTox-II that incorporates molecular similarity, pharmacophores, fragment propensities and machine-learning models for the prediction of various toxicity endpoints; such as acute toxicity, hepatotoxicity, cytotoxicity, carcinogenicity, mutagenicity, immunotoxicity, adverse outcomes pathways (Tox21) and toxicity targets. The predictive models are built on data from both in vitro assays (e.g. Tox21 assays, Ames bacterial mutation assays, hepG2 cytotoxicity assays, Immunotoxicity assays) and in vivo cases (e.g. carcinogenicity, hepatotoxicity). The models have been validated on independent external sets and have shown strong performance. ProTox-II provides a freely available webserver for in silico toxicity prediction for toxicologists, regulatory agencies, computational and medicinal chemists, and all users without login at http://tox.charite.de/protox_II. The webserver takes a two-dimensional chemical structure as an input and reports the possible toxicity profile of the chemical for 33 models with confidence scores, and an overall toxicity radar chart along with three most similar compounds with known acute toxicity.
Collapse
Affiliation(s)
- Priyanka Banerjee
- Structural Bioinformatics Group, Institute for Physiology & ECRC, Charité - University Medicine Berlin, 10115 Berlin, Germany
| | - Andreas O Eckert
- Structural Bioinformatics Group, Institute for Physiology & ECRC, Charité - University Medicine Berlin, 10115 Berlin, Germany
| | - Anna K Schrey
- Structural Bioinformatics Group, Institute for Physiology & ECRC, Charité - University Medicine Berlin, 10115 Berlin, Germany
| | - Robert Preissner
- Structural Bioinformatics Group, Institute for Physiology & ECRC, Charité - University Medicine Berlin, 10115 Berlin, Germany.,BB3R - Berlin Brandenburg 3R Graduate School, Freie Universität Berlin, Berlin, Germany
| |
Collapse
|
42
|
McEachran AD, Balabin I, Cathey T, Transue TR, Al-Ghoul H, Grulke C, Sobus JR, Williams AJ. Linking in silico MS/MS spectra with chemistry data to improve identification of unknowns. Sci Data 2019; 6:141. [PMID: 31375670 PMCID: PMC6677792 DOI: 10.1038/s41597-019-0145-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Accepted: 07/01/2019] [Indexed: 12/21/2022] Open
Abstract
Confident identification of unknown chemicals in high resolution mass spectrometry (HRMS) screening studies requires cohesive workflows and complementary data, tools, and software. Chemistry databases, screening libraries, and chemical metadata have become fixtures in identification workflows. To increase confidence in compound identifications, the use of structural fragmentation data collected via tandem mass spectrometry (MS/MS or MS2) is vital. However, the availability of empirically collected MS/MS data for identification of unknowns is limited. Researchers have therefore turned to in silico generation of MS/MS data for use in HRMS-based screening studies. This paper describes the generation en masse of predicted MS/MS spectra for the entirety of the US EPA's DSSTox database using competitive fragmentation modelling and a freely available open source tool, CFM-ID. The generated dataset comprises predicted MS/MS spectra for ~700,000 structures, and mappings between predicted spectra, structures, associated substances, and chemical metadata. Together, these resources facilitate improved compound identifications in HRMS screening studies. These data are accessible via an SQL database, a comma-separated export file (.csv), and EPA's CompTox Chemicals Dashboard.
Collapse
Affiliation(s)
- Andrew D McEachran
- Oak Ridge Institute for Science and Education (ORISE) Research Participation Program, United States Environmental Protection Agency, 109 T.W. Alexander Dr., Research Triangle Park, Durham, NC, 27711, USA. .,National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Dr., Research Triangle Park, Durham, NC, 27711, USA.
| | - Ilya Balabin
- CSRA Inc., 109 T.W. Alexander Drive, Research Triangle Park, Durham, NC, 27711, USA
| | - Tommy Cathey
- GDIT, 109 T.W. Alexander Dr., Research Triangle Park, Durham, NC, 27711, USA
| | - Thomas R Transue
- GDIT, 109 T.W. Alexander Dr., Research Triangle Park, Durham, NC, 27711, USA
| | - Hussein Al-Ghoul
- Oak Ridge Associated Universities (ORAU), 109 T.W. Alexander Dr., Research Triangle Park, Durham, NC, 27711, USA
| | - Chris Grulke
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Dr., Research Triangle Park, Durham, NC, 27711, USA
| | - Jon R Sobus
- National Exposure Research Laboratory, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Dr., Research Triangle Park, Durham, NC, 27711, USA
| | - Antony J Williams
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Dr., Research Triangle Park, Durham, NC, 27711, USA.
| |
Collapse
|
43
|
Ntie-Kang F. Mechanistic role of plant-based bitter principles and bitterness prediction for natural product studies II: prediction tools and case studies. PHYSICAL SCIENCES REVIEWS 2019. [DOI: 10.1515/psr-2019-0007] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Abstract
The first part of this chapter provides an overview of computer-based tools (algorithms, web servers, and software) for the prediction of bitterness in compounds. These tools all implement machine learning (ML) methods and are all freely accessible. For each tool, a brief description of the implemented method is provided, along with the training sets and the benchmarking results. In the second part, an attempt has been made to explain at the mechanistic level why some medicinal plants are bitter and how plants use bitter natural compounds, obtained through the biosynthetic process as important ingredients for adapting to the environment. A further exploration is made on the role of bitter natural products in the defense mechanism of plants against insect pest, herbivores, and other invaders. Case studies have focused on alkaloids, terpenoids, cyanogenic glucosides and phenolic derivatives.
Collapse
|
44
|
Tuwani R, Wadhwa S, Bagler G. BitterSweet: Building machine learning models for predicting the bitter and sweet taste of small molecules. Sci Rep 2019; 9:7155. [PMID: 31073241 PMCID: PMC6509165 DOI: 10.1038/s41598-019-43664-y] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Accepted: 04/12/2019] [Indexed: 01/29/2023] Open
Abstract
The dichotomy of sweet and bitter tastes is a salient evolutionary feature of human gustatory system with an innate attraction to sweet taste and aversion to bitterness. A better understanding of molecular correlates of bitter-sweet taste gradient is crucial for identification of natural as well as synthetic compounds of desirable taste on this axis. While previous studies have advanced our understanding of the molecular basis of bitter-sweet taste and contributed models for their identification, there is ample scope to enhance these models by meticulous compilation of bitter-sweet molecules and utilization of a wide spectrum of molecular descriptors. Towards these goals, our study provides a structured compilation of bitter, sweet and tasteless molecules and state-of-the-art machine learning models for bitter-sweet taste prediction (BitterSweet). We compare different sets of molecular descriptors for their predictive performance and further identify important features as well as feature blocks. The utility of BitterSweet models is demonstrated by taste prediction on large specialized chemical sets such as FlavorDB, FooDB, SuperSweet, Super Natural II, DSSTox, and DrugBank. To facilitate future research in this direction, we make all datasets and BitterSweet models publicly available, and present an end-to-end software for bitter-sweet taste prediction based on freely available chemical descriptors.
Collapse
Affiliation(s)
- Rudraksh Tuwani
- Complex Systems Laboratory, Center for Computational Biology, Indraprastha Institute of Information Technology (IIIT-Delhi), New Delhi, India
| | - Somin Wadhwa
- Complex Systems Laboratory, Center for Computational Biology, Indraprastha Institute of Information Technology (IIIT-Delhi), New Delhi, India
| | - Ganesh Bagler
- Complex Systems Laboratory, Center for Computational Biology, Indraprastha Institute of Information Technology (IIIT-Delhi), New Delhi, India.
| |
Collapse
|
45
|
Colby SM, Thomas DG, Nuñez JR, Baxter DJ, Glaesemann KR, Brown JM, Pirrung MA, Govind N, Teeguarden JG, Metz TO, Renslow RS. ISiCLE: A Quantum Chemistry Pipeline for Establishing in Silico Collision Cross Section Libraries. Anal Chem 2019; 91:4346-4356. [PMID: 30741529 PMCID: PMC6526953 DOI: 10.1021/acs.analchem.8b04567] [Citation(s) in RCA: 80] [Impact Index Per Article: 13.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
High-throughput, comprehensive, and confident identifications of metabolites and other chemicals in biological and environmental samples will revolutionize our understanding of the role these chemically diverse molecules play in biological systems. Despite recent technological advances, metabolomics studies still result in the detection of a disproportionate number of features that cannot be confidently assigned to a chemical structure. This inadequacy is driven by the single most significant limitation in metabolomics, the reliance on reference libraries constructed by analysis of authentic reference materials with limited commercial availability. To this end, we have developed the in silico chemical library engine (ISiCLE), a high-performance computing-friendly cheminformatics workflow for generating libraries of chemical properties. In the instantiation described here, we predict probable three-dimensional molecular conformers (i.e., conformational isomers) using chemical identifiers as input, from which collision cross sections (CCS) are derived. The approach employs first-principles simulation, distinguished by the use of molecular dynamics, quantum chemistry, and ion mobility calculations, to generate structures and chemical property libraries, all without training data. Importantly, optimization of ISiCLE included a refactoring of the popular MOBCAL code for trajectory-based mobility calculations, improving its computational efficiency by over 2 orders of magnitude. Calculated CCS values were validated against 1983 experimentally measured CCS values and compared to previously reported CCS calculation approaches. Average calculated CCS error for the validation set is 3.2% using standard parameters, outperforming other density functional theory (DFT)-based methods and machine learning methods (e.g., MetCCS). An online database is introduced for sharing both calculated and experimental CCS values ( metabolomics.pnnl.gov ), initially including a CCS library with over 1 million entries. Finally, three successful applications of molecule characterization using calculated CCS are described, including providing evidence for the presence of an environmental degradation product, the separation of molecular isomers, and an initial characterization of complex blinded mixtures of exposure chemicals. This work represents a method to address the limitations of small molecule identification and offers an alternative to generating chemical identification libraries experimentally by analyzing authentic reference materials. All code is available at github.com/pnnl .
Collapse
Affiliation(s)
- Sean M. Colby
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Dennis G. Thomas
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Jamie R. Nuñez
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Douglas J. Baxter
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Kurt R. Glaesemann
- Communications and Information Technology Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Joseph M. Brown
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Meg A. Pirrung
- National Security Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Niranjan Govind
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Justin G. Teeguarden
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
- Department of Environmental and Molecular Toxicology, Oregon State University, Corvallis, Oregon 97331, United States
| | - Thomas O. Metz
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| | - Ryan S. Renslow
- Earth and Biological Sciences Directorate, Pacific Northwest National Laboratory, Richland, Washington 99352, United States
| |
Collapse
|
46
|
Sosnin S, Karlov D, Tetko IV, Fedorov MV. Comparative Study of Multitask Toxicity Modeling on a Broad Chemical Space. J Chem Inf Model 2019; 59:1062-1072. [PMID: 30589269 DOI: 10.1021/acs.jcim.8b00685] [Citation(s) in RCA: 58] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Acute toxicity is one of the most challenging properties to predict purely with computational methods due to its direct relationship to biological interactions. Moreover, toxicity can be represented by different end points: it can be measured for different species using different types of administration, etc., and it is questionable if the knowledge transfer between end points is possible. We performed a comparative study of prediction multitask toxicity for a broad chemical space using different descriptors and modeling algorithms and applied multitask learning for a large toxicity data set extracted from the Registry of Toxic Effects of Chemical Substances (RTECS). We demonstrated that multitask modeling provides significant improvement over single-output models and other machine learning methods. Our research reveals that multitask learning can be very useful to improve the quality of acute toxicity modeling and raises a discussion about the usage of multitask approaches for regulation purposes. Our MultiTox models are freely available in OCHEM platform ( ochem.eu/multitox ) under CC-BY-NC license.
Collapse
Affiliation(s)
- Sergey Sosnin
- Skolkovo Institute of Science and Technology , Skolkovo Innovation Center , Moscow 143026 , Russia
| | - Dmitry Karlov
- Skolkovo Institute of Science and Technology , Skolkovo Innovation Center , Moscow 143026 , Russia
| | - Igor V Tetko
- Helmholtz Zentrum München-Research Center for Environmental Health (GmbH) , Institute of Structural Biology and BIGCHEM GmbH , Ingolstädter Landstraße 1 , D-85764 Neuherberg , Germany
| | - Maxim V Fedorov
- Skolkovo Institute of Science and Technology , Skolkovo Innovation Center , Moscow 143026 , Russia.,University of Strathclyde , Department of Physics , John Anderson Building, 107 Rottenrow East , Glasgow , U.K. G40NG
| |
Collapse
|
47
|
Ring CL, Arnot JA, Bennett DH, Egeghy PP, Fantke P, Huang L, Isaacs KK, Jolliet O, Phillips KA, Price PS, Shin HM, Westgate JN, Setzer RW, Wambaugh JF. Consensus Modeling of Median Chemical Intake for the U.S. Population Based on Predictions of Exposure Pathways. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2019; 53:719-732. [PMID: 30516957 PMCID: PMC6690061 DOI: 10.1021/acs.est.8b04056] [Citation(s) in RCA: 87] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/19/2023]
Abstract
Prioritizing the potential risk posed to human health by chemicals requires tools that can estimate exposure from limited information. In this study, chemical structure and physicochemical properties were used to predict the probability that a chemical might be associated with any of four exposure pathways leading from sources-consumer (near-field), dietary, far-field industrial, and far-field pesticide-to the general population. The balanced accuracies of these source-based exposure pathway models range from 73 to 81%, with the error rate for identifying positive chemicals ranging from 17 to 36%. We then used exposure pathways to organize predictions from 13 different exposure models as well as other predictors of human intake rates. We created a consensus, meta-model using the Systematic Empirical Evaluation of Models framework in which the predictors of exposure were combined by pathway and weighted according to predictive ability for chemical intake rates inferred from human biomonitoring data for 114 chemicals. The consensus model yields an R2 of ∼0.8. We extrapolate to predict relevant pathway(s), median intake rate, and credible interval for 479 926 chemicals, mostly with minimal exposure information. This approach identifies 1880 chemicals for which the median population intake rates may exceed 0.1 mg/kg bodyweight/day, while there is 95% confidence that the median intake rate is below 1 μg/kg BW/day for 474572 compounds.
Collapse
Affiliation(s)
- Caroline L. Ring
- National Center for Computational Toxicology, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711
- Oak Ridge Institute for Science and Education, Oak Ridge, Tennessee 37831
| | - Jon A. Arnot
- ARC Arnot Research and Consulting, 36 Sproat Ave. Toronto, ON, Canada, M4M 1W4
- Department of Physical & Environmental Sciences, University of Toronto Scarborough 1265 Military Trail, Toronto, ON, Canada, M1C 1A4
- Department of Pharmacology and Toxicology, University of Toronto, 1 King’s College Cir, Toronto, ON, Canada, M5S 1A8
| | - Deborah H. Bennett
- Department of Public Health Sciences, University of California, Davis, California, 95616
| | - Peter P. Egeghy
- National Exposure Research Laboratory, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711
| | - Peter Fantke
- Quantitative Sustainability Assessment Division, Department of Management Engineering, Technical University of Denmark, 2800 Kgs. Lyngby, Denmark
| | - Lei Huang
- Department of Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, Michigan 48109
| | - Kristin K. Isaacs
- National Exposure Research Laboratory, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711
| | - Olivier Jolliet
- Department of Environmental Health Sciences, School of Public Health, University of Michigan, Ann Arbor, Michigan 48109
| | - Katherine A. Phillips
- National Exposure Research Laboratory, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711
| | - Paul S. Price
- National Exposure Research Laboratory, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711
| | - Hyeong-Moo Shin
- Department of Earth and Environmental Sciences, University of Texas, Arlington, Texas, 76019
| | - John N. Westgate
- ARC Arnot Research and Consulting, 36 Sproat Ave. Toronto, ON, Canada, M4M 1W4
| | - R. Woodrow Setzer
- National Center for Computational Toxicology, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711
| | - John F. Wambaugh
- National Center for Computational Toxicology, Office of Research and Development, United States Environmental Protection Agency, Research Triangle Park, North Carolina 27711
- Corresponding Author: John F. Wambaugh, 109 T.W. Alexander Dr, NC 27711, USA, , Phone: (919) 541-7641
| |
Collapse
|
48
|
Nicolas CI, Mansouri K, Phillips KA, Grulke CM, Richard AM, Williams AJ, Rabinowitz J, Isaacs KK, Yau A, Wambaugh JF. Rapid experimental measurements of physicochemical properties to inform models and testing. THE SCIENCE OF THE TOTAL ENVIRONMENT 2018; 636:901-909. [PMID: 29729507 PMCID: PMC6214190 DOI: 10.1016/j.scitotenv.2018.04.266] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/20/2018] [Revised: 04/19/2018] [Accepted: 04/20/2018] [Indexed: 04/14/2023]
Abstract
The structures and physicochemical properties of chemicals are important for determining their potential toxicological effects, toxicokinetics, and route(s) of exposure. These data are needed to prioritize the risk for thousands of environmental chemicals, but experimental values are often lacking. In an attempt to efficiently fill data gaps in physicochemical property information, we generated new data for 200 structurally diverse compounds, which were rigorously selected from the USEPA ToxCast chemical library, and whose structures are available within the Distributed Structure-Searchable Toxicity Database (DSSTox). This pilot study evaluated rapid experimental methods to determine five physicochemical properties, including the log of the octanol:water partition coefficient (known as log(Kow) or logP), vapor pressure, water solubility, Henry's law constant, and the acid dissociation constant (pKa). For most compounds, experiments were successful for at least one property; log(Kow) yielded the largest return (176 values). It was determined that 77 ToxPrint structural features were enriched in chemicals with at least one measurement failure, indicating which features may have played a role in rapid method failures. To gauge consistency with traditional measurement methods, the new measurements were compared with previous measurements (where available). Since quantitative structure-activity/property relationship (QSAR/QSPR) models are used to fill gaps in physicochemical property information, 5 suites of QSPRs were evaluated for their predictive ability and chemical coverage or applicability domain of new experimental measurements. The ability to have accurate measurements of these properties will facilitate better exposure predictions in two ways: 1) direct input of these experimental measurements into exposure models; and 2) construction of QSPRs with a wider applicability domain, as their predicted physicochemical values can be used to parameterize exposure models in the absence of experimental data.
Collapse
Affiliation(s)
- Chantel I Nicolas
- ScitoVation, LLC 6 Davis Drive, Durham, NC 27703, USA; National Center for Computational Toxicology, Office of Research and Development, US EPA, Research Triangle Park, NC 27711, USA; Oak Ridge Institute for Science and Education, Oak Ridge, TN 37831, USA
| | - Kamel Mansouri
- ScitoVation, LLC 6 Davis Drive, Durham, NC 27703, USA; National Center for Computational Toxicology, Office of Research and Development, US EPA, Research Triangle Park, NC 27711, USA; Oak Ridge Institute for Science and Education, Oak Ridge, TN 37831, USA
| | - Katherine A Phillips
- National Exposure Research Laboratory, Office of Research and Development, US EPA, Research Triangle Park, NC 27711, USA
| | - Christopher M Grulke
- National Center for Computational Toxicology, Office of Research and Development, US EPA, Research Triangle Park, NC 27711, USA
| | - Ann M Richard
- National Center for Computational Toxicology, Office of Research and Development, US EPA, Research Triangle Park, NC 27711, USA
| | - Antony J Williams
- National Center for Computational Toxicology, Office of Research and Development, US EPA, Research Triangle Park, NC 27711, USA
| | - James Rabinowitz
- National Center for Computational Toxicology, Office of Research and Development, US EPA, Research Triangle Park, NC 27711, USA
| | - Kristin K Isaacs
- National Exposure Research Laboratory, Office of Research and Development, US EPA, Research Triangle Park, NC 27711, USA
| | - Alice Yau
- Southwest Research Institute, San Antonio, TX 78238, USA
| | - John F Wambaugh
- National Center for Computational Toxicology, Office of Research and Development, US EPA, Research Triangle Park, NC 27711, USA.
| |
Collapse
|
49
|
McEachran AD, Mansouri K, Grulke C, Schymanski EL, Ruttkies C, Williams AJ. "MS-Ready" structures for non-targeted high-resolution mass spectrometry screening studies. J Cheminform 2018; 10:45. [PMID: 30167882 PMCID: PMC6117229 DOI: 10.1186/s13321-018-0299-2] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Accepted: 08/21/2018] [Indexed: 02/05/2023] Open
Abstract
Chemical database searching has become a fixture in many non-targeted identification workflows based on high-resolution mass spectrometry (HRMS). However, the form of a chemical structure observed in HRMS does not always match the form stored in a database (e.g., the neutral form versus a salt; one component of a mixture rather than the mixture form used in a consumer product). Linking the form of a structure observed via HRMS to its related form(s) within a database will enable the return of all relevant variants of a structure, as well as the related metadata, in a single query. A Konstanz Information Miner (KNIME) workflow has been developed to produce structural representations observed using HRMS ("MS-Ready structures") and links them to those stored in a database. These MS-Ready structures, and associated mappings to the full chemical representations, are surfaced via the US EPA's Chemistry Dashboard ( https://comptox.epa.gov/dashboard/ ). This article describes the workflow for the generation and linking of ~ 700,000 MS-Ready structures (derived from ~ 760,000 original structures) as well as download, search and export capabilities to serve structure identification using HRMS. The importance of this form of structural representation for HRMS is demonstrated with several examples, including integration with the in silico fragmentation software application MetFrag. The structures, search, download and export functionality are all available through the CompTox Chemistry Dashboard, while the MetFrag implementation can be viewed at https://msbi.ipb-halle.de/MetFragBeta/ .
Collapse
Affiliation(s)
- Andrew D. McEachran
- Oak Ridge Institute for Science and Education (ORISE) Research Participation Program, U.S. Environmental Protection Agency, 109 T.W. Alexander Dr., Research Triangle Park, NC 27711 USA
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Mail Drop D143-02, 109 T.W. Alexander Dr., Research Triangle Park, NC 27711 USA
| | - Kamel Mansouri
- Oak Ridge Institute for Science and Education (ORISE) Research Participation Program, U.S. Environmental Protection Agency, 109 T.W. Alexander Dr., Research Triangle Park, NC 27711 USA
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Mail Drop D143-02, 109 T.W. Alexander Dr., Research Triangle Park, NC 27711 USA
- Present Address: Integrated Laboratory Systems, Inc., 601 Keystone Dr., Morrisville, NC 27650 USA
| | - Chris Grulke
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Mail Drop D143-02, 109 T.W. Alexander Dr., Research Triangle Park, NC 27711 USA
| | - Emma L. Schymanski
- Luxembourg Centre for Systems Biomedicine (LCSB), University of Luxembourg, 6, avenue du Swing, 4367 Belvaux, Luxembourg
| | - Christoph Ruttkies
- Department of Stress and Development Biology, Leibniz Institute of Plant Biochemistry (IPB), Weinberg 3, 06120 Halle (Saale), Germany
| | - Antony J. Williams
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Mail Drop D143-02, 109 T.W. Alexander Dr., Research Triangle Park, NC 27711 USA
| |
Collapse
|
50
|
McEachran AD, Hedgespeth ML, Newton SR, McMahen R, Strynar M, Shea D, Nichols EG. Comparison of emerging contaminants in receiving waters downstream of a conventional wastewater treatment plant and a forest-water reuse system. ENVIRONMENTAL SCIENCE AND POLLUTION RESEARCH INTERNATIONAL 2018; 25:12451-12463. [PMID: 29460251 PMCID: PMC6739829 DOI: 10.1007/s11356-018-1505-5] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/03/2017] [Accepted: 02/06/2018] [Indexed: 05/22/2023]
Abstract
Forest-water reuse (FWR) systems treat municipal, industrial, and agricultural wastewaters via land application to forest soils. Previous studies have shown that both large-scale conventional wastewater treatment plants (WWTPs) and FWR systems do not completely remove many contaminants of emerging concern (CECs) before release of treated wastewater. To better characterize CECs and potential for increased implementation of FWR systems, FWR systems need to be directly compared to conventional WWTPs. In this study, both a quantitative, targeted analysis and a nontargeted analysis were utilized to better understand how CECs release to waterways from an FWR system compared to a conventional treatment system. Quantitatively, greater concentrations and total mass load of CECs was exhibited downstream of the conventional WWTP compared to the FWR. Average summed concentrations of 33 targeted CECs downstream of the conventional system were ~ 1000 ng/L and downstream of the FWR were ~ 30 ng/L. From a nontargeted chemical standpoint, more tentatively identified chemicals were present, and at a greater relative abundance, downstream of the conventional system as well. Frequently occurring contaminants included phthalates, pharmaceuticals, and industrial chemicals. These data indicate that FWR systems represent a sustainable wastewater treatment alternative and that emerging contaminant release to waterways was lower at a FWR system than a conventional WWTP.
Collapse
Affiliation(s)
- Andrew D McEachran
- Department of Forestry and Environmental Resources, College of Natural Resources, North Carolina State University, Raleigh, NC, USA.
| | - Melanie L Hedgespeth
- Department of Forestry and Environmental Resources, College of Natural Resources, North Carolina State University, Raleigh, NC, USA
| | - Seth R Newton
- National Exposure Research Laboratory, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, Durham, NC, 27711, USA
| | - Rebecca McMahen
- Oak Ridge Institute for Science and Education (ORISE) Research Participation Program, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, Durham, NC, 27711, USA
| | - Mark Strynar
- National Exposure Research Laboratory, Office of Research and Development, U.S. Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, Durham, NC, 27711, USA
| | - Damian Shea
- Department of Biological Sciences, College of Science, North Carolina State University, Raleigh, NC, USA
| | - Elizabeth Guthrie Nichols
- Department of Forestry and Environmental Resources, College of Natural Resources, North Carolina State University, Raleigh, NC, USA
| |
Collapse
|