1
|
Guo W, Liu J, Dong F, Song M, Li Z, Khan MKH, Patterson TA, Hong H. Review of machine learning and deep learning models for toxicity prediction. Exp Biol Med (Maywood) 2023; 248:1952-1973. [PMID: 38057999 PMCID: PMC10798180 DOI: 10.1177/15353702231209421] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/08/2023] Open
Abstract
The ever-increasing number of chemicals has raised public concerns due to their adverse effects on human health and the environment. To protect public health and the environment, it is critical to assess the toxicity of these chemicals. Traditional in vitro and in vivo toxicity assays are complicated, costly, and time-consuming and may face ethical issues. These constraints raise the need for alternative methods for assessing the toxicity of chemicals. Recently, due to the advancement of machine learning algorithms and the increase in computational power, many toxicity prediction models have been developed using various machine learning and deep learning algorithms such as support vector machine, random forest, k-nearest neighbors, ensemble learning, and deep neural network. This review summarizes the machine learning- and deep learning-based toxicity prediction models developed in recent years. Support vector machine and random forest are the most popular machine learning algorithms, and hepatotoxicity, cardiotoxicity, and carcinogenicity are the frequently modeled toxicity endpoints in predictive toxicology. It is known that datasets impact model performance. The quality of datasets used in the development of toxicity prediction models using machine learning and deep learning is vital to the performance of the developed models. The different toxicity assignments for the same chemicals among different datasets of the same type of toxicity have been observed, indicating benchmarking datasets is needed for developing reliable toxicity prediction models using machine learning and deep learning algorithms. This review provides insights into current machine learning models in predictive toxicology, which are expected to promote the development and application of toxicity prediction models in the future.
Collapse
Affiliation(s)
- Wenjing Guo
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Jie Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Fan Dong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Meng Song
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Zoe Li
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Md Kamrul Hasan Khan
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR 72079, USA
| |
Collapse
|
2
|
Ruan T, Li P, Wang H, Li T, Jiang G. Identification and Prioritization of Environmental Organic Pollutants: From an Analytical and Toxicological Perspective. Chem Rev 2023; 123:10584-10640. [PMID: 37531601 DOI: 10.1021/acs.chemrev.3c00056] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/04/2023]
Abstract
Exposure to environmental organic pollutants has triggered significant ecological impacts and adverse health outcomes, which have been received substantial and increasing attention. The contribution of unidentified chemical components is considered as the most significant knowledge gap in understanding the combined effects of pollutant mixtures. To address this issue, remarkable analytical breakthroughs have recently been made. In this review, the basic principles on recognition of environmental organic pollutants are overviewed. Complementary analytical methodologies (i.e., quantitative structure-activity relationship prediction, mass spectrometric nontarget screening, and effect-directed analysis) and experimental platforms are briefly described. The stages of technique development and/or essential parts of the analytical workflow for each of the methodologies are then reviewed. Finally, plausible technique paths and applications of the future nontarget screening methods, interdisciplinary techniques for achieving toxicant identification, and burgeoning strategies on risk assessment of chemical cocktails are discussed.
Collapse
Affiliation(s)
- Ting Ruan
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Pengyang Li
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Haotian Wang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Tingyu Li
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| | - Guibin Jiang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
- University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
3
|
Shin HK, Huang R, Chen M. In silico modeling-based new alternative methods to predict drug and herb-induced liver injury: A review. Food Chem Toxicol 2023; 179:113948. [PMID: 37460037 PMCID: PMC10640386 DOI: 10.1016/j.fct.2023.113948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2023] [Revised: 07/10/2023] [Accepted: 07/14/2023] [Indexed: 07/25/2023]
Abstract
New approach methods (NAMs) have been developed to predict a wide range of toxicities through innovative technologies. Liver injury is one of the most extensively studied endpoints due to its severity and frequency, occurring among populations that consume drugs or dietary supplements. In this review, we focus on recent developments of in silico modeling for liver injury prediction using deep learning and in vitro data based on adverse outcome pathways (AOPs). Despite these models being mainly developed using datasets generated from drug-like molecules, they were also applied to the prediction of hepatotoxicity caused by herbal products. As deep learning has achieved great success in many different fields, advanced machine learning algorithms have been actively applied to improve the accuracy of in silico models. Additionally, the development of liver AOPs, combined with big data in toxicology, has been valuable in developing in silico models with enhanced predictive performance and interpretability. Specifically, one approach involves developing structure-based models for predicting molecular initiating events of liver AOPs, while others use in vitro data with structure information as model inputs for making predictions. Even though liver injury remains a difficult endpoint to predict, advancements in machine learning algorithms and the expansion of in vitro databases with relevant biological knowledge have made a huge impact on improving in silico modeling for drug-induced liver injury prediction.
Collapse
Affiliation(s)
- Hyun Kil Shin
- Department of Predictive Toxicology, Korea Institute of Toxicology (KIT), 34114, Daejeon, Republic of Korea
| | - Ruili Huang
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD, 20850, USA.
| | - Minjun Chen
- Division of Bioinformatics and Biostatistics, National Center for Toxicological Research (NCTR), U.S. Food and Drug Administration, 3900 NCTR Rd., Jefferson, AR, 72079, USA.
| |
Collapse
|
4
|
Hu Y, Ren Q, Liu X, Gao L, Xiao L, Yu W. In Silico Prediction of Human Organ Toxicity via Artificial Intelligence Methods. Chem Res Toxicol 2023. [PMID: 37300507 DOI: 10.1021/acs.chemrestox.2c00411] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Unpredicted human organ level toxicity remains one of the major reasons for drug clinical failure. There is a critical need for cost-efficient strategies in the early stages of drug development for human toxicity assessment. At present, artificial intelligence methods are popularly regarded as a promising solution in chemical toxicology. Thus, we provided comprehensive in silico prediction models for eight significant human organ level toxicity end points using machine learning, deep learning, and transfer learning algorithms. In this work, our results showed that the graph-based deep learning approach was generally better than the conventional machine learning models, and good performances were observed for most of the human organ level toxicity end points in this study. In addition, we found that the transfer learning algorithm could improve model performance for skin sensitization end point using source domain of in vivo acute toxicity data and in vitro data of the Tox21 project. It can be concluded that our models can provide useful guidance for the rapid identification of the compounds with human organ level toxicity for drug discovery.
Collapse
Affiliation(s)
- Yuxuan Hu
- State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing 210009, China
| | - Qiuhan Ren
- School of Science, China Pharmaceutical University, Nanjing 211198, China
| | - Xintong Liu
- State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing 210009, China
| | - Liming Gao
- School of Science, China Pharmaceutical University, Nanjing 211198, China
| | - Lecheng Xiao
- School of Pharmacy, China Pharmaceutical University, Nanjing 211198, China
| | - Wenying Yu
- State Key Laboratory of Natural Medicines, China Pharmaceutical University, Nanjing 210009, China
| |
Collapse
|
5
|
Chung E, Russo DP, Ciallella HL, Wang YT, Wu M, Aleksunes LM, Zhu H. Data-Driven Quantitative Structure-Activity Relationship Modeling for Human Carcinogenicity by Chronic Oral Exposure. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:6573-6588. [PMID: 37040559 PMCID: PMC10134506 DOI: 10.1021/acs.est.3c00648] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 03/28/2023] [Accepted: 03/29/2023] [Indexed: 06/19/2023]
Abstract
Traditional methodologies for assessing chemical toxicity are expensive and time-consuming. Computational modeling approaches have emerged as low-cost alternatives, especially those used to develop quantitative structure-activity relationship (QSAR) models. However, conventional QSAR models have limited training data, leading to low predictivity for new compounds. We developed a data-driven modeling approach for constructing carcinogenicity-related models and used these models to identify potential new human carcinogens. To this goal, we used a probe carcinogen dataset from the US Environmental Protection Agency's Integrated Risk Information System (IRIS) to identify relevant PubChem bioassays. Responses of 25 PubChem assays were significantly relevant to carcinogenicity. Eight assays inferred carcinogenicity predictivity and were selected for QSAR model training. Using 5 machine learning algorithms and 3 types of chemical fingerprints, 15 QSAR models were developed for each PubChem assay dataset. These models showed acceptable predictivity during 5-fold cross-validation (average CCR = 0.71). Using our QSAR models, we can correctly predict and rank 342 IRIS compounds' carcinogenic potentials (PPV = 0.72). The models predicted potential new carcinogens, which were validated by a literature search. This study portends an automated technique that can be applied to prioritize potential toxicants using validated QSAR models based on extensive training sets from public data resources.
Collapse
Affiliation(s)
- Elena Chung
- Department
of Chemistry and Biochemistry, Rowan University, 201 Mullica Hill Road, Glassboro, New Jersey 08028, United States
| | - Daniel P. Russo
- Department
of Chemistry and Biochemistry, Rowan University, 201 Mullica Hill Road, Glassboro, New Jersey 08028, United States
| | - Heather L. Ciallella
- Department
of Toxicology, Cuyahoga County Medical Examiner’s
Office, 11001 Cedar Avenue, Cleveland, Ohio 44106, United States
| | - Yu-Tang Wang
- Institute
of Agro-Products Processing Science and Technology, Chinese Academy of Agricultural Sciences/Key Laboratory of Agro-Products
Processing, Ministry of Agriculture, Beijing 100193, China
| | - Min Wu
- School
of Life Science and Technology, China Pharmaceutical
University, No. 24, Tong Jia Xiang, Nanjing 210009, China
| | - Lauren M. Aleksunes
- Department
of Pharmacology and Toxicology, Rutgers
University, Ernest Mario School of Pharmacy, 170 Frelinghuysen Road, Piscataway, New Jersey 08854, United States
| | - Hao Zhu
- Department
of Chemistry and Biochemistry, Rowan University, 201 Mullica Hill Road, Glassboro, New Jersey 08028, United States
| |
Collapse
|
6
|
Xu T, Li S, Li AJ, Zhao J, Sakamuru S, Huang W, Xia M, Huang R. Identification of Potent and Selective Acetylcholinesterase/Butyrylcholinesterase Inhibitors by Virtual Screening. J Chem Inf Model 2023; 63:2321-2330. [PMID: 37011147 PMCID: PMC10688023 DOI: 10.1021/acs.jcim.3c00230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/05/2023]
Abstract
Acetylcholinesterase (AChE) and butyrylcholinesterase (BChE) play important roles in human neurodegenerative disorders such as Alzheimer's disease. In this study, machine learning methods were applied to develop quantitative structure-activity relationship models for the prediction of novel AChE and BChE inhibitors based on data from quantitative high-throughput screening assays. The models were used to virtually screen an in-house collection of ∼360K compounds. The optimal models achieved good performance with area under the receiver operating characteristic curve values ranging from 0.83 ± 0.03 to 0.87 ± 0.01 for the prediction of AChE/BChE inhibition activity and selectivity. Experimental validation showed that the best-performing models increased the assay hit rate by several folds. We identified 88 novel AChE and 126 novel BChE inhibitors, 25% (AChE) and 53% (BChE) of which showed potent inhibitory effects (IC50 < 5 μM). In addition, structure-activity relationship analysis of the BChE inhibitors revealed scaffolds for chemistry design and optimization. In conclusion, machine learning models were shown to efficiently identify potent and selective inhibitors against AChE and BChE and novel structural series for further design and development of potential therapeutics against neurodegenerative disorders.
Collapse
Affiliation(s)
- Tuan Xu
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Shuaizhang Li
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Andrew J. Li
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Jinghua Zhao
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Srilatha Sakamuru
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Wenwei Huang
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Menghang Xia
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Ruili Huang
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| |
Collapse
|
7
|
Xu T, Kabir M, Sakamuru S, Shah P, Padilha E, Ngan DK, Xia M, Xu X, Simeonov A, Huang R. Predictive Models for Human Cytochrome P450 3A7 Selective Inhibitors and Substrates. J Chem Inf Model 2023; 63:846-855. [PMID: 36719788 PMCID: PMC10664139 DOI: 10.1021/acs.jcim.2c01516] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
Inappropriate use of prescription drugs is potentially more harmful in fetuses/neonates than in adults. Cytochrome P450 (CYP) 3A subfamily undergoes developmental changes in expression, such as a transition from CYP3A7 to CYP3A4 shortly after birth, which provides a potential way to distinguish medication effects on fetuses/neonates and adults. The purpose of this study was to build first-in-class predictive models for both inhibitors and substrates of CYP3A7/CYP3A4 using chemical structure analysis. Three metrics were used to evaluate model performance: area under the receiver operating characteristic curve (AUC-ROC), balanced accuracy (BA), and Matthews correlation coefficient (MCC). The performance varied for each CYP3A7/CYP3A4 inhibitor/substrate model depending on the data set type, model type, rebalancing method, and specific feature set. For the active inhibitor/substrate data set, the optimal models achieved AUC-ROC values ranging from 0.77 ± 0.01 to 0.84 ± 0.01. For the selective inhibitor/substrate data set, the optimal models achieved AUC-ROC values ranging from 0.72 ± 0.02 to 0.79 ± 0.04. The predictive power of the optimal models was validated by compounds with known potencies as CYP3A7/CYP3A4 inhibitors or substrates. In addition, we identified structural features significant for CYP3A7/CYP3A4 selective or common inhibitors and substrates. In summary, the top performing models can be further applied as a tool to rapidly evaluate the safety and efficacy of new drugs separately for fetuses/neonates and adults. The significant structural features could guide the design of new therapeutic drugs as well as aid in the optimization of existing medicine for fetuses/neonates.
Collapse
Affiliation(s)
- Tuan Xu
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Md Kabir
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
- The Graduate School of Biomedical Sciences, Departments of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, New York 10029, United States
| | - Srilatha Sakamuru
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Pranav Shah
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Elias Padilha
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Deborah K. Ngan
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Menghang Xia
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Xin Xu
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Anton Simeonov
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Ruili Huang
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| |
Collapse
|
8
|
Ye L, Ngan DK, Xu T, Liu Z, Zhao J, Sakamuru S, Zhang L, Zhao T, Xia M, Simeonov A, Huang R. Prediction of drug-induced liver injury and cardiotoxicity using chemical structure and in vitro assay data. Toxicol Appl Pharmacol 2022; 454:116250. [PMID: 36150479 PMCID: PMC9561045 DOI: 10.1016/j.taap.2022.116250] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2022] [Revised: 08/24/2022] [Accepted: 09/14/2022] [Indexed: 11/18/2022]
Abstract
Drug-induced liver injury (DILI) and cardiotoxicity (DICT) are major adverse effects triggered by many clinically important drugs. To provide an alternative to in vivo toxicity testing, the U.S. Tox21 consortium has screened a collection of ∼10K compounds, including drugs in clinical use, against >70 cell-based assays in a quantitative high-throughput screening (qHTS) format. In this study, we compiled reference compound lists for DILI and DICT and compared the potential of Tox21 assay data with chemical structure information in building prediction models for human in vivo hepatotoxicity and cardiotoxicity. Models were built with four different machine learning algorithms (e.g., Random Forest, Naïve Bayes, eXtreme Gradient Boosting, and Support Vector Machine) and model performance was evaluated by calculating the area under the receiver operating characteristic curve (AUC-ROC). Chemical structure-based models showed reasonable predictive power for DILI (best AUC-ROC = 0.75 ± 0.03) and DICT (best AUC-ROC = 0.83 ± 0.03), while Tox21 assay data alone only showed better than random performance. DILI and DICT prediction models built using a combination of assay data and chemical structure information did not have a positive impact on model performance. The suboptimal predictive performance of the assay data is likely due to insufficient coverage of an adequately predictive number of toxicity mechanisms. The Tox21 consortium is currently expanding coverage of biological response space with additional assays that probe toxicologically important targets and under-represented pathways that may improve the prediction of in vivo toxicity such as DILI and DICT.
Collapse
Affiliation(s)
- Lin Ye
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Deborah K Ngan
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Tuan Xu
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Zhichao Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration (FDA), Jefferson, AR 72079, USA
| | - Jinghua Zhao
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Srilatha Sakamuru
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Li Zhang
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Tongan Zhao
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Menghang Xia
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Anton Simeonov
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA
| | - Ruili Huang
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, MD 20850, USA.
| |
Collapse
|
9
|
Xu X, Wang C, Gui B, Yuan X, Li C, Zhao Y, Martyniuk CJ, Su L. Application of machine learning to predict the inhibitory activity of organic chemicals on thyroid stimulating hormone receptor. ENVIRONMENTAL RESEARCH 2022; 212:113175. [PMID: 35351457 DOI: 10.1016/j.envres.2022.113175] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Revised: 03/04/2022] [Accepted: 03/22/2022] [Indexed: 06/14/2023]
Abstract
With the promotion of carbon neutrality, it is also important to synchronously promote the assessment and sustainable management of chemicals so as to protect public health. Humans and animals are possibly exposed to endocrine disruptors that have inhibitory effects on thyroid stimulating hormone receptor (TSHR). As such, it is important to identify chemicals that inhibit TSHR and to develop models to predict their inhibitory activity. In this study, 5952 compounds derived from a cyclic adenosine monophosphate (cAMP) analysis, a key signaling pathway in thyrocytes, were used to establish a binary classification model comparing methods that included random forest (RF), extreme gradient boosting (XGB), and logistic regression (LR). The prediction model based on RF showed the highest identification accuracy for revealing chemicals that may inhibit TSHR. For the RF model, recall was calculated at 0.89, balance accuracy was 0.85, and its receiver operating characteristic (ROC) curve-area under (AUC) was 0.92, indicating that the model had very high predictive capacity. The lowest CDocker energy (CE) and CDocker interaction energy (CIE) for chemicals and TSHR were determined and were subsequently introduced into the predictive model as descriptors. A regression model, extreme gradient boosting-Regression (XGBR), was successfully established yielding an R2 = 0.65 to predict inhibitory activity for active compounds. Parameters that included dissociation characteristics, molecular structure, and binding energy were all key factors in the predictive model. We demonstrate that QSAR models are useful approaches, not only for identifying chemicals that inhibit TSHR, but for predicting inhibitory activity of active compounds.
Collapse
Affiliation(s)
- Xiaotian Xu
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China
| | - Chen Wang
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China
| | - Bingxin Gui
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China
| | - Xiangyi Yuan
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China
| | - Chao Li
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China
| | - Yuanhui Zhao
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China
| | - Christopher J Martyniuk
- Center for Environmental and Human Toxicology, Department of Physiological Sciences, College of Veterinary Medicine, UF Genetics Institute, Interdisciplinary Program in Biomedical Sciences Neuroscience, University of Florida, Gainesville, FL, 32611, USA
| | - Limin Su
- State Environmental Protection Key Laboratory of Wetland Ecology and Vegetation Restoration, School of Environment, Northeast Normal University, Changchun, 130117, PR China.
| |
Collapse
|
10
|
Xu T, Xu M, Zhu W, Chen CZ, Zhang Q, Zheng W, Huang R. Efficient Identification of Anti-SARS-CoV-2 Compounds Using Chemical Structure- and Biological Activity-Based Modeling. J Med Chem 2022; 65:4590-4599. [PMID: 35275639 PMCID: PMC8936051 DOI: 10.1021/acs.jmedchem.1c01372] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2021] [Indexed: 12/12/2022]
Abstract
Identification of anti-SARS-CoV-2 compounds through traditional high-throughput screening (HTS) assays is limited by high costs and low hit rates. To address these challenges, we developed machine learning models to identify compounds acting via inhibition of the entry of SARS-CoV-2 into human host cells or the SARS-CoV-2 3-chymotrypsin-like (3CL) protease. The optimal classification models achieved good performance with area under the receiver operating characteristic curve (AUC-ROC) values of >0.78. Experimental validation showed that the best performing models increased the assay hit rate by 2.1-fold for viral entry inhibitors and 10.4-fold for 3CL protease inhibitors compared to those of the original drug repurposing screens. Twenty-two compounds showed potent (<5 μM) antiviral activities in a SARS-CoV-2 live virus assay. In conclusion, machine learning models can be developed and used as a complementary approach to HTS to expand compound screening capacities and improve the speed and efficiency of anti-SARS-CoV-2 drug discovery.
Collapse
Affiliation(s)
- Tuan Xu
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Miao Xu
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Wei Zhu
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Catherine Z Chen
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Qi Zhang
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Wei Zheng
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Ruili Huang
- Division of Pre-clinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| |
Collapse
|
11
|
Liu J, Guo W, Sakkiah S, Ji Z, Yavas G, Zou W, Chen M, Tong W, Patterson TA, Hong H. Machine Learning Models for Predicting Liver Toxicity. Methods Mol Biol 2022; 2425:393-415. [PMID: 35188640 DOI: 10.1007/978-1-0716-1960-5_15] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Liver toxicity is a major adverse drug reaction that accounts for drug failure in clinical trials and withdrawal from the market. Therefore, predicting potential liver toxicity at an early stage in drug discovery is crucial to reduce costs and the potential for drug failure. However, current in vivo animal toxicity testing is very expensive and time consuming. As an alternative approach, various machine learning models have been developed to predict potential liver toxicity in humans. This chapter reviews current advances in the development and application of machine learning models for prediction of potential liver toxicity in humans and discusses possible improvements to liver toxicity prediction.
Collapse
Affiliation(s)
- Jie Liu
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Wenjing Guo
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Sugunadevi Sakkiah
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Zuowei Ji
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Gokhan Yavas
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Wen Zou
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Minjun Chen
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Weida Tong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Tucker A Patterson
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA
| | - Huixiao Hong
- National Center for Toxicological Research, U.S. Food and Drug Administration, Jefferson, AR, USA.
| |
Collapse
|
12
|
Bassan A, Alves VM, Amberg A, Anger LT, Auerbach S, Beilke L, Bender A, Cronin MT, Cross KP, Hsieh JH, Greene N, Kemper R, Kim MT, Mumtaz M, Noeske T, Pavan M, Pletz J, Russo DP, Sabnis Y, Schaefer M, Szabo DT, Valentin JP, Wichard J, Williams D, Woolley D, Zwickl C, Myatt GJ. In silico approaches in organ toxicity hazard assessment: current status and future needs in predicting liver toxicity. COMPUTATIONAL TOXICOLOGY (AMSTERDAM, NETHERLANDS) 2021; 20:100187. [PMID: 35340402 PMCID: PMC8955833 DOI: 10.1016/j.comtox.2021.100187] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/15/2023]
Abstract
Hepatotoxicity is one of the most frequently observed adverse effects resulting from exposure to a xenobiotic. For example, in pharmaceutical research and development it is one of the major reasons for drug withdrawals, clinical failures, and discontinuation of drug candidates. The development of faster and cheaper methods to assess hepatotoxicity that are both more sustainable and more informative is critically needed. The biological mechanisms and processes underpinning hepatotoxicity are summarized and experimental approaches to support the prediction of hepatotoxicity are described, including toxicokinetic considerations. The paper describes the increasingly important role of in silico approaches and highlights challenges to the adoption of these methods including the lack of a commonly agreed upon protocol for performing such an assessment and the need for in silico solutions that take dose into consideration. A proposed framework for the integration of in silico and experimental information is provided along with a case study describing how computational methods have been used to successfully respond to a regulatory question concerning non-genotoxic impurities in chemically synthesized pharmaceuticals.
Collapse
Affiliation(s)
- Arianna Bassan
- Innovatune srl, Via Giulio Zanon 130/D, 35129 Padova, Italy
| | - Vinicius M. Alves
- The National Institute of Environmental Health Sciences, Division of the National Toxicology, Program, Research Triangle Park, NC 27709, USA
| | - Alexander Amberg
- Sanofi, R&D Preclinical Safety Frankfurt, Industriepark Hoechst, D-65926 Frankfurt am Main, Germany
| | | | - Scott Auerbach
- The National Institute of Environmental Health Sciences, Division of the National Toxicology, Program, Research Triangle Park, NC 27709, USA
| | - Lisa Beilke
- Toxicology Solutions Inc., San Diego, CA, USA
| | - Andreas Bender
- AI and Data Analytics, Clinical Pharmacology and Safety Sciences, R&D, AstraZeneca, Cambridge, UK
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW
| | - Mark T.D. Cronin
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, L3 3AF, UK
| | | | - Jui-Hua Hsieh
- The National Institute of Environmental Health Sciences, Division of the National Toxicology, Program, Research Triangle Park, NC 27709, USA
| | - Nigel Greene
- Data Science and AI, DSM, IMED Biotech Unit, AstraZeneca, Boston, USA
| | - Raymond Kemper
- Nuvalent, One Broadway, 14th floor, Cambridge, MA, 02142, USA
| | - Marlene T. Kim
- US Food and Drug Administration, Center for Drug Evaluation and Research, Silver Spring, MD, 20993, USA
| | - Moiz Mumtaz
- Office of the Associate Director for Science (OADS), Agency for Toxic Substances and Disease, Registry, US Department of Health and Human Services, Atlanta, GA, USA
| | - Tobias Noeske
- Imaging and Data Analytics, Clinical Pharmacology & Safety Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Manuela Pavan
- Innovatune srl, Via Giulio Zanon 130/D, 35129 Padova, Italy
| | - Julia Pletz
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Liverpool, L3 3AF, UK
| | - Daniel P. Russo
- Department of Chemistry, Rutgers University, Camden, NJ 08102, USA
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA
| | - Yogesh Sabnis
- UCB Biopharma SRL, Chemin du Foriest – B-1420 Braine-l’Alleud, Belgium
| | - Markus Schaefer
- Sanofi, R&D Preclinical Safety Frankfurt, Industriepark Hoechst, D-65926 Frankfurt am Main, Germany
| | | | | | - Joerg Wichard
- Bayer AG, Genetic Toxicology, Müllerstr. 178, 13353 Berlin, Germany
| | - Dominic Williams
- Functional & Mechanistic Safety, Clinical Pharmacology & Safety Sciences, AstraZeneca, Darwin Building 310, Cambridge Science Park, Milton Rd, Cambridge CB4 0FZ, UK
| | - David Woolley
- ForthTox Limited, PO Box 13550, Linlithgow, EH49 7YU, UK
| | - Craig Zwickl
- Transendix LLC, 1407 Moores Manor, Indianapolis, IN 46229, USA
| | - Glenn J. Myatt
- Instem, 1393 Dublin Road, Columbus, OH 43215. USA
- Corresponding author. (G.J. Myatt)
| |
Collapse
|
13
|
Garcia de Lomana M, Morger A, Norinder U, Buesen R, Landsiedel R, Volkamer A, Kirchmair J, Mathea M. ChemBioSim: Enhancing Conformal Prediction of In Vivo Toxicity by Use of Predicted Bioactivities. J Chem Inf Model 2021; 61:3255-3272. [PMID: 34153183 PMCID: PMC8317154 DOI: 10.1021/acs.jcim.1c00451] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Indexed: 02/07/2023]
Abstract
Computational methods such as machine learning approaches have a strong track record of success in predicting the outcomes of in vitro assays. In contrast, their ability to predict in vivo endpoints is more limited due to the high number of parameters and processes that may influence the outcome. Recent studies have shown that the combination of chemical and biological data can yield better models for in vivo endpoints. The ChemBioSim approach presented in this work aims to enhance the performance of conformal prediction models for in vivo endpoints by combining chemical information with (predicted) bioactivity assay outcomes. Three in vivo toxicological endpoints, capturing genotoxic (MNT), hepatic (DILI), and cardiological (DICC) issues, were selected for this study due to their high relevance for the registration and authorization of new compounds. Since the sparsity of available biological assay data is challenging for predictive modeling, predicted bioactivity descriptors were introduced instead. Thus, a machine learning model for each of the 373 collected biological assays was trained and applied on the compounds of the in vivo toxicity data sets. Besides the chemical descriptors (molecular fingerprints and physicochemical properties), these predicted bioactivities served as descriptors for the models of the three in vivo endpoints. For this study, a workflow based on a conformal prediction framework (a method for confidence estimation) built on random forest models was developed. Furthermore, the most relevant chemical and bioactivity descriptors for each in vivo endpoint were preselected with lasso models. The incorporation of bioactivity descriptors increased the mean F1 scores of the MNT model from 0.61 to 0.70 and for the DICC model from 0.72 to 0.82 while the mean efficiencies increased by roughly 0.10 for both endpoints. In contrast, for the DILI endpoint, no significant improvement in model performance was observed. Besides pure performance improvements, an analysis of the most important bioactivity features allowed detection of novel and less intuitive relationships between the predicted biological assay outcomes used as descriptors and the in vivo endpoints. This study presents how the prediction of in vivo toxicity endpoints can be improved by the incorporation of biological information-which is not necessarily captured by chemical descriptors-in an automated workflow without the need for adding experimental workload for the generation of bioactivity descriptors as predicted outcomes of bioactivity assays were utilized. All bioactivity CP models for deriving the predicted bioactivities, as well as the in vivo toxicity CP models, can be freely downloaded from https://doi.org/10.5281/zenodo.4761225.
Collapse
Affiliation(s)
- Marina Garcia de Lomana
- BASF
SE, Ludwigshafen am Rhein 67063, Germany
- Department
of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, Vienna 1090, Austria
| | - Andrea Morger
- In Silico
Toxicology and Structural Bioinformatics, Institute of Physiology, Charité Universitätsmedizin Berlin, Charitéplatz
1, Berlin 10117, Germany
| | - Ulf Norinder
- MTM
Research Centre, School of Science and Technology, Örebro University, Örebro SE-70182, Sweden
| | | | | | - Andrea Volkamer
- In Silico
Toxicology and Structural Bioinformatics, Institute of Physiology, Charité Universitätsmedizin Berlin, Charitéplatz
1, Berlin 10117, Germany
| | - Johannes Kirchmair
- Department
of Pharmaceutical Sciences, Faculty of Life Sciences, University of Vienna, Vienna 1090, Austria
| | | |
Collapse
|
14
|
Wang MWH, Goodman JM, Allen TEH. Machine Learning in Predictive Toxicology: Recent Applications and Future Directions for Classification Models. Chem Res Toxicol 2020; 34:217-239. [PMID: 33356168 DOI: 10.1021/acs.chemrestox.0c00316] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
In recent times, machine learning has become increasingly prominent in predictive toxicology as it has shifted from in vivo studies toward in silico studies. Currently, in vitro methods together with other computational methods such as quantitative structure-activity relationship modeling and absorption, distribution, metabolism, and excretion calculations are being used. An overview of machine learning and its applications in predictive toxicology is presented here, including support vector machines (SVMs), random forest (RF) and decision trees (DTs), neural networks, regression models, naïve Bayes, k-nearest neighbors, and ensemble learning. The recent successes of these machine learning methods in predictive toxicology are summarized, and a comparison of some models used in predictive toxicology is presented. In predictive toxicology, SVMs, RF, and DTs are the dominant machine learning methods due to the characteristics of the data available. Lastly, this review describes the current challenges facing the use of machine learning in predictive toxicology and offers insights into the possible areas of improvement in the field.
Collapse
Affiliation(s)
- Marcus W H Wang
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Jonathan M Goodman
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom
| | - Timothy E H Allen
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom.,MRC Toxicology Unit, University of Cambridge, Hodgkin Building, Lancaster Road, Leicester LE1 7HB, United Kingdom
| |
Collapse
|
15
|
Xu T, Wu L, Xia M, Simeonov A, Huang R. Systematic Identification of Molecular Targets and Pathways Related to Human Organ Level Toxicity. Chem Res Toxicol 2020; 34:412-421. [PMID: 33251791 DOI: 10.1021/acs.chemrestox.0c00305] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The mechanisms leading to organ level toxicities are poorly understood. In this study, we applied an integrated approach to deduce the molecular targets and biological pathways involved in chemically induced toxicity for eight common human organ level toxicity end points (carcinogenicity, cardiotoxicity, developmental toxicity, hepatotoxicity, nephrotoxicity, neurotoxicity, reproductive toxicity, and skin toxicity). Integrated analysis of in vitro assay data, molecular targets and pathway annotations from the literature, and toxicity-molecular target associations derived from text mining, combined with machine learning techniques, were used to generate molecular targets for each of the organ level toxicity end points. A total of 1516 toxicity-related genes were identified and subsequently analyzed for biological pathway coverage, resulting in 206 significant pathways (p-value <0.05), ranging from 3 (e.g., developmental toxicity) to 101 (e.g., skin toxicity) for each toxicity end point. This study presents a systematic and comprehensive analysis of molecular targets and pathways related to various in vivo toxicity end points. These molecular targets and pathways could aid in understanding the biological mechanisms of toxicity and serve as a guide for the design of suitable in vitro assays for more efficient toxicity testing. In addition, these results are complementary to the existing adverse outcome pathway (AOP) framework and can be used to aid in the development of novel AOPs. Our results provide abundant testable hypotheses for further experimental validation.
Collapse
Affiliation(s)
- Tuan Xu
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Leihong Wu
- National Center for Toxicological Research, U.S. Food and Drug Administration, 3900 NCTR Road, Jefferson, Arkansas 72079, United States
| | - Menghang Xia
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Anton Simeonov
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| | - Ruili Huang
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences (NCATS), National Institutes of Health (NIH), Rockville, Maryland 20850, United States
| |
Collapse
|