1
|
Son A, Park J, Kim W, Yoon Y, Lee S, Ji J, Kim H. Recent Advances in Omics, Computational Models, and Advanced Screening Methods for Drug Safety and Efficacy. TOXICS 2024; 12:822. [PMID: 39591001 PMCID: PMC11598288 DOI: 10.3390/toxics12110822] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/29/2024] [Revised: 11/10/2024] [Accepted: 11/14/2024] [Indexed: 11/28/2024]
Abstract
It is imperative to comprehend the mechanisms that underlie drug toxicity in order to enhance the efficacy and safety of novel therapeutic agents. The capacity to identify molecular pathways that contribute to drug-induced toxicity has been significantly enhanced by recent developments in omics technologies, such as transcriptomics, proteomics, and metabolomics. This has enabled the early identification of potential adverse effects. These insights are further enhanced by computational tools, including quantitative structure-activity relationship (QSAR) analyses and machine learning models, which accurately predict toxicity endpoints. Additionally, technologies such as physiologically based pharmacokinetic (PBPK) modeling and micro-physiological systems (MPS) provide more precise preclinical-to-clinical translation, thereby improving drug safety assessments. This review emphasizes the synergy between sophisticated screening technologies, in silico modeling, and omics data, emphasizing their roles in reducing late-stage drug development failures. Challenges persist in the integration of a variety of data types and the interpretation of intricate biological interactions, despite the progress that has been made. The development of standardized methodologies that further enhance predictive toxicology is contingent upon the ongoing collaboration between researchers, clinicians, and regulatory bodies. This collaboration ensures the development of therapeutic pharmaceuticals that are more effective and safer.
Collapse
Affiliation(s)
- Ahrum Son
- Department of Molecular Medicine, Scripps Research, San Diego, CA 92037, USA;
| | - Jongham Park
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (Y.Y.); (S.L.)
| | - Woojin Kim
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (Y.Y.); (S.L.)
| | - Yoonki Yoon
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (Y.Y.); (S.L.)
| | - Sangwoon Lee
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (Y.Y.); (S.L.)
| | - Jaeho Ji
- Department of Convergent Bioscience and Informatics, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea;
| | - Hyunsoo Kim
- Department of Bio-AI Convergence, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea; (J.P.); (W.K.); (Y.Y.); (S.L.)
- Department of Convergent Bioscience and Informatics, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea;
- Protein AI Design Institute, Chungnam National University, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
- SCICS, Prove Beyond AI, 99 Daehak-ro, Yuseong-gu, Daejeon 34134, Republic of Korea
| |
Collapse
|
2
|
Shah P, Siramshetty VB, Mathé E, Xu X. Developing Robust Human Liver Microsomal Stability Prediction Models: Leveraging Inter-Species Correlation with Rat Data. Pharmaceutics 2024; 16:1257. [PMID: 39458588 PMCID: PMC11510424 DOI: 10.3390/pharmaceutics16101257] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 09/03/2024] [Accepted: 09/19/2024] [Indexed: 10/28/2024] Open
Abstract
Objectives: Pharmacokinetic issues were the leading cause of drug attrition, accounting for approximately 40% of all cases before the turn of the century. To this end, several high-throughput in vitro assays like microsomal stability have been developed to evaluate the pharmacokinetic profiles of compounds in the early stages of drug discovery. At NCATS, a single-point rat liver microsomal (RLM) stability assay is used as a Tier I assay, while human liver microsomal (HLM) stability is employed as a Tier II assay. We experimentally screened and collected data on over 30,000 compounds for RLM stability and over 7000 compounds for HLM stability. Although HLM stability screening provides valuable insights, the increasing number of hits generated, along with the time- and resource-intensive nature of the assay, highlights the need for alternative strategies. One promising approach is leveraging in silico models trained on these experimental datasets. Methods: We describe the development of an HLM stability prediction model using our in-house HLM stability dataset. Results: Employing both classical machine learning methods and advanced techniques, such as neural networks, we achieved model accuracies exceeding 80%. Moreover, we validated our model using external test sets and found that our models are comparable to some of the best models in literature. Additionally, the strong correlation observed between our RLM and HLM data was further reinforced by the fact that our HLM model performance improved when using RLM stability predictions as an input descriptor. Conclusions: The best model along with a subset of our dataset (PubChem AID: 1963597) has been made publicly accessible on the ADME@NCATS website for the benefit of the greater drug discovery community. To the best of our knowledge, it is the largest open-source model of its kind and the first to leverage cross-species data.
Collapse
|
3
|
Jia X, Wang T, Zhu H. Advancing Computational Toxicology by Interpretable Machine Learning. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:17690-17706. [PMID: 37224004 PMCID: PMC10666545 DOI: 10.1021/acs.est.3c00653] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 05/05/2023] [Accepted: 05/05/2023] [Indexed: 05/26/2023]
Abstract
Chemical toxicity evaluations for drugs, consumer products, and environmental chemicals have a critical impact on human health. Traditional animal models to evaluate chemical toxicity are expensive, time-consuming, and often fail to detect toxicants in humans. Computational toxicology is a promising alternative approach that utilizes machine learning (ML) and deep learning (DL) techniques to predict the toxicity potentials of chemicals. Although the applications of ML- and DL-based computational models in chemical toxicity predictions are attractive, many toxicity models are "black boxes" in nature and difficult to interpret by toxicologists, which hampers the chemical risk assessments using these models. The recent progress of interpretable ML (IML) in the computer science field meets this urgent need to unveil the underlying toxicity mechanisms and elucidate the domain knowledge of toxicity models. In this review, we focused on the applications of IML in computational toxicology, including toxicity feature data, model interpretation methods, use of knowledge base frameworks in IML development, and recent applications. The challenges and future directions of IML modeling in toxicology are also discussed. We hope this review can encourage efforts in developing interpretable models with new IML algorithms that can assist new chemical assessments by illustrating toxicity mechanisms in humans.
Collapse
Affiliation(s)
- Xuelian Jia
- Department
of Chemistry and Biochemistry, Rowan University, Glassboro, New Jersey 08028, United States
| | - Tong Wang
- Department
of Chemistry and Biochemistry, Rowan University, Glassboro, New Jersey 08028, United States
| | - Hao Zhu
- Department
of Chemistry and Biochemistry, Rowan University, Glassboro, New Jersey 08028, United States
| |
Collapse
|
4
|
Hao Y, Romano JD, Moore JH. Knowledge graph aids comprehensive explanation of drug and chemical toxicity. CPT Pharmacometrics Syst Pharmacol 2023; 12:1072-1079. [PMID: 37475158 PMCID: PMC10431039 DOI: 10.1002/psp4.12975] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2023] [Revised: 04/04/2023] [Accepted: 04/06/2023] [Indexed: 07/22/2023] Open
Abstract
In computational toxicology, prediction of complex endpoints has always been challenging, as they often involve multiple distinct mechanisms. State-of-the-art models are either limited by low accuracy, or lack of interpretability due to their black-box nature. Here, we introduce AIDTox, an interpretable deep learning model which incorporates curated knowledge of chemical-gene connections, gene-pathway annotations, and pathway hierarchy. AIDTox accurately predicts cytotoxicity outcomes in HepG2 and HEK293 cells. It also provides comprehensive explanations of cytotoxicity covering multiple aspects of drug activity, including target interaction, metabolism, and elimination. In summary, AIDTox provides a computational framework for unveiling cellular mechanisms for complex toxicity endpoints.
Collapse
Affiliation(s)
- Yun Hao
- Genomics and Computational Biology (GCB) Graduate ProgramUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Joseph D. Romano
- Institute for Biomedical InformaticsUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
- Center of Excellence in Environmental ToxicologyUniversity of PennsylvaniaPhiladelphiaPennsylvaniaUSA
| | - Jason H. Moore
- Department of Computational BiomedicineCedars‐Sinai Medical CenterLos AngelesCaliforniaUSA
| |
Collapse
|
5
|
Xu JY, Wang K, Men SH, Yang Y, Zhou Q, Yan ZG. QSAR-QSIIR-based prediction of bioconcentration factor using machine learning and preliminary application. ENVIRONMENT INTERNATIONAL 2023; 177:108003. [PMID: 37276762 DOI: 10.1016/j.envint.2023.108003] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Revised: 05/25/2023] [Accepted: 05/29/2023] [Indexed: 06/07/2023]
Abstract
Bioconcentration factor (BCF) is one of the important parameters for developing human health ambient water quality criteria (HHAWQC) for chemical pollutants. Traditional experimental method to obtain BCF is time-consuming and costly. Therefore, prediction of BCF by modeling has attracted much attention. QSAR (Quantitative Structure-Activity Relationship) model based on molecular descriptor is often used to predict BCF, however, in order to improve the accuracy of prediction, previous models are only applicable for prediction for a single category of substance and a single species, and cannot meet the needs of BCF prediction of pollutants lacing toxicity data. In this study, optimized 17 traditional molecular descriptor and five kinds of bioactivity descriptor were selected from more than 200 molecular descriptor and 25 kinds of biological activity descriptors. A QSAR-QSIIR (Quantitative Structure In vitro-In vivo Relationship) model suitable for multiple chemical substances and whole species is constructed by using optimized 4-MLP machine learning algorithm with selected molecular and bioactivity descriptors. The constructed model significantly improves the prediction accuracy of BCF. The R2 of verification set and test set are 0.8575 and 0.7924, respectively, and the difference between predicted BCF and measured BCF is mostly less than 1.5 times. Then, BCF of BTEX in Chinese common aquatic products is predicted using the constructed QSAR-QSIIR model, and the HHAWQC of BTEX in China are derived using the predicted BCF, which provides a valuable reference for establishment of China's BTEX water quality standards.
Collapse
Affiliation(s)
- Jia-Yun Xu
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing 100012, China
| | - Kun Wang
- National Engineering Laboratory for Lake Pollution Control and Ecological Restoration, State Environment Protection Key Laboratory for Lake Pollution Control, Chinese Research Academy of Environmental Sciences, Beijing 100012, China
| | - Shu-Hui Men
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing 100012, China
| | - Yang Yang
- China Energy Longyuan Environmental Protection Co.,Ltd., Beijing 100039, China
| | - Quan Zhou
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing 100012, China
| | - Zhen-Guang Yan
- State Key Laboratory of Environmental Criteria and Risk Assessment, Chinese Research Academy of Environmental Sciences, Beijing 100012, China.
| |
Collapse
|
6
|
Yan X, Yue T, Winkler DA, Yin Y, Zhu H, Jiang G, Yan B. Converting Nanotoxicity Data to Information Using Artificial Intelligence and Simulation. Chem Rev 2023. [PMID: 37262026 DOI: 10.1021/acs.chemrev.3c00070] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/03/2023]
Abstract
Decades of nanotoxicology research have generated extensive and diverse data sets. However, data is not equal to information. The question is how to extract critical information buried in vast data streams. Here we show that artificial intelligence (AI) and molecular simulation play key roles in transforming nanotoxicity data into critical information, i.e., constructing the quantitative nanostructure (physicochemical properties)-toxicity relationships, and elucidating the toxicity-related molecular mechanisms. For AI and molecular simulation to realize their full impacts in this mission, several obstacles must be overcome. These include the paucity of high-quality nanomaterials (NMs) and standardized nanotoxicity data, the lack of model-friendly databases, the scarcity of specific and universal nanodescriptors, and the inability to simulate NMs at realistic spatial and temporal scales. This review provides a comprehensive and representative, but not exhaustive, summary of the current capability gaps and tools required to fill these formidable gaps. Specifically, we discuss the applications of AI and molecular simulation, which can address the large-scale data challenge for nanotoxicology research. The need for model-friendly nanotoxicity databases, powerful nanodescriptors, new modeling approaches, molecular mechanism analysis, and design of the next-generation NMs are also critically discussed. Finally, we provide a perspective on future trends and challenges.
Collapse
Affiliation(s)
- Xiliang Yan
- Institute of Environmental Research at the Greater Bay Area, Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Guangzhou University, Guangzhou 510006, China
| | - Tongtao Yue
- Key Laboratory of Marine Environment and Ecology, Ministry of Education, Institute of Coastal Environmental Pollution Control, Ocean University of China, Qingdao 266100, China
| | - David A Winkler
- Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, Victoria 3052, Australia
- School of Pharmacy, University of Nottingham, Nottingham NG7 2QL, U.K
- Department of Biochemistry and Chemistry, La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria 3086, Australia
| | - Yongguang Yin
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| | - Hao Zhu
- Department of Chemistry and Biochemistry, Rowan University, Glassboro, New Jersey 08028, United States
| | - Guibin Jiang
- State Key Laboratory of Environmental Chemistry and Ecotoxicology, Research Center for Eco-Environmental Sciences, Chinese Academy of Sciences, Beijing 100085, China
| | - Bing Yan
- Institute of Environmental Research at the Greater Bay Area, Key Laboratory for Water Quality and Conservation of the Pearl River Delta, Ministry of Education, Guangzhou University, Guangzhou 510006, China
| |
Collapse
|
7
|
Chung E, Russo DP, Ciallella HL, Wang YT, Wu M, Aleksunes LM, Zhu H. Data-Driven Quantitative Structure-Activity Relationship Modeling for Human Carcinogenicity by Chronic Oral Exposure. ENVIRONMENTAL SCIENCE & TECHNOLOGY 2023; 57:6573-6588. [PMID: 37040559 PMCID: PMC10134506 DOI: 10.1021/acs.est.3c00648] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/24/2023] [Revised: 03/28/2023] [Accepted: 03/29/2023] [Indexed: 06/19/2023]
Abstract
Traditional methodologies for assessing chemical toxicity are expensive and time-consuming. Computational modeling approaches have emerged as low-cost alternatives, especially those used to develop quantitative structure-activity relationship (QSAR) models. However, conventional QSAR models have limited training data, leading to low predictivity for new compounds. We developed a data-driven modeling approach for constructing carcinogenicity-related models and used these models to identify potential new human carcinogens. To this goal, we used a probe carcinogen dataset from the US Environmental Protection Agency's Integrated Risk Information System (IRIS) to identify relevant PubChem bioassays. Responses of 25 PubChem assays were significantly relevant to carcinogenicity. Eight assays inferred carcinogenicity predictivity and were selected for QSAR model training. Using 5 machine learning algorithms and 3 types of chemical fingerprints, 15 QSAR models were developed for each PubChem assay dataset. These models showed acceptable predictivity during 5-fold cross-validation (average CCR = 0.71). Using our QSAR models, we can correctly predict and rank 342 IRIS compounds' carcinogenic potentials (PPV = 0.72). The models predicted potential new carcinogens, which were validated by a literature search. This study portends an automated technique that can be applied to prioritize potential toxicants using validated QSAR models based on extensive training sets from public data resources.
Collapse
Affiliation(s)
- Elena Chung
- Department
of Chemistry and Biochemistry, Rowan University, 201 Mullica Hill Road, Glassboro, New Jersey 08028, United States
| | - Daniel P. Russo
- Department
of Chemistry and Biochemistry, Rowan University, 201 Mullica Hill Road, Glassboro, New Jersey 08028, United States
| | - Heather L. Ciallella
- Department
of Toxicology, Cuyahoga County Medical Examiner’s
Office, 11001 Cedar Avenue, Cleveland, Ohio 44106, United States
| | - Yu-Tang Wang
- Institute
of Agro-Products Processing Science and Technology, Chinese Academy of Agricultural Sciences/Key Laboratory of Agro-Products
Processing, Ministry of Agriculture, Beijing 100193, China
| | - Min Wu
- School
of Life Science and Technology, China Pharmaceutical
University, No. 24, Tong Jia Xiang, Nanjing 210009, China
| | - Lauren M. Aleksunes
- Department
of Pharmacology and Toxicology, Rutgers
University, Ernest Mario School of Pharmacy, 170 Frelinghuysen Road, Piscataway, New Jersey 08854, United States
| | - Hao Zhu
- Department
of Chemistry and Biochemistry, Rowan University, 201 Mullica Hill Road, Glassboro, New Jersey 08028, United States
| |
Collapse
|
8
|
Sharma B, Chenthamarakshan V, Dhurandhar A, Pereira S, Hendler JA, Dordick JS, Das P. Accurate clinical toxicity prediction using multi-task deep neural nets and contrastive molecular explanations. Sci Rep 2023; 13:4908. [PMID: 36966203 PMCID: PMC10039880 DOI: 10.1038/s41598-023-31169-8] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2022] [Accepted: 03/07/2023] [Indexed: 03/27/2023] Open
Abstract
Explainable machine learning for molecular toxicity prediction is a promising approach for efficient drug development and chemical safety. A predictive ML model of toxicity can reduce experimental cost and time while mitigating ethical concerns by significantly reducing animal and clinical testing. Herein, we use a deep learning framework for simultaneously modeling in vitro, in vivo, and clinical toxicity data. Two different molecular input representations are used; Morgan fingerprints and pre-trained SMILES embeddings. A multi-task deep learning model accurately predicts toxicity for all endpoints, including clinical, as indicated by the area under the Receiver Operator Characteristic curve and balanced accuracy. In particular, pre-trained molecular SMILES embeddings as input to the multi-task model improved clinical toxicity predictions compared to existing models in MoleculeNet benchmark. Additionally, our multitask approach is comprehensive in the sense that it is comparable to state-of-the-art approaches for specific endpoints in in vitro, in vivo and clinical platforms. Through both the multi-task model and transfer learning, we were able to indicate the minimal need of in vivo data for clinical toxicity predictions. To provide confidence and explain the model's predictions, we adapt a post-hoc contrastive explanation method that returns pertinent positive and negative features, which correspond well to known mutagenic and reactive toxicophores, such as unsubstituted bonded heteroatoms, aromatic amines, and Michael receptors. Furthermore, toxicophore recovery by pertinent feature analysis captures more of the in vitro (53%) and in vivo (56%), rather than of the clinical (8%), endpoints, and indeed uncovers a preference in known toxicophore data towards in vitro and in vivo experimental data. To our knowledge, this is the first contrastive explanation, using both present and absent substructures, for predictions of clinical and in vivo molecular toxicity.
Collapse
Affiliation(s)
| | | | | | - Shiranee Pereira
- ICARE, International Center for Alternatives in Research and Education, Chennai, India
| | | | | | - Payel Das
- IBM Research, Yorktown Heights, NY, USA.
| |
Collapse
|
9
|
Pirzada RH, Ahmad B, Qayyum N, Choi S. Modeling structure-activity relationships with machine learning to identify GSK3-targeted small molecules as potential COVID-19 therapeutics. Front Endocrinol (Lausanne) 2023; 14:1084327. [PMID: 36950681 PMCID: PMC10025526 DOI: 10.3389/fendo.2023.1084327] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 10/30/2022] [Accepted: 02/20/2023] [Indexed: 03/08/2023] Open
Abstract
Coronaviruses induce severe upper respiratory tract infections, which can spread to the lungs. The nucleocapsid protein (N protein) plays an important role in genome replication, transcription, and virion assembly in SARS-CoV-2, the virus causing COVID-19, and in other coronaviruses. Glycogen synthase kinase 3 (GSK3) activation phosphorylates the viral N protein. To combat COVID-19 and future coronavirus outbreaks, interference with the dependence of N protein on GSK3 may be a viable strategy. Toward this end, this study aimed to construct robust machine learning models to identify GSK3 inhibitors from Food and Drug Administration-approved and investigational drug libraries using the quantitative structure-activity relationship approach. A non-redundant dataset consisting of 495 and 3070 compounds for GSK3α and GSK3β, respectively, was acquired from the ChEMBL database. Twelve sets of molecular descriptors were used to define these inhibitors, and machine learning algorithms were selected using the LazyPredict package. Histogram-based gradient boosting and light gradient boosting machine algorithms were used to develop predictive models that were evaluated based on the root mean square error and R-squared value. Finally, the top two drugs (selinexor and ruboxistaurin) were selected for molecular dynamics simulation based on the highest predicted activity (negative log of the half-maximal inhibitory concentration, pIC50 value) to further investigate the structural stability of the protein-ligand complexes. This artificial intelligence-based virtual high-throughput screening approach is an effective strategy for accelerating drug discovery and finding novel pharmacological targets while reducing the cost and time.
Collapse
Affiliation(s)
- Rameez Hassan Pirzada
- Department of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
- S&K Therapeutics, Ajou University Campus Plaza, Suwon, Republic of Korea
| | - Bilal Ahmad
- Department of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
| | - Naila Qayyum
- Department of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
| | - Sangdun Choi
- Department of Molecular Science and Technology, Ajou University, Suwon, Republic of Korea
- S&K Therapeutics, Ajou University Campus Plaza, Suwon, Republic of Korea
- *Correspondence: Sangdun Choi,
| |
Collapse
|
10
|
Parastar H, Tauler R. Big (Bio)Chemical Data Mining Using Chemometric Methods: A Need for Chemists. Angew Chem Int Ed Engl 2022. [DOI: 10.1002/ange.201801134] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Affiliation(s)
- Hadi Parastar
- Department of Chemistry Sharif University of Technology Tehran Iran
| | - Roma Tauler
- Department of Environmental Chemistry IDAEA-CSIC 08034 Barcelona Spain
| |
Collapse
|
11
|
Hao Y, Romano JD, Moore JH. Knowledge-guided deep learning models of drug toxicity improve interpretation. PATTERNS (NEW YORK, N.Y.) 2022; 3:100565. [PMID: 36124309 PMCID: PMC9481960 DOI: 10.1016/j.patter.2022.100565] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Revised: 05/16/2022] [Accepted: 07/12/2022] [Indexed: 12/04/2022]
Abstract
In drug development, a major reason for attrition is the lack of understanding of cellular mechanisms governing drug toxicity. The black-box nature of conventional classification models has limited their utility in identifying toxicity pathways. Here we developed DTox (deep learning for toxicology), an interpretation framework for knowledge-guided neural networks, which can predict compound response to toxicity assays and infer toxicity pathways of individual compounds. We demonstrate that DTox can achieve the same level of predictive performance as conventional models with a significant improvement in interpretability. Using DTox, we were able to rediscover mechanisms of transcription activation by three nuclear receptors, recapitulate cellular activities induced by aromatase inhibitors and pregnane X receptor (PXR) agonists, and differentiate distinctive mechanisms leading to HepG2 cytotoxicity. Virtual screening by DTox revealed that compounds with predicted cytotoxicity are at higher risk for clinical hepatic phenotypes. In summary, DTox provides a framework for deciphering cellular mechanisms of toxicity in silico.
Collapse
Affiliation(s)
- Yun Hao
- Genomics and Computational Biology (GCB) Graduate Program, University of Pennsylvania, Philadelphia, PA, USA
| | - Joseph D. Romano
- Institute for Biomedical Informatics, University of Pennsylvania, Philadelphia, PA, USA
- Center of Excellence in Environmental Toxicology, University of Pennsylvania, Philadelphia, PA, USA
| | - Jason H. Moore
- Department of Computational Biomedicine, Cedars-Sinai Medical Center, Los Angeles, CA, USA
| |
Collapse
|
12
|
Zushi Y. Direct Prediction of Physicochemical Properties and Toxicities of Chemicals from Analytical Descriptors by GC-MS. Anal Chem 2022; 94:9149-9157. [PMID: 35700270 PMCID: PMC9246259 DOI: 10.1021/acs.analchem.2c01667] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
![]()
With advances in
machine learning (ML) techniques, the quantitative
structure–activity relationship (QSAR) approach is becoming
popular for evaluating chemicals. However, the QSAR approach requires
that the chemical structure of the target compound is known and that
it should be convertible to molecular descriptors. These requirements
lead to limitations in predicting the properties and toxicities of
chemicals distributed in the environment as in the PubChem database;
the structural information on only 14% of compounds is available.
This study proposes a new ML-based QSAR approach that can predict
the properties and toxicities of compounds using analytical descriptors
of mass spectrum and retention index obtained via gas chromatography–mass
spectrometry without requiring exact structural information. The model
was developed based on the XGBoost ML method. The root-mean-square
errors (RMSEs) for log Ko-w, log (molecular weight), melting point,
boiling point, log (vapor pressure), log (water solubility), log (LD50) (rat, oral), and log (LD50) (mouse, oral) are
0.97, 0.052, 51, 23, 0.74, 1.1, 0.74, and 0.6, respectively. The model
performed well on a chemical standard mixture measurement, with similar
results to those of model validation. It also performed well on a
measurement of contaminated oil with spectral deconvolution. These
results indicate that the model is suitable for investigating unknown-structured
chemicals detected in measurements. Any online user can execute the
model through a web application named Detective-QSAR (http://www.mixture-platform.net/Detective_QSAR_Med_Open/). The analytical descriptor-based approach is expected to create
new opportunities for the evaluation of unknown chemicals around us.
Collapse
Affiliation(s)
- Yasuyuki Zushi
- Research Institute of Science for Safety and Sustainability, National Institute of Advanced Industrial Science and Technology, 16-1 Onogawa, Tsukuba, Ibaraki 305-8506, Japan.,Graduate School of Science and Technology, University of Tsukuba, 1-1-1 Tennodai, Tsukuba, Ibaraki 305-8577, Japan
| |
Collapse
|
13
|
Hsieh JH. Accounting for Artifacts in High-Throughput Toxicity Assays. Methods Mol Biol 2022; 2474:155-167. [PMID: 35294764 DOI: 10.1007/978-1-0716-2213-1_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Compound activity identification is the primary goal in high throughput screening (HTS) assays. However, assay artifacts including both systematic (e.g., compound autofluorescence) and nonsystematic (e.g., noise) complicate activity interpretation. In addition, other than the traditional potency parameter, half-maximal effect concentration [EC50], additional activity parameters (e.g., point-of-departure [POD] and weighted area-under-the-curve [wAUC]) could be derived from HTS data for activity profiling. A data analysis pipeline has been developed to handle the artifacts, and to provide compound activity characterization with either binary or continuous metrics. This chapter outlines the steps in the pipeline using Tox21 estrogen receptor (ER) β-lactamase assays, including the formats to identify either agonists or antagonists, as well as the counterscreen assays for identifying artifacts as examples. The steps can be applied to other lower throughput assays with concentration-response data.
Collapse
Affiliation(s)
- Jui-Hua Hsieh
- National Institute of Environmental Health Sciences, Durham, NC, USA.
| |
Collapse
|
14
|
Abstract
In this chapter, we give a brief overview of the regulatory requirements for acute systemic toxicity information in the European Union, and we review structure-based computational models that are available and potentially useful in the assessment of acute systemic toxicity. Emphasis is placed on quantitative structure-activity relationship (QSAR) models implemented by means of a range of software tools. The most recently published literature models for acute systemic toxicity are also discussed, and perspectives for future developments in this field are offered.
Collapse
Affiliation(s)
- Ivanka Tsakovska
- Institute of Biophysics and Biomedical Engineering, Bulgarian Academy of Sciences, Sofia, Bulgaria.
| | - Antonia Diukendjieva
- Institute of Biophysics and Biomedical Engineering, Bulgarian Academy of Sciences, Sofia, Bulgaria
| | - Andrew P Worth
- European Commission, Joint Research Centre (JRC), Ispra, Italy
| |
Collapse
|
15
|
Sedykh A. CurveP Method for Rendering High-Throughput Screening Dose-Response Data into Digital Fingerprints. Methods Mol Biol 2022; 2474:147-154. [PMID: 35294763 DOI: 10.1007/978-1-0716-2213-1_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
The nature of high-throughput screening (HTS) puts certain limits on optimal test conditions for each particular sample; therefore, on top of usual data normalization, additional parsing is often needed to account for incomplete read outs or various artifacts that arise from signal interferences.CurveP is a heuristic, user-tunable curve-cleaning algorithm that attempts to find a minimum set of corrections, which would give a monotonic dose-response curve. After applying the corrections, the algorithm proceeds to calculate a set of numeric features, which can be used as a fingerprint characterizing the sample, or as a vector of independent variables (e.g., molecular descriptors in case of chemical substances testing). The resulting output can be a part of HTS data analysis or can be used as input for a broad spectrum of computational applications, such as quantitative structure-activity relationship (QSAR ) modeling, computational toxicology, bioinformatics, and cheminformatics.
Collapse
|
16
|
Hao Y, Moore JH. TargetTox: A Feature Selection Pipeline for Identifying Predictive Targets Associated with Drug Toxicity. J Chem Inf Model 2021; 61:5386-5394. [PMID: 34757743 DOI: 10.1021/acs.jcim.1c00733] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In silico assessment of drug toxicity is becoming a critical step in drug development. Conventional ligand-based models are limited by low accuracy and lack of interpretability. Further, they often fail to explain cellular mechanisms underlying structure-toxicity associations. We addressed these limitations by incorporating target profile as an intermediate connecting structure to toxicity. To accommodate for high-dimensional feature space, we developed a pipeline named TargetTox that can identity a subset of predictive features. We implemented TargetTox to study 569 targets and 815 adverse events. The features identified by TargetTox comprise less than 10% of the original feature space; nevertheless, they accurately predicted binding outcomes for 377 targets and toxicity outcomes for 36 adverse events. We demonstrated that predictive targets tend to be differentially expressed in the tissue of toxicity. We also rediscovered key cellular functions associated with cardiotoxicity from the predictive targets, as well as markers of skin and liver diseases. Furthermore, we found evidence supporting diagnostic and therapeutic applications of some predictive targets in hepatotoxicity and nephrotoxicity. Our findings highlighted the critical role of predictive targets in cellular mechanisms leading to toxicity. In general, our study improved the interpretability of toxicity prediction without sacrificing accuracy. Our novel pipeline may benefit future studies of high-dimensional data sets.
Collapse
Affiliation(s)
- Yun Hao
- Genomics and Computational Biology (GCB) Graduate Program, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| | - Jason H Moore
- Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania, Philadelphia, Pennsylvania 19104, United States
| |
Collapse
|
17
|
Jagiello K, Halappanavar S, Rybińska-Fryca A, Willliams A, Vogel U, Puzyn T. Transcriptomics-Based and AOP-Informed Structure-Activity Relationships to Predict Pulmonary Pathology Induced by Multiwalled Carbon Nanotubes. SMALL (WEINHEIM AN DER BERGSTRASSE, GERMANY) 2021; 17:e2003465. [PMID: 33502096 DOI: 10.1002/smll.202003465] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/05/2020] [Revised: 08/31/2020] [Indexed: 06/12/2023]
Abstract
This study presents a novel strategy that employs quantitative structure-activity relationship models for nanomaterials (Nano-QSAR) for predicting transcriptomic pathway level response using lung tissue inflammation, an essential key event (KEs) in the existing adverse outcome pathway (AOP) for lung fibrosis, as a model response. Transcriptomic profiles of mouse lungs exposed to ten different multiwalled carbon nanotubes (MWCNTs) are analyzed using statistical and bioinformatics tools. Three pathways "agranulocyte adhesion and diapedesis," "granulocyte adhesion and diapedesis," and "acute phase signaling," that (1) are commonly perturbed across the MWCNTs panel, (2) show dose response (Benchmark dose, BMDs), and (3) are anchored to the KEs identified in the lung fibrosis AOP, are considered in modelling. The three pathways are associated with tissue inflammation. The results show that the aspect ratio (κ) of MWCNTs is directly correlated with the pathway BMDs. The study establishes a methodology for QSAR construction based on canonical pathways and proposes a MWCNTs grouping strategy based on the κ-values of the specific pathway associated genes. Finally, the study shows how the AOP framework can help guide QSAR modelling efforts; conversely, the outcome of the QSAR modelling can aid in refining certain aspects of the AOP in question (here, lung fibrosis).
Collapse
Affiliation(s)
- Karolina Jagiello
- QSAR Lab Ltd., Aleja Grunwaldzka 190/102, Gdansk, 80-266, Poland
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, Gdansk, 80-308, Poland
| | - Sabina Halappanavar
- Environmental Health Science and Research Bureau, Health Canada, Ottawa, Ontario, K1A 0K9, Canada
- Department of Biology, University of Ottawa, Ottawa, Ontario, K1N 9A7, Canada
| | | | - Andrew Willliams
- Environmental Health Science and Research Bureau, Health Canada, Ottawa, Ontario, K1A 0K9, Canada
| | - Ulla Vogel
- The National Research Centre for the Working Environment, Copenhagen, DK-2100, Denmark
| | - Tomasz Puzyn
- QSAR Lab Ltd., Aleja Grunwaldzka 190/102, Gdansk, 80-266, Poland
- Faculty of Chemistry, University of Gdansk, Wita Stwosza 63, Gdansk, 80-308, Poland
| |
Collapse
|
18
|
Hsieh JH, Sedykh A, Mutlu E, Germolec DR, Auerbach SS, Rider CV. Harnessing In Silico, In Vitro, and In Vivo Data to Understand the Toxicity Landscape of Polycyclic Aromatic Compounds (PACs). Chem Res Toxicol 2020; 34:268-285. [PMID: 33063992 DOI: 10.1021/acs.chemrestox.0c00213] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
Polycyclic aromatic compounds (PACs) are compounds with a minimum of two six-atom aromatic fused rings. PACs arise from incomplete combustion or thermal decomposition of organic matter and are ubiquitous in the environment. Within PACs, carcinogenicity is generally regarded to be the most important public health concern. However, toxicity in other systems (reproductive and developmental toxicity, immunotoxicity) has also been reported. Despite the large number of PACs identified in the environment, research attention to understand exposure and health effects of PACs has focused on a relatively limited subset, namely polycyclic aromatic hydrocarbons (PAHs), the PACs with only carbon and hydrogen atoms. To triage the rest of the vast number of PACs for more resource-intensive testing, we developed a data-driven approach to contextualize hazard characterization of PACs, by leveraging the available data from various data streams (in silico toxicity, in vitro activity, structural fingerprints, and in vivo data availability). The PACs were clustered on the basis of their in silico toxicity profiles containing predictions from 8 different categories (carcinogenicity, cardiotoxicity, developmental toxicity, genotoxicity, hepatotoxicity, neurotoxicity, reproductive toxicity, and urinary toxicity). We found that PACs with the same parent structure (e.g., fluorene) could have diverse in silico toxicity profiles. In contrast, PACs with similar substituted groups (e.g., alkylated-PAHs) or heterocyclics (e.g., N-PACs) with varying ring sizes could have similar in silico toxicity profiles, suggesting that these groups are better candidates for toxicity read-across analysis. The clusters/regions associated with certain in silico toxicity, in vitro activity, and structural fingerprints were identified. We found that genotoxicity/carcinogenicity (in silico toxicity) and xenobiotic homeostasis and stress response (in vitro activity), respectively, dominate the toxicity/activity variation seen in the PACs. The "hot spots" with enriched toxicity/activity in conjunction with availability of in vivo carcinogenicity data revealed regions of either data-poor (hydroxylated-PAHs) or data-rich (unsubstituted, parent PAHs) PACs. These regions offer potential targets for prioritization of further in vivo assessment and for chemical read-across efforts. The analysis results are searchable through an interactive web application (https://ntp.niehs.nih.gov/go/pacs_tableau), allowing for alternative hypothesis generation.
Collapse
Affiliation(s)
- Jui-Hua Hsieh
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina 27709, United States
| | | | - Esra Mutlu
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina 27709, United States
| | - Dori R Germolec
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina 27709, United States
| | - Scott S Auerbach
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina 27709, United States
| | - Cynthia V Rider
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina 27709, United States
| |
Collapse
|
19
|
Benchmarking Data Sets from PubChem BioAssay Data: Current Scenario and Room for Improvement. Int J Mol Sci 2020; 21:ijms21124380. [PMID: 32575564 PMCID: PMC7352161 DOI: 10.3390/ijms21124380] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2020] [Revised: 06/15/2020] [Accepted: 06/18/2020] [Indexed: 11/17/2022] Open
Abstract
Developing realistic data sets for evaluating virtual screening methods is a task that has been tackled by the cheminformatics community for many years. Numerous artificially constructed data collections were developed, such as DUD, DUD-E, or DEKOIS. However, they all suffer from multiple drawbacks, one of which is the absence of experimental results confirming the impotence of presumably inactive molecules, leading to possible false negatives in the ligand sets. In light of this problem, the PubChem BioAssay database, an open-access repository providing the bioactivity information of compounds that were already tested on a biological target, is now a recommended source for data set construction. Nevertheless, there exist several issues with the use of such data that need to be properly addressed. In this article, an overview of benchmarking data collections built upon experimental PubChem BioAssay input is provided, along with a thorough discussion of noteworthy issues that one must consider during the design of new ligand sets from this database. The points raised in this review are expected to guide future developments in this regard, in hopes of offering better evaluation tools for novel in silico screening procedures.
Collapse
|
20
|
Muratov EN, Bajorath J, Sheridan RP, Tetko IV, Filimonov D, Poroikov V, Oprea TI, Baskin II, Varnek A, Roitberg A, Isayev O, Curtarolo S, Fourches D, Cohen Y, Aspuru-Guzik A, Winkler DA, Agrafiotis D, Cherkasov A, Tropsha A. QSAR without borders. Chem Soc Rev 2020; 49:3525-3564. [PMID: 32356548 PMCID: PMC8008490 DOI: 10.1039/d0cs00098a] [Citation(s) in RCA: 384] [Impact Index Per Article: 76.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
Prediction of chemical bioactivity and physical properties has been one of the most important applications of statistical and more recently, machine learning and artificial intelligence methods in chemical sciences. This field of research, broadly known as quantitative structure-activity relationships (QSAR) modeling, has developed many important algorithms and has found a broad range of applications in physical organic and medicinal chemistry in the past 55+ years. This Perspective summarizes recent technological advances in QSAR modeling but it also highlights the applicability of algorithms, modeling methods, and validation practices developed in QSAR to a wide range of research areas outside of traditional QSAR boundaries including synthesis planning, nanotechnology, materials science, biomaterials, and clinical informatics. As modern research methods generate rapidly increasing amounts of data, the knowledge of robust data-driven modelling methods professed within the QSAR field can become essential for scientists working both within and outside of chemical research. We hope that this contribution highlighting the generalizable components of QSAR modeling will serve to address this challenge.
Collapse
Affiliation(s)
- Eugene N Muratov
- UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Baker NC, Sipes NS, Franzosa J, Belair DG, Abbott BD, Judson RS, Knudsen TB. Characterizing cleft palate toxicants using ToxCast data, chemical structure, and the biomedical literature. Birth Defects Res 2019; 112:19-39. [PMID: 31471948 DOI: 10.1002/bdr2.1581] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2019] [Revised: 07/23/2019] [Accepted: 07/24/2019] [Indexed: 12/11/2022]
Abstract
Cleft palate has been linked to both genetic and environmental factors that perturb key events during palatal morphogenesis. As a developmental outcome, it presents a challenging, mechanistically complex endpoint for predictive modeling. A data set of 500 chemicals evaluated for their ability to induce cleft palate in animal prenatal developmental studies was compiled from Toxicity Reference Database and the biomedical literature, which included 63 cleft palate active and 437 inactive chemicals. To characterize the potential molecular targets for chemical-induced cleft palate, we mined the ToxCast high-throughput screening database for patterns and linkages in bioactivity profiles and chemical structural descriptors. ToxCast assay results were filtered for cytotoxicity and grouped by target gene activity to produce a "gene score." Following unsuccessful attempts to derive a global prediction model using structural and gene score descriptors, hierarchical clustering was applied to the set of 63 cleft palate positives to extract local structure-bioactivity clusters for follow-up study. Patterns of enrichment were confirmed on the complete data set, that is, including cleft palate inactives, and putative molecular initiating events identified. The clusters corresponded to ToxCast assays for cytochrome P450s, G-protein coupled receptors, retinoic acid receptors, the glucocorticoid receptor, and tyrosine kinases/phosphatases. These patterns and linkages were organized into preliminary decision trees and the resulting inferences were mapped to a putative adverse outcome pathway framework for cleft palate supported by literature evidence of current mechanistic understanding. This general data-driven approach offers a promising avenue for mining chemical-bioassay drivers of complex developmental endpoints where data are often limited.
Collapse
Affiliation(s)
| | - Nisha S Sipes
- NIEHS Division of the National Toxicology Program, Research Triangle Park, North Carolina
| | - Jill Franzosa
- IOAA CSS, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - David G Belair
- NHEERL, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - Barbara D Abbott
- NHEERL, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - Richard S Judson
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| | - Thomas B Knudsen
- National Center for Computational Toxicology, Office of Research and Development, U.S. Environmental Protection Agency, Research Triangle Park, North Carolina
| |
Collapse
|
22
|
Hsieh JH, Smith-Roe SL, Huang R, Sedykh A, Shockley KR, Auerbach SS, Merrick BA, Xia M, Tice RR, Witt KL. Identifying Compounds with Genotoxicity Potential Using Tox21 High-Throughput Screening Assays. Chem Res Toxicol 2019; 32:1384-1401. [PMID: 31243984 DOI: 10.1021/acs.chemrestox.9b00053] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Genotoxicity is a critical component of a comprehensive toxicological profile. The Tox21 Program used five quantitative high-throughput screening (qHTS) assays measuring some aspect of DNA damage/repair to provide information on the genotoxic potential of over 10 000 compounds. Included were assays detecting activation of p53, increases in the DNA repair protein ATAD5, phosphorylation of H2AX, and enhanced cytotoxicity in DT40 cells deficient in DNA-repair proteins REV3 or KU70/RAD54. Each assay measures a distinct component of the DNA damage response signaling network; >70% of active compounds were detected in only one of the five assays. When qHTS results were compared with results from three standard genotoxicity assays (bacterial mutation, in vitro chromosomal aberration, and in vivo micronucleus), a maximum of 40% of known, direct-acting genotoxicants were active in one or more of the qHTS genotoxicity assays, indicating low sensitivity. This suggests that these qHTS assays cannot in their current form be used to replace traditional genotoxicity assays. However, despite the low sensitivity, ranking chemicals by potency of response in the qHTS assays revealed an enrichment for genotoxicants up to 12-fold compared with random selection, when allowing a 1% false positive rate. This finding indicates these qHTS assays can be used to prioritize chemicals for further investigation, allowing resources to focus on compounds most likely to induce genotoxic effects. To refine this prioritization process, models for predicting the genotoxicity potential of chemicals that were active in Tox21 genotoxicity assays were constructed using all Tox21 assay data, yielding a prediction accuracy up to 0.83. Data from qHTS assays related to stress-response pathway signaling (including genotoxicity) were the most informative for model construction. By using the results from qHTS genotoxicity assays, predictions from models based on qHTS data, and predictions from commercial bacterial mutagenicity QSAR models, we prioritized Tox21 chemicals for genotoxicity characterization.
Collapse
Affiliation(s)
- Jui-Hua Hsieh
- Kelly Government Solutions , Research Triangle Park , North Carolina 27709 , United States
| | - Stephanie L Smith-Roe
- Division of the National Toxicology Program , National Institute of Environmental Health Sciences , Research Triangle Park , North Carolina 27709 , United States
| | - Ruili Huang
- National Center for Advancing Translational Sciences , National Institutes of Health , Rockville , Maryland 20850 , United States
| | - Alexander Sedykh
- Sciome, LLC , Research Triangle Park , North Carolina 27709 , United States
| | - Keith R Shockley
- Division of Intramural Research , National Institute of Environmental Health Sciences , Research Triangle Park , North Carolina 27709 , United States
| | - Scott S Auerbach
- Division of the National Toxicology Program , National Institute of Environmental Health Sciences , Research Triangle Park , North Carolina 27709 , United States
| | - B Alex Merrick
- Division of the National Toxicology Program , National Institute of Environmental Health Sciences , Research Triangle Park , North Carolina 27709 , United States
| | - Menghang Xia
- National Center for Advancing Translational Sciences , National Institutes of Health , Rockville , Maryland 20850 , United States
| | - Raymond R Tice
- RTice Consulting , Hillsborough , North Carolina 27278 , United States
| | - Kristine L Witt
- Division of the National Toxicology Program , National Institute of Environmental Health Sciences , Research Triangle Park , North Carolina 27709 , United States
| |
Collapse
|
23
|
Thomas RS, Bahadori T, Buckley TJ, Cowden J, Deisenroth C, Dionisio KL, Frithsen JB, Grulke CM, Gwinn MR, Harrill JA, Higuchi M, Houck KA, Hughes MF, Hunter ES, Isaacs KK, Judson RS, Knudsen TB, Lambert JC, Linnenbrink M, Martin TM, Newton SR, Padilla S, Patlewicz G, Paul-Friedman K, Phillips KA, Richard AM, Sams R, Shafer TJ, Setzer RW, Shah I, Simmons JE, Simmons SO, Singh A, Sobus JR, Strynar M, Swank A, Tornero-Valez R, Ulrich EM, Villeneuve DL, Wambaugh JF, Wetmore BA, Williams AJ. The Next Generation Blueprint of Computational Toxicology at the U.S. Environmental Protection Agency. Toxicol Sci 2019; 169:317-332. [PMID: 30835285 PMCID: PMC6542711 DOI: 10.1093/toxsci/kfz058] [Citation(s) in RCA: 236] [Impact Index Per Article: 39.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
The U.S. Environmental Protection Agency (EPA) is faced with the challenge of efficiently and credibly evaluating chemical safety often with limited or no available toxicity data. The expanding number of chemicals found in commerce and the environment, coupled with time and resource requirements for traditional toxicity testing and exposure characterization, continue to underscore the need for new approaches. In 2005, EPA charted a new course to address this challenge by embracing computational toxicology (CompTox) and investing in the technologies and capabilities to push the field forward. The return on this investment has been demonstrated through results and applications across a range of human and environmental health problems, as well as initial application to regulatory decision-making within programs such as the EPA's Endocrine Disruptor Screening Program. The CompTox initiative at EPA is more than a decade old. This manuscript presents a blueprint to guide the strategic and operational direction over the next 5 years. The primary goal is to obtain broader acceptance of the CompTox approaches for application to higher tier regulatory decisions, such as chemical assessments. To achieve this goal, the blueprint expands and refines the use of high-throughput and computational modeling approaches to transform the components in chemical risk assessment, while systematically addressing key challenges that have hindered progress. In addition, the blueprint outlines additional investments in cross-cutting efforts to characterize uncertainty and variability, develop software and information technology tools, provide outreach and training, and establish scientific confidence for application to different public health and environmental regulatory decisions.
Collapse
Affiliation(s)
- Russell S. Thomas
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Tina Bahadori
- National Center for Environmental Assessment, Office of Research and Development, US Environmental Protection Agency
| | - Timothy J. Buckley
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - John Cowden
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Chad Deisenroth
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Kathie L. Dionisio
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Jeffrey B. Frithsen
- Chemical Safety for Sustainability National Research Program, Office of Research and Development, US Environmental Protection Agency
| | - Christopher M. Grulke
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Maureen R. Gwinn
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Joshua A. Harrill
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Mark Higuchi
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Keith A. Houck
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Michael F. Hughes
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - E. Sidney Hunter
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Kristin K. Isaacs
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Richard S. Judson
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Thomas B. Knudsen
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Jason C. Lambert
- National Center for Environmental Assessment, Office of Research and Development, US Environmental Protection Agency
| | - Monica Linnenbrink
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Todd M. Martin
- National Risk Management Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Seth R. Newton
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Stephanie Padilla
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Grace Patlewicz
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Katie Paul-Friedman
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Katherine A. Phillips
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Ann M. Richard
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Reeder Sams
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Timothy J. Shafer
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - R. Woodrow Setzer
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Imran Shah
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Jane E. Simmons
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Steven O. Simmons
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Amar Singh
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Jon R. Sobus
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Mark Strynar
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Adam Swank
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Rogelio Tornero-Valez
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Elin M. Ulrich
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Daniel L Villeneuve
- National Health and Environmental Effects Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - John F. Wambaugh
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| | - Barbara A. Wetmore
- National Exposure Research Laboratory, Office of Research and Development, US Environmental Protection Agency
| | - Antony J. Williams
- National Center for Computational Toxicology, Office of Research and Development, US Environmental Protection Agency
| |
Collapse
|
24
|
Allen CHG, Mervin LH, Mahmoud SY, Bender A. Leveraging heterogeneous data from GHS toxicity annotations, molecular and protein target descriptors and Tox21 assay readouts to predict and rationalise acute toxicity. J Cheminform 2019; 11:36. [PMID: 31152262 PMCID: PMC6544914 DOI: 10.1186/s13321-019-0356-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2018] [Accepted: 05/15/2019] [Indexed: 01/06/2023] Open
Abstract
Despite the increasing knowledge in both the chemical and biological domains the assimilation and exploration of heterogeneous datasets, encoding information about the chemical, bioactivity and phenotypic properties of compounds, remains a challenge due to requirement for overlap between chemicals assayed across the spaces. Here, we have constructed a novel dataset, larger than we have used in prior work, comprising 579 acute oral toxic compounds and 1427 non-toxic compounds derived from regulatory GHS information, along with their corresponding molecular and protein target descriptors and qHTS in vitro assay readouts from the Tox21 project. We found no clear association between the results of a FAFDrugs4 toxicophore screen and the acute oral toxicity classifications for our compound set; and a screen using a subset of the ToxAlerts toxicophores was also of limited utility, with only slight enrichment toward the toxic set (odds ratio of 1.48). We then investigated to what degree toxic and non-toxic compounds could be separated in each of the spaces, to compare their potential contribution to further analyses. Using an LDA projection, we found the largest degree of separation using chemical descriptors (Cohen’s d of 1.95) and the lowest degree of separation between toxicity classes using qHTS descriptors (Cohen’s d of 0.67). To compare the predictivity of the feature spaces for the toxicity endpoint, we next trained Random Forest (RF) acute oral toxicity classifiers on either molecular, protein target and qHTS descriptors. RFs trained on molecular and protein target descriptors were most predictive, with ROC AUC values of 0.80–0.92 and 0.70–0.85, respectively, across three test sets. RFs trained on both chemical and protein target descriptors combined exhibited similar predictive performance to the single-domain models (ROC AUC of 0.80–0.91). Model interpretability was improved by the inclusion of protein target descriptors, which allow the identification of specific targets (e.g. Retinal dehydrogenase) with literature links to toxic modes of action (e.g. oxidative stress). The dataset compiled in this study has been made available for future application.
Collapse
Affiliation(s)
- Chad H G Allen
- Department of Chemistry, Centre for Molecular Informatics, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Lewis H Mervin
- Department of Chemistry, Centre for Molecular Informatics, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Samar Y Mahmoud
- Department of Chemistry, Centre for Molecular Informatics, Lensfield Road, Cambridge, CB2 1EW, UK
| | - Andreas Bender
- Department of Chemistry, Centre for Molecular Informatics, Lensfield Road, Cambridge, CB2 1EW, UK.
| |
Collapse
|
25
|
Russo DP, Strickland J, Karmaus AL, Wang W, Shende S, Hartung T, Aleksunes LM, Zhu H. Nonanimal Models for Acute Toxicity Evaluations: Applying Data-Driven Profiling and Read-Across. ENVIRONMENTAL HEALTH PERSPECTIVES 2019; 127:47001. [PMID: 30933541 PMCID: PMC6785238 DOI: 10.1289/ehp3614] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/03/2023]
Abstract
BACKGROUND Low-cost, high-throughput in vitro bioassays have potential as alternatives to animal models for toxicity testing. However, incorporating in vitro bioassays into chemical toxicity evaluations such as read-across requires significant data curation and analysis based on knowledge of relevant toxicity mechanisms, lowering the enthusiasm of using the massive amount of unstructured public data. OBJECTIVE We aimed to develop a computational method to automatically extract useful bioassay data from a public repository (i.e., PubChem) and assess its ability to predict animal toxicity using a novel bioprofile-based read-across approach. METHODS A training database containing 7,385 compounds with diverse rat acute oral toxicity data was searched against PubChem to establish in vitro bioprofiles. Using a novel subspace clustering algorithm, bioassay groups that may inform on relevant toxicity mechanisms underlying acute oral toxicity were identified. These bioassays groups were used to predict animal acute oral toxicity using read-across through a cross-validation process. Finally, an external test set of over 600 new compounds was used to validate the resulting model predictivity. RESULTS Several bioassay clusters showed high predictivity for acute oral toxicity (positive prediction rates range from 62-100%) through cross-validation. After incorporating individual clusters into an ensemble model, chemical toxicants in the external test set were evaluated for putative acute toxicity (positive prediction rate equal to 76%). Additionally, chemical fragment -in vitro-in vivo relationships were identified to illustrate new animal toxicity mechanisms. CONCLUSIONS The in vitro bioassay data-driven profiling strategy developed in this study meets the urgent needs of computational toxicology in the current big data era and can be extended to develop predictive models for other complex toxicity end points. https://doi.org/10.1289/EHP3614.
Collapse
Affiliation(s)
- Daniel P. Russo
- Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, USA
| | - Judy Strickland
- Integrated Laboratory Systems (ILS), Research Triangle Park, North Carolina, USA
| | - Agnes L. Karmaus
- Integrated Laboratory Systems (ILS), Research Triangle Park, North Carolina, USA
| | - Wenyi Wang
- Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, USA
| | - Sunil Shende
- Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, USA
- Department of Computer Science, Rutgers University, Camden, New Jersey, USA
| | - Thomas Hartung
- Johns Hopkins Bloomberg School of Public Health, Center for Alternatives to Animal Testing (CAAT), Baltimore, Maryland, USA
- University of Konstanz, CAAT-Europe, Konstanz, Germany
| | - Lauren M. Aleksunes
- Department of Pharmacology and Toxicology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, New Jersey, USA
| | - Hao Zhu
- Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, USA
- Department of Chemistry, Rutgers University, Camden, New Jersey, USA
| |
Collapse
|
26
|
Tang W, Chen J, Wang Z, Xie H, Hong H. Deep learning for predicting toxicity of chemicals: a mini review. JOURNAL OF ENVIRONMENTAL SCIENCE AND HEALTH. PART C, ENVIRONMENTAL CARCINOGENESIS & ECOTOXICOLOGY REVIEWS 2019; 36:252-271. [PMID: 30821199 DOI: 10.1080/10590501.2018.1537563] [Citation(s) in RCA: 40] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/09/2023]
Abstract
Humans and wildlife inhabit a world with panoply of natural and synthetic chemicals. Alarmingly, only a limited number of chemicals have undergone comprehensive toxicological evaluation due to limitations of traditional toxicity testing. High-throughput screening assays provide a higher-speed alternative for conventional toxicity testing. Advancement of high-throughput bioassay technology has greatly increased chemical toxicity data volumes in the past decade, pushing toxicology research into a "big data" era. However, traditional data analysis methods fail to effectively process large data volumes, presenting both a challenge and an opportunity for toxicologists. Deep learning, a machine learning method leveraging deep neural networks (DNNs), is a proven useful tool for building quantitative structure-activity relationship (QSAR) models for toxicity prediction utilizing these new large datasets. In this mini review, a brief technical background on DNNs is provided, and the current state of chemical toxicity prediction models built with DNNs is reviewed. In addition, relevant toxicity data sources are summarized, possible limitations are discussed, and perspectives on DNN utilization in chemical toxicity prediction are given.
Collapse
Affiliation(s)
- Weihao Tang
- a Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology , Dalian University of Technology , Dalian , China
| | - Jingwen Chen
- a Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology , Dalian University of Technology , Dalian , China
| | - Zhongyu Wang
- a Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology , Dalian University of Technology , Dalian , China
| | - Hongbin Xie
- a Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology , Dalian University of Technology , Dalian , China
| | - Huixiao Hong
- b National Center for Toxicological Research , U.S. Food and Drug Administration , Jefferson , Arkansas , USA
| |
Collapse
|
27
|
Hsieh JH, Ryan K, Sedykh A, Lin JA, Shapiro AJ, Parham F, Behl M. Application of Benchmark Concentration (BMC) Analysis on Zebrafish Data: A New Perspective for Quantifying Toxicity in Alternative Animal Models. Toxicol Sci 2019; 167:92-104. [PMID: 30321397 PMCID: PMC6317423 DOI: 10.1093/toxsci/kfy258] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Over the past decade, the zebrafish is increasingly being used as a model to screen for chemical-mediated toxicities including developmental toxicity (DT) and neurotoxicity (NT). One of the major challenges is lack of harmonization in data analysis approaches, thereby posing difficulty in comparing findings across laboratories. To address this, we sought to establish a unified data analysis strategy for both DT and NT data, by adopting the benchmark concentration (BMC) analysis. There are two critical aspects in the BMC analysis: having a toxicity endpoint amenable for BMC and selecting a proper benchmark response (BMR) for the endpoint. For the former, in addition to the typical endpoints in NT assay (eg, hyper/hypo- response quantified by distance moved), we also used endpoints that assess the differences in movement patterns between chemical-treated embryos and control embryos. For the latter, we standardized the selection of BMR, which is analogous to minimum activity threshold, based on intrinsic response variations in the endpoint. When comparing our BMC results with a traditionally used LOAEL method (lowest-observed-adverse-effect level), we found high active compound concordance (100% for DT vs 74% for NT); generally, the BMC was more sensitive than LOAEL (no. of BMC more sensitive/no. of concordant active compounds, 43/50 for DT vs 16/26 for NT). Using the BMC with standardized toxicity endpoints and an appropriate BMR, we may now have a unified data-analysis approach to comparing results across different zebrafish datasets, for a better understanding of strengths and challenges when using the zebrafish as a screening tool.
Collapse
Affiliation(s)
- Jui-Hua Hsieh
- Kelly Government Solutions, Durham, North Carolina, 27709, USA
| | - Kristen Ryan
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, 27709, USA
| | | | - Ja-An Lin
- Department of Biostatistics, The University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, 27516, USA
| | - Andrew J Shapiro
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, 27709, USA
| | - Frederick Parham
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, 27709, USA
| | - Mamta Behl
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, North Carolina, 27709, USA
| |
Collapse
|
28
|
Wignall JA, Muratov E, Sedykh A, Guyton KZ, Tropsha A, Rusyn I, Chiu WA. Conditional Toxicity Value (CTV) Predictor: An In Silico Approach for Generating Quantitative Risk Estimates for Chemicals. ENVIRONMENTAL HEALTH PERSPECTIVES 2018; 126:057008. [PMID: 29847084 PMCID: PMC6071978 DOI: 10.1289/ehp2998] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/21/2017] [Revised: 03/25/2018] [Accepted: 04/16/2018] [Indexed: 05/03/2023]
Abstract
BACKGROUND Human health assessments synthesize human, animal, and mechanistic data to produce toxicity values that are key inputs to risk-based decision making. Traditional assessments are data-, time-, and resource-intensive, and they cannot be developed for most environmental chemicals owing to a lack of appropriate data. OBJECTIVES As recommended by the National Research Council, we propose a solution for predicting toxicity values for data-poor chemicals through development of quantitative structure-activity relationship (QSAR) models. METHODS We used a comprehensive database of chemicals with existing regulatory toxicity values from U.S. federal and state agencies to develop quantitative QSAR models. We compared QSAR-based model predictions to those based on high-throughput screening (HTS) assays. RESULTS QSAR models for noncancer threshold-based values and cancer slope factors had cross-validation-based Q2 of 0.25-0.45, mean model errors of 0.70-1.11 log10 units, and applicability domains covering >80% of environmental chemicals. Toxicity values predicted from QSAR models developed in this study were more accurate and precise than those based on HTS assays or mean-based predictions. A publicly accessible web interface to make predictions for any chemical of interest is available at http://toxvalue.org. CONCLUSIONS An in silico tool that can predict toxicity values with an uncertainty of an order of magnitude or less can be used to quickly and quantitatively assess risks of environmental chemicals when traditional toxicity data or human health assessments are unavailable. This tool can fill a critical gap in the risk assessment and management of data-poor chemicals. https://doi.org/10.1289/EHP2998.
Collapse
Affiliation(s)
| | - Eugene Muratov
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Alexander Sedykh
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Kathryn Z Guyton
- Monographs Section, International Agency for Research on Cancer, World Health Organization, Lyon, France
| | - Alexander Tropsha
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, USA
| | - Ivan Rusyn
- Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, Texas, USA
| | - Weihsueh A Chiu
- Department of Veterinary Integrative Biosciences, College of Veterinary Medicine and Biomedical Sciences, Texas A&M University, College Station, Texas, USA
| |
Collapse
|
29
|
Tauler R, Parastar H. Big (Bio)Chemical Data Mining Using Chemometric Methods: A Need for Chemists. Angew Chem Int Ed Engl 2018; 61:e201801134. [DOI: 10.1002/anie.201801134] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2018] [Indexed: 11/08/2022]
Affiliation(s)
- Roma Tauler
- IDAEA-CSIC Environmental Chemistry Jordi Girona 18-26 08034 Barcelona SPAIN
| | | |
Collapse
|
30
|
Abstract
Compound activity identification is the primary goal in high-throughput screening (HTS) assays. However, assay artifacts including both systematic (e.g., compound auto-fluorescence) and nonsystematic (e.g., noise) complicate activity interpretation. In addition, other than the traditional potency parameter, half-maximal effect concentration (EC50), additional activity parameters (e.g., point-of-departure, POD) could be derived from HTS data for activity profiling. A data analysis pipeline has been developed to handle the artifacts and to provide compound activity characterization with either binary or continuous metrics. This chapter outlines the steps in the pipeline using Tox21 glucocorticoid receptor (GR) β-lactamase assays, including the formats to identify either agonists or antagonists, as well as the counter-screen assays for identifying artifacts as examples. The steps can be applied to other lower-throughput assays with concentration-response data.
Collapse
|
31
|
Bureau R. Nontest Methods to Predict Acute Toxicity: State of the Art for Applications of In Silico Methods. Methods Mol Biol 2018; 1800:519-534. [PMID: 29934909 DOI: 10.1007/978-1-4939-7899-1_24] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The assessment of acute toxicity of chemicals by in silico methods is actually done by two methodologies, read-across and QSAR. The two approaches are strongly based on the similarity between the chemical for which a risk assessment is required and the reference chemical(s) for which the experimental data are known. Here, we describe the two methodologies with some main publications as illustrations and the in silico data associated with acute toxicity endpoints (ECHA, REACH) accessible via eChemPortal.
Collapse
Affiliation(s)
- Ronan Bureau
- Centre d'Etudes et de Recherche sur le Médicament de Normandie (CERMN), Normandie Univ, UNICAEN, Caen, France.
| |
Collapse
|
32
|
Grisoni F, Ballabio D, Todeschini R, Consonni V. Molecular Descriptors for Structure-Activity Applications: A Hands-On Approach. Methods Mol Biol 2018; 1800:3-53. [PMID: 29934886 DOI: 10.1007/978-1-4939-7899-1_1] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
Molecular descriptors capture diverse parts of the structural information of molecules and they are the support of many contemporary computer-assisted toxicological and chemical applications. After briefly introducing some fundamental concepts of structure-activity applications (e.g., molecular descriptor dimensionality, classical vs. fingerprint description, and activity landscapes), this chapter guides the readers through a step-by-step explanation of molecular descriptors rationale and application. To this end, the chapter illustrates a case study of a recently published application of molecular descriptors for modeling the activity on cytochrome P450.
Collapse
Affiliation(s)
- Francesca Grisoni
- Department of Earth and Environmental Sciences, Milano Chemometrics and QSAR Research Group, University of Milano-Bicocca, Milan, Italy.
| | - Davide Ballabio
- Department of Earth and Environmental Sciences, Milano Chemometrics and QSAR Research Group, University of Milano-Bicocca, Milan, Italy
| | - Roberto Todeschini
- Department of Earth and Environmental Sciences, Milano Chemometrics and QSAR Research Group, University of Milano-Bicocca, Milan, Italy
| | - Viviana Consonni
- Department of Earth and Environmental Sciences, Milano Chemometrics and QSAR Research Group, University of Milano-Bicocca, Milan, Italy
| |
Collapse
|
33
|
Witt KL, Hsieh JH, Smith-Roe SL, Xia M, Huang R, Auerbach SS, Hur J, Tice RR. Assessment of the DNA damaging potential of environmental chemicals using a quantitative high-throughput screening approach to measure p53 activation. ENVIRONMENTAL AND MOLECULAR MUTAGENESIS 2017; 58:494-507. [PMID: 28714573 PMCID: PMC5555817 DOI: 10.1002/em.22112] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/03/2017] [Revised: 06/16/2017] [Accepted: 06/19/2017] [Indexed: 05/08/2023]
Abstract
Genotoxicity potential is a critical component of any comprehensive toxicological profile. Compounds that induce DNA or chromosomal damage often activate p53, a transcription factor essential to cell cycle regulation. Thus, within the US Tox21 Program, we screened a library of ∼10,000 (∼8,300 unique) environmental compounds and drugs for activation of the p53-signaling pathway using a quantitative high-throughput screening assay employing HCT-116 cells (p53+/+ ) containing a stably integrated β-lactamase reporter gene under control of the p53 response element (p53RE). Cells were exposed (-S9) for 16 hr at 15 concentrations (generally 1.2 nM to 92 μM) three times, independently. Excluding compounds that failed analytical chemistry analysis or were suspected of inducing assay interference, 365 (4.7%) of 7,849 unique compounds were concluded to activate p53. As part of an in-depth characterization of our results, we first compared them with results from traditional in vitro genotoxicity assays (bacterial mutation, chromosomal aberration); ∼15% of known, direct-acting genotoxicants in our library activated the p53RE. Mining the Comparative Toxicogenomics Database revealed that these p53 actives were significantly associated with increased expression of p53 downstream genes involved in DNA damage responses. Furthermore, 53 chemical substructures associated with genotoxicity were enriched in certain classes of p53 actives, for example, anthracyclines (antineoplastics) and vinca alkaloids (tubulin disruptors). Interestingly, the tubulin disruptors manifested unusual nonmonotonic concentration response curves suggesting activity through a unique p53 regulatory mechanism. Through the analysis of our results, we aim to define a role for this assay as one component of a comprehensive toxicological characterization of large compound libraries. Environ. Mol. Mutagen. 58:494-507, 2017. © 2017 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Kristine L. Witt
- National Institute of Environmental Health Sciences, Division of the National Toxicology Program, Research Triangle Park, NC
- Corresponding author: Kristine L. Witt, NIEHS/DNTP, 919-541-2761,
| | | | - Stephanie L. Smith-Roe
- National Institute of Environmental Health Sciences, Division of the National Toxicology Program, Research Triangle Park, NC
| | - Menghang Xia
- National Institutes of Health Center for Advancing Translational Sciences, Bethesda, MD
| | - Ruili Huang
- National Institutes of Health Center for Advancing Translational Sciences, Bethesda, MD
| | - Scott S. Auerbach
- National Institute of Environmental Health Sciences, Division of the National Toxicology Program, Research Triangle Park, NC
| | - Junguk Hur
- Department of Biomedical Sciences, School of Medicine and Health Sciences, University of North Dakota, Grand Forks, ND
| | - Raymond R. Tice
- National Institute of Environmental Health Sciences, Division of the National Toxicology Program, Research Triangle Park, NC
| |
Collapse
|
34
|
Hsieh JH, Huang R, Lin JA, Sedykh A, Zhao J, Tice RR, Paules RS, Xia M, Auerbach SS. Real-time cell toxicity profiling of Tox21 10K compounds reveals cytotoxicity dependent toxicity pathway linkage. PLoS One 2017; 12:e0177902. [PMID: 28531190 PMCID: PMC5439695 DOI: 10.1371/journal.pone.0177902] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 05/04/2017] [Indexed: 01/01/2023] Open
Abstract
Cytotoxicity is a commonly used in vitro endpoint for evaluating chemical toxicity. In support of the U.S. Tox21 screening program, the cytotoxicity of ~10K chemicals was interrogated at 0, 8, 16, 24, 32, & 40 hours of exposure in a concentration dependent fashion in two cell lines (HEK293, HepG2) using two multiplexed, real-time assay technologies. One technology measures the metabolic activity of cells (i.e., cell viability, glo) while the other evaluates cell membrane integrity (i.e., cell death, flor). Using glo technology, more actives and greater temporal variations were seen in HEK293 cells, while results for the flor technology were more similar across the two cell types. Chemicals were grouped into classes based on their cytotoxicity kinetics profiles and these classes were evaluated for their associations with activity in the Tox21 nuclear receptor and stress response pathway assays. Some pathways, such as the activation of H2AX, were associated with the fast-responding cytotoxicity classes, while others, such as activation of TP53, were associated with the slow-responding cytotoxicity classes. By clustering pathways based on their degree of association to the different cytotoxicity kinetics labels, we identified clusters of pathways where active chemicals presented similar kinetics of cytotoxicity. Such linkages could be due to shared underlying biological processes between pathways, for example, activation of H2AX and heat shock factor. Others involving nuclear receptor activity are likely due to shared chemical structures rather than pathway level interactions. Based on the linkage between androgen receptor antagonism and Nrf2 activity, we surmise that a subclass of androgen receptor antagonists cause cytotoxicity via oxidative stress that is associated with Nrf2 activation. In summary, the real-time cytotoxicity screen provides informative chemical cytotoxicity kinetics data related to their cytotoxicity mechanisms, and with our analysis, it is possible to formulate mechanism-based hypotheses on the cytotoxic properties of the tested chemicals.
Collapse
Affiliation(s)
- Jui-Hua Hsieh
- Kelly Government Solutions, Durham, North Carolina, United States of America
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, United States of America
| | - Ja-An Lin
- US Food and Drug Administration, Silver Spring, Maryland, United States of America
| | | | - Jinghua Zhao
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, United States of America
| | - Raymond R. Tice
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina, United States of America
| | - Richard S. Paules
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina, United States of America
| | - Menghang Xia
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland, United States of America
| | - Scott S. Auerbach
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Durham, North Carolina, United States of America
| |
Collapse
|
35
|
Svensson F, Norinder U, Bender A. Modelling compound cytotoxicity using conformal prediction and PubChem HTS data. Toxicol Res (Camb) 2017; 6:73-80. [PMID: 30090478 PMCID: PMC6061930 DOI: 10.1039/c6tx00252h] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2016] [Accepted: 10/28/2016] [Indexed: 12/28/2022] Open
Abstract
The assessment of compound cytotoxicity is an important part of the drug discovery process. Accurate predictions of cytotoxicity have the potential to expedite decision making and save considerable time and effort. In this work we apply class conditional conformal prediction to model the cytotoxicity of compounds based on 16 high throughput cytotoxicity assays from PubChem. The data span 16 cell lines and comprise more than 440 000 unique compounds. The data sets are heavily imbalanced with only 0.8% of the tested compounds being cytotoxic. We trained one classification model for each cell line and validated the performance with respect to validity and accuracy. The generated models deliver high quality predictions for both toxic and non-toxic compounds despite the imbalance between the two classes. On external data collected from the same assay provider as one of the investigated cell lines the model had a sensitivity of 74% and a specificity of 65% at the 80% confidence level among the compounds assigned to a single class. Compared to previous approaches for large scale cytotoxicity modelling, this represents a balanced performance in the prediction of the toxic and non-toxic classes. The conformal prediction framework also allows the modeller to control the error frequency of the predictions, allowing predictions of cytotoxicity outcomes with confidence.
Collapse
Affiliation(s)
- Fredrik Svensson
- Centre for Molecular Informatics , Department of Chemistry , University of Cambridge , Lensfield Road , Cambridge CB2 1EW , UK .
| | - Ulf Norinder
- Swedish Toxicology Sciences Research Center , SE-151 36 Södertälje , Sweden
- Dept. Computer and Systems Sciences , Stockholm Univ. , Box 7003 , SE-164 07 Kista , Sweden
| | - Andreas Bender
- Centre for Molecular Informatics , Department of Chemistry , University of Cambridge , Lensfield Road , Cambridge CB2 1EW , UK .
| |
Collapse
|
36
|
Abstract
INTRODUCTION With the emergence of the 'big data' era, the biomedical research community has great interest in exploiting publicly available chemical information for drug discovery. PubChem is an example of public databases that provide a large amount of chemical information free of charge. AREAS COVERED This article provides an overview of how PubChem's data, tools, and services can be used for virtual screening and reviews recent publications that discuss important aspects of exploiting PubChem for drug discovery. EXPERT OPINION PubChem offers comprehensive chemical information useful for drug discovery. It also provides multiple programmatic access routes, which are essential to build automated virtual screening pipelines that exploit PubChem data. In addition, PubChemRDF allows users to download PubChem data and load them into a local computing facility, facilitating data integration between PubChem and other resources. PubChem resources have been used in many studies for developing bioactivity and toxicity prediction models, discovering polypharmacologic (multi-target) ligands, and identifying new macromolecule targets of compounds (for drug-repurposing or off-target side effect prediction). These studies demonstrate the usefulness of PubChem as a key resource for computer-aided drug discovery and related area.
Collapse
Affiliation(s)
- Sunghwan Kim
- a National Center for Biotechnology Information, National Library of Medicine , National Institutes of Health , Department of Health and Human Services, Bethesda , MD , USA
| |
Collapse
|
37
|
Zhu XW, Xin YJ, Chen QH. Chemical and in vitro biological information to predict mouse liver toxicity using recursive random forests. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2016; 27:559-572. [PMID: 27353437 DOI: 10.1080/1062936x.2016.1201142] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2016] [Accepted: 06/09/2016] [Indexed: 06/06/2023]
Abstract
In this study, recursive random forests were used to build classification models for mouse liver toxicity. The mouse liver toxicity endpoint (67 toxic and 166 non-toxic) was a composition of four in vivo chronic systemic and carcinogenic toxicity endpoints (non-proliferative, neoplastic, proliferative and gross pathology). A multiple under-sampling approach and a shifted classification threshold of 0.288 (non-toxic < 0.288 and toxic ≥ 0.288) were used to cope with the unbalanced data. Our study showed that recursive random forests are very efficient in variable selection and for the development of predictive in silico models. Generally, over 95% redundant descriptors could be reduced from modelling for all the chemical, biological and hybrid models in this study. The predictive performance of chemical models (CCR of 0.73) is comparable with hybrid model performance (CCR of 0.74). Descriptors related to the octanol-water partition coefficient are vital for model performance. The in vitro endpoint of CYP2A2 played a key role in the development and interpretation of hybrid models. Identifying high-throughput screening assays relevant to liver toxicity would be key for improving in silico models of liver toxicity.
Collapse
Affiliation(s)
- X-W Zhu
- a College of Resource and Environment, Qingdao Agricultural University , Qingdao , China
- b Qingdao Engineering Research Center for Rural Environment, Qingdao Agricultural University , Qingdao , China
| | - Y-J Xin
- a College of Resource and Environment, Qingdao Agricultural University , Qingdao , China
| | - Q-H Chen
- a College of Resource and Environment, Qingdao Agricultural University , Qingdao , China
| |
Collapse
|
38
|
Burton J, Worth AP, Tsakovska I, Diukendjieva A. In Silico Models for Acute Systemic Toxicity. Methods Mol Biol 2016; 1425:177-200. [PMID: 27311468 DOI: 10.1007/978-1-4939-3609-0_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2023]
Abstract
In this chapter, we give an overview of the regulatory requirements for acute systemic toxicity information in the European Union, and we review the availability of structure-based computational models that are available and potentially useful in the assessment of acute systemic toxicity. The most recently published literature models for acute systemic toxicity are also discussed, and perspectives for future developments in this field are offered.
Collapse
Affiliation(s)
- Julien Burton
- Systems Toxicology Unit and EURL ECVAM, Institute for Health and Consumer Protection, Joint Research Centre, European Commission, Ispra, Varese, Italy
| | - Andrew P Worth
- Systems Toxicology Unit and EURL ECVAM, Institute for Health and Consumer Protection, Joint Research Centre, European Commission, Ispra, Varese, Italy.
| | - Ivanka Tsakovska
- Department of QSAR & Molecular Modeling, Institute of Biophysics and Biomedical Engineering, Bulgarian Academy of Sciences, Sofia, Bulgaria
| | - Antonia Diukendjieva
- Department of QSAR & Molecular Modeling, Institute of Biophysics and Biomedical Engineering, Bulgarian Academy of Sciences, Sofia, Bulgaria
| |
Collapse
|
39
|
Allen CHG, Koutsoukas A, Cortés-Ciriano I, Murrell DS, Malliavin TE, Glen RC, Bender A. Improving the prediction of organism-level toxicity through integration of chemical, protein target and cytotoxicity qHTS data. Toxicol Res (Camb) 2016; 5:883-894. [PMID: 30090397 PMCID: PMC6062365 DOI: 10.1039/c5tx00406c] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2015] [Accepted: 03/01/2016] [Indexed: 12/29/2022] Open
Abstract
Prediction of compound toxicity is essential because covering the vast chemical space requiring safety assessment using traditional experimentally-based, resource-intensive techniques is impossible. However, such prediction is nontrivial due to the complex causal relationship between compound structure and in vivo harm. Protein target annotations and in vitro experimental outcomes encode relevant bioactivity information complementary to chemicals' structures. This work tests the hypothesis that utilizing three complementary types of data will afford predictive models that outperform traditional models built using fewer data types. A tripartite, heterogeneous descriptor set for 367 compounds was comprised of (a) chemical descriptors, (b) protein target descriptors generated using an algorithm trained on 190 000 ligand-protein interactions from ChEMBL, and (c) descriptors derived from in vitro cell cytotoxicity dose-response data from a panel of human cell lines. 100 random forests classification models for predicting rat LD50 were built using every combination of descriptors. Successive integration of data types improved predictive performance; models built using the full dataset had an average external correct classification rate of 0.82, compared to 0.73-0.80 for models built using two data types and 0.67-0.78 for models built using one. Pairwise comparisons of models trained on the same data showed that including a third data domain on top of chemistry improved average correct classification rate by 1.4-2.4 points, with p-values <0.01. Additionally, the approach enhanced the models' applicability domains and proved useful for generating novel mechanism hypotheses. The use of tripartite heterogeneous bioactivity datasets is a useful technique for improving toxicity prediction. Both protein target descriptors - which have the practical value of being derived in silico - and cytotoxicity descriptors derived from experiment are suitable contributors to such datasets.
Collapse
Affiliation(s)
- Chad H G Allen
- Centre for Molecular Informatics , Department of Chemistry , Lensfield Road , Cambridge CB2 1EW , UK . ; Tel: +44 (0)1223 762983
| | - Alexios Koutsoukas
- Centre for Molecular Informatics , Department of Chemistry , Lensfield Road , Cambridge CB2 1EW , UK . ; Tel: +44 (0)1223 762983
| | - Isidro Cortés-Ciriano
- Unité de Bioinformatique Structurale , Institut Pasteur and CNRS UMR 3528 , Structural Biology and Chemistry Department , Paris , France
| | - Daniel S Murrell
- Centre for Molecular Informatics , Department of Chemistry , Lensfield Road , Cambridge CB2 1EW , UK . ; Tel: +44 (0)1223 762983
| | - Thérèse E Malliavin
- Unité de Bioinformatique Structurale , Institut Pasteur and CNRS UMR 3528 , Structural Biology and Chemistry Department , Paris , France
| | - Robert C Glen
- Centre for Molecular Informatics , Department of Chemistry , Lensfield Road , Cambridge CB2 1EW , UK . ; Tel: +44 (0)1223 762983
- Department of Surgery and Cancer , Faculty of Medicine , Imperial College London , Sir Alexander Fleming Building , South Kensington Campus , London SW7 2AZ , UK
| | - Andreas Bender
- Centre for Molecular Informatics , Department of Chemistry , Lensfield Road , Cambridge CB2 1EW , UK . ; Tel: +44 (0)1223 762983
| |
Collapse
|
40
|
Kim MT, Huang R, Sedykh A, Wang W, Xia M, Zhu H. Mechanism Profiling of Hepatotoxicity Caused by Oxidative Stress Using Antioxidant Response Element Reporter Gene Assay Models and Big Data. ENVIRONMENTAL HEALTH PERSPECTIVES 2016; 124:634-41. [PMID: 26383846 PMCID: PMC4858396 DOI: 10.1289/ehp.1509763] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/27/2015] [Accepted: 09/16/2015] [Indexed: 05/18/2023]
Abstract
BACKGROUND Hepatotoxicity accounts for a substantial number of drugs being withdrawn from the market. Using traditional animal models to detect hepatotoxicity is expensive and time-consuming. Alternative in vitro methods, in particular cell-based high-throughput screening (HTS) studies, have provided the research community with a large amount of data from toxicity assays. Among the various assays used to screen potential toxicants is the antioxidant response element beta lactamase reporter gene assay (ARE-bla), which identifies chemicals that have the potential to induce oxidative stress and was used to test > 10,000 compounds from the Tox21 program. OBJECTIVE The ARE-bla computational model and HTS data from a big data source (PubChem) were used to profile environmental and pharmaceutical compounds with hepatotoxicity data. METHODS Quantitative structure-activity relationship (QSAR) models were developed based on ARE-bla data. The models predicted the potential oxidative stress response for known liver toxicants when no ARE-bla data were available. Liver toxicants were used as probe compounds to search PubChem Bioassay and generate a response profile, which contained thousands of bioassays (> 10 million data points). By ranking the in vitro-in vivo correlations (IVIVCs), the most relevant bioassay(s) related to hepatotoxicity were identified. RESULTS The liver toxicants profile contained the ARE-bla and relevant PubChem assays. Potential toxicophores for well-known toxicants were created by identifying chemical features that existed only in compounds with high IVIVCs. CONCLUSION Profiling chemical IVIVCs created an opportunity to fully explore the source-to-outcome continuum of modern experimental toxicology using cheminformatics approaches and big data sources. CITATION Kim MT, Huang R, Sedykh A, Wang W, Xia M, Zhu H. 2016. Mechanism profiling of hepatotoxicity caused by oxidative stress using antioxidant response element reporter gene assay models and big data. Environ Health Perspect 124:634-641; http://dx.doi.org/10.1289/ehp.1509763.
Collapse
Affiliation(s)
- Marlene Thai Kim
- Department of Chemistry, Rutgers University, Camden, New Jersey, USA
- Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, USA
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland, USA
| | - Alexander Sedykh
- Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, USA
- Multicase Inc., Beachwood, Ohio, USA
| | - Wenyi Wang
- Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, USA
| | - Menghang Xia
- National Center for Advancing Translational Sciences, National Institutes of Health, Department of Health and Human Services, Bethesda, Maryland, USA
| | - Hao Zhu
- Department of Chemistry, Rutgers University, Camden, New Jersey, USA
- Center for Computational and Integrative Biology, Rutgers University, Camden, New Jersey, USA
- Address correspondence to H. Zhu, 315 Penn St., Rutgers University, Camden, NJ 08102 USA. Telephone: (856) 225-6781. E-mail:
| |
Collapse
|
41
|
Ribay K, Kim MT, Wang W, Pinolini D, Zhu H. Predictive Modeling of Estrogen Receptor Binding Agents Using Advanced Cheminformatics Tools and Massive Public Data. FRONTIERS IN ENVIRONMENTAL SCIENCE 2016; 4:12. [PMID: 27642585 PMCID: PMC5023020 DOI: 10.3389/fenvs.2016.00012] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Estrogen receptors (ERα) are a critical target for drug design as well as a potential source of toxicity when activated unintentionally. Thus, evaluating potential ERα binding agents is critical in both drug discovery and chemical toxicity areas. Using computational tools, e.g., Quantitative Structure-Activity Relationship (QSAR) models, can predict potential ERα binding agents before chemical synthesis. The purpose of this project was to develop enhanced predictive models of ERα binding agents by utilizing advanced cheminformatics tools that can integrate publicly available bioassay data. The initial ERα binding agent data set, consisting of 446 binders and 8307 non-binders, was obtained from the Tox21 Challenge project organized by the NIH Chemical Genomics Center (NCGC). After removing the duplicates and inorganic compounds, this data set was used to create a training set (259 binders and 259 non-binders). This training set was used to develop QSAR models using chemical descriptors. The resulting models were then used to predict the binding activity of 264 external compounds, which were available to us after the models were developed. The cross-validation results of training set [Correct Classification Rate (CCR) = 0.72] were much higher than the external predictivity of the unknown compounds (CCR = 0.59). To improve the conventional QSAR models, all compounds in the training set were used to search PubChem and generate a profile of their biological responses across thousands of bioassays. The most important bioassays were prioritized to generate a similarity index that was used to calculate the biosimilarity score between each two compounds. The nearest neighbors for each compound within the set were then identified and its ERα binding potential was predicted by its nearest neighbors in the training set. The hybrid model performance (CCR = 0.94 for cross validation; CCR = 0.68 for external prediction) showed significant improvement over the original QSAR models, particularly for the activity cliffs that induce prediction errors. The results of this study indicate that the response profile of chemicals from public data provides useful information for modeling and evaluation purposes. The public big data resources should be considered along with chemical structure information when predicting new compounds, such as unknown ERα binding agents.
Collapse
Affiliation(s)
- Kathryn Ribay
- Department of Chemistry, Rutgers University, Camden, NJ, USA
| | - Marlene T. Kim
- Department of Chemistry, Rutgers University, Camden, NJ, USA
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, USA
| | - Wenyi Wang
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, USA
| | - Daniel Pinolini
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, USA
| | - Hao Zhu
- Department of Chemistry, Rutgers University, Camden, NJ, USA
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ, USA
- Correspondence: Hao Zhu,
| |
Collapse
|
42
|
Martin TM. Prediction of in vitro and in vivo oestrogen receptor activity using hierarchical clustering. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2016; 27:17-30. [PMID: 26784454 DOI: 10.1080/1062936x.2015.1125945] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
In this study, hierarchical clustering classification models were developed to predict in vitro and in vivo oestrogen receptor (ER) activity. Classification models were developed for binding, agonist, and antagonist in vitro ER activity and for mouse in vivo uterotrophic ER binding. In vitro classification models yielded balanced accuracies ranging from 0.65 to 0.85 for the external prediction set. In vivo ER classification models yielded balanced accuracies ranging from 0.72 to 0.83. If used as additional biological descriptors for in vivo models, in vitro scores were found to increase the prediction accuracy of in vivo ER models. If in vitro activity was used directly as a surrogate for in vivo activity, the results were poor (balanced accuracy ranged from 0.49 to 0.72). Under-sampling negative compounds in the training set was found to increase the coverage (fraction of chemicals which can be predicted) and increase prediction sensitivity.
Collapse
Affiliation(s)
- T M Martin
- a National Risk Management Research Laboratory , US Environmental Protection Agency , Cincinnati , OH , USA
| |
Collapse
|
43
|
Patlewicz G, Fitzpatrick JM. Current and Future Perspectives on the Development, Evaluation, and Application of in Silico Approaches for Predicting Toxicity. Chem Res Toxicol 2016; 29:438-51. [PMID: 26686752 DOI: 10.1021/acs.chemrestox.5b00388] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Exploiting non-testing approaches to predict toxicity early in the drug discovery development cycle is a helpful component in minimizing expensive drug failures due to toxicity being identified in late development or even during clinical trials. Changes in regulations in the industrial chemicals and cosmetics sectors in recent years have prompted a significant number of advances in the development, application, and assessment of non-testing approaches, such as (Q)SARs. Many efforts have also been undertaken to establish guiding principles for performing read-across within category and analogue approaches. This review offers a perspective, as taken from these sectors, of the current status of non-testing approaches, their evolution in light of the advances in high-throughput approaches and constructs such as adverse outcome pathways, and their potential relevance for drug discovery. It also proposes a workflow for how non-testing approaches could be practically integrated within testing and assessment strategies.
Collapse
Affiliation(s)
- Grace Patlewicz
- National Center for Computational Toxicology (NCCT), U.S. Environmental Protection Agency , Research Triangle Park, North Carolina 27711, United States
| | - Jeremy M Fitzpatrick
- National Center for Computational Toxicology (NCCT), U.S. Environmental Protection Agency , Research Triangle Park, North Carolina 27711, United States
| |
Collapse
|
44
|
Sedykh A. CurveP Method for Rendering High-Throughput Screening Dose-Response Data into Digital Fingerprints. Methods Mol Biol 2016; 1473:135-41. [PMID: 27518631 DOI: 10.1007/978-1-4939-6346-1_14] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
The nature of high-throughput screening (HTS) puts certain limits on optimal test conditions for each particular sample, therefore, on top of usual data normalization, additional parsing is often needed to account for incomplete read outs or various artifacts that arise from signal interferences.CurveP is a heuristic, user-tunable, curve-cleaning algorithm that attempts to find a minimum set of corrections, which would give a monotonic dose-response curve. After applying the corrections, the algorithm proceeds to calculate a set of numeric features, which can be used as a fingerprint characterizing the sample, or as a vector of independent variables (e.g., molecular descriptors in case of chemical substances testing). The resulting output can be a part of HTS data analysis or can be used as input for a broad spectrum of computational applications, such as Quantitative Structure-Activity Relationship (QSAR) modeling, computational toxicology, bio- and cheminformatics.
Collapse
Affiliation(s)
- Alexander Sedykh
- Multicase Inc., 23811 Chagrin Blvd., Ste 305,, Beachwood, OH, 44122, USA.
| |
Collapse
|
45
|
Dong Z, Liu Y, Duan L, Bekele D, Naidu R. Uncertainties in human health risk assessment of environmental contaminants: A review and perspective. ENVIRONMENT INTERNATIONAL 2015; 85:120-32. [PMID: 26386465 DOI: 10.1016/j.envint.2015.09.008] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/06/2015] [Revised: 08/31/2015] [Accepted: 09/02/2015] [Indexed: 05/24/2023]
Abstract
Addressing uncertainties in human health risk assessment is a critical issue when evaluating the effects of contaminants on public health. A range of uncertainties exist through the source-to-outcome continuum, including exposure assessment, hazard and risk characterisation. While various strategies have been applied to characterising uncertainty, classical approaches largely rely on how to maximise the available resources. Expert judgement, defaults and tools for characterising quantitative uncertainty attempt to fill the gap between data and regulation requirements. The experiences of researching 2,3,7,8-tetrachlorodibenzo-p-dioxin (TCDD) illustrated uncertainty sources and how to maximise available information to determine uncertainties, and thereby provide an 'adequate' protection to contaminant exposure. As regulatory requirements and recurring issues increase, the assessment of complex scenarios involving a large number of chemicals requires more sophisticated tools. Recent advances in exposure and toxicology science provide a large data set for environmental contaminants and public health. In particular, biomonitoring information, in vitro data streams and computational toxicology are the crucial factors in the NexGen risk assessment, as well as uncertainties minimisation. Although in this review we cannot yet predict how the exposure science and modern toxicology will develop in the long-term, current techniques from emerging science can be integrated to improve decision-making.
Collapse
Affiliation(s)
- Zhaomin Dong
- The Faculty of Science and Information Technology, University of Newcastle, University Drive, Callaghan, NSW 2308, Australia; Cooperative Research Centre for Contamination Assessment and Remediation of the Environment, Mawson Lakes, SA 5095, Australia
| | - Yanju Liu
- The Faculty of Science and Information Technology, University of Newcastle, University Drive, Callaghan, NSW 2308, Australia; Cooperative Research Centre for Contamination Assessment and Remediation of the Environment, Mawson Lakes, SA 5095, Australia
| | - Luchun Duan
- The Faculty of Science and Information Technology, University of Newcastle, University Drive, Callaghan, NSW 2308, Australia; Cooperative Research Centre for Contamination Assessment and Remediation of the Environment, Mawson Lakes, SA 5095, Australia
| | - Dawit Bekele
- The Faculty of Science and Information Technology, University of Newcastle, University Drive, Callaghan, NSW 2308, Australia; Cooperative Research Centre for Contamination Assessment and Remediation of the Environment, Mawson Lakes, SA 5095, Australia
| | - Ravi Naidu
- The Faculty of Science and Information Technology, University of Newcastle, University Drive, Callaghan, NSW 2308, Australia; Cooperative Research Centre for Contamination Assessment and Remediation of the Environment, Mawson Lakes, SA 5095, Australia.
| |
Collapse
|
46
|
Abdo N, Wetmore BA, Chappell GA, Shea D, Wright FA, Rusyn I. In vitro screening for population variability in toxicity of pesticide-containing mixtures. ENVIRONMENT INTERNATIONAL 2015; 85:147-55. [PMID: 26386728 PMCID: PMC4773193 DOI: 10.1016/j.envint.2015.09.012] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/19/2015] [Revised: 09/07/2015] [Accepted: 09/08/2015] [Indexed: 05/07/2023]
Abstract
Population-based human in vitro models offer exceptional opportunities for evaluating the potential hazard and mode of action of chemicals, as well as variability in responses to toxic insults among individuals. This study was designed to test the hypothesis that comparative population genomics with efficient in vitro experimental design can be used for evaluation of the potential for hazard, mode of action, and the extent of population variability in responses to chemical mixtures. We selected 146 lymphoblast cell lines from 4 ancestrally and geographically diverse human populations based on the availability of genome sequence and basal RNA-seq data. Cells were exposed to two pesticide mixtures - an environmental surface water sample comprised primarily of organochlorine pesticides and a laboratory-prepared mixture of 36 currently used pesticides - in concentration response and evaluated for cytotoxicity. On average, the two mixtures exhibited a similar range of in vitro cytotoxicity and showed considerable inter-individual variability across screened cell lines. However, when in vitro-to-in vivo extrapolation (IVIVE) coupled with reverse dosimetry was employed to convert the in vitro cytotoxic concentrations to oral equivalent doses and compared to the upper bound of predicted human exposure, we found that a nominally more cytotoxic chlorinated pesticide mixture is expected to have greater margin of safety (more than 5 orders of magnitude) as compared to the current use pesticide mixture (less than 2 orders of magnitude) due primarily to differences in exposure predictions. Multivariate genome-wide association mapping revealed an association between the toxicity of current use pesticide mixture and a polymorphism in rs1947825 in C17orf54. We conclude that a combination of in vitro human population-based cytotoxicity screening followed by dosimetric adjustment and comparative population genomics analyses enables quantitative evaluation of human health hazard from complex environmental mixtures. Additionally, such an approach yields testable hypotheses regarding potential toxicity mechanisms.
Collapse
Affiliation(s)
- Nour Abdo
- Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, NC, USA; Department of Public Health, Jordan University of Science and Technology, Ibrid, Jordan
| | - Barbara A Wetmore
- The Hamner Institutes for Health Sciences, Research Triangle Park, NC, USA
| | - Grace A Chappell
- Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, NC, USA; Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA
| | - Damian Shea
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, USA
| | - Fred A Wright
- Department of Biological Sciences, North Carolina State University, Raleigh, NC, USA; Department of Statistics and the Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA
| | - Ivan Rusyn
- Department of Environmental Sciences and Engineering, University of North Carolina, Chapel Hill, NC, USA; Department of Veterinary Integrative Biosciences, Texas A&M University, College Station, TX, USA.
| |
Collapse
|
47
|
Use of alternative assays to identify and prioritize organophosphorus flame retardants for potential developmental and neurotoxicity. Neurotoxicol Teratol 2015; 52:181-93. [DOI: 10.1016/j.ntt.2015.09.003] [Citation(s) in RCA: 129] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2015] [Revised: 09/02/2015] [Accepted: 09/02/2015] [Indexed: 12/26/2022]
|
48
|
Hsieh JH, Sedykh A, Huang R, Xia M, Tice RR. A Data Analysis Pipeline Accounting for Artifacts in Tox21 Quantitative High-Throughput Screening Assays. JOURNAL OF BIOMOLECULAR SCREENING 2015; 20:887-97. [PMID: 25904095 PMCID: PMC4568956 DOI: 10.1177/1087057115581317] [Citation(s) in RCA: 54] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/03/2014] [Accepted: 03/19/2015] [Indexed: 11/16/2022]
Abstract
A main goal of the U.S. Tox21 program is to profile a 10K-compound library for activity against a panel of stress-related and nuclear receptor signaling pathway assays using a quantitative high-throughput screening (qHTS) approach. However, assay artifacts, including nonreproducible signals and assay interference (e.g., autofluorescence), complicate compound activity interpretation. To address these issues, we have developed a data analysis pipeline that includes an updated signal noise-filtering/curation protocol and an assay interference flagging system. To better characterize various types of signals, we adopted a weighted version of the area under the curve (wAUC) to quantify the amount of activity across the tested concentration range in combination with the assay-dependent point-of-departure (POD) concentration. Based on the 32 Tox21 qHTS assays analyzed, we demonstrate that signal profiling using wAUC affords the best reproducibility (Pearson's r = 0.91) in comparison with the POD (0.82) only or the AC(50) (i.e., half-maximal activity concentration, 0.81). Among the activity artifacts characterized, cytotoxicity is the major confounding factor; on average, about 8% of Tox21 compounds are affected, whereas autofluorescence affects less than 0.5%. To facilitate data evaluation, we implemented two graphical user interface applications, allowing users to rapidly evaluate the in vitro activity of Tox21 compounds.
Collapse
Affiliation(s)
- Jui-Hua Hsieh
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, NC, USA
| | | | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | - Menghang Xia
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD, USA
| | - Raymond R Tice
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, Research Triangle Park, NC, USA
| |
Collapse
|
49
|
Chen S, Hsieh JH, Huang R, Sakamuru S, Hsin LY, Xia M, Shockley KR, Auerbach S, Kanaya N, Lu H, Svoboda D, Witt KL, Merrick BA, Teng CT, Tice RR. Cell-Based High-Throughput Screening for Aromatase Inhibitors in the Tox21 10K Library. Toxicol Sci 2015; 147:446-57. [PMID: 26141389 DOI: 10.1093/toxsci/kfv141] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Multiple mechanisms exist for endocrine disruption; one nonreceptor-mediated mechanism is via effects on aromatase, an enzyme critical for maintaining the normal in vivo balance of androgens and estrogens. We adapted the AroER tri-screen 96-well assay to 1536-well format to identify potential aromatase inhibitors (AIs) in the U.S. Tox21 10K compound library. In this assay, screening with compound alone identifies estrogen receptor alpha (ERα) agonists, screening in the presence of testosterone (T) identifies AIs and/or ERα antagonists, and screening in the presence of 17β-estradiol (E2) identifies ERα antagonists. Screening the Tox-21 library in the presence of T resulted in finding 302 potential AIs. These compounds, along with 31 known AI actives and inactives, were rescreened using all 3 assay formats. Of the 333 compounds tested, 113 (34%; 63 actives, 50 marginal actives) were considered to be potential AIs independent of cytotoxicity and ER antagonism activity. Structure-activity analysis suggested the presence of both conventional (eg, 1, 2, 4, - triazole class) and novel AI structures. Due to their novel structures, 14 of the 63 potential AI actives, including both drugs and fungicides, were selected for confirmation in the biochemical tritiated water-release aromatase assay. Ten compounds were active in the assay; the remaining 4 were only active in high-throughput screen assay, but with low efficacy. To further characterize these 10 novel AIs, we investigated their binding characteristics. The AroER tri-screen, in high-throughput format, accurately and efficiently identified chemicals in a large and diverse chemical library that selectively interact with aromatase.
Collapse
Affiliation(s)
- Shiuan Chen
- *Department of Cancer Biology, Beckman Research Institute of the City of Hope, Duarte, California 91010;
| | - Jui-Hua Hsieh
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina 27709
| | - Ruili Huang
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland 20850; and
| | - Srilatha Sakamuru
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland 20850; and
| | - Li-Yu Hsin
- *Department of Cancer Biology, Beckman Research Institute of the City of Hope, Duarte, California 91010
| | - Menghang Xia
- National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, Maryland 20850; and
| | - Keith R Shockley
- Biostatistics and Computational Biology Branch, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina 27709
| | - Scott Auerbach
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina 27709
| | - Noriko Kanaya
- *Department of Cancer Biology, Beckman Research Institute of the City of Hope, Duarte, California 91010
| | - Hannah Lu
- *Department of Cancer Biology, Beckman Research Institute of the City of Hope, Duarte, California 91010
| | - Daniel Svoboda
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina 27709
| | - Kristine L Witt
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina 27709
| | - B Alex Merrick
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina 27709
| | - Christina T Teng
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina 27709
| | - Raymond R Tice
- Division of the National Toxicology Program, National Institute of Environmental Health Sciences, National Institutes of Health, Research Triangle Park, North Carolina 27709
| |
Collapse
|
50
|
Wang W, Kim MT, Sedykh A, Zhu H. Developing Enhanced Blood-Brain Barrier Permeability Models: Integrating External Bio-Assay Data in QSAR Modeling. Pharm Res 2015; 32:3055-65. [PMID: 25862462 DOI: 10.1007/s11095-015-1687-1] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2014] [Accepted: 03/20/2015] [Indexed: 02/02/2023]
Abstract
PURPOSE Experimental Blood-Brain Barrier (BBB) permeability models for drug molecules are expensive and time-consuming. As alternative methods, several traditional Quantitative Structure-Activity Relationship (QSAR) models have been developed previously. In this study, we aimed to improve the predictivity of traditional QSAR BBB permeability models by employing relevant public bio-assay data in the modeling process. METHODS We compiled a BBB permeability database consisting of 439 unique compounds from various resources. The database was split into a modeling set of 341 compounds and a validation set of 98 compounds. Consensus QSAR modeling workflow was employed on the modeling set to develop various QSAR models. A five-fold cross-validation approach was used to validate the developed models, and the resulting models were used to predict the external validation set compounds. Furthermore, we used previously published membrane transporter models to generate relevant transporter profiles for target compounds. The transporter profiles were used as additional biological descriptors to develop hybrid QSAR BBB models. RESULTS The consensus QSAR models have R(2) = 0.638 for five-fold cross-validation and R(2) = 0.504 for external validation. The consensus model developed by pooling chemical and transporter descriptors showed better predictivity (R(2) = 0.646 for five-fold cross-validation and R(2) = 0.526 for external validation). Moreover, several external bio-assays that correlate with BBB permeability were identified using our automatic profiling tool. CONCLUSIONS The BBB permeability models developed in this study can be useful for early evaluation of new compounds (e.g., new drug candidates). The combination of chemical and biological descriptors shows a promising direction to improve the current traditional QSAR models.
Collapse
Affiliation(s)
- Wenyi Wang
- The Rutgers Center for Computational and Integrative Biology, Camden, New Jersey, 08102, USA
| | | | | | | |
Collapse
|