1
|
Li P, Hua L, Ma Z, Hu W, Liu Y, Zhu J. Conformalized Graph Learning for Molecular ADMET Property Prediction and Reliable Uncertainty Quantification. J Chem Inf Model 2024; 64:8705-8717. [PMID: 39571080 DOI: 10.1021/acs.jcim.4c01139] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/10/2024]
Abstract
Drug discovery and development is a complex and costly process, with a substantial portion of the expense dedicated to characterizing the absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties of new drug candidates. While the advent of deep learning and molecular graph neural networks (GNNs) has significantly enhanced in silico ADMET prediction capabilities, reliably quantifying prediction uncertainty remains a critical challenge. The performance of GNNs is influenced by both the volume and the quality of the data. Hence, determining the reliability and extent of a prediction is as crucial as achieving accurate predictions, especially for out-of-domain (OoD) compounds. This paper introduces a novel GNN model called conformalized fusion regression (CFR). CFR combined a GNN model with a joint mean-quantile regression loss and an ensemble-based conformal prediction (CP) method. Through rigorous evaluation across various ADMET tasks, we demonstrate that our framework provides accurate predictions, reliable probability calibration, and high-quality prediction intervals, outperforming existing uncertainty quantification methods.
Collapse
Affiliation(s)
- Peiyao Li
- Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
- Molecular Science, BeiGene (Beijing) Inc., Beijing 102206, China
| | - Lan Hua
- Molecular Science, BeiGene (Beijing) Inc., Beijing 102206, China
| | - Zhechao Ma
- Department of Computer Science and Technology, Hefei University of Technology, Hefei 230009, China
| | - Wenbo Hu
- Department of Computer Science and Technology, Hefei University of Technology, Hefei 230009, China
| | - Ye Liu
- Molecular Science, BeiGene (Beijing) Inc., Beijing 102206, China
| | - Jun Zhu
- Department of Computer Science and Technology, Tsinghua University, Beijing 100084, China
| |
Collapse
|
2
|
Liang T, Liu W, Tan K, Wu A, Lu X. Advancing Ionic Liquid Research with pSCNN: A Novel Approach for Accurate Normal Melting Temperature Predictions. ACS OMEGA 2024; 9:31694-31702. [PMID: 39072063 PMCID: PMC11270577 DOI: 10.1021/acsomega.4c02393] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/11/2024] [Revised: 04/12/2024] [Accepted: 06/25/2024] [Indexed: 07/30/2024]
Abstract
Ionic liquids (ILs), known for their distinct and tunable properties, offer a broad spectrum of potential applications across various fields, including chemistry, materials science, and energy storage. However, practical applications of ILs are often limited by their unfavorable physicochemical properties. Experimental screening becomes impractical due to the vast number of potential IL combinations. Therefore, the development of a robust and efficient model for predicting the IL properties is imperative. As the defining feature, it is of practice significance to establish an accurate yet efficient model to predict the normal melting point of IL (T m), which may facilitate the discovery and design of novel ILs for specific applications. In this study, we presented a pseudo-Siamese convolution neural network (pSCNN) inspired by SCNN and focused on the T m. Utilizing a data set of 3098 ILs, we systematically assess various deep learning models (ANN, pSCNN, and Transformer-CNF), along with molecular descriptors (ECFP fingerprint and Mordred properties), for their performance in predicting the T m of ILs. Remarkably, among the investigated modeling schemes, the pSCNN, coupled with filtered Mordred descriptors, demonstrates superior performance, yielding mean absolute error (MAE) and root-mean-square error (RMSE) values of 24.36 and 31.56 °C, respectively. Feature analysis further highlights the effectiveness of the pSCNN model. Moreover, the pSCNN method, with a pair of inputs, can be extended beyond ionic liquid melting point prediction.
Collapse
Affiliation(s)
- Tao Liang
- State Key Laboratory of Physical
Chemistry of Solid Surface, Fujian Provincial Key Laboratory for Theoretical
and Computational Chemistry, Departmental of Chemistry, College of
Chemistry and Chemical Engineering, Xiamen
University, Xiamen 361005, P. R. China
| | - Wei Liu
- State Key Laboratory of Physical
Chemistry of Solid Surface, Fujian Provincial Key Laboratory for Theoretical
and Computational Chemistry, Departmental of Chemistry, College of
Chemistry and Chemical Engineering, Xiamen
University, Xiamen 361005, P. R. China
| | - Kai Tan
- State Key Laboratory of Physical
Chemistry of Solid Surface, Fujian Provincial Key Laboratory for Theoretical
and Computational Chemistry, Departmental of Chemistry, College of
Chemistry and Chemical Engineering, Xiamen
University, Xiamen 361005, P. R. China
| | - Anan Wu
- State Key Laboratory of Physical
Chemistry of Solid Surface, Fujian Provincial Key Laboratory for Theoretical
and Computational Chemistry, Departmental of Chemistry, College of
Chemistry and Chemical Engineering, Xiamen
University, Xiamen 361005, P. R. China
| | - Xin Lu
- State Key Laboratory of Physical
Chemistry of Solid Surface, Fujian Provincial Key Laboratory for Theoretical
and Computational Chemistry, Departmental of Chemistry, College of
Chemistry and Chemical Engineering, Xiamen
University, Xiamen 361005, P. R. China
| |
Collapse
|
3
|
Dutschmann TM, Schlenker V, Baumann K. Chemoinformatic regression methods and their applicability domain. Mol Inform 2024; 43:e202400018. [PMID: 38803302 DOI: 10.1002/minf.202400018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2024] [Revised: 03/24/2024] [Accepted: 03/25/2024] [Indexed: 05/29/2024]
Abstract
The growing interest in chemoinformatic model uncertainty calls for a summary of the most widely used regression techniques and how to estimate their reliability. Regression models learn a mapping from the space of explanatory variables to the space of continuous output values. Among other limitations, the predictive performance of the model is restricted by the training data used for model fitting. Identification of unusual objects by outlier detection methods can improve model performance. Additionally, proper model evaluation necessitates defining the limitations of the model, often called the applicability domain. Comparable to certain classifiers, some regression techniques come with built-in methods or augmentations to quantify their (un)certainty, while others rely on generic procedures. The theoretical background of their working principles and how to deduce specific and general definitions for their domain of applicability shall be explained.
Collapse
Affiliation(s)
- Thomas-Martin Dutschmann
- Institute of Medicinal and Pharmaceutical Chemistry, University of Technology Braunschweig, 38106, Braunschweig, Germany
| | - Valerie Schlenker
- Institute of Medicinal and Pharmaceutical Chemistry, University of Technology Braunschweig, 38106, Braunschweig, Germany
| | - Knut Baumann
- Institute of Medicinal and Pharmaceutical Chemistry, University of Technology Braunschweig, 38106, Braunschweig, Germany
| |
Collapse
|
4
|
Tran-Nguyen VK, Junaid M, Simeon S, Ballester PJ. A practical guide to machine-learning scoring for structure-based virtual screening. Nat Protoc 2023; 18:3460-3511. [PMID: 37845361 DOI: 10.1038/s41596-023-00885-w] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 07/03/2023] [Indexed: 10/18/2023]
Abstract
Structure-based virtual screening (SBVS) via docking has been used to discover active molecules for a range of therapeutic targets. Chemical and protein data sets that contain integrated bioactivity information have increased both in number and in size. Artificial intelligence and, more concretely, its machine-learning (ML) branch, including deep learning, have effectively exploited these data sets to build scoring functions (SFs) for SBVS against targets with an atomic-resolution 3D model (e.g., generated by X-ray crystallography or predicted by AlphaFold2). Often outperforming their generic and non-ML counterparts, target-specific ML-based SFs represent the state of the art for SBVS. Here, we present a comprehensive and user-friendly protocol to build and rigorously evaluate these new SFs for SBVS. This protocol is organized into four sections: (i) using a public benchmark of a given target to evaluate an existing generic SF; (ii) preparing experimental data for a target from public repositories; (iii) partitioning data into a training set and a test set for subsequent target-specific ML modeling; and (iv) generating and evaluating target-specific ML SFs by using the prepared training-test partitions. All necessary code and input/output data related to three example targets (acetylcholinesterase, HMG-CoA reductase, and peroxisome proliferator-activated receptor-α) are available at https://github.com/vktrannguyen/MLSF-protocol , can be run by using a single computer within 1 week and make use of easily accessible software/programs (e.g., Smina, CNN-Score, RF-Score-VS and DeepCoy) and web resources. Our aim is to provide practical guidance on how to augment training data to enhance SBVS performance, how to identify the most suitable supervised learning algorithm for a data set, and how to build an SF with the highest likelihood of discovering target-active molecules within a given compound library.
Collapse
Affiliation(s)
| | - Muhammad Junaid
- Centre de Recherche en Cancérologie de Marseille, Marseille, France
| | - Saw Simeon
- Centre de Recherche en Cancérologie de Marseille, Marseille, France
| | | |
Collapse
|
5
|
Abstract
The problem of human trust is one of the most fundamental problems in applied artificial intelligence in drug discovery. In silico models have been widely used to accelerate the process of drug discovery in recent years. However, most of these models can only give reliable predictions within a limited chemical space that the training set covers (applicability domain). Predictions of samples falling outside the applicability domain are unreliable and sometimes dangerous for the drug-design decision-making process. Uncertainty quantification accordingly has drawn great attention to enable autonomous drug designing. By quantifying the confidence level of model predictions, the reliability of the predictions can be quantitatively represented to assist researchers in their molecular reasoning and experimental design. Here we summarize the state-of-the-art approaches to uncertainty quantification and underline how they can be used for drug design and discovery projects. Furthermore, we also outline four representative application scenarios of uncertainty quantification in drug discovery.
Collapse
Affiliation(s)
- Jie Yu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Dingyan Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing 100049, China
| |
Collapse
|
6
|
Huang DZ, Baber JC, Bahmanyar SS. The challenges of generalizability in artificial intelligence for ADME/Tox endpoint and activity prediction. Expert Opin Drug Discov 2021; 16:1045-1056. [PMID: 33739897 DOI: 10.1080/17460441.2021.1901685] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
INTRODUCTION Artificial intelligence (AI) has seen a massive resurgence in recent years with wide successes in computer vision, natural language processing, and games. The similar creation of robust and accurate AI models for ADME/Tox endpoint and activity prediction would be revolutionary to drug discovery pipelines. There have been numerous demonstrations of successful applications, but a key challenge remains: how generalizable are these predictive models? AREAS COVERED The authors present a summary of current promising components of AI models in the context of early drug discovery where ADME/Tox endpoint and activity prediction is the main driver of the iterative drug design process. Following that is a review of applicability domains and dataset construction considerations which determine generalizability bottlenecks for AI deployment. Further reviewed is the role of promising learning frameworks - multitask, transfer, and meta learning - which leverage auxiliary data to overcome issues of generalizability. EXPERT OPINION The authors conclude that the most promising direction toward integrating reliable and informative AI models into the drug discovery pipeline is a conjunction of learned feature representations, deep learning, and novel learning frameworks. Such a solution would address the sparse and incomplete datasets that are available for key endpoints related to drug discovery.
Collapse
Affiliation(s)
| | - J Christian Baber
- Scientific Informatics, Global Head of Scientific Informatics, Scientific Informatics, Takeda Pharmaceuticals, Cambridge, MA, USA
| | - Sogole Sami Bahmanyar
- Computational Chemistry, Director of Computational Sciences, Computational Chemistry, Takeda Pharmaceuticals, San Diego, USA
| |
Collapse
|
7
|
Brown BP, Mendenhall J, Geanes AR, Meiler J. General Purpose Structure-Based Drug Discovery Neural Network Score Functions with Human-Interpretable Pharmacophore Maps. J Chem Inf Model 2021; 61:603-620. [PMID: 33496578 PMCID: PMC7903419 DOI: 10.1021/acs.jcim.0c01001] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2020] [Indexed: 12/20/2022]
Abstract
The BioChemical Library (BCL) is an academic open-source cheminformatics toolkit comprising ligand-based virtual high-throughput screening (vHTS) tools such as quantitative structure-activity/property relationship (QSAR/QSPR) modeling, small molecule flexible alignment, small molecule conformer generation, and more. Here, we expand the capabilities of the BCL to include structure-based virtual screening. We introduce two new score functions, BCL-AffinityNet and BCL-DockANNScore, based on novel distance-dependent signed protein-ligand atomic property correlations. Both metrics are conventional feed-forward dropout neural networks trained on the new descriptors. We demonstrate that BCL-AffinityNet is one of the top performing score functions on the comparative assessment of score functions 2016 affinity prediction and affinity ranking tasks. We also demonstrate that BCL-AffinityNet performs well on the CSAR-NRC HiQ I and II test sets. Furthermore, we demonstrate that BCL-DockANNScore is competitive with multiple state-of-the-art methods on the docking power and screening power tasks. Finally, we show how our models can be decomposed into human-interpretable pharmacophore maps to aid in hit/lead optimization. Altogether, our results expand the utility of the BCL for structure-based scoring to aid small molecule discovery and design. BCL-AffinityNet, BCL-DockANNScore, and the pharmacophore mapping application, as well as the remainder of the BCL cheminformatics toolkit, are freely available with an academic license at the BCL Commons site hosted on http://meilerlab.org/.
Collapse
Affiliation(s)
- Benjamin P. Brown
- Chemical
and Physical Biology Program, Medical Scientist Training Program,
Center for Structural Biology, Vanderbilt
University, Nashville, Tennessee 37232, United States
| | - Jeffrey Mendenhall
- Department
of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, Tennessee 37232, United States
| | - Alexander R. Geanes
- Department
of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, Tennessee 37232, United States
| | - Jens Meiler
- Department
of Chemistry, Center for Structural Biology, Vanderbilt University, Nashville, Tennessee 37232, United States
- Departments
of Pharmacology and Biomedical Informatics, Vanderbilt University, Nashville, Tennessee 37212, United States
- Institute
for Drug Discovery, Leipzig University Medical
School, Leipzig SAC 04103, Germany
| |
Collapse
|
8
|
Lazic SE, Williams DP. Improving drug safety predictions by reducing poor analytical practices. TOXICOLOGY RESEARCH AND APPLICATION 2020. [DOI: 10.1177/2397847320978633] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022] Open
Abstract
Predicting the safety of a drug from preclinical data is a major challenge in drug discovery, and progressing an unsafe compound into the clinic puts patients at risk and wastes resources. In drug safety pharmacology and related fields, methods and analytical decisions known to provide poor predictions are common and include creating arbitrary thresholds, binning continuous values, giving all assays equal weight, and multiple reuse of information. In addition, the metrics used to evaluate models often omit important criteria and models’ performance on new data are often not assessed rigorously. Prediction models with these problems are unlikely to perform well, and published models suffer from many of these issues. We describe these problems in detail, demonstrate their negative consequences, and propose simple solutions that are standard in other disciplines where predictive modelling is used.
Collapse
Affiliation(s)
| | - Dominic P Williams
- Functional and Mechanistic Safety, Clinical Pharmacology and Safety Sciences, AstraZeneca, R&D, Cambridge, UK
| |
Collapse
|
9
|
Wilm A, Norinder U, Agea MI, de Bruyn Kops C, Stork C, Kühnl J, Kirchmair J. Skin Doctor CP: Conformal Prediction of the Skin Sensitization Potential of Small Organic Molecules. Chem Res Toxicol 2020; 34:330-344. [PMID: 33295759 PMCID: PMC7887802 DOI: 10.1021/acs.chemrestox.0c00253] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
Skin sensitization potential or potency is an important end point in the safety assessment of new chemicals and new chemical mixtures. Formerly, animal experiments such as the local lymph node assay (LLNA) were the main form of assessment. Today, however, the focus lies on the development of nonanimal testing approaches (i.e., in vitro and in chemico assays) and computational models. In this work, we investigate, based on publicly available LLNA data, the ability of aggregated, Mondrian conformal prediction classifiers to differentiate between non- sensitizing and sensitizing compounds as well as between two levels of skin sensitization potential (weak to moderate sensitizers, and strong to extreme sensitizers). The advantage of the conformal prediction framework over other modeling approaches is that it assigns compounds to activity classes only if a defined minimum level of confidence is reached for the individual predictions. This eliminates the need for applicability domain criteria that often are arbitrary in their nature and less flexible. Our new binary classifier, named Skin Doctor CP, differentiates nonsensitizers from sensitizers with a higher reliability-to-efficiency ratio than the corresponding nonconformal prediction workflow that we presented earlier. When tested on a set of 257 compounds at the significance levels of 0.10 and 0.30, the model reached an efficiency of 0.49 and 0.92, and an accuracy of 0.83 and 0.75, respectively. In addition, we developed a ternary classification workflow to differentiate nonsensitizers, weak to moderate sensitizers, and strong to extreme sensitizers. Although this model achieved satisfactory overall performance (accuracies of 0.90 and 0.73, and efficiencies of 0.42 and 0.90, at significance levels 0.10 and 0.30, respectively), it did not obtain satisfying class-wise results (at a significance level of 0.30, the validities obtained for nonsensitizers, weak to moderate sensitizers, and strong to extreme sensitizers were 0.70, 0.58, and 0.63, respectively). We argue that the model is, in consequence, unable to reliably identify strong to extreme sensitizers and suggest that other ternary models derived from the currently accessible LLNA data might suffer from the same problem. Skin Doctor CP is available via a public web service at https://nerdd.zbh.uni-hamburg.de/skinDoctorII/.
Collapse
Affiliation(s)
- Anke Wilm
- Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, 20146 Hamburg, Germany.,HITeC e.V., 22527 Hamburg, Germany
| | - Ulf Norinder
- Department of Computer and Systems Sciences, Stockholm University, SE-16407 Kista, Sweden.,Department of Pharmaceutical Biosciences, Uppsala University, SE-75124 Uppsala, Sweden.,MTM Research Centre, School of Science and Technology, Örebro University, SE-70182 Örebro, Sweden
| | - M Isabel Agea
- Department of Informatics and Chemistry, University of Chemistry and Technology Prague, 16628 Prague, Czech Republic
| | - Christina de Bruyn Kops
- Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, 20146 Hamburg, Germany
| | - Conrad Stork
- Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, 20146 Hamburg, Germany
| | - Jochen Kühnl
- Front End Innovation, Beiersdorf AG, 22529 Hamburg, Germany
| | - Johannes Kirchmair
- Center for Bioinformatics (ZBH), Department of Informatics, Universität Hamburg, 20146 Hamburg, Germany.,Department of Pharmaceutical Chemistry, University of Vienna, 1090 Vienna, Austria
| |
Collapse
|
10
|
Mervin LH, Johansson S, Semenova E, Giblin KA, Engkvist O. Uncertainty quantification in drug design. Drug Discov Today 2020; 26:474-489. [PMID: 33253918 DOI: 10.1016/j.drudis.2020.11.027] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Revised: 07/13/2020] [Accepted: 11/23/2020] [Indexed: 01/03/2023]
Abstract
Machine learning and artificial intelligence are increasingly being applied to the drug-design process as a result of the development of novel algorithms, growing access, the falling cost of computation and the development of novel technologies for generating chemically and biologically relevant data. There has been recent progress in fields such as molecular de novo generation, synthetic route prediction and, to some extent, property predictions. Despite this, most research in these fields has focused on improving the accuracy of the technologies, rather than on quantifying the uncertainty in the predictions. Uncertainty quantification will become a key component in autonomous decision making and will be crucial for integrating machine learning and chemistry automation to create an autonomous design-make-test-analyse cycle. This review covers the empirical, frequentist and Bayesian approaches to uncertainty quantification, and outlines how they can be used for drug design. We also outline the impact of uncertainty quantification on decision making.
Collapse
Affiliation(s)
- Lewis H Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK.
| | - Simon Johansson
- Department of Computer Science and Engineering, Chalmers University of Technology, Gothenburg, Sweden; Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Elizaveta Semenova
- Data Sciences and Quantitative Biology, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Kathryn A Giblin
- Medicinal Chemistry, Research and Early Development, Oncology R&D, AstraZeneca, Cambridge, UK
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| |
Collapse
|
11
|
Nolte TM, Nauser T, Gubler L, Hendriks AJ, Peijnenburg WJGM. Thermochemical unification of molecular descriptors to predict radical hydrogen abstraction with low computational cost. Phys Chem Chem Phys 2020; 22:23215-23225. [PMID: 33029596 DOI: 10.1039/d0cp03750h] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Chemistry describes transformation of matter with reaction equations and corresponding rate constants. However, accurate rate constants are not always easy to get. Here we focus on radical oxidation reactions. Analysis of over 500 published rate constants of hydroxyl radicals led us to hypothesize that a modified linear free-energy relationship (LFER) could be used to predict rate constants speedily, reliably and accurately. LFERs correlate the Gibbs activation-energy with the Gibbs energy of reaction. We calculated the latter as the sum of one-electron transfer and, if appropriate, proton transfer. We parametrized specific transition state effects to orbital delocalizability and the polarity of the reactant. The calculation time for 500 reactions is less than 8 hours on a standard desktop-PC. Rate constants were also calculated for hydrogen and methyl radicals; these controls show that the predictions are applicable to a broader set of oxidizing radicals. An accuracy of 30-40% (standard deviation) with reference to reported experimental values was found suitable for the screening of complex chemical systems for possibly relevant reactions. In particular, potentially relevant reactions can be singled out and scrutinized in detail when prioritizing chemicals for environmental risk assessment.
Collapse
Affiliation(s)
- Tom M Nolte
- Eidgenössische Technische Hochschule (ETH) Zurich, Laboratory of Inorganic Chemistry, Vladimir-Prelog-Weg 1, 8093 Zurich, Switzerland.
| | | | | | | | | |
Collapse
|
12
|
Gilmour N, Kern PS, Alépée N, Boislève F, Bury D, Clouet E, Hirota M, Hoffmann S, Kühnl J, Lalko JF, Mewes K, Miyazawa M, Nishida H, Osmani A, Petersohn D, Sekine S, van Vliet E, Klaric M. Development of a next generation risk assessment framework for the evaluation of skin sensitisation of cosmetic ingredients. Regul Toxicol Pharmacol 2020; 116:104721. [DOI: 10.1016/j.yrtph.2020.104721] [Citation(s) in RCA: 26] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Revised: 06/16/2020] [Accepted: 06/19/2020] [Indexed: 12/17/2022]
|
13
|
Rifai EA, van Dijk M, Geerke DP. Recent Developments in Linear Interaction Energy Based Binding Free Energy Calculations. Front Mol Biosci 2020; 7:114. [PMID: 32626725 PMCID: PMC7311763 DOI: 10.3389/fmolb.2020.00114] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2019] [Accepted: 05/14/2020] [Indexed: 11/13/2022] Open
Abstract
The linear interaction energy (LIE) approach is an end-point method to compute binding affinities. As such it combines explicit conformational sampling (of the protein-bound and unbound-ligand states) with efficiency in calculating values for the protein-ligand binding free energy ΔG bind . This perspective summarizes our recent efforts to use molecular simulation and empirically calibrated LIE models for accurate and efficient calculation of ΔG bind for diverse sets of compounds binding to flexible proteins (e.g., Cytochrome P450s and other proteins of direct pharmaceutical or biochemical interest). Such proteins pose challenges on ΔG bind computation, which we tackle using a previously introduced statistically weighted LIE scheme. Because calibrated LIE models require empirical fitting of scaling parameters, they need to be accompanied with an applicability domain (AD) definition to provide a measure of confidence for predictions for arbitrary query compounds within a reference frame defined by a collective chemical and interaction space. To enable AD assessment of LIE predictions (or other protein-structure and -dynamic based ΔG bind calculations) we recently introduced strategies for AD assignment of LIE models, based on simulation and training data only. These strategies are reviewed here as well, together with available tools to facilitate and/or automate LIE computation (including software for combined statistically-weighted LIE calculations and AD assessment).
Collapse
Affiliation(s)
- Eko Aditya Rifai
- AIMMS Division of Molecular and Computational Toxicology, Department of Chemistry and Pharmaceutical Sciences, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Marc van Dijk
- AIMMS Division of Molecular and Computational Toxicology, Department of Chemistry and Pharmaceutical Sciences, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| | - Daan P Geerke
- AIMMS Division of Molecular and Computational Toxicology, Department of Chemistry and Pharmaceutical Sciences, Vrije Universiteit Amsterdam, Amsterdam, Netherlands
| |
Collapse
|
14
|
Minerali E, Foil DH, Zorn KM, Lane TR, Ekins S. Comparing Machine Learning Algorithms for Predicting Drug-Induced Liver Injury (DILI). Mol Pharm 2020; 17:2628-2637. [PMID: 32422053 DOI: 10.1021/acs.molpharmaceut.0c00326] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Drug-induced liver injury (DILI) is one the most unpredictable adverse reactions to xenobiotics in humans and the leading cause of postmarketing withdrawals of approved drugs. To date, these drugs have been collated by the FDA to form the DILIRank database, which classifies DILI severity and potential. These classifications have been used by various research groups in generating computational predictions for this type of liver injury. Recently, groups from Pfizer and AstraZeneca have collated DILI in vitro data and physicochemical properties for compounds that can be used along with data from the FDA to build machine learning models for DILI. In this study, we have used these data sets, as well as the Biopharmaceutics Drug Disposition Classification System data set, to generate Bayesian machine learning models with our in-house software, Assay Central. The performance of all machine learning models was assessed through both the internal 5-fold cross-validation metrics and prediction accuracy of an external test set of compounds with known hepatotoxicity. The best-performing Bayesian model was based on the DILI-concern category from the DILIRank database with an ROC of 0.814, a sensitivity of 0.741, a specificity of 0.755, and an accuracy of 0.746. A comparison of alternative machine learning algorithms, such as k-nearest neighbors, support vector classification, AdaBoosted decision trees, and deep learning methods, produced similar statistics to those generated with the Bayesian algorithm in Assay Central. This study demonstrates machine learning models grouped in a tool called MegaTox that can be used to predict early-stage clinical compounds, as well as recent FDA-approved drugs, to identify potential DILI.
Collapse
Affiliation(s)
- Eni Minerali
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Daniel H Foil
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Kimberley M Zorn
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Thomas R Lane
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Sean Ekins
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| |
Collapse
|
15
|
Amberg A, Anger LT, Bercu J, Bower D, Cross KP, Custer L, Harvey JS, Hasselgren C, Honma M, Johnson C, Jolly R, Kenyon MO, Kruhlak NL, Leavitt P, Quigley DP, Miller S, Snodin D, Stavitskaya L, Teasdale A, Trejo-Martin A, White AT, Wichard J, Myatt GJ. Extending (Q)SARs to incorporate proprietary knowledge for regulatory purposes: is aromatic N-oxide a structural alert for predicting DNA-reactive mutagenicity? Mutagenesis 2019; 34:67-82. [PMID: 30189015 DOI: 10.1093/mutage/gey020] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2018] [Revised: 07/02/2018] [Accepted: 07/28/2018] [Indexed: 11/13/2022] Open
Abstract
(Quantitative) structure-activity relationship or (Q)SAR predictions of DNA-reactive mutagenicity are important to support both the design of new chemicals and the assessment of impurities, degradants, metabolites, extractables and leachables, as well as existing chemicals. Aromatic N-oxides represent a class of compounds that are often considered alerting for mutagenicity yet the scientific rationale of this structural alert is not clear and has been questioned. Because aromatic N-oxide-containing compounds may be encountered as impurities, degradants and metabolites, it is important to accurately predict mutagenicity of this chemical class. This article analysed a series of publicly available aromatic N-oxide data in search of supporting information. The article also used a previously developed structure-activity relationship (SAR) fingerprint methodology where a series of aromatic N-oxide substructures was generated and matched against public and proprietary databases, including pharmaceutical data. An assessment of the number of mutagenic and non-mutagenic compounds matching each substructure across all sources was used to understand whether the general class or any specific subclasses appear to lead to mutagenicity. This analysis resulted in a downgrade of the general aromatic N-oxide alert. However, it was determined there were enough public and proprietary data to assign the quindioxin and related chemicals as well as benzo[c][1,2,5]oxadiazole 1-oxide subclasses as alerts. The overall results of this analysis were incorporated into Leadscope's expert-rule-based model to enhance its predictive accuracy.
Collapse
Affiliation(s)
- Alexander Amberg
- Sanofi, R&D Preclinical Safety Frankfurt, Industriepark Höchst, Frankfurt am Main, Germany
| | - Lennart T Anger
- Sanofi, R&D Preclinical Safety Frankfurt, Industriepark Höchst, Frankfurt am Main, Germany
| | - Joel Bercu
- Gilead Sciences, Nonclinical Safety and Pathobiology, Foster City, CA, USA
| | | | | | - Laura Custer
- Bristol-Myers Squibb, Drug Safety Evaluation, New Brunswick, NJ, USA
| | - James S Harvey
- GlaxoSmithKline Pre-Clinical Development, Ware, Hertfordshire, UK
| | | | - Masamitsu Honma
- National Institute of Health Sciences, Division of Genetics & Mutagenesis, Kamiyoga, Setagaya-ku, Tokyo, Japan
| | | | - Robert Jolly
- Toxicology Division, Eli Lilly and Company, Indianapolis, IN, USA
| | - Michelle O Kenyon
- Pfizer Worldwide Research and Development, Drug Safety, Genetic Toxicology, Groton, CT, USA
| | - Naomi L Kruhlak
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research, Silver Spring, MD, USA
| | - Penny Leavitt
- Bristol-Myers Squibb, Drug Safety Evaluation, New Brunswick, NJ, USA
| | | | | | | | - Lidiya Stavitskaya
- U.S. Food and Drug Administration, Center for Drug Evaluation and Research, Silver Spring, MD, USA
| | - Andrew Teasdale
- AstraZeneca, Pharmaceutical Technology and Development, Macclesfield, Cheshire, UK
| | | | - Angela T White
- GlaxoSmithKline Pre-Clinical Development, Ware, Hertfordshire, UK
| | - Joerg Wichard
- Bayer AG, Pharmaceuticals Division, Investigational Toxicology, Muellerstr, Berlin, Germany
| | | |
Collapse
|
16
|
Amberg A, Andaya RV, Anger LT, Barber C, Beilke L, Bercu J, Bower D, Brigo A, Cammerer Z, Cross KP, Custer L, Dobo K, Gerets H, Gervais V, Glowienke S, Gomez S, Van Gompel J, Harvey J, Hasselgren C, Honma M, Johnson C, Jolly R, Kemper R, Kenyon M, Kruhlak N, Leavitt P, Miller S, Muster W, Naven R, Nicolette J, Parenty A, Powley M, Quigley DP, Reddy MV, Sasaki JC, Stavitskaya L, Teasdale A, Trejo-Martin A, Weiner S, Welch DS, White A, Wichard J, Woolley D, Myatt GJ. Principles and procedures for handling out-of-domain and indeterminate results as part of ICH M7 recommended (Q)SAR analyses. Regul Toxicol Pharmacol 2019; 102:53-64. [PMID: 30562600 PMCID: PMC7500704 DOI: 10.1016/j.yrtph.2018.12.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2018] [Revised: 12/10/2018] [Accepted: 12/14/2018] [Indexed: 02/08/2023]
Abstract
The International Council for Harmonization (ICH) M7 guideline describes a hazard assessment process for impurities that have the potential to be present in a drug substance or drug product. In the absence of adequate experimental bacterial mutagenicity data, (Q)SAR analysis may be used as a test to predict impurities' DNA reactive (mutagenic) potential. However, in certain situations, (Q)SAR software is unable to generate a positive or negative prediction either because of conflicting information or because the impurity is outside the applicability domain of the model. Such results present challenges in generating an overall mutagenicity prediction and highlight the importance of performing a thorough expert review. The following paper reviews pharmaceutical and regulatory experiences handling such situations. The paper also presents an analysis of proprietary data to help understand the likelihood of misclassifying a mutagenic impurity as non-mutagenic based on different combinations of (Q)SAR results. This information may be taken into consideration when supporting the (Q)SAR results with an expert review, especially when out-of-domain results are generated during a (Q)SAR evaluation.
Collapse
Affiliation(s)
- Alexander Amberg
- Sanofi, R&D Preclinical Safety Frankfurt, Industriepark Hoechst, D-65926, Frankfurt am Main, Germany
| | | | - Lennart T Anger
- Sanofi, R&D Preclinical Safety Frankfurt, Industriepark Hoechst, D-65926, Frankfurt am Main, Germany
| | | | - Lisa Beilke
- Toxicology Solutions Inc., San Diego, CA, USA
| | - Joel Bercu
- Gilead Sciences, 333 Lakeside Drive, Foster City, CA, USA
| | - Dave Bower
- Leadscope, Inc., 1393 Dublin Rd, Columbus, OH, 43215, USA
| | - Alessandro Brigo
- Roche Pharmaceutical Research & Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Switzerland
| | - Zoryanna Cammerer
- Janssen Research & Development, 1400 McKean Road, Spring House, PA, 19477, USA
| | - Kevin P Cross
- Leadscope, Inc., 1393 Dublin Rd, Columbus, OH, 43215, USA
| | - Laura Custer
- Bristol-Myers Squibb, Drug Safety Evaluation, 1 Squibb Dr, New Brunswick, NJ, 08903, USA
| | - Krista Dobo
- Pfizer Global Research & Development, 558 Eastern Point Road, Groton, CT, 06340, USA
| | - Helga Gerets
- UCB Biopharma SPRL, Chemin du Foriest, B-1420, Braine-l'Alleud, Belgium
| | | | - Susanne Glowienke
- Novartis Pharma AG, Pre-Clinical Safety, Werk Klybeck, CH-4057, Basel, Switzerland
| | - Stephen Gomez
- Consultant to Theravance Biopharma US, Inc., 901 Gateway Blvd, South San Francisco, CA, 94080, USA
| | - Jacky Van Gompel
- Janssen Pharmaceutical Companies of Johnson & Johnson, 2340, Beerse, Belgium
| | - James Harvey
- GlaxoSmithKline, Park Road, Ware, Hertfordshire, SG12 0DP, UK
| | | | | | | | - Robert Jolly
- Toxicology Division, Eli Lilly and Company, Indianapolis, IN, USA
| | - Raymond Kemper
- Vertex Pharmaceuticals Inc., Discovery and Investigative Toxicology, 50 Northern Ave, Boston, MA, USA
| | - Michelle Kenyon
- Pfizer Global Research & Development, 558 Eastern Point Road, Groton, CT, 06340, USA
| | - Naomi Kruhlak
- FDA Center for Drug Evaluation and Research, Silver Spring, MD, USA
| | - Penny Leavitt
- Bristol-Myers Squibb, Drug Safety Evaluation, 1 Squibb Dr, New Brunswick, NJ, 08903, USA
| | - Scott Miller
- Leadscope, Inc., 1393 Dublin Rd, Columbus, OH, 43215, USA
| | - Wolfgang Muster
- Roche Pharmaceutical Research & Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Switzerland
| | | | | | - Alexis Parenty
- Novartis Pharma AG, Pre-Clinical Safety, Werk Klybeck, CH-4057, Basel, Switzerland
| | - Mark Powley
- Merck Research Laboratories, West Point, PA, 19486, USA
| | | | | | | | | | | | | | - Sandy Weiner
- Janssen Research & Development, 1400 McKean Road, Spring House, PA, 19477, USA
| | | | - Angela White
- GlaxoSmithKline, Park Road, Ware, Hertfordshire, SG12 0DP, UK
| | - Joerg Wichard
- Bayer Pharma AG, Investigational Toxicology, Muellerstr. 178, D-13353, Berlin, Germany
| | - David Woolley
- ForthTox Limited, PO Box 13550, Linlithgow, EH49 7YU, UK
| | - Glenn J Myatt
- Leadscope, Inc., 1393 Dublin Rd, Columbus, OH, 43215, USA.
| |
Collapse
|
17
|
Jain S, Ecker GF. In Silico Approaches to Predict Drug-Transporter Interaction Profiles: Data Mining, Model Generation, and Link to Cholestasis. Methods Mol Biol 2019; 1981:383-396. [PMID: 31016669 DOI: 10.1007/978-1-4939-9420-5_26] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Transport proteins play a crucial role in drug distribution, disposition, and clearance by mediating cellular drug influx and efflux. Inhibition of these transporters may lead to drug-drug interactions or even drug-induced liver injury, such as cholestasis, which comprises a major challenge in drug development process. Thus, computer-based (in silico) models that can predict the pharmacological and toxicological profiles of these small molecules with respect to liver transporters may help in the early prioritization of compounds and hence may lower the high attrition rates. In this chapter, we provide a protocol for in silico prediction of cholestasis by generating validated predictive models. In addition to the two-dimensional molecular descriptors, we include transporter inhibition predictions as descriptors and evaluate the influence of the same on the performance of the cholestasis models.
Collapse
Affiliation(s)
- Sankalp Jain
- Department of Pharmaceutical Chemistry, University of Vienna, Althanstrasse 14, Vienna, 1090, Austria
| | - Gerhard F Ecker
- Department of Pharmaceutical Chemistry, University of Vienna, Althanstrasse 14, Vienna, 1090, Austria.
| |
Collapse
|
18
|
Hanser T, Barber C, Guesné S, Marchaland JF, Werner S. Applicability Domain: Towards a More Formal Framework to Express the Applicability of a Model and the Confidence in Individual Predictions. CHALLENGES AND ADVANCES IN COMPUTATIONAL CHEMISTRY AND PHYSICS 2019. [DOI: 10.1007/978-3-030-16443-0_11] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
|
19
|
Ruiz IL, Gómez-Nieto MÁ. Study of the Applicability Domain of the QSAR Classification Models by Means of the Rivality and Modelability Indexes. Molecules 2018; 23:molecules23112756. [PMID: 30356020 PMCID: PMC6278359 DOI: 10.3390/molecules23112756] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2018] [Revised: 10/14/2018] [Accepted: 10/22/2018] [Indexed: 11/30/2022] Open
Abstract
The reliability of a QSAR classification model depends on its capacity to achieve confident predictions of new compounds not considered in the building of the model. The results of this external validation process show the applicability domain (AD) of the QSAR model and, therefore, the robustness of the model to predict the property/activity of new molecules. In this paper we propose the use of the rivality and modelability indexes for the study of the characteristics of the datasets to be correctly modeled by a QSAR algorithm and to predict the reliability of the built model to prognosticate the property/activity of new molecules. The calculation of these indexes has a very low computational cost, not requiring the building of a model, thus being good tools for the analysis of the datasets in the first stages of the building of QSAR classification models. In our study, we have selected two benchmark datasets with similar number of molecules but with very different modelability and we have corroborated the capacity of the predictability of the rivality and modelability indexes regarding the classification models built using Support Vector Machine and Random Forest algorithms with 5-fold cross-validation and leave-one-out techniques. The results have shown the excellent ability of both indexes to predict outliers and the applicability domain of the QSAR classification models. In all cases, these values accurately predicted the statistic parameters of the QSAR models generated by the algorithms.
Collapse
Affiliation(s)
- Irene Luque Ruiz
- Department of Computing and Numerical Analysis, Campus Universitario de Rabanales, Albert Einstein Building, University of Córdoba, E-14071 Córdoba, Spain.
| | - Miguel Ángel Gómez-Nieto
- Department of Computing and Numerical Analysis, Campus Universitario de Rabanales, Albert Einstein Building, University of Córdoba, E-14071 Córdoba, Spain.
| |
Collapse
|
20
|
Pastor M, Quintana J, Sanz F. Development of an Infrastructure for the Prediction of Biological Endpoints in Industrial Environments. Lessons Learned at the eTOX Project. Front Pharmacol 2018; 9:1147. [PMID: 30364191 PMCID: PMC6193068 DOI: 10.3389/fphar.2018.01147] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2018] [Accepted: 09/21/2018] [Indexed: 11/13/2022] Open
Abstract
In silico methods are increasingly being used for assessing the chemical safety of substances, as a part of integrated approaches involving in vitro and in vivo experiments. A paradigmatic example of these strategies is the eTOX project http://www.etoxproject.eu, funded by the European Innovative Medicines Initiative (IMI), which aimed at producing high quality predictions of in vivo toxicity of drug candidates and resulted in generating about 200 models for diverse endpoints of toxicological interest. In an industry-oriented project like eTOX, apart from the predictive quality, the models need to meet other quality parameters related to the procedures for their generation and their intended use. For example, when the models are used for predicting the properties of drug candidates, the prediction system must guarantee the complete confidentiality of the compound structures. The interface of the system must be designed to provide non-expert users all the information required to choose the models and appropriately interpret the results. Moreover, procedures like installation, maintenance, documentation, validation and versioning, which are common in software development, must be also implemented for the models and for the prediction platform in which they are implemented. In this article we describe our experience in the eTOX project and the lessons learned after 7 years of close collaboration between industrial and academic partners. We believe that some of the solutions found and the tools developed could be useful for supporting similar initiatives in the future.
Collapse
Affiliation(s)
| | | | - Ferran Sanz
- *Correspondence: Manuel Pastor, Ferran Sanz,
| |
Collapse
|
21
|
Myatt GJ, Ahlberg E, Akahori Y, Allen D, Amberg A, Anger LT, Aptula A, Auerbach S, Beilke L, Bellion P, Benigni R, Bercu J, Booth ED, Bower D, Brigo A, Burden N, Cammerer Z, Cronin MTD, Cross KP, Custer L, Dettwiler M, Dobo K, Ford KA, Fortin MC, Gad-McDonald SE, Gellatly N, Gervais V, Glover KP, Glowienke S, Van Gompel J, Gutsell S, Hardy B, Harvey JS, Hillegass J, Honma M, Hsieh JH, Hsu CW, Hughes K, Johnson C, Jolly R, Jones D, Kemper R, Kenyon MO, Kim MT, Kruhlak NL, Kulkarni SA, Kümmerer K, Leavitt P, Majer B, Masten S, Miller S, Moser J, Mumtaz M, Muster W, Neilson L, Oprea TI, Patlewicz G, Paulino A, Lo Piparo E, Powley M, Quigley DP, Reddy MV, Richarz AN, Ruiz P, Schilter B, Serafimova R, Simpson W, Stavitskaya L, Stidl R, Suarez-Rodriguez D, Szabo DT, Teasdale A, Trejo-Martin A, Valentin JP, Vuorinen A, Wall BA, Watts P, White AT, Wichard J, Witt KL, Woolley A, Woolley D, Zwickl C, Hasselgren C. In silico toxicology protocols. Regul Toxicol Pharmacol 2018; 96:1-17. [PMID: 29678766 DOI: 10.1016/j.yrtph.2018.04.014] [Citation(s) in RCA: 131] [Impact Index Per Article: 18.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2017] [Revised: 03/16/2018] [Accepted: 04/16/2018] [Indexed: 10/17/2022]
Abstract
The present publication surveys several applications of in silico (i.e., computational) toxicology approaches across different industries and institutions. It highlights the need to develop standardized protocols when conducting toxicity-related predictions. This contribution articulates the information needed for protocols to support in silico predictions for major toxicological endpoints of concern (e.g., genetic toxicity, carcinogenicity, acute toxicity, reproductive toxicity, developmental toxicity) across several industries and regulatory bodies. Such novel in silico toxicology (IST) protocols, when fully developed and implemented, will ensure in silico toxicological assessments are performed and evaluated in a consistent, reproducible, and well-documented manner across industries and regulatory bodies to support wider uptake and acceptance of the approaches. The development of IST protocols is an initiative developed through a collaboration among an international consortium to reflect the state-of-the-art in in silico toxicology for hazard identification and characterization. A general outline for describing the development of such protocols is included and it is based on in silico predictions and/or available experimental data for a defined series of relevant toxicological effects or mechanisms. The publication presents a novel approach for determining the reliability of in silico predictions alongside experimental data. In addition, we discuss how to determine the level of confidence in the assessment based on the relevance and reliability of the information.
Collapse
Affiliation(s)
- Glenn J Myatt
- Leadscope, Inc., 1393 Dublin Rd, Columbus, OH 43215, USA.
| | - Ernst Ahlberg
- Predictive Compound ADME & Safety, Drug Safety & Metabolism, AstraZeneca IMED Biotech Unit, Mölndal, Sweden
| | - Yumi Akahori
- Chemicals Evaluation and Research Institute, 1-4-25 Kouraku, Bunkyo-ku, Tokyo 112-0004 Japan
| | - David Allen
- Integrated Laboratory Systems, Inc., Research Triangle Park, NC, USA
| | - Alexander Amberg
- Sanofi, R&D Preclinical Safety Frankfurt, Industriepark Hoechst, D-65926 Frankfurt am Main, Germany
| | - Lennart T Anger
- Sanofi, R&D Preclinical Safety Frankfurt, Industriepark Hoechst, D-65926 Frankfurt am Main, Germany
| | - Aynur Aptula
- Unilever, Safety and Environmental Assurance Centre, Colworth, Beds, UK
| | - Scott Auerbach
- The National Institute of Environmental Health Sciences, Division of the National Toxicology Program, Research Triangle Park, NC 27709, USA
| | - Lisa Beilke
- Toxicology Solutions Inc., San Diego, CA, USA
| | | | | | - Joel Bercu
- Gilead Sciences, 333 Lakeside Drive, Foster City, CA, USA
| | - Ewan D Booth
- Syngenta, Product Safety Department, Jealott's Hill International Research Centre, Bracknell, Berkshire, RG42 6EY, UK
| | - Dave Bower
- Leadscope, Inc., 1393 Dublin Rd, Columbus, OH 43215, USA
| | - Alessandro Brigo
- Roche Pharmaceutical Research & Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Switzerland
| | - Natalie Burden
- National Centre for the Replacement, Refinement and Reduction of Animals in Research (NC3Rs), Gibbs Building, 215 Euston Road, London NW1 2BE, UK
| | - Zoryana Cammerer
- Janssen Research & Development, 1400 McKean Road, Spring House, PA 19477, USA
| | - Mark T D Cronin
- School of Pharmacy and Chemistry, Liverpool John Moores University, Liverpool, L3 3AF, UK
| | - Kevin P Cross
- Leadscope, Inc., 1393 Dublin Rd, Columbus, OH 43215, USA
| | - Laura Custer
- Bristol-Myers Squibb, Drug Safety Evaluation, 1 Squibb Dr, New Brunswick, NJ 08903, USA
| | | | - Krista Dobo
- Pfizer Global Research & Development, 558 Eastern Point Road, Groton, CT 06340, USA
| | - Kevin A Ford
- Global Blood Therapeutics, South San Francisco, CA 94080, USA
| | - Marie C Fortin
- Department of Pharmacology and Toxicology, Ernest Mario School of Pharmacy, Rutgers, The State University of New Jersey, 170 Frelinghuysen Rd, Piscataway, NJ 08855, USA
| | | | - Nichola Gellatly
- National Centre for the Replacement, Refinement and Reduction of Animals in Research (NC3Rs), Gibbs Building, 215 Euston Road, London NW1 2BE, UK
| | | | - Kyle P Glover
- Defense Threat Reduction Agency, Edgewood Chemical Biological Center, Aberdeen Proving Ground, MD 21010, USA
| | - Susanne Glowienke
- Novartis Pharma AG, Pre-Clinical Safety, Werk Klybeck, CH-4057, Basel, Switzerland
| | - Jacky Van Gompel
- Janssen Pharmaceutical Companies of Johnson & Johnson, 2340 Beerse, Belgium
| | - Steve Gutsell
- Unilever, Safety and Environmental Assurance Centre, Colworth, Beds, UK
| | - Barry Hardy
- Douglas Connect GmbH, Technology Park Basel, Hochbergerstrasse 60C, CH-4057 Basel / Basel-Stadt, Switzerland
| | - James S Harvey
- GlaxoSmithKline Pre-Clinical Development, Park Road, Ware, Hertfordshire, SG12 0DP, UK
| | - Jedd Hillegass
- Bristol-Myers Squibb, Drug Safety Evaluation, 1 Squibb Dr, New Brunswick, NJ 08903, USA
| | | | - Jui-Hua Hsieh
- Kelly Government Solutions, Research Triangle Park, NC 27709, USA
| | - Chia-Wen Hsu
- FDA Center for Drug Evaluation and Research, Silver Spring, MD 20993, USA
| | - Kathy Hughes
- Existing Substances Risk Assessment Bureau, Health Canada, Ottawa, ON, K1A 0K9, Canada
| | | | - Robert Jolly
- Toxicology Division, Eli Lilly and Company, Indianapolis, IN, USA
| | - David Jones
- Medicines and Healthcare Products Regulatory Agency, 151 Buckingham Palace Road, London, SW1W 9SZ, UK
| | - Ray Kemper
- Vertex Pharmaceuticals Inc., Discovery and Investigative Toxicology, 50 Northern Ave, Boston, MA, USA
| | - Michelle O Kenyon
- Pfizer Global Research & Development, 558 Eastern Point Road, Groton, CT 06340, USA
| | - Marlene T Kim
- FDA Center for Drug Evaluation and Research, Silver Spring, MD 20993, USA
| | - Naomi L Kruhlak
- FDA Center for Drug Evaluation and Research, Silver Spring, MD 20993, USA
| | - Sunil A Kulkarni
- Existing Substances Risk Assessment Bureau, Health Canada, Ottawa, ON, K1A 0K9, Canada
| | - Klaus Kümmerer
- Institute for Sustainable and Environmental Chemistry, Leuphana University Lüneburg, Scharnhorststraße 1/C13.311b, 21335 Lüneburg, Germany
| | - Penny Leavitt
- Bristol-Myers Squibb, Drug Safety Evaluation, 1 Squibb Dr, New Brunswick, NJ 08903, USA
| | | | - Scott Masten
- The National Institute of Environmental Health Sciences, Division of the National Toxicology Program, Research Triangle Park, NC 27709, USA
| | - Scott Miller
- Leadscope, Inc., 1393 Dublin Rd, Columbus, OH 43215, USA
| | - Janet Moser
- Chemical Security Analysis Center, Department of Homeland Security, 3401 Ricketts Point Road, Aberdeen Proving Ground, MD 21010-5405, USA; Battelle Memorial Institute, 505 King Avenue, Columbus, OH 43210, USA
| | - Moiz Mumtaz
- Agency for Toxic Substances and Disease Registry, US Department of Health and Human Services, Atlanta, GA, USA
| | - Wolfgang Muster
- Roche Pharmaceutical Research & Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, Switzerland
| | - Louise Neilson
- British American Tobacco, Research and Development, Regents Park Road, Southampton, Hampshire, SO15 8TL, UK
| | - Tudor I Oprea
- Translational Informatics Division, Department of Internal Medicine, Health Sciences Center, The University of New Mexico, NM, USA
| | - Grace Patlewicz
- U.S. Environmental Protection Agency, National Center for Computational Toxicology, Research Triangle Park, NC 27711, USA
| | - Alexandre Paulino
- SAPEC Agro, S.A., Avenida do Rio Tejo, Herdade das Praias, 2910-440 Setúbal, Portugal
| | - Elena Lo Piparo
- Chemical Food Safety Group, Nestlé Research Center, Lausanne, Switzerland
| | - Mark Powley
- FDA Center for Drug Evaluation and Research, Silver Spring, MD 20993, USA
| | | | | | - Andrea-Nicole Richarz
- European Commission, Joint Research Centre, Directorate for Health, Consumers and Reference Materials, Chemical Safety and Alternative Methods Unit, Via Enrico Fermi 2749, 21027 Ispra, VA, Italy
| | - Patricia Ruiz
- Agency for Toxic Substances and Disease Registry, US Department of Health and Human Services, Atlanta, GA, USA
| | - Benoit Schilter
- Chemical Food Safety Group, Nestlé Research Center, Lausanne, Switzerland
| | | | - Wendy Simpson
- Unilever, Safety and Environmental Assurance Centre, Colworth, Beds, UK
| | - Lidiya Stavitskaya
- FDA Center for Drug Evaluation and Research, Silver Spring, MD 20993, USA
| | | | | | - David T Szabo
- RAI Services Company, 950 Reynolds Blvd., Winston-Salem, NC 27105, USA
| | | | | | | | | | - Brian A Wall
- Colgate-Palmolive Company, Piscataway, NJ 08854, USA
| | - Pete Watts
- Bibra, Cantium House, Railway Approach, Wallington, Surrey, SM6 0DZ, UK
| | - Angela T White
- GlaxoSmithKline Pre-Clinical Development, Park Road, Ware, Hertfordshire, SG12 0DP, UK
| | - Joerg Wichard
- Bayer Pharma AG, Investigational Toxicology, Muellerstr. 178, D-13353 Berlin, Germany
| | - Kristine L Witt
- The National Institute of Environmental Health Sciences, Division of the National Toxicology Program, Research Triangle Park, NC 27709, USA
| | - Adam Woolley
- ForthTox Limited, PO Box 13550, Linlithgow, EH49 7YU, UK
| | - David Woolley
- ForthTox Limited, PO Box 13550, Linlithgow, EH49 7YU, UK
| | - Craig Zwickl
- Transendix LLC, 1407 Moores Manor, Indianapolis, IN 46229, USA
| | | |
Collapse
|
22
|
López-Massaguer O, Pinto-Gil K, Sanz F, Amberg A, Anger LT, Stolte M, Ravagli C, Marc P, Pastor M. Generating Modeling Data From Repeat-Dose Toxicity Reports. Toxicol Sci 2018; 162:287-300. [PMID: 29155963 PMCID: PMC5837688 DOI: 10.1093/toxsci/kfx254] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Over the past decades, pharmaceutical companies have conducted a large number of high-quality in vivo repeat-dose toxicity (RDT) studies for regulatory purposes. As part of the eTOX project, a high number of these studies have been compiled and integrated into a database. This valuable resource can be queried directly, but it can be further exploited to build predictive models. As the studies were originally conducted to investigate the properties of individual compounds, the experimental conditions across the studies are highly heterogeneous. Consequently, the original data required normalization/standardization, filtering, categorization and integration to make possible any data analysis (such as building predictive models). Additionally, the primary objectives of the RDT studies were to identify toxicological findings, most of which do not directly translate to in vivo endpoints. This article describes a method to extract datasets containing comparable toxicological properties for a series of compounds amenable for building predictive models. The proposed strategy starts with the normalization of the terms used within the original reports. Then, comparable datasets are extracted from the database by applying filters based on the experimental conditions. Finally, carefully selected profiles of toxicological findings are mapped to endpoints of interest, generating QSAR-like tables. In this work, we describe in detail the strategy and tools used for carrying out these transformations and illustrate its application in a data sample extracted from the eTOX database. The suitability of the resulting tables for developing hazard-predicting models was investigated by building proof-of-concept models for in vivo liver endpoints.
Collapse
Affiliation(s)
- Oriol López-Massaguer
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Kevin Pinto-Gil
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Ferran Sanz
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | | | - Lennart T Anger
- Sanofi, Preclinical Safety, 65926 Frankfurt am Main, Germany
| | - Manuela Stolte
- Sanofi, Preclinical Safety, 65926 Frankfurt am Main, Germany
| | - Carlo Ravagli
- Translational Medicine, Novartis Institute for Biomedical Research, CH-4002 Basel, Switzerland
| | - Philippe Marc
- Translational Medicine, Novartis Institute for Biomedical Research, CH-4002 Basel, Switzerland
| | - Manuel Pastor
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Institut Hospital del Mar d’Investigacions Mèdiques (IMIM), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| |
Collapse
|
23
|
Nolte TM, Pinto-Gil K, Hendriks AJ, Ragas AMJ, Pastor M. Quantitative structure-activity relationships for primary aerobic biodegradation of organic chemicals in pristine surface waters: starting points for predicting biodegradation under acclimatization. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2018; 20:157-170. [PMID: 29192704 DOI: 10.1039/c7em00375g] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/27/2023]
Abstract
Microbial biomass and acclimation can affect the removal of organic chemicals in natural surface waters. In order to account for these effects and develop more robust models for biodegradation, we have compiled and curated removal data for un-acclimated (pristine) surface waters on which we developed quantitative structure-activity relationships (QSARs). Global analysis of the very heterogeneous dataset including neutral, anionic, cationic and zwitterionic chemicals (N = 233) using a random forest algorithm showed that useful predictions were possible (Qext2 = 0.4-0.5) though relatively large standard errors were associated (SDEP ∼0.7). Classification of the chemicals based on speciation state and metabolic pathway showed that biodegradation is influenced by the two, and that the dependence of biodegradation on chemical characteristics is non-linear. Class-specific QSAR analysis indicated that shape and charge distribution determine the biodegradation of neutral chemicals (R2 ∼ 0.6), e.g. through membrane permeation or binding to P450 enzymes, whereas the average biodegradation of charged chemicals is 1 to 2 orders of magnitude lower, for which degradation depends more directly on cellular uptake (R2 ∼ 0.6). Further analysis showed that specific chemical classes such as peptides and organic halogens are relatively less biodegradable in pristine surface waters, resulting in the need for the microbial consortia to acclimate. Additional literature data was used to verify an acclimation model (based on Monod-type kinetics) capable of extrapolating QSAR predictions to acclimating conditions such as in water treatment, downstream lakes and large rivers under μg L-1 to mg L-1 concentrations. The framework developed, despite being based on multiple assumptions, is promising and needs further validation using experimentation with more standardised and homogenised conditions as well as adequate characterization of the inoculum used.
Collapse
Affiliation(s)
- Tom M Nolte
- Department of Environmental Science, Institute for Water and Wetland Research, Radboud University Nijmegen, P. O. Box 9010, 6500 GL Nijmegen, The Netherlands.
| | | | | | | | | |
Collapse
|
24
|
Rifai EA, van Dijk M, Vermeulen NPE, Geerke DP. Binding free energy predictions of farnesoid X receptor (FXR) agonists using a linear interaction energy (LIE) approach with reliability estimation: application to the D3R Grand Challenge 2. J Comput Aided Mol Des 2018; 32:239-249. [PMID: 28889350 PMCID: PMC5767202 DOI: 10.1007/s10822-017-0055-0] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2017] [Accepted: 08/29/2017] [Indexed: 01/21/2023]
Abstract
Computational protein binding affinity prediction can play an important role in drug research but performing efficient and accurate binding free energy calculations is still challenging. In the context of phase 2 of the Drug Design Data Resource (D3R) Grand Challenge 2 we used our automated eTOX ALLIES approach to apply the (iterative) linear interaction energy (LIE) method and we evaluated its performance in predicting binding affinities for farnesoid X receptor (FXR) agonists. Efficiency was obtained by our pre-calibrated LIE models and molecular dynamics (MD) simulations at the nanosecond scale, while predictive accuracy was obtained for a small subset of compounds. Using our recently introduced reliability estimation metrics, we could classify predictions with higher confidence by featuring an applicability domain (AD) analysis in combination with protein-ligand interaction profiling. The outcomes of and agreement between our AD and interaction-profile analyses to distinguish and rationalize the performance of our predictions highlighted the relevance of sufficiently exploring protein-ligand interactions during training and it demonstrated the possibility to quantitatively and efficiently evaluate if this is achieved by using simulation data only.
Collapse
Affiliation(s)
- Eko Aditya Rifai
- AIMMS Division of Molecular Toxicology, Department of Chemistry and Pharmaceutical Sciences, Vrije Universiteit Amsterdam, De Boelelaan 1108, 1081 HZ, Amsterdam, The Netherlands
| | - Marc van Dijk
- AIMMS Division of Molecular Toxicology, Department of Chemistry and Pharmaceutical Sciences, Vrije Universiteit Amsterdam, De Boelelaan 1108, 1081 HZ, Amsterdam, The Netherlands
| | - Nico P E Vermeulen
- AIMMS Division of Molecular Toxicology, Department of Chemistry and Pharmaceutical Sciences, Vrije Universiteit Amsterdam, De Boelelaan 1108, 1081 HZ, Amsterdam, The Netherlands
| | - Daan P Geerke
- AIMMS Division of Molecular Toxicology, Department of Chemistry and Pharmaceutical Sciences, Vrije Universiteit Amsterdam, De Boelelaan 1108, 1081 HZ, Amsterdam, The Netherlands.
| |
Collapse
|
25
|
Sanz F, Pognan F, Steger-Hartmann T, Díaz C, Cases M, Pastor M, Marc P, Wichard J, Briggs K, Watson DK, Kleinöder T, Yang C, Amberg A, Beaumont M, Brookes AJ, Brunak S, Cronin MTD, Ecker GF, Escher S, Greene N, Guzmán A, Hersey A, Jacques P, Lammens L, Mestres J, Muster W, Northeved H, Pinches M, Saiz J, Sajot N, Valencia A, van der Lei J, Vermeulen NPE, Vock E, Wolber G, Zamora I. Legacy data sharing to improve drug safety assessment: the eTOX project. Nat Rev Drug Discov 2017; 16:811-812. [PMID: 29026211 DOI: 10.1038/nrd.2017.177] [Citation(s) in RCA: 43] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The sharing of legacy preclinical safety data among pharmaceutical companies and its integration with other information sources offers unprecedented opportunities to improve the early assessment of drug safety. Here, we discuss the experience of the eTOX project, which was established through the Innovative Medicines Initiative to explore this possibility.
Collapse
Affiliation(s)
- Ferran Sanz
- Institut Hospital del Mar d'Investigacions Mèdiques (IMIM), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - François Pognan
- Novartis Institute for Biomedical Research, Basel, CH-4002, Switzerland
| | | | - Carlos Díaz
- Synapse Research Management Partners, 08007 Barcelona, Spain
| | | | | | - Manuel Pastor
- Institut Hospital del Mar d'Investigacions Mèdiques (IMIM), Universitat Pompeu Fabra, 08003 Barcelona, Spain
| | - Philippe Marc
- Novartis Institute for Biomedical Research, Basel, CH-4002, Switzerland
| | | | | | | | | | - Chihae Yang
- Molecular Networks GmbH, 90411 Nürnberg, Germany
| | | | - Maria Beaumont
- GlaxoSmithKline Research and Development Ltd, Stevenage SG1 2NY, UK
| | | | - Søren Brunak
- Technical University of Denmark (DTU), 2800 Lyngby, Denmark
| | | | | | - Sylvia Escher
- Fraunhofer Institute for Toxicology and Experimental Medicine (ITEM), 30625 Hannover, Germany
| | - Nigel Greene
- Pfizer Ltd, Groton, Connecticut 06340, USA. Current affiliation: AstraZeneca, Waltham, Massachusettts 02451, USA
| | | | - Anne Hersey
- European Bioinformatics Institute, European Molecular Biology Laboratory, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | | | | | | | | - Marc Pinches
- AstraZeneca AB, SK10 2NA Cheshire, UK. Current affiliation: Lhasa Ltd, Leeds LS11 5PS, UK
| | - Javier Saiz
- Universitat Politècnica de València, 46022 València, Spain
| | | | - Alfonso Valencia
- ICREA, 08010 Barcelona, Spain & Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain
| | - Johan van der Lei
- Erasmus Universitair Medisch Centrum, 3015 CE Rotterdam, The Netherlands
| | | | - Esther Vock
- Boehringer Ingelheim International GmbH, 88379 Biberach an der Riss, Germany
| | | | - Ismael Zamora
- Lead Molecular Design S.L., 08172 Sant Cugat del Vallès, Spain
| |
Collapse
|
26
|
Speck-Planche A, Dias Soeiro Cordeiro MN. Speeding up Early Drug Discovery in Antiviral Research: A Fragment-Based in Silico Approach for the Design of Virtual Anti-Hepatitis C Leads. ACS COMBINATORIAL SCIENCE 2017; 19:501-512. [PMID: 28437091 DOI: 10.1021/acscombsci.7b00039] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Hepatitis C constitutes an unresolved global health problem. This infectious disease is caused by the hepatotropic hepatitis C virus (HCV), and it can lead to the occurrence of life-threatening medical conditions, such as cirrhosis and liver cancer. Nowadays, major clinical concerns have arisen because of the appearance of multidrug resistance (MDR) and the side effects especially associated with long-term treatments. In this work, we report the first multitasking model for quantitative structure-biological effect relationships (mtk-QSBER), focused on the simultaneous exploration of anti-HCV activity and in vitro safety profiles related to the absorption, distribution, metabolism, elimination, and toxicity (ADMET). The mtk-QSBER model was created from a data set formed by 40 158 cases, displaying accuracy higher than 95% in both training and prediction (test) sets. Several molecular fragments were selected, and their quantitative contributions to anti-HCV activity and ADMET profiles were calculated. By combining the analysis of the fragments with positive contributions and the physicochemical meanings of the different molecular descriptors in the mtk-QSBER, six new molecules were designed. These new molecules were predicted to exhibit potent anti-HCV activity and desirable in vitro ADMET properties. In addition, the designed molecules have good druglikeness according to the Lipinski's rule of five and its variants.
Collapse
Affiliation(s)
- Alejandro Speck-Planche
- LAQV@REQUIMTE/Department
of Chemistry and Biochemistry, University of Porto, 4169-007 Porto, Portugal
| | | |
Collapse
|
27
|
Speck-Planche A, Cordeiro MNDS. De novo computational design of compounds virtually displaying potent antibacterial activity and desirable in vitro ADMET profiles. Med Chem Res 2017. [DOI: 10.1007/s00044-017-1936-4] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
28
|
Nolte TM, Ragas AMJ. A review of quantitative structure-property relationships for the fate of ionizable organic chemicals in water matrices and identification of knowledge gaps. ENVIRONMENTAL SCIENCE. PROCESSES & IMPACTS 2017; 19:221-246. [PMID: 28296985 DOI: 10.1039/c7em00034k] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]
Abstract
Many organic chemicals are ionizable by nature. After use and release into the environment, various fate processes determine their concentrations, and hence exposure to aquatic organisms. In the absence of suitable data, such fate processes can be estimated using Quantitative Structure-Property Relationships (QSPRs). In this review we compiled available QSPRs from the open literature and assessed their applicability towards ionizable organic chemicals. Using quantitative and qualitative criteria we selected the 'best' QSPRs for sorption, (a)biotic degradation, and bioconcentration. The results indicate that many suitable QSPRs exist, but some critical knowledge gaps remain. Specifically, future focus should be directed towards the development of QSPR models for biodegradation in wastewater and sediment systems, direct photolysis and reaction with singlet oxygen, as well as additional reactive intermediates. Adequate QSPRs for bioconcentration in fish exist, but more accurate assessments can be achieved using pharmacologically based toxicokinetic (PBTK) models. No adequate QSPRs exist for bioconcentration in non-fish species. Due to the high variability of chemical and biological species as well as environmental conditions in QSPR datasets, accurate predictions for specific systems and inter-dataset conversions are problematic, for which standardization is needed. For all QSPR endpoints, additional data requirements involve supplementing the current chemical space covered and accurately characterizing the test systems used.
Collapse
Affiliation(s)
- Tom M Nolte
- Department of Environmental Science, Institute for Water and Wetland Research, Radboud University Nijmegen, P.O. Box 9010, 6500 GL Nijmegen, The Netherlands.
| | - Ad M J Ragas
- Department of Environmental Science, Institute for Water and Wetland Research, Radboud University Nijmegen, P.O. Box 9010, 6500 GL Nijmegen, The Netherlands.
| |
Collapse
|
29
|
Montanari F, Zdrazil B. How Open Data Shapes In Silico Transporter Modeling. Molecules 2017; 22:molecules22030422. [PMID: 28272367 PMCID: PMC5553104 DOI: 10.3390/molecules22030422] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Revised: 02/28/2017] [Accepted: 03/02/2017] [Indexed: 12/05/2022] Open
Abstract
Chemical compound bioactivity and related data are nowadays easily available from open data sources and the open medicinal chemistry literature for many transmembrane proteins. Computational ligand-based modeling of transporters has therefore experienced a shift from local (quantitative) models to more global, qualitative, predictive models. As the size and heterogeneity of the data set rises, careful data curation becomes even more important. This includes, for example, not only a tailored cutoff setting for the generation of binary classes, but also the proper assessment of the applicability domain. Powerful machine learning algorithms (such as multi-label classification) now allow the simultaneous prediction of multiple related targets. However, the more complex, the less interpretable these models will get. We emphasize that transmembrane transporters are very peculiar, some of which act as off-targets rather than as real drug targets. Thus, careful selection of the right modeling technique is important, as well as cautious interpretation of results. We hope that, as more and more data will become available, we will be able to ameliorate and specify our models, coming closer towards function elucidation and the development of safer medicine.
Collapse
Affiliation(s)
- Floriane Montanari
- Pharmacoinformatics Research Group, Department of Pharmaceutical Chemistry, University of Vienna, A-1090 Vienna, Austria.
| | - Barbara Zdrazil
- Pharmacoinformatics Research Group, Department of Pharmaceutical Chemistry, University of Vienna, A-1090 Vienna, Austria.
| |
Collapse
|
30
|
Aniceto N, Freitas AA, Bender A, Ghafourian T. A novel applicability domain technique for mapping predictive reliability across the chemical space of a QSAR: reliability-density neighbourhood. J Cheminform 2016. [PMCID: PMC5395519 DOI: 10.1186/s13321-016-0182-y] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
The ability to define the regions of chemical space where a predictive model can be safely used is a necessary condition to assure the reliability of new predictions. This implies that reliability must be determined across chemical space in the attempt to localize “safe” and “unsafe” regions for prediction. As a result we devised an applicability domain technique that addresses the data locally instead of handling it as a whole—the reliability-density neighbourhood (RDN). The main novelty aspect of this method is that it characterizes each single training instance according to the density of its neighbourhood in the training set, as well as its individual bias and precision. By scanning through the chemical space (by iteratively increasing the applicability domain area), it was observed that new test compounds are successively included into the applicability domain region in such a manner that strongly correlates to their predictive performance. This allows the mapping of local reliability across different locations in the training set space, and thus allows identifying regions where the model has low reliability. This method also showed matching profiles between two external sets, which is an indication that it performs robustly with new data. Another novel aspect in this technique is that it is paired with a specific feature selection algorithm. As a result, the impact of the feature set used was studied from which the top 20 features selected by ReliefF yielded the best results, as opposed to using the model’s features or the entire feature set as commonly done. As the third novel aspect, in this work we propose a new scoring function to help evaluate the quality of an applicability domain profile (i.e., the curve of accuracy vs the applicability domain measure in question). Overall, the RDN showed to be a promising method that can correctly sort new instances according to predictive performance. As a result, this technique can be received by an end-user as proof of concept for the performance of a QSAR model in new data, thus promoting the user’s trust on the QSAR output.. ![]()
Collapse
|
31
|
Hanser T, Barber C, Marchaland JF, Werner S. Applicability domain: towards a more formal definition. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2016; 27:893-909. [PMID: 27827546 DOI: 10.1080/1062936x.2016.1250229] [Citation(s) in RCA: 73] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2016] [Accepted: 10/16/2016] [Indexed: 06/06/2023]
Abstract
In recent years the applicability domain (AD) of a prediction system has become an important concern in (Q)SAR modelling, especially in the context of human safety assessment. Today AD is an active research topic, and many methods have been designed to estimate the adequacy of a model and the confidence in its outcome for a given prediction task. Unfortunately, the wide spectrum of techniques developed for this purpose is based on various definitions of the concept of AD, often taking into account different types of information. This variety of methodologies confuses the end users and makes the comparison of the AD for different models almost impossible. In this article, we demonstrate that AD is not a monolithic concept and can be broken down into three well-defined sub-domains assessing confidence at the model, prediction and decision levels, respectively. By leveraging this separation of concerns we have an opportunity to clarify, formalize and extend the definition of AD. We propose a framework that captures this new vision with the aim to initiate a global effort to converge towards a common AD definition within the (Q)SAR community.
Collapse
Affiliation(s)
- T Hanser
- a Research Group, Lhasa Limited (UK) , Leeds , UK
| | - C Barber
- a Research Group, Lhasa Limited (UK) , Leeds , UK
| | | | - S Werner
- a Research Group, Lhasa Limited (UK) , Leeds , UK
| |
Collapse
|
32
|
Abburu S, Venkatraman V, Alsberg BK. TD-DFT based fine-tuning of molecular excitation energies using evolutionary algorithms. RSC Adv 2016. [DOI: 10.1039/c5ra22800j] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
An evolutionary de novo design method is presented to fine-tune the excitation energies of molecules calculated using time-dependent density functional theory (TD-DFT).
Collapse
Affiliation(s)
- Sailesh Abburu
- Department of Chemistry
- Norwegian University of Science and Technology (NTNU)
- 7491 Trondheim
- Norway
| | - Vishwesh Venkatraman
- Department of Chemistry
- Norwegian University of Science and Technology (NTNU)
- 7491 Trondheim
- Norway
| | - Bjørn K. Alsberg
- Department of Chemistry
- Norwegian University of Science and Technology (NTNU)
- 7491 Trondheim
- Norway
| |
Collapse
|
33
|
Capoferri L, Verkade-Vreeker MCA, Buitenhuis D, Commandeur JNM, Pastor M, Vermeulen NPE, Geerke DP. Linear Interaction Energy Based Prediction of Cytochrome P450 1A2 Binding Affinities with Reliability Estimation. PLoS One 2015; 10:e0142232. [PMID: 26551865 PMCID: PMC4638363 DOI: 10.1371/journal.pone.0142232] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2015] [Accepted: 10/18/2015] [Indexed: 11/22/2022] Open
Abstract
Prediction of human Cytochrome P450 (CYP) binding affinities of small ligands, i.e., substrates and inhibitors, represents an important task for predicting drug-drug interactions. A quantitative assessment of the ligand binding affinity towards different CYPs can provide an estimate of inhibitory activity or an indication of isoforms prone to interact with the substrate of inhibitors. However, the accuracy of global quantitative models for CYP substrate binding or inhibition based on traditional molecular descriptors can be limited, because of the lack of information on the structure and flexibility of the catalytic site of CYPs. Here we describe the application of a method that combines protein-ligand docking, Molecular Dynamics (MD) simulations and Linear Interaction Energy (LIE) theory, to allow for quantitative CYP affinity prediction. Using this combined approach, a LIE model for human CYP 1A2 was developed and evaluated, based on a structurally diverse dataset for which the estimated experimental uncertainty was 3.3 kJ mol-1. For the computed CYP 1A2 binding affinities, the model showed a root mean square error (RMSE) of 4.1 kJ mol-1 and a standard error in prediction (SDEP) in cross-validation of 4.3 kJ mol-1. A novel approach that includes information on both structural ligand description and protein-ligand interaction was developed for estimating the reliability of predictions, and was able to identify compounds from an external test set with a SDEP for the predicted affinities of 4.6 kJ mol-1 (corresponding to 0.8 pKi units).
Collapse
Affiliation(s)
- Luigi Capoferri
- AIMMS Division of Molecular Toxicology, Department of Chemistry and Pharmaceutical Sciences, VU University, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Marlies C. A. Verkade-Vreeker
- AIMMS Division of Molecular Toxicology, Department of Chemistry and Pharmaceutical Sciences, VU University, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Danny Buitenhuis
- AIMMS Division of Molecular Toxicology, Department of Chemistry and Pharmaceutical Sciences, VU University, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Jan N. M. Commandeur
- AIMMS Division of Molecular Toxicology, Department of Chemistry and Pharmaceutical Sciences, VU University, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Manuel Pastor
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, IMIM (Hospital del Mar Medical Research Institute), Dr. Aiguader, 88, E-08003 Barcelona, Spain
| | - Nico P. E. Vermeulen
- AIMMS Division of Molecular Toxicology, Department of Chemistry and Pharmaceutical Sciences, VU University, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| | - Daan P. Geerke
- AIMMS Division of Molecular Toxicology, Department of Chemistry and Pharmaceutical Sciences, VU University, De Boelelaan 1083, 1081 HV Amsterdam, The Netherlands
| |
Collapse
|
34
|
Kumar R, Chaudhary K, Singh Chauhan J, Nagpal G, Kumar R, Sharma M, Raghava GP. An in silico platform for predicting, screening and designing of antihypertensive peptides. Sci Rep 2015; 5:12512. [PMID: 26213115 PMCID: PMC4515604 DOI: 10.1038/srep12512] [Citation(s) in RCA: 121] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2015] [Accepted: 06/19/2015] [Indexed: 11/30/2022] Open
Abstract
High blood pressure or hypertension is an affliction that threatens millions of lives worldwide. Peptides from natural origin have been shown recently to be highly effective in lowering blood pressure. In the present study, we have framed a platform for predicting and designing novel antihypertensive peptides. Due to a large variation found in the length of antihypertensive peptides, we divided these peptides into four categories (i) Tiny peptides, (ii) small peptides, (iii) medium peptides and (iv) large peptides. First, we developed SVM based regression models for tiny peptides using chemical descriptors and achieved maximum correlation of 0.701 and 0.543 for dipeptides and tripeptides, respectively. Second, classification models were developed for small peptides and achieved maximum accuracy of 76.67%, 72.04% and 77.39% for tetrapeptide, pentapeptide and hexapeptides, respectively. Third, we have developed a model for medium peptides using amino acid composition and achieved maximum accuracy of 82.61%. Finally, we have developed a model for large peptides using amino acid composition and achieved maximum accuracy of 84.21%. Based on the above study, a web-based platform has been developed for locating antihypertensive peptides in a protein, screening of peptides and designing of antihypertensive peptides.
Collapse
Affiliation(s)
| | | | | | | | | | - Minakshi Sharma
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh-160036, India
| | - Gajendra P.S. Raghava
- Bioinformatics Centre, CSIR-Institute of Microbial Technology, Chandigarh-160036, India
| |
Collapse
|
35
|
Su M, Tan J, Lin CY. Development of HIV-1 integrase inhibitors: recent molecular modeling perspectives. Drug Discov Today 2015. [PMID: 26220090 DOI: 10.1016/j.drudis.2015.07.012] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Of the three viral enzymes essential to HIV replication, HIV-1 integrase (IN) is gaining popularity as a target for the antiviral therapy of AIDS. Substantial work focusing on IN has been done over the past three decades, which has facilitated and led to the approval of three drugs. Here, we discuss in detail the development of IN inhibitors between January 2012 and May 2014, with a particular focus on molecular simulation. We highlight controversial aspects of computational drug design and refer to alternative practices where appropriate. The analysis of these computational approaches provides some useful clues to the possible future discovery of novel IN inhibitors.
Collapse
Affiliation(s)
- Min Su
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, China
| | - Jianjun Tan
- College of Life Science and Bioengineering, Beijing University of Technology, Beijing 100124, China.
| | - Chun-Yuan Lin
- Department of Computer Science and Information Engineering, Chang Gung University, Taoyuan 33302, Taiwan.
| |
Collapse
|
36
|
Hewitt M, Ellison CM, Cronin MTD, Pastor M, Steger-Hartmann T, Munoz-Muriendas J, Pognan F, Madden JC. Ensuring confidence in predictions: A scheme to assess the scientific validity of in silico models. Adv Drug Deliv Rev 2015; 86:101-11. [PMID: 25794480 DOI: 10.1016/j.addr.2015.03.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2014] [Revised: 03/05/2015] [Accepted: 03/11/2015] [Indexed: 11/28/2022]
Abstract
The use of in silico tools within the drug development process to predict a wide range of properties including absorption, distribution, metabolism, elimination and toxicity has become increasingly important due to changes in legislation and both ethical and economic drivers to reduce animal testing. Whilst in silico tools have been used for decades there remains reluctance to accept predictions based on these methods particularly in regulatory settings. This apprehension arises in part due to lack of confidence in the reliability, robustness and applicability of the models. To address this issue we propose a scheme for the verification of in silico models that enables end users and modellers to assess the scientific validity of models in accordance with the principles of good computer modelling practice. We report here the implementation of the scheme within the Innovative Medicines Initiative project "eTOX" (electronic toxicity) and its application to the in silico models developed within the frame of this project.
Collapse
Affiliation(s)
- Mark Hewitt
- School of Pharmacy, Faculty of Science and Engineering, University of Wolverhampton, City Campus, Wulfruna Street, WV1 1SB, England, United Kingdom; School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Byrom Street, Liverpool, L3 3AF, England, United Kingdom.
| | - Claire M Ellison
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Byrom Street, Liverpool, L3 3AF, England, United Kingdom.
| | - Mark T D Cronin
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Byrom Street, Liverpool, L3 3AF, England, United Kingdom.
| | - Manuel Pastor
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, IMIM (Hospital del Mar Medical Research Institute), Dr. Aiguader 88, E-08003 Barcelona, Spain.
| | - Thomas Steger-Hartmann
- Bayer HealthCare, Bayer Pharma AG, Investigational Toxicology, Müllerstraße 178, 13352 Berlin, Germany.
| | - Jordi Munoz-Muriendas
- Chemical Sciences, Computational Chemistry, GlaxoSmithKline, Stevenage, SG1 2NY, England, United Kingdom.
| | - Francois Pognan
- Biochemical & Cellular Toxicology, Discovery Investigative Safety - PreClinical Safety, Novartis Pharma AG, Werk Klybeck, Postfach, CH-4002 Basel, Switzerland.
| | - Judith C Madden
- School of Pharmacy and Biomolecular Sciences, Liverpool John Moores University, Byrom Street, Liverpool, L3 3AF, England, United Kingdom.
| |
Collapse
|
37
|
Sanz F, Carrió P, López O, Capoferri L, Kooi DP, Vermeulen NPE, Geerke DP, Montanari F, Ecker GF, Schwab CH, Kleinöder T, Magdziarz T, Pastor M. Integrative Modeling Strategies for Predicting Drug Toxicities at the eTOX Project. Mol Inform 2015; 34:477-84. [DOI: 10.1002/minf.201400193] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2014] [Accepted: 04/01/2015] [Indexed: 11/11/2022]
|
38
|
Sheridan RP. The Relative Importance of Domain Applicability Metrics for Estimating Prediction Errors in QSAR Varies with Training Set Diversity. J Chem Inf Model 2015; 55:1098-107. [DOI: 10.1021/acs.jcim.5b00110] [Citation(s) in RCA: 41] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Robert P. Sheridan
- Cheminformatics Department, RY800B-305, Merck Research Laboratories, Rahway, New Jersey 07065, United States
| |
Collapse
|
39
|
Carrió P, López O, Sanz F, Pastor M. eTOXlab, an open source modeling framework for implementing predictive models in production environments. J Cheminform 2015; 7:8. [PMID: 25774224 PMCID: PMC4358905 DOI: 10.1186/s13321-015-0058-6] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2014] [Accepted: 02/24/2015] [Indexed: 11/10/2022] Open
Abstract
Background Computational models based in Quantitative-Structure Activity Relationship (QSAR) methodologies are widely used tools for predicting the biological properties of new compounds. In many instances, such models are used as a routine in the industry (e.g. food, cosmetic or pharmaceutical industry) for the early assessment of the biological properties of new compounds. However, most of the tools currently available for developing QSAR models are not well suited for supporting the whole QSAR model life cycle in production environments. Results We have developed eTOXlab; an open source modeling framework designed to be used at the core of a self-contained virtual machine that can be easily deployed in production environments, providing predictions as web services. eTOXlab consists on a collection of object-oriented Python modules with methods mapping common tasks of standard modeling workflows. This framework allows building and validating QSAR models as well as predicting the properties of new compounds using either a command line interface or a graphic user interface (GUI). Simple models can be easily generated by setting a few parameters, while more complex models can be implemented by overriding pieces of the original source code. eTOXlab benefits from the object-oriented capabilities of Python for providing high flexibility: any model implemented using eTOXlab inherits the features implemented in the parent model, like common tools and services or the automatic exposure of the models as prediction web services. The particular eTOXlab architecture as a self-contained, portable prediction engine allows building models with confidential information within corporate facilities, which can be safely exported and used for prediction without disclosing the structures of the training series. Conclusions The software presented here provides full support to the specific needs of users that want to develop, use and maintain predictive models in corporate environments. The technologies used by eTOXlab (web services, VM, object-oriented programming) provide an elegant solution to common practical issues; the system can be installed easily in heterogeneous environments and integrates well with other software. Moreover, the system provides a simple and safe solution for building models with confidential structures that can be shared without disclosing sensitive information. Electronic supplementary material The online version of this article (doi:10.1186/s13321-015-0058-6) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Pau Carrió
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, IMIM (Hospital del Mar Medical Research Institute), Dr. Aiguader 88, E-08003 Barcelona, Spain
| | - Oriol López
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, IMIM (Hospital del Mar Medical Research Institute), Dr. Aiguader 88, E-08003 Barcelona, Spain
| | - Ferran Sanz
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, IMIM (Hospital del Mar Medical Research Institute), Dr. Aiguader 88, E-08003 Barcelona, Spain
| | - Manuel Pastor
- Research Programme on Biomedical Informatics (GRIB), Department of Experimental and Health Sciences, Universitat Pompeu Fabra, IMIM (Hospital del Mar Medical Research Institute), Dr. Aiguader 88, E-08003 Barcelona, Spain
| |
Collapse
|
40
|
The eTOX data-sharing project to advance in silico drug-induced toxicity prediction. Int J Mol Sci 2014; 15:21136-54. [PMID: 25405742 PMCID: PMC4264217 DOI: 10.3390/ijms151121136] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Accepted: 10/20/2014] [Indexed: 11/16/2022] Open
Abstract
The high-quality in vivo preclinical safety data produced by the pharmaceutical industry during drug development, which follows numerous strict guidelines, are mostly not available in the public domain. These safety data are sometimes published as a condensed summary for the few compounds that reach the market, but the majority of studies are never made public and are often difficult to access in an automated way, even sometimes within the owning company itself. It is evident from many academic and industrial examples, that useful data mining and model development requires large and representative data sets and careful curation of the collected data. In 2010, under the auspices of the Innovative Medicines Initiative, the eTOX project started with the objective of extracting and sharing preclinical study data from paper or pdf archives of toxicology departments of the 13 participating pharmaceutical companies and using such data for establishing a detailed, well-curated database, which could then serve as source for read-across approaches (early assessment of the potential toxicity of a drug candidate by comparison of similar structure and/or effects) and training of predictive models. The paper describes the efforts undertaken to allow effective data sharing intellectual property (IP) protection and set up of adequate controlled vocabularies) and to establish the database (currently with over 4000 studies contributed by the pharma companies corresponding to more than 1400 compounds). In addition, the status of predictive models building and some specific features of the eTOX predictive system (eTOXsys) are presented as decision support knowledge-based tools for drug development process at an early stage.
Collapse
|