Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Norinder U, Spjuth O, Svensson F. Using Predicted Bioactivity Profiles to Improve Predictive Modeling. J Chem Inf Model 2020;60:2830-2837. [PMID: 32374618 DOI: 10.1021/acs.jcim.0c00250] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

For:	Norinder U, Spjuth O, Svensson F. Using Predicted Bioactivity Profiles to Improve Predictive Modeling. J Chem Inf Model 2020;60:2830-2837. [PMID: 32374618 DOI: 10.1021/acs.jcim.0c00250] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

Number

Cited by Other Article(s)

Godinez WJ, Trifonov V, Fang B, Kuzu G, Pei L, Guiguemde WA, Martin EJ, King FJ, Jenkins JL, Skewes-Cox P. Compound Activity Prediction with Dose-Dependent Transcriptomic Profiles and Deep Learning. J Chem Inf Model 2024;64:2695-2704. [PMID: 38293736 DOI: 10.1021/acs.jcim.3c01855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2024]

Mullowney MW, Duncan KR, Elsayed SS, Garg N, van der Hooft JJJ, Martin NI, Meijer D, Terlouw BR, Biermann F, Blin K, Durairaj J, Gorostiola González M, Helfrich EJN, Huber F, Leopold-Messer S, Rajan K, de Rond T, van Santen JA, Sorokina M, Balunas MJ, Beniddir MA, van Bergeijk DA, Carroll LM, Clark CM, Clevert DA, Dejong CA, Du C, Ferrinho S, Grisoni F, Hofstetter A, Jespers W, Kalinina OV, Kautsar SA, Kim H, Leao TF, Masschelein J, Rees ER, Reher R, Reker D, Schwaller P, Segler M, Skinnider MA, Walker AS, Willighagen EL, Zdrazil B, Ziemert N, Goss RJM, Guyomard P, Volkamer A, Gerwick WH, Kim HU, Müller R, van Wezel GP, van Westen GJP, Hirsch AKH, Linington RG, Robinson SL, Medema MH. Artificial intelligence for natural product drug discovery. Nat Rev Drug Discov 2023;22:895-916. [PMID: 37697042 DOI: 10.1038/s41573-023-00774-7] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/20/2023] [Indexed: 09/13/2023]

Affiliation(s)

Michael W Mullowney Duchossois Family Institute, The University of Chicago, Chicago, IL, USA
Katherine R Duncan Strathclyde Institute of Pharmacy and Biomedical Sciences, University of Strathclyde, Glasgow, UK
Somayah S Elsayed Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
Neha Garg School of Chemistry and Biochemistry, Center for Microbial Dynamics and Infection, Georgia Institute of Technology, Atlanta, GA, USA
Justin J J van der Hooft Bioinformatics Group, Wageningen University, Wageningen, The Netherlands Department of Biochemistry, University of Johannesburg, Johannesburg, South Africa
Nathaniel I Martin Biological Chemistry Group, Institute of Biology, Leiden University, Leiden, The Netherlands
David Meijer Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
Barbara R Terlouw Bioinformatics Group, Wageningen University, Wageningen, The Netherlands
Friederike Biermann Bioinformatics Group, Wageningen University, Wageningen, The Netherlands Institute of Molecular Bio Science, Goethe-University Frankfurt, Frankfurt am Main, Germany LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
Kai Blin The Novo Nordisk Foundation Center for Biosustainability, Technical University of Denmark, Kongens Lyngby, Denmark
Janani Durairaj Biozentrum, University of Basel, Basel, Switzerland
Marina Gorostiola González Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands ONCODE institute, Leiden, The Netherlands
Eric J N Helfrich Institute of Molecular Bio Science, Goethe-University Frankfurt, Frankfurt am Main, Germany LOEWE Center for Translational Biodiversity Genomics (TBG), Frankfurt am Main, Germany
Florian Huber Center for Digitalization and Digitality, Hochschule Düsseldorf, Düsseldorf, Germany
Stefan Leopold-Messer Institut für Mikrobiologie, Eidgenössische Technische Hochschule (ETH) Zürich, Zürich, Switzerland
Kohulan Rajan Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller-University Jena, Jena, Germany
Tristan de Rond School of Chemical Sciences, University of Auckland, Auckland, New Zealand
Jeffrey A van Santen Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada
Maria Sorokina Institute for Inorganic and Analytical Chemistry, Friedrich-Schiller University, Jena, Germany Pharmaceuticals R&D, Bayer AG, Berlin, Germany
Marcy J Balunas Department of Microbiology and Immunology, University of Michigan, Ann Arbor, MI, USA Department of Medicinal Chemistry, University of Michigan, Ann Arbor, MI, USA
Mehdi A Beniddir Équipe "Chimie des Substances Naturelles", Université Paris-Saclay, CNRS, BioCIS, Orsay, France
Doris A van Bergeijk Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
Laura M Carroll Structural and Computational Biology Unit, EMBL, Heidelberg, Germany
Chase M Clark Division of Pharmaceutical Sciences, School of Pharmacy, University of Wisconsin-Madison, Madison, WI, USA
Djork-Arné Clevert WRDM - Machine Learning Research, Pfizer, Berlin, Germany
Chris A Dejong Adapsyn Bioscience, Hamilton, Ontario, Canada
Chao Du Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands
Scarlet Ferrinho Chemistry Department, University of St Andrews, St Andrews, UK
Francesca Grisoni Institute for Complex Molecular Systems, Department of Biomedical Engineering, Eindhoven University of Technology, Eindhoven, The Netherlands Centre for Living Technologies, Alliance TU/e, WUR, UU, UMC Utrecht, Utrecht, The Netherlands
Albert Hofstetter Laboratory of Physical Chemistry, ETH Zürich, Zürich, Switzerland
Willem Jespers Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands
Olga V Kalinina Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany Drug Bioinformatics, Medical Faculty, Saarland University, Homburg, Germany Center for Bioinformatics, Saarland University, Saarbrücken, Germany
Satria A Kautsar Department of Chemistry, Scripps Research, FL, USA
Hyunwoo Kim College of Pharmacy and Integrated Research Institute for Drug Development, Dongguk University Seoul, Goyang-si, Republic of Korea
Tiago F Leao Center for Nuclear Energy in Agriculture, University of São Paulo, Piracicaba, Brazil
Joleen Masschelein Center for Microbiology, VIB-KU Leuven, Heverlee, Belgium Department of Biology, KU Leuven, Heverlee, Belgium
Evan R Rees Division of Pharmaceutical Sciences, School of Pharmacy, University of Wisconsin-Madison, Madison, WI, USA
Raphael Reher Institute of Pharmaceutical Biology and Biotechnology, University of Marburg, Marburg, Germany Institute of Pharmacy, Martin-Luther-University Halle-Wittenberg, Halle (Saale), Germany
Daniel Reker Department of Biomedical Engineering, Duke University, Durham, NC, USA Duke Microbiome Center, Duke University, Durham, NC, USA
Philippe Schwaller Laboratory of Artificial Chemical Intelligence, Institut des Sciences et Ingénierie Chimiques, Ecole Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland
Marwin Segler Microsoft Research, Cambridge, UK
Michael A Skinnider Adapsyn Bioscience, Hamilton, Ontario, Canada Michael Smith Laboratories, University of British Columbia, Vancouver, British Columbia, Canada
Allison S Walker Department of Chemistry, Vanderbilt University, Nashville, TN, USA Department of Biological Sciences, Vanderbilt University, Nashville, TN, USA
Egon L Willighagen Department of Bioinformatics - BiGCaT, NUTRIM, Maastricht University, Maastricht, The Netherlands
Barbara Zdrazil European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Cambridgeshire, UK
Nadine Ziemert Interfaculty Institute for Microbiology and Infection Medicine Tuebingen (IMIT), Institute for Bioinformatics and Medical Informatics (IBMI), University of Tuebingen, Tuebingen, Germany
Rebecca J M Goss Chemistry Department, University of St Andrews, St Andrews, UK
Pierre Guyomard Bonsai team, CRIStAL - Centre de Recherche en Informatique Signal et Automatique de Lille, Université de Lille, Villeneuve d'Ascq Cedex, France
Andrea Volkamer Center for Bioinformatics, Saarland University, Saarbrücken, Germany In silico Toxicology and Structural Bioinformatics, Institute of Physiology, Charité - Universitätsmedizin Berlin, Berlin, Germany
William H Gerwick Scripps Institution of Oceanography, University of California San Diego, La Jolla, CA, USA
Hyun Uk Kim Department of Chemical and Biomolecular Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon, Korea
Rolf Müller Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany Department of Pharmacy, Saarland University, Saarbrücken, Germany German Center for infection research (DZIF), Braunschweig, Germany Helmholtz International Lab for Anti-Infectives, Saarbrücken, Germany
Gilles P van Wezel Department of Molecular Biotechnology, Institute of Biology, Leiden University, Leiden, The Netherlands Netherlands Institute of Ecology, NIOO-KNAW, Wageningen, The Netherlands
Gerard J P van Westen Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Leiden, The Netherlands.
Anna K H Hirsch Helmholtz Institute for Pharmaceutical Research Saarland (HIPS), Helmholtz Centre for Infection Research (HZI), Saarbrücken, Germany. Department of Pharmacy, Saarland University, Saarbrücken, Germany. German Center for infection research (DZIF), Braunschweig, Germany. Helmholtz International Lab for Anti-Infectives, Saarbrücken, Germany.
Roger G Linington Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada.
Serina L Robinson Department of Environmental Microbiology, Eawag: Swiss Federal Institute for Aquatic Science and Technology, Dübendorf, Switzerland.
Marnix H Medema Bioinformatics Group, Wageningen University, Wageningen, The Netherlands. Institute of Biology, Leiden University, Leiden, The Netherlands.

Collapse

Seal S, Yang H, Trapotsi MA, Singh S, Carreras-Puigvert J, Spjuth O, Bender A. Merging bioactivity predictions from cell morphology and chemical fingerprint models using similarity to training data. J Cheminform 2023;15:56. [PMID: 37268960 DOI: 10.1186/s13321-023-00723-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Accepted: 04/20/2023] [Indexed: 06/04/2023] Open

Walter M, Allen LN, de la Vega de León A, Webb SJ, Gillet VJ. Analysis of the benefits of imputation models over traditional QSAR models for toxicity prediction. J Cheminform 2022;14:32. [PMID: 35672779 PMCID: PMC9172131 DOI: 10.1186/s13321-022-00611-w] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Accepted: 05/12/2022] [Indexed: 11/21/2022] Open

Morger A, Garcia de Lomana M, Norinder U, Svensson F, Kirchmair J, Mathea M, Volkamer A. Studying and mitigating the effects of data drifts on ML model performance at the example of chemical toxicity data. Sci Rep 2022;12:7244. [PMID: 35508546 PMCID: PMC9068909 DOI: 10.1038/s41598-022-09309-3] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2021] [Accepted: 03/17/2022] [Indexed: 11/09/2022] Open

Abstract

Machine learning models are widely applied to predict molecular properties or the biological activity of small molecules on a specific protein. Models can be integrated in a conformal prediction (CP) framework which adds a calibration step to estimate the confidence of the predictions. CP models present the advantage of ensuring a predefined error rate under the assumption that test and calibration set are exchangeable. In cases where the test data have drifted away from the descriptor space of the training data, or where assay setups have changed, this assumption might not be fulfilled and the models are not guaranteed to be valid. In this study, the performance of internally valid CP models when applied to either newer time-split data or to external data was evaluated. In detail, temporal data drifts were analysed based on twelve datasets from the ChEMBL database. In addition, discrepancies between models trained on publicly-available data and applied to proprietary data for the liver toxicity and MNT in vivo endpoints were investigated. In most cases, a drastic decrease in the validity of the models was observed when applied to the time-split or external (holdout) test sets. To overcome the decrease in model validity, a strategy for updating the calibration set with data more similar to the holdout set was investigated. Updating the calibration set generally improved the validity, restoring it completely to its expected value in many cases. The restored validity is the first requisite for applying the CP models with confidence. However, the increased validity comes at the cost of a decrease in model efficiency, as more predictions are identified as inconclusive. This study presents a strategy to recalibrate CP models to mitigate the effects of data drifts. Updating the calibration sets without having to retrain the model has proven to be a useful approach to restore the validity of most models.

Collapse

Oguike OE, Ugwuishiwu CH, Asogwa CN, Nnadi CO, Obonga WO, Attama AA. Systematic review on the application of machine learning to quantitative structure-activity relationship modeling against Plasmodium falciparum. Mol Divers 2022;26:3447-3462. [PMID: 35064444 PMCID: PMC8782692 DOI: 10.1007/s11030-022-10380-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2021] [Accepted: 01/07/2022] [Indexed: 11/29/2022]

Development and implementation of an enterprise-wide predictive model for early absorption, distribution, metabolism and excretion properties. Future Med Chem 2021;13:1639-1654. [PMID: 34528444 DOI: 10.4155/fmc-2021-0138] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open

Mervin LH, Trapotsi MA, Afzal AM, Barrett IP, Bender A, Engkvist O. Probabilistic Random Forest improves bioactivity predictions close to the classification threshold by taking into account experimental uncertainty. J Cheminform 2021;13:62. [PMID: 34412708 PMCID: PMC8375213 DOI: 10.1186/s13321-021-00539-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Accepted: 07/30/2021] [Indexed: 11/24/2022] Open

Abstract

Measurements of protein–ligand interactions have reproducibility limits due to experimental errors. Any model based on such assays will consequentially have such unavoidable errors influencing their performance which should ideally be factored into modelling and output predictions, such as the actual standard deviation of experimental measurements (σ) or the associated comparability of activity values between the aggregated heterogenous activity units (i.e., K_i versus IC₅₀ values) during dataset assimilation. However, experimental errors are usually a neglected aspect of model generation. In order to improve upon the current state-of-the-art, we herein present a novel approach toward predicting protein–ligand interactions using a Probabilistic Random Forest (PRF) classifier. The PRF algorithm was applied toward in silico protein target prediction across ~ 550 tasks from ChEMBL and PubChem. Predictions were evaluated by taking into account various scenarios of experimental standard deviations in both training and test sets and performance was assessed using fivefold stratified shuffled splits for validation. The largest benefit in incorporating the experimental deviation in PRF was observed for data points close to the binary threshold boundary, when such information was not considered in any way in the original RF algorithm. For example, in cases when σ ranged between 0.4–0.6 log units and when ideal probability estimates between 0.4–0.6, the PRF outperformed RF with a median absolute error margin of ~ 17%. In comparison, the baseline RF outperformed PRF for cases with high confidence to belong to the active class (far from the binary decision threshold), although the RF models gave errors smaller than the experimental uncertainty, which could indicate that they were overtrained and/or over-confident. Finally, the PRF models trained with putative inactives decreased the performance compared to PRF models without putative inactives and this could be because putative inactives were not assigned an experimental pXC₅₀ value, and therefore they were considered inactives with a low uncertainty (which in practice might not be true). In conclusion, PRF can be useful for target prediction models in particular for data where class boundaries overlap with the measurement uncertainty, and where a substantial part of the training data is located close to the classification threshold.

Collapse

Wilm A, Garcia de Lomana M, Stork C, Mathai N, Hirte S, Norinder U, Kühnl J, Kirchmair J. Predicting the Skin Sensitization Potential of Small Molecules with Machine Learning Models Trained on Biologically Meaningful Descriptors. Pharmaceuticals (Basel) 2021;14:ph14080790. [PMID: 34451887 PMCID: PMC8402010 DOI: 10.3390/ph14080790] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 08/03/2021] [Accepted: 08/06/2021] [Indexed: 02/06/2023] Open

Garcia de Lomana M, Morger A, Norinder U, Buesen R, Landsiedel R, Volkamer A, Kirchmair J, Mathea M. ChemBioSim: Enhancing Conformal Prediction of In Vivo Toxicity by Use of Predicted Bioactivities. J Chem Inf Model 2021;61:3255-3272. [PMID: 34153183 PMCID: PMC8317154 DOI: 10.1021/acs.jcim.1c00451] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Indexed: 02/07/2023]

Abstract

Computational methods such as machine learning approaches have a strong track record of success in predicting the outcomes of in vitro assays. In contrast, their ability to predict in vivo endpoints is more limited due to the high number of parameters and processes that may influence the outcome. Recent studies have shown that the combination of chemical and biological data can yield better models for in vivo endpoints. The ChemBioSim approach presented in this work aims to enhance the performance of conformal prediction models for in vivo endpoints by combining chemical information with (predicted) bioactivity assay outcomes. Three in vivo toxicological endpoints, capturing genotoxic (MNT), hepatic (DILI), and cardiological (DICC) issues, were selected for this study due to their high relevance for the registration and authorization of new compounds. Since the sparsity of available biological assay data is challenging for predictive modeling, predicted bioactivity descriptors were introduced instead. Thus, a machine learning model for each of the 373 collected biological assays was trained and applied on the compounds of the in vivo toxicity data sets. Besides the chemical descriptors (molecular fingerprints and physicochemical properties), these predicted bioactivities served as descriptors for the models of the three in vivo endpoints. For this study, a workflow based on a conformal prediction framework (a method for confidence estimation) built on random forest models was developed. Furthermore, the most relevant chemical and bioactivity descriptors for each in vivo endpoint were preselected with lasso models. The incorporation of bioactivity descriptors increased the mean F1 scores of the MNT model from 0.61 to 0.70 and for the DICC model from 0.72 to 0.82 while the mean efficiencies increased by roughly 0.10 for both endpoints. In contrast, for the DILI endpoint, no significant improvement in model performance was observed. Besides pure performance improvements, an analysis of the most important bioactivity features allowed detection of novel and less intuitive relationships between the predicted biological assay outcomes used as descriptors and the in vivo endpoints. This study presents how the prediction of in vivo toxicity endpoints can be improved by the incorporation of biological information-which is not necessarily captured by chemical descriptors-in an automated workflow without the need for adding experimental workload for the generation of bioactivity descriptors as predicted outcomes of bioactivity assays were utilized. All bioactivity CP models for deriving the predicted bioactivities, as well as the in vivo toxicity CP models, can be freely downloaded from https://doi.org/10.5281/zenodo.4761225.

Collapse