1
|
Pikalyova R, Akhmetshin T, Horvath D, Varnek A. CoLiNN: A Tool for Fast Chemical Space Visualization of Combinatorial Libraries Without Enumeration. Mol Inform 2025; 44:e202400263. [PMID: 40099935 PMCID: PMC11916640 DOI: 10.1002/minf.202400263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Revised: 02/03/2025] [Accepted: 02/28/2025] [Indexed: 03/20/2025]
Abstract
Visualization of the combinatorial library chemical space provides a comprehensive overview of available compound classes, their diversity, and physicochemical property distribution - key factors in drug discovery. Typically, this visualization requires time- and resource-consuming compound enumeration, standardization, descriptor calculation, and dimensionality reduction. In this study, we present the Combinatorial Library Neural Network (CoLiNN) designed to predict the projection of compounds on a 2D chemical space map using only their building blocks and reaction information, thus eliminating the need for compound enumeration. Trained on 2.5 K virtual DNA-Encoded Libraries (DELs), CoLiNN demonstrated high predictive performance, accurately predicting the compound position on Generative Topographic Maps (GTMs). GTMs predicted by CoLiNN were found very similar to the maps built for enumerated structures. In the library comparison task, we compared the GTMs of DELs and the ChEMBL database. The similarity-based DELs/ChEMBL rankings obtained with "true" and CoLiNN predicted GTMs were consistent. Therefore, CoLiNN has the potential to become the go-to tool for combinatorial compound library design - it can explore the library design space more efficiently by skipping the compound enumeration.
Collapse
Affiliation(s)
- Regina Pikalyova
- Laboratoire de ChemoinformatiqueUniversity of Strasbourg4, rue B. Pascal67081StrasbourgFrance
| | - Tagir Akhmetshin
- Laboratoire de ChemoinformatiqueUniversity of Strasbourg4, rue B. Pascal67081StrasbourgFrance
| | - Dragos Horvath
- Laboratoire de ChemoinformatiqueUniversity of Strasbourg4, rue B. Pascal67081StrasbourgFrance
| | - Alexandre Varnek
- Laboratoire de ChemoinformatiqueUniversity of Strasbourg4, rue B. Pascal67081StrasbourgFrance
| |
Collapse
|
2
|
Gantzer P, Staub R, Harabuchi Y, Maeda S, Varnek A. Chemography-guided analysis of a reaction path network for ethylene hydrogenation with a model Wilkinson's catalyst. Mol Inform 2025; 44:e202400063. [PMID: 39121023 DOI: 10.1002/minf.202400063] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 07/11/2024] [Accepted: 07/19/2024] [Indexed: 08/11/2024]
Abstract
Visualization and analysis of large chemical reaction networks become rather challenging when conventional graph-based approaches are used. As an alternative, we propose to use the chemical cartography ("chemography") approach, describing the data distribution on a 2-dimensional map. Here, the Generative Topographic Mapping (GTM) algorithm - an advanced chemography approach - has been applied to visualize the reaction path network of a simplified Wilkinson's catalyst-catalyzed hydrogenation containing some 105 structures generated with the help of the Artificial Force Induced Reaction (AFIR) method using either Density Functional Theory or Neural Network Potential (NNP) for potential energy surface calculations. Using new atoms permutation invariant 3D descriptors for structure encoding, we've demonstrated that GTM possesses the abilities to cluster structures that share the same 2D representation, to visualize potential energy surface, to provide an insight on the reaction path exploration as a function of time and to compare reaction path networks obtained with different methods of energy assessment.
Collapse
Affiliation(s)
- Philippe Gantzer
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, Hokkaido, 001-0021, Japan
| | - Ruben Staub
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, Hokkaido, 001-0021, Japan
| | - Yu Harabuchi
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, Hokkaido, 001-0021, Japan
| | - Satoshi Maeda
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, Hokkaido, 001-0021, Japan
| | - Alexandre Varnek
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Sapporo, Hokkaido, 001-0021, Japan
- Laboratory of Chemoinformatics, UMR 7140, CNRS, University of Strasbourg, Strasbourg, 67081, France
| |
Collapse
|
3
|
Orlov AA, Akhmetshin TN, Horvath D, Marcou G, Varnek A. From High Dimensions to Human Insight: Exploring Dimensionality Reduction for Chemical Space Visualization. Mol Inform 2025; 44:e202400265. [PMID: 39633514 PMCID: PMC11733715 DOI: 10.1002/minf.202400265] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2024] [Revised: 11/08/2024] [Accepted: 11/09/2024] [Indexed: 12/07/2024]
Abstract
Dimensionality reduction is an important exploratory data analysis method that allows high-dimensional data to be represented in a human-interpretable lower-dimensional space. It is extensively applied in the analysis of chemical libraries, where chemical structure data - represented as high-dimensional feature vectors-are transformed into 2D or 3D chemical space maps. In this paper, commonly used dimensionality reduction techniques - Principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), Uniform Manifold Approximation and Projection (UMAP), and Generative Topographic Mapping (GTM) - are evaluated in terms of neighborhood preservation and visualization capability of sets of small molecules from the ChEMBL database.
Collapse
Affiliation(s)
- Alexey A. Orlov
- Laboratory of ChemoinformaticsUMR 7140 CNRSUniversity of Strasbourg, 4Blaise Pascal Str.67000StrasbourgFrance
| | - Tagir N. Akhmetshin
- Laboratory of ChemoinformaticsUMR 7140 CNRSUniversity of Strasbourg, 4Blaise Pascal Str.67000StrasbourgFrance
| | - Dragos Horvath
- Laboratory of ChemoinformaticsUMR 7140 CNRSUniversity of Strasbourg, 4Blaise Pascal Str.67000StrasbourgFrance
| | - Gilles Marcou
- Laboratory of ChemoinformaticsUMR 7140 CNRSUniversity of Strasbourg, 4Blaise Pascal Str.67000StrasbourgFrance
| | - Alexandre Varnek
- Laboratory of ChemoinformaticsUMR 7140 CNRSUniversity of Strasbourg, 4Blaise Pascal Str.67000StrasbourgFrance
| |
Collapse
|
4
|
Protopopov MV, Tararina VV, Bonachera F, Dzyuba IM, Kapeliukha A, Hlotov S, Chuk O, Marcou G, Klimchuk O, Horvath D, Yeghyan E, Savych O, Tarkhanova OO, Varnek A, Moroz YS. The freedom space - a new set of commercially available molecules for hit discovery. Mol Inform 2024; 43:e202400114. [PMID: 39171757 DOI: 10.1002/minf.202400114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2024] [Revised: 07/18/2024] [Accepted: 07/24/2024] [Indexed: 08/23/2024]
Abstract
The advent of high-performance virtual screening techniques nowadays allows drug designers to explore ultra-large sets of candidate compounds in search of molecules predicted to have desired properties. However, the success of such an endeavor heavily relies on the pertinence (drug-likeness and, foremost, chemical feasibility) of these candidates, or otherwise, virtual screening will return valueless "hits", by the garbage in/garbage out principle. The huge popularity of the judiciously enumerated Enamine REAL Space is clear proof of the strength of this Big Data trend in drug discovery. Here we describe a new dataset of make-on-demand compounds called the Freedom space. It follows the principles of Enamine REAL Space and contains highly feasible molecules (synthesis success rate over 75 percent). However, the scaffold and chemography analysis revealed significant differences to both the REAL and biologically annotated compounds from the ChEMBL database. The Freedom Space is a significant extension of the REAL Space and can be utilized for a more comprehensive exploration of the synthetically feasible chemical space in hit finding and hit-to-lead campaigns.
Collapse
Affiliation(s)
- Mykola V Protopopov
- Chemspace LLC, Kyiv, Ukraine
- Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
| | - Valentyna V Tararina
- Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
- Enamine Ltd., Kyiv, Ukraine
| | - Fanny Bonachera
- Laboratoire de Chemoinformatique, University of Strasbourg, 4 rue Blaise Pascal, 67000, Strasbourg, France
| | | | - Anna Kapeliukha
- Chemspace LLC, Kyiv, Ukraine
- Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
| | - Serhii Hlotov
- Chemspace LLC, Kyiv, Ukraine
- Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
| | - Oleksii Chuk
- Chemspace LLC, Kyiv, Ukraine
- Kyiv Academic University, 36 Vernadsky blvd., Kyiv, Ukraine
| | - Gilles Marcou
- Laboratoire de Chemoinformatique, University of Strasbourg, 4 rue Blaise Pascal, 67000, Strasbourg, France
| | - Olga Klimchuk
- Laboratoire de Chemoinformatique, University of Strasbourg, 4 rue Blaise Pascal, 67000, Strasbourg, France
| | - Dragos Horvath
- Laboratoire de Chemoinformatique, University of Strasbourg, 4 rue Blaise Pascal, 67000, Strasbourg, France
| | - Erik Yeghyan
- Laboratoire de Chemoinformatique, University of Strasbourg, 4 rue Blaise Pascal, 67000, Strasbourg, France
| | | | | | - Alexandre Varnek
- Laboratoire de Chemoinformatique, University of Strasbourg, 4 rue Blaise Pascal, 67000, Strasbourg, France
| | - Yurii S Moroz
- Taras Shevchenko National University of Kyiv, Kyiv, Ukraine
- Enamine Ltd., Kyiv, Ukraine
| |
Collapse
|
5
|
Llompart P, Minoletti C, Baybekov S, Horvath D, Marcou G, Varnek A. Will we ever be able to accurately predict solubility? Sci Data 2024; 11:303. [PMID: 38499581 PMCID: PMC10948805 DOI: 10.1038/s41597-024-03105-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2023] [Accepted: 02/29/2024] [Indexed: 03/20/2024] Open
Abstract
Accurate prediction of thermodynamic solubility by machine learning remains a challenge. Recent models often display good performances, but their reliability may be deceiving when used prospectively. This study investigates the origins of these discrepancies, following three directions: a historical perspective, an analysis of the aqueous solubility dataverse and data quality. We investigated over 20 years of published solubility datasets and models, highlighting overlooked datasets and the overlaps between popular sets. We benchmarked recently published models on a novel curated solubility dataset and report poor performances. We also propose a workflow to cure aqueous solubility data aiming at producing useful models for bench chemist. Our results demonstrate that some state-of-the-art models are not ready for public usage because they lack a well-defined applicability domain and overlook historical data sources. We report the impact of factors influencing the utility of the models: interlaboratory standard deviation, ionic state of the solute and data sources. The herein obtained models, and quality-assessed datasets are publicly available.
Collapse
Affiliation(s)
- P Llompart
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France
- IDD/CADD, Sanofi, Vitry-Sur-Seine, France
| | | | - S Baybekov
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France
| | - D Horvath
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France
| | - G Marcou
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France.
| | - A Varnek
- Laboratory of Chemoinformatics, UMR7140, University of Strasbourg, Strasbourg, France
| |
Collapse
|
6
|
Abstract
DNA-encoded libraries (DELs) are widely used in the discovery of drug candidates, and understanding their design principles is critical for accessing better libraries. Most DELs are combinatorial in nature and are synthesized by assembling sets of building blocks in specific topologies. In this study, different aspects of library topology were explored and their effect on DEL properties and chemical diversity was analyzed. We introduce a descriptor for DEL topological assignment (DELTA) and use it to examine the landscape of possible DEL topologies and their coverage in the literature. A generative topographic mapping analysis revealed that the impact of library topology on chemical space coverage is secondary to building block selection. Furthermore, it became apparent that the descriptor used to analyze chemical space dictates how structures cluster, with the effects of topology being apparent when using three-dimensional descriptors but not with common two-dimensional descriptors. This outcome points to potential challenges of attempts to predict DEL productivity based on chemical space analyses alone. While topology is rather inconsequential for defining the chemical space of encoded compounds, it greatly affects possible interactions with target proteins as illustrated in docking studies using NAD/NADP binding proteins as model receptors.
Collapse
Affiliation(s)
- William K Weigel
- Department of Medicinal Chemistry, Skaggs College of Pharmacy, University of Utah, 30 S 2000 E, Salt Lake City, Utah 84112, United States
| | - Alba L Montoya
- Department of Medicinal Chemistry, Skaggs College of Pharmacy, University of Utah, 30 S 2000 E, Salt Lake City, Utah 84112, United States
| | - Raphael M Franzini
- Department of Medicinal Chemistry, Skaggs College of Pharmacy, University of Utah, 30 S 2000 E, Salt Lake City, Utah 84112, United States
- Huntsman Cancer Institute, University of Utah, 2000 Circle of Hope Dr., Salt Lake City, Utah 84112, United States
| |
Collapse
|
7
|
Zabolotna Y, Bonachera F, Horvath D, Lin A, Marcou G, Klimchuk O, Varnek A. Chemspace Atlas: Multiscale Chemography of Ultralarge Libraries for Drug Discovery. J Chem Inf Model 2022; 62:4537-4548. [DOI: 10.1021/acs.jcim.2c00509] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Yuliana Zabolotna
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081, France
| | - Fanny Bonachera
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081, France
| | - Dragos Horvath
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081, France
| | - Arkadii Lin
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081, France
| | - Gilles Marcou
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081, France
| | - Olga Klimchuk
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081, France
| | - Alexandre Varnek
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081, France
| |
Collapse
|
8
|
Zabolotna Y, Volochnyuk DM, Ryabukhin SV, Horvath D, Gavrilenko KS, Marcou G, Moroz YS, Oksiuta O, Varnek A. A Close-up Look at the Chemical Space of Commercially Available Building Blocks for Medicinal Chemistry. J Chem Inf Model 2021; 62:2171-2185. [PMID: 34928600 DOI: 10.1021/acs.jcim.1c00811] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The ability to efficiently synthesize desired compounds can be a limiting factor for chemical space exploration in drug discovery. This ability is conditioned not only by the existence of well-studied synthetic protocols but also by the availability of corresponding reagents, so-called building blocks (BBs). In this work, we present a detailed analysis of the chemical space of 400 000 purchasable BBs. The chemical space was defined by corresponding synthons─fragments contributed to the final molecules upon reaction. They allow an analysis of BB physicochemical properties and diversity, unbiased by the leaving and protective groups in actual reagents. The main classes of BBs were analyzed in terms of their availability, rule-of-two-defined quality, and diversity. Available BBs were eventually compared to a reference set of biologically relevant synthons derived from ChEMBL fragmentation, in order to illustrate how well they cover the actual medicinal chemistry needs. This was performed on a newly constructed universal generative topographic map of synthon chemical space that enables visualization of both libraries and analysis of their overlapped and library-specific regions.
Collapse
Affiliation(s)
- Yuliana Zabolotna
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081, France
| | - Dmitriy M Volochnyuk
- Institute of Organic Chemistry, National Academy of Sciences of Ukraine, Murmanska Street 5, Kyiv 02660, Ukraine.,Enamine Ltd., 78 Chervonotkatska str., 02660 Kiev, Ukraine
| | - Sergey V Ryabukhin
- The Institute of High Technologies, Kyiv National Taras Shevchenko University, 64 Volodymyrska Street, Kyiv 01601, Ukraine.,Enamine Ltd., 78 Chervonotkatska str., 02660 Kiev, Ukraine
| | - Dragos Horvath
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081, France
| | - Konstantin S Gavrilenko
- Research-And-Education ChemBioCenter, National Taras Shevchenko University of Kyiv, Chervonotkatska str., 61, 03022 Kiev, Ukraine.,Enamine Ltd., 78 Chervonotkatska str., 02660 Kiev, Ukraine
| | - Gilles Marcou
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081, France
| | - Yurii S Moroz
- Research-And-Education ChemBioCenter, National Taras Shevchenko University of Kyiv, Chervonotkatska str., 61, 03022 Kiev, Ukraine.,Chemspace, Chervonotkatska Street 78, 02094 Kyiv, Ukraine
| | - Oleksandr Oksiuta
- Institute of Organic Chemistry, National Academy of Sciences of Ukraine, Murmanska Street 5, Kyiv 02660, Ukraine.,Chemspace, Chervonotkatska Street 78, 02094 Kyiv, Ukraine
| | - Alexandre Varnek
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081, France.,Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, 001-0021 Sapporo, Japan
| |
Collapse
|
9
|
Zabolotna Y, Ertl P, Horvath D, Bonachera F, Marcou G, Varnek A. NP Navigator: A New Look at the Natural Product Chemical Space. Mol Inform 2021; 40:e2100068. [PMID: 34170632 DOI: 10.1002/minf.202100068] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Accepted: 05/15/2021] [Indexed: 11/08/2022]
Abstract
Natural products (NPs), being evolutionary selected over millions of years to bind to biological macromolecules, remained an important source of inspiration for medicinal chemists even after the advent of efficient drug discovery technologies such as combinatorial chemistry and high-throughput screening. Thus, there is a strong demand for efficient and user-friendly computational tools that allow to analyze large libraries of NPs. In this context, we introduce NP Navigator - a freely available intuitive online tool for visualization and navigation through the chemical space of NPs and NP-like molecules. It is based on the hierarchical ensemble of generative topographic maps, featuring NPs from the COlleCtion of Open NatUral producTs (COCONUT), bioactive compounds from ChEMBL and commercially available molecules from ZINC. NP Navigator allows to efficiently analyze different aspects of NPs - chemotype distribution, physicochemical properties, biological activity and commercial availability of NPs. The latter concerns not only purchasable NPs but also their close analogs that can be considered as synthetic mimetics of NPs or pseudo-NPs.
Collapse
Affiliation(s)
- Yuliana Zabolotna
- University of Strasbourg, Laboratory of Chemoinformatics, 4, rue B. Pascal, 67081, Strasbourg, France
| | - Peter Ertl
- Novartis Institutes for BioMedical Research, Novartis Campus, CH-4056, Basel, Switzerland
| | - Dragos Horvath
- University of Strasbourg, Laboratory of Chemoinformatics, 4, rue B. Pascal, 67081, Strasbourg, France
| | - Fanny Bonachera
- University of Strasbourg, Laboratory of Chemoinformatics, 4, rue B. Pascal, 67081, Strasbourg, France
| | - Gilles Marcou
- University of Strasbourg, Laboratory of Chemoinformatics, 4, rue B. Pascal, 67081, Strasbourg, France
| | - Alexandre Varnek
- University of Strasbourg, Laboratory of Chemoinformatics, 4, rue B. Pascal, 67081, Strasbourg, France.,Institute for Chemical Reaction Design and Discovery, WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Sapporo, Kita-ku, 001-0021 Sapporo, Japan
| |
Collapse
|
10
|
Bort W, Baskin II, Gimadiev T, Mukanov A, Nugmanov R, Sidorov P, Marcou G, Horvath D, Klimchuk O, Madzhidov T, Varnek A. Discovery of novel chemical reactions by deep generative recurrent neural network. Sci Rep 2021; 11:3178. [PMID: 33542271 PMCID: PMC7862614 DOI: 10.1038/s41598-021-81889-y] [Citation(s) in RCA: 30] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2020] [Accepted: 01/06/2021] [Indexed: 12/18/2022] Open
Abstract
The "creativity" of Artificial Intelligence (AI) in terms of generating de novo molecular structures opened a novel paradigm in compound design, weaknesses (stability & feasibility issues of such structures) notwithstanding. Here we show that "creative" AI may be as successfully taught to enumerate novel chemical reactions that are stoichiometrically coherent. Furthermore, when coupled to reaction space cartography, de novo reaction design may be focused on the desired reaction class. A sequence-to-sequence autoencoder with bidirectional Long Short-Term Memory layers was trained on on-purpose developed "SMILES/CGR" strings, encoding reactions of the USPTO database. The autoencoder latent space was visualized on a generative topographic map. Novel latent space points were sampled around a map area populated by Suzuki reactions and decoded to corresponding reactions. These can be critically analyzed by the expert, cleaned of irrelevant functional groups and eventually experimentally attempted, herewith enlarging the synthetic purpose of popular synthetic pathways.
Collapse
Affiliation(s)
- William Bort
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France
| | - Igor I Baskin
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya str. 18, 420008, Kazan, Russia
- Department of Materials Science and Engineering, Technion - Israel Institute of Technology, 3200003, Haifa, Israel
| | - Timur Gimadiev
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, 001-0021, Japan
| | - Artem Mukanov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya str. 18, 420008, Kazan, Russia
| | - Ramil Nugmanov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya str. 18, 420008, Kazan, Russia
| | - Pavel Sidorov
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, 001-0021, Japan
| | - Gilles Marcou
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France
| | - Dragos Horvath
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France
| | - Olga Klimchuk
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France
| | - Timur Madzhidov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institute of Chemistry, Kazan Federal University, Kremlyovskaya str. 18, 420008, Kazan, Russia
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, UMR 7140 CNRS, University of Strasbourg, 1, rue Blaise Pascal, 67000, Strasbourg, France.
- Institute for Chemical Reaction Design and Discovery (WPI-ICReDD), Hokkaido University, Kita 21 Nishi 10, Kita-ku, Sapporo, 001-0021, Japan.
| |
Collapse
|
11
|
Chemical Graph Theory for Property Modeling in QSAR and QSPR—Charming QSAR & QSPR. MATHEMATICS 2020. [DOI: 10.3390/math9010060] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Quantitative structure-activity relationship (QSAR) and Quantitative structure-property relationship (QSPR) are mathematical models for the prediction of the chemical, physical or biological properties of chemical compounds. Usually, they are based on structural (grounded on fragment contribution) or calculated (centered on QSAR three-dimensional (QSAR-3D) or chemical descriptors) parameters. Hereby, we describe a Graph Theory approach for generating and mining molecular fragments to be used in QSAR or QSPR modeling based exclusively on fragment contributions. Merging of Molecular Graph Theory, Simplified Molecular Input Line Entry Specification (SMILES) notation, and the connection table data allows a precise way to differentiate and count the molecular fragments. Machine learning strategies generated models with outstanding root mean square error (RMSE) and R2 values. We also present the software Charming QSAR & QSPR, written in Python, for the property prediction of chemical compounds while using this approach.
Collapse
|
12
|
Horvath D, Marcou G, Varnek A. Trustworthiness, the Key to Grid-Based Map-Driven Predictive Model Enhancement and Applicability Domain Control. J Chem Inf Model 2020; 60:6020-6032. [DOI: 10.1021/acs.jcim.0c00998] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Dragos Horvath
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Gilles Marcou
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| |
Collapse
|
13
|
Zabolotna Y, Lin A, Horvath D, Marcou G, Volochnyuk DM, Varnek A. Chemography: Searching for Hidden Treasures. J Chem Inf Model 2020; 61:179-188. [PMID: 33334102 DOI: 10.1021/acs.jcim.0c00936] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
The days when medicinal chemistry was limited to a few series of compounds of therapeutic interest are long gone. Nowadays, no human may succeed to acquire a complete overview of more than a billion existing or feasible compounds within which the potential "blockbuster drugs" are well hidden and yet only a few mouse clicks away. To reach these "hidden treasures", we adapted the generative topographic mapping method to enable efficient navigation through the chemical space, from a global overview to a structural pattern detection, covering, for the first time, the complete ZINC library of purchasable compounds, relative to 1.6 million biologically relevant ChEMBL molecules. About 40 000 hierarchical maps of the chemical space were constructed. Structural motifs inherent to only one library were identified. Roughly 20 000 off-market ChEMBL compound families represent incentives to enrich commercial catalogs. Alternatively, 125 000 ZINC-specific compound classes, absent in structure-activity bases, are novel paths to explore in medicinal chemistry. The complete list of these chemotypes can be downloaded using the link https://forms.gle/B6bUJj82t9EfmttV6.
Collapse
Affiliation(s)
- Yuliana Zabolotna
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081 France
| | - Arkadii Lin
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081 France
| | - Dragos Horvath
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081 France
| | - Gilles Marcou
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081 France
| | - Dmitriy M Volochnyuk
- Institute of Organic Chemistry National Academy of Sciences of Ukraine, Murmanska Street 5, Kyiv 02660, Ukraine.,Enamine Ltd., Chervonotkatska Street 78, Kyiv 02094, Ukraine
| | - Alexandre Varnek
- University of Strasbourg, Laboratoire de Chemoinformatique, 4, rue B. Pascal, Strasbourg 67081 France
| |
Collapse
|
14
|
Horvath D, Orlov A, Osolodkin DI, Ishmukhametov AA, Marcou G, Varnek A. A Chemographic Audit of anti-Coronavirus Structure-activity Information from Public Databases (ChEMBL). Mol Inform 2020; 39:e2000080. [PMID: 32363750 PMCID: PMC7267182 DOI: 10.1002/minf.202000080] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Accepted: 04/26/2020] [Indexed: 01/30/2023]
Abstract
Discovery of drugs against newly emerged pathogenic agents like the SARS-CoV-2 coronavirus (CoV) must be based on previous research against related species. Scientists need to get acquainted with and develop a global oversight over so-far tested molecules. Chemography (herein used Generative Topographic Mapping, in particular) places structures on a human-readable 2D map (obtained by dimensionality reduction of the chemical space of molecular descriptors) and is thus well suited for such an audit. The goal is to map medicinal chemistry efforts so far targeted against CoVs. This includes comparing libraries tested against various virus species/genera, predicting their polypharmacological profiles and highlighting often encountered chemotypes. Maps are challenged to provide predictive activity landscapes against viral proteins. Definition of "anti-CoV" map zones led to selection of therein residing 380 potential anti-CoV agents, out of a vast pool of 800 M organic compounds.
Collapse
Affiliation(s)
- Dragos Horvath
- Chemoinformatics LaboratoryUMR 7140 CNRS/University of Strasbourg4, rue Blaise Pascal67000Strasbourg
| | - Alexey Orlov
- Chemoinformatics LaboratoryUMR 7140 CNRS/University of Strasbourg4, rue Blaise Pascal67000Strasbourg
- FSBSI “Chumakov FSC R&D IBP RAS”Poselok Instituta Poliomielita 8 bd. 1Poselenie MoskovskyMoscow108819Russia
| | - Dmitry I. Osolodkin
- FSBSI “Chumakov FSC R&D IBP RAS”Poselok Instituta Poliomielita 8 bd. 1Poselenie MoskovskyMoscow108819Russia
- Institute of Translational Medicine and BiotechnologySechenov First Moscow State Medical UniversityTrubetskaya ul. 8Moscow119991Russia
| | - Aydar A. Ishmukhametov
- FSBSI “Chumakov FSC R&D IBP RAS”Poselok Instituta Poliomielita 8 bd. 1Poselenie MoskovskyMoscow108819Russia
- Institute of Translational Medicine and BiotechnologySechenov First Moscow State Medical UniversityTrubetskaya ul. 8Moscow119991Russia
| | - Gilles Marcou
- Chemoinformatics LaboratoryUMR 7140 CNRS/University of Strasbourg4, rue Blaise Pascal67000Strasbourg
| | - Alexandre Varnek
- Chemoinformatics LaboratoryUMR 7140 CNRS/University of Strasbourg4, rue Blaise Pascal67000Strasbourg
| |
Collapse
|
15
|
Chaube S, Goverapet Srinivasan S, Rai B. Applied machine learning for predicting the lanthanide-ligand binding affinities. Sci Rep 2020; 10:14322. [PMID: 32868845 PMCID: PMC7459320 DOI: 10.1038/s41598-020-71255-9] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2020] [Accepted: 08/12/2020] [Indexed: 11/25/2022] Open
Abstract
Binding affinities of metal-ligand complexes are central to a multitude of applications like drug design, chelation therapy, designing reagents for solvent extraction etc. While state-of-the-art molecular modelling approaches are usually employed to gather structural and chemical insights about the metal complexation with ligands, their computational cost and the limited ability to predict metal-ligand stability constants with reasonable accuracy, renders them impractical to screen large chemical spaces. In this context, leveraging vast amounts of experimental data to learn the metal-binding affinities of ligands becomes a promising alternative. Here, we develop a machine learning framework for predicting binding affinities (logK1) of lanthanide cations with several structurally diverse molecular ligands. Six supervised machine learning algorithms-Random Forest (RF), k-Nearest Neighbours (KNN), Support Vector Machines (SVM), Kernel Ridge Regression (KRR), Multi Layered Perceptrons (MLP) and Adaptive Boosting (AdaBoost)-were trained on a dataset comprising thousands of experimental values of logK1 and validated in an external 10-folds cross-validation procedure. This was followed by a thorough feature engineering and feature importance analysis to identify the molecular, metallic and solvent features most relevant to binding affinity prediction, along with an evaluation of performance metrics against the dimensionality of feature space. Having demonstrated the excellent predictive ability of our framework, we utilized the best performing AdaBoost model to predict the logK1 values of lanthanide cations with nearly 71 million compounds present in the PubChem database. Our methodology opens up an opportunity for significantly accelerating screening and design of ligands for various targeted applications, from vast chemical spaces.
Collapse
Affiliation(s)
- Suryanaman Chaube
- TCS Research, Tata Research Development and Design Center, 54-B Hadapsar Industrial Estate, Hadapsar, Pune, Maharashtra, 411013, India
| | - Sriram Goverapet Srinivasan
- TCS Research, Tata Research Development and Design Center, 54-B Hadapsar Industrial Estate, Hadapsar, Pune, Maharashtra, 411013, India.
| | - Beena Rai
- TCS Research, Tata Research Development and Design Center, 54-B Hadapsar Industrial Estate, Hadapsar, Pune, Maharashtra, 411013, India
| |
Collapse
|
16
|
Baskin II, Lozano S, Durot M, Marcou G, Horvath D, Varnek A. Autoignition temperature: comprehensive data analysis and predictive models. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2020; 31:597-613. [PMID: 32646236 DOI: 10.1080/1062936x.2020.1785933] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/18/2020] [Accepted: 06/18/2020] [Indexed: 06/11/2023]
Abstract
Here we report a new predictive model for autoignition temperature (AIT), an important physical parameter widely used to assess potential safety hazards of combustible materials. Available structure-AIT data extracted from different sources were critically analysed. Support vector regression (SVR) models on different data subsets were built in order to identify a reliable compound set on which a realistic model could be built. This led to a selection of the dataset containing 875 compounds annotated with AIT values. The thereupon-based SVR model performs reasonably well in cross-validation with the determination coefficient r 2 = 0.77 and mean absolute error MAE = 37.8°C. External validation on 20 industrial compounds missing in the training set confirmed its good predictive power (MAE = 28.7°C).
Collapse
Affiliation(s)
- I I Baskin
- Laboratory of Chemoinformatics, University of Strasbourg, UMR 7140 CNRS/UniStra , Strasbourg, France
| | - S Lozano
- BioLab, Centre de Recherche de Solaize, Total , Solaize, France
| | - M Durot
- BioLab, Centre de Recherche de Solaize, Total , Solaize, France
| | - G Marcou
- Laboratory of Chemoinformatics, University of Strasbourg, UMR 7140 CNRS/UniStra , Strasbourg, France
| | - D Horvath
- Laboratory of Chemoinformatics, University of Strasbourg, UMR 7140 CNRS/UniStra , Strasbourg, France
| | - A Varnek
- Laboratory of Chemoinformatics, University of Strasbourg, UMR 7140 CNRS/UniStra , Strasbourg, France
| |
Collapse
|
17
|
Horvath D, Marcou G, Varnek A. Generative topographic mapping in drug design. DRUG DISCOVERY TODAY. TECHNOLOGIES 2019; 32-33:99-107. [PMID: 33386101 DOI: 10.1016/j.ddtec.2020.06.003] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/07/2020] [Revised: 06/10/2020] [Accepted: 06/18/2020] [Indexed: 06/12/2023]
Abstract
This is a review article of Generative Topographic Mapping (GTM) - a non-linear dimensionality reduction technique producing generative 2D maps of high-dimensional vector spaces - and its specific applications in Drug Design (chemical space cartography, compound library design and analysis, virtual screening, pharmacological profiling, de novo drug design, conformational space & docking interaction cartography, etc.) Written by chemoinformaticians for potential users among medicinal chemists and biologists, the article purposely avoids all underlying mathematics. First, the GTM concept is intuitively explained, based on the strong analogies with the rather popular Self-Organizing Maps (SOMs), which are well established library analysis tools. GTM is basically a fuzzy-logics-based generalization of SOMs. The second part of the review, some of published GTM applications in drug design are briefly revisited.
Collapse
Affiliation(s)
- Dragos Horvath
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France.
| | - Gilles Marcou
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, UMR 7140 University of Strasbourg/CNRS, 4 rue Blaise Pascal, 67000 Strasbourg, France.
| |
Collapse
|
18
|
Lin A, Beck B, Horvath D, Marcou G, Varnek A. Diversifying chemical libraries with generative topographic mapping. J Comput Aided Mol Des 2019; 34:805-815. [DOI: 10.1007/s10822-019-00215-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2019] [Accepted: 07/15/2019] [Indexed: 01/28/2023]
|
19
|
Horvath D, Marcou G, Varnek A. Generative Topographic Mapping of the Docking Conformational Space. Molecules 2019; 24:molecules24122269. [PMID: 31216756 PMCID: PMC6631714 DOI: 10.3390/molecules24122269] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2019] [Revised: 06/14/2019] [Accepted: 06/15/2019] [Indexed: 12/21/2022] Open
Abstract
Following previous efforts to render the Conformational Space (CS) of flexible compounds by Generative Topographic Mapping (GTM), this polyvalent mapping technique is here adapted to the docking problem. Contact fingerprints (CF) characterize ligands from the perspective of the binding site by monitoring protein atoms that are “touched” by those of the ligand. A “Contact” (CF) map was built by GTM-driven dimensionality reduction of the CF vector space. Alternatively, a “Hybrid” (Hy) map used a composite descriptor of CFs concatenated with ligand fragment descriptors. These maps indirectly represent the active site and integrate the binding information of multiple ligands. The concept is illustrated by a docking study into the ATP-binding site of CDK2, using the S4MPLE program to generate thousands of poses for each ligand. Both maps were challenged to (1) Discriminate native from non-native ligand poses, e.g., create RMSD-landscapes “colored” by the conformer ensemble of ligands of known binding modes in order to highlight “native” map zones (poses with RMSD to PDB structures < 2Å). Then, projection of poses of other ligands on such landscapes might serve to predict those falling in native zones as being well-docked. (2) Distinguish ligands–characterized by their ensemble of conformers–by their potency, e.g., testing the hypotheses whether zones privileged by potent binders are clearly separated from the ones preferred by decoys on the maps. Hybrid maps were better in both challenges and outperformed the classical energy and individual contact satisfaction scores in discriminating ligands by potency. Moreover, the intuitive visualization and analysis of docking CS may, as already mentioned, have several applications–from highlighting of key contacts to monitoring docking calculation convergence.
Collapse
Affiliation(s)
- Dragos Horvath
- Laboratoire de Chemoinformatique, UMR7140 CNRS/Univ. of Strasbourg, 1, rue Blaise Pascal, 67000 Strasbourg, France.
| | - Gilles Marcou
- Laboratoire de Chemoinformatique, UMR7140 CNRS/Univ. of Strasbourg, 1, rue Blaise Pascal, 67000 Strasbourg, France.
| | - Alexandre Varnek
- Laboratoire de Chemoinformatique, UMR7140 CNRS/Univ. of Strasbourg, 1, rue Blaise Pascal, 67000 Strasbourg, France.
| |
Collapse
|
20
|
Marcou G, Flamme B, Beck G, Chagnes A, Mokshyna O, Horvath D, Varnek A. In silico
Design, Virtual Screening and Synthesis of Novel Electrolytic Solvents. Mol Inform 2019; 38:e1900014. [DOI: 10.1002/minf.201900014] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2019] [Accepted: 05/07/2019] [Indexed: 11/11/2022]
Affiliation(s)
- G. Marcou
- Faculty of Chemistry – UMR7140University of Strasbourg 4, rue Blaise Pascal 67000 Strasbourg France
| | - B. Flamme
- Ecole Nationale Supérieure de Chimie de Paris 11 Rue Pierre et Marie Curie 75005 Paris France
| | - G. Beck
- Faculty of Chemistry – UMR7140University of Strasbourg 4, rue Blaise Pascal 67000 Strasbourg France
| | - A. Chagnes
- Université de Lorraine, CNRS, GeoRessources F-54000 Nancy France
| | - O. Mokshyna
- Faculty of Chemistry – UMR7140University of Strasbourg 4, rue Blaise Pascal 67000 Strasbourg France
| | - D. Horvath
- Faculty of Chemistry – UMR7140University of Strasbourg 4, rue Blaise Pascal 67000 Strasbourg France
| | - A. Varnek
- Faculty of Chemistry – UMR7140University of Strasbourg 4, rue Blaise Pascal 67000 Strasbourg France
| |
Collapse
|
21
|
Sosnin S, Vashurina M, Withnall M, Karpov P, Fedorov M, Tetko IV. A Survey of Multi-task Learning Methods in Chemoinformatics. Mol Inform 2019; 38:e1800108. [PMID: 30499195 PMCID: PMC6587441 DOI: 10.1002/minf.201800108] [Citation(s) in RCA: 42] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2018] [Accepted: 10/16/2018] [Indexed: 01/09/2023]
Abstract
Despite the increasing volume of available data, the proportion of experimentally measured data remains small compared to the virtual chemical space of possible chemical structures. Therefore, there is a strong interest in simultaneously predicting different ADMET and biological properties of molecules, which are frequently strongly correlated with one another. Such joint data analyses can increase the accuracy of models by exploiting their common representation and identifying common features between individual properties. In this work we review the recent developments in multi-learning approaches as well as cover the freely available tools and packages that can be used to perform such studies.
Collapse
Affiliation(s)
- Sergey Sosnin
- Center for Computational and Data-Intensive Science and EngineeringSkolkovo Institute of Science and Technology Skolkovo Innovation CenterMoscow143026Russia
| | - Mariia Vashurina
- Helmholtz Zentrum München – German Research Center for Environmental Health (GmbH)Institute of Structural BiologyIngolstädter Landstraße 1D-85764NeuherbergGermany
| | - Michael Withnall
- Helmholtz Zentrum München – German Research Center for Environmental Health (GmbH)Institute of Structural BiologyIngolstädter Landstraße 1D-85764NeuherbergGermany
| | - Pavel Karpov
- Helmholtz Zentrum München – German Research Center for Environmental Health (GmbH)Institute of Structural BiologyIngolstädter Landstraße 1D-85764NeuherbergGermany
| | - Maxim Fedorov
- Center for Computational and Data-Intensive Science and EngineeringSkolkovo Institute of Science and Technology Skolkovo Innovation CenterMoscow143026Russia
- University of StrathclydeDepartment of Physics John Anderson Building, 107 Rottenrow EastG40NGGlasgowUnited Kingdom
| | - Igor V. Tetko
- Helmholtz Zentrum München – German Research Center for Environmental Health (GmbH)Institute of Structural BiologyIngolstädter Landstraße 1D-85764NeuherbergGermany
- BIGCHEM GmbHIngolstädter Landstraße 1, b. 60wD-85764NeuherbergGermany
| |
Collapse
|
22
|
Sattarov B, Baskin II, Horvath D, Marcou G, Bjerrum EJ, Varnek A. De Novo Molecular Design by Combining Deep Autoencoder Recurrent Neural Networks with Generative Topographic Mapping. J Chem Inf Model 2019; 59:1182-1196. [PMID: 30785751 DOI: 10.1021/acs.jcim.8b00751] [Citation(s) in RCA: 85] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Here we show that Generative Topographic Mapping (GTM) can be used to explore the latent space of the SMILES-based autoencoders and generate focused molecular libraries of interest. We have built a sequence-to-sequence neural network with Bidirectional Long Short-Term Memory layers and trained it on the SMILES strings from ChEMBL23. Very high reconstruction rates of the test set molecules were achieved (>98%), which are comparable to the ones reported in related publications. Using GTM, we have visualized the autoencoder latent space on the two-dimensional topographic map. Targeted map zones can be used for generating novel molecular structures by sampling associated latent space points and decoding them to SMILES. The sampling method based on a genetic algorithm was introduced to optimize compound properties "on the fly". The generated focused molecular libraries were shown to contain original and a priori feasible compounds which, pending actual synthesis and testing, showed encouraging behavior in independent structure-based affinity estimation procedures (pharmacophore matching, docking).
Collapse
Affiliation(s)
- Boris Sattarov
- Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France
| | - Igor I Baskin
- Faculty of Physics , M.V. Lomonosov Moscow State University , Leninskie Gory , Moscow 19991 , Russia
| | - Dragos Horvath
- Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France
| | - Gilles Marcou
- Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France
| | - Esben Jannik Bjerrum
- Wildcard Pharmaceutical Consulting, Zeaborg Science Center, Frødings Allé 41 , 2860 Søborg , Denmark
| | - Alexandre Varnek
- Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France
| |
Collapse
|
23
|
Pros and cons of virtual screening based on public “Big Data”: In silico mining for new bromodomain inhibitors. Eur J Med Chem 2019; 165:258-272. [DOI: 10.1016/j.ejmech.2019.01.010] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2018] [Revised: 12/24/2018] [Accepted: 01/05/2019] [Indexed: 12/22/2022]
|
24
|
Orlov AA, Khvatov EV, Koruchekov AA, Nikitina AA, Zolotareva AD, Eletskaya AA, Kozlovskaya LI, Palyulin VA, Horvath D, Osolodkin DI, Varnek A. Getting to Know the Neighbours with GTM: The Case of Antiviral Compounds. Mol Inform 2019; 38:e1800166. [DOI: 10.1002/minf.201800166] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2018] [Accepted: 02/02/2019] [Indexed: 01/16/2023]
Affiliation(s)
- Alexey A. Orlov
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
| | | | - Alexander A. Koruchekov
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
| | - Anastasia A. Nikitina
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
| | - Anastasia D. Zolotareva
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Sechenov First Moscow State Medical University Moscow 119991 Russia
| | - Anastasia A. Eletskaya
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
| | - Liubov I. Kozlovskaya
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Sechenov First Moscow State Medical University Moscow 119991 Russia
| | | | - Dragos Horvath
- Laboratory of Chemoinformatics, Faculty of ChemistryUniversity of Strasbourg Strasbourg 67081 France
| | - Dmitry I. Osolodkin
- FSBSI “Chumakov FSC R&D IBP RAS” Moscow 108819 Russia
- Lomonosov Moscow State University Moscow 119991 Russia
- Sechenov First Moscow State Medical University Moscow 119991 Russia
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, Faculty of ChemistryUniversity of Strasbourg Strasbourg 67081 France
| |
Collapse
|
25
|
Lin A, Horvath D, Marcou G, Beck B, Varnek A. Multi-task generative topographic mapping in virtual screening. J Comput Aided Mol Des 2019; 33:331-343. [PMID: 30739238 DOI: 10.1007/s10822-019-00188-x] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2018] [Accepted: 02/02/2019] [Indexed: 12/16/2022]
Abstract
The previously reported procedure to generate "universal" Generative Topographic Maps (GTMs) of the drug-like chemical space is in practice a multi-task learning process, in which both operational GTM parameters (example: map grid size) and hyperparameters (key example: the molecular descriptor space to be used) are being chosen by an evolutionary process in order to fit/select "universal" GTM manifolds. After selection (a one-time task aimed at optimizing the compromise in terms of neighborhood behavior compliance, over a large pool of various biological targets), for any further use the manifolds are ready to provide "fit-free" predictive models. Using any structure-activity set-irrespectively whether the associated target served at map fitting stage or not-the generation or "coloring" a property landscape enables predicting the property for any external molecule, with zero additional fitable parameters involved. While previous works have signaled the excellent behavior of such models in aggressive three-fold cross-validation assessments of their predictive power, the present work wished to explore their behavior in Virtual Screening (VS), here simulated on hand of external DUD ligand and decoy series that are fully disjoint from the ChEMBL-extracted landscape coloring sets. Beyond the rather robust results of the universal GTM manifolds in this challenge, it could be shown that the descriptor spaces selected by the evolutionary multi-task learner were intrinsically able to serve as an excellent support for many other VS procedures, starting from parameter-free similarity searching, to local (target-specific) GTM models, to parameter-rich, nonlinear Random Forest and Neural Network approaches.
Collapse
Affiliation(s)
- Arkadii Lin
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4, Blaise Pascal Str., 67081, Strasbourg, France.,Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorferstrasse 65, 88397, Biberach an der Riss, Germany
| | - Dragos Horvath
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4, Blaise Pascal Str., 67081, Strasbourg, France
| | - Gilles Marcou
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4, Blaise Pascal Str., 67081, Strasbourg, France
| | - Bernd Beck
- Department of Medicinal Chemistry, Boehringer Ingelheim Pharma GmbH & Co. KG, Birkendorferstrasse 65, 88397, Biberach an der Riss, Germany
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4, Blaise Pascal Str., 67081, Strasbourg, France.
| |
Collapse
|
26
|
Casciuc I, Zabolotna Y, Horvath D, Marcou G, Bajorath J, Varnek A. Virtual Screening with Generative Topographic Maps: How Many Maps Are Required? J Chem Inf Model 2018; 59:564-572. [DOI: 10.1021/acs.jcim.8b00650] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Iuri Casciuc
- Laboratoire de Chémoinformatique UMR 7140 CNRS, Institut LeBel 4, rue B. Pascal 67081 Strasbourg, France
| | - Yuliana Zabolotna
- Laboratoire de Chémoinformatique UMR 7140 CNRS, Institut LeBel 4, rue B. Pascal 67081 Strasbourg, France
| | - Dragos Horvath
- Laboratoire de Chémoinformatique UMR 7140 CNRS, Institut LeBel 4, rue B. Pascal 67081 Strasbourg, France
| | - Gilles Marcou
- Laboratoire de Chémoinformatique UMR 7140 CNRS, Institut LeBel 4, rue B. Pascal 67081 Strasbourg, France
| | - Jürgen Bajorath
- B-IT, Limes, Unit Chem. Biol. & Med. Chem., University of Bonn, 53115 Bonn, Germany
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique UMR 7140 CNRS, Institut LeBel 4, rue B. Pascal 67081 Strasbourg, France
| |
Collapse
|
27
|
Kaneko H. Sparse Generative Topographic Mapping for Both Data Visualization and Clustering. J Chem Inf Model 2018; 58:2528-2535. [DOI: 10.1021/acs.jcim.8b00528] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Hiromasa Kaneko
- Department of Applied Chemistry, School of Science and Technology, Meiji University, 1-1-1 Higashi-Mita, Tama-ku, Kawasaki, Kanagawa 214-8571, Japan
| |
Collapse
|
28
|
Kaneko H. Data Visualization, Regression, Applicability Domains and Inverse Analysis Based on Generative Topographic Mapping. Mol Inform 2018; 38:e1800088. [PMID: 30259699 DOI: 10.1002/minf.201800088] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Accepted: 08/30/2018] [Indexed: 01/11/2023]
Abstract
This paper introduces two generative topographic mapping (GTM) methods that can be used for data visualization, regression analysis, inverse analysis, and the determination of applicability domains (ADs). In GTM-multiple linear regression (GTM-MLR), the prior probability distribution of the descriptors or explanatory variables (X) is calculated with GTM, and the posterior probability distribution of the property/activity or objective variable (y) given X is calculated with MLR; inverse analysis is then performed using the product rule and Bayes' theorem. In GTM-regression (GTMR), X and y are combined and GTM is performed to obtain the joint probability distribution of X and y; this leads to the posterior probability distributions of y given X and of X given y, which are used for regression and inverse analysis, respectively. Simulations using linear and nonlinear datasets and quantitative structure-activity relationship (QSAR) and quantitative structure-property relationship (QSPR) datasets confirm that GTM-MLR and GTMR enable data visualization, regression analysis, and inverse analysis considering appropriate ADs. Python and MATLAB codes for the proposed algorithms are available at https://github.com/hkaneko1985/gtm-generativetopographicmapping.
Collapse
Affiliation(s)
- Hiromasa Kaneko
- Department of Applied Chemistry, Meiji University 1-1-1 Higashi-Mita, Tama-ku, Kawasaki, Kanagawa, 214-8571, Japan
| |
Collapse
|
29
|
Glavatskikh M, Madzhidov T, Horvath D, Nugmanov R, Gimadiev T, Malakhova D, Marcou G, Varnek A. Predictive Models for Kinetic Parameters of Cycloaddition Reactions. Mol Inform 2018; 38:e1800077. [DOI: 10.1002/minf.201800077] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 07/22/2018] [Indexed: 01/10/2023]
Affiliation(s)
- Marta Glavatskikh
- Laboratoire de Chémoinformatique; UMR 7140 CNRS; Université de Strasbourg, 1; rue Blaise Pascal 67000 Strasbourg France
- Laboratory of Chemoinformatics and Molecular Modeling; Butlerov Institute of Chemistry; Kazan Federal University; Kremlyovskaya str. 18 Kazan Russia
| | - Timur Madzhidov
- Laboratory of Chemoinformatics and Molecular Modeling; Butlerov Institute of Chemistry; Kazan Federal University; Kremlyovskaya str. 18 Kazan Russia
| | - Dragos Horvath
- Laboratoire de Chémoinformatique; UMR 7140 CNRS; Université de Strasbourg, 1; rue Blaise Pascal 67000 Strasbourg France
| | - Ramil Nugmanov
- Laboratory of Chemoinformatics and Molecular Modeling; Butlerov Institute of Chemistry; Kazan Federal University; Kremlyovskaya str. 18 Kazan Russia
| | - Timur Gimadiev
- Laboratoire de Chémoinformatique; UMR 7140 CNRS; Université de Strasbourg, 1; rue Blaise Pascal 67000 Strasbourg France
- Laboratory of Chemoinformatics and Molecular Modeling; Butlerov Institute of Chemistry; Kazan Federal University; Kremlyovskaya str. 18 Kazan Russia
| | - Daria Malakhova
- Laboratory of Chemoinformatics and Molecular Modeling; Butlerov Institute of Chemistry; Kazan Federal University; Kremlyovskaya str. 18 Kazan Russia
| | - Gilles Marcou
- Laboratoire de Chémoinformatique; UMR 7140 CNRS; Université de Strasbourg, 1; rue Blaise Pascal 67000 Strasbourg France
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique; UMR 7140 CNRS; Université de Strasbourg, 1; rue Blaise Pascal 67000 Strasbourg France
| |
Collapse
|
30
|
Glavatskikh M, Madzhidov T, Baskin II, Horvath D, Nugmanov R, Gimadiev T, Marcou G, Varnek A. Visualization and Analysis of Complex Reaction Data: The Case of Tautomeric Equilibria. Mol Inform 2018; 37:e1800056. [DOI: 10.1002/minf.201800056] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2018] [Accepted: 06/29/2018] [Indexed: 11/07/2022]
Affiliation(s)
- Marta Glavatskikh
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
| | - Timur Madzhidov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
| | - Igor I. Baskin
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
- Faculty of Physics; Lomonosov Moscow State University; Leninskie Gory 1/2 119991 Moscow Russia
| | - Dragos Horvath
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
| | - Ramil Nugmanov
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
| | - Timur Gimadiev
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
- Laboratory of Chemoinformatics and Molecular Modeling, Butlerov Institut of Chemistry; Kazan Federal University; Kremlevskaya str. 18 Kazan Russia
| | - Gilles Marcou
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique, UMR 7140 CNRS; Université de Strasbourg; 1, rue Blaise Pascal 67000 Strasbourg France
| |
Collapse
|
31
|
Konovalov AI, Antipin IS, Burilov VA, Madzhidov TI, Kurbangalieva AR, Nemtarev AV, Solovieva SE, Stoikov II, Mamedov VA, Zakharova LY, Gavrilova EL, Sinyashin OG, Balova IA, Vasilyev AV, Zenkevich IG, Krasavin MY, Kuznetsov MA, Molchanov AP, Novikov MS, Nikolaev VA, Rodina LL, Khlebnikov AF, Beletskaya IP, Vatsadze SZ, Gromov SP, Zyk NV, Lebedev AT, Lemenovskii DA, Petrosyan VS, Nenaidenko VG, Negrebetskii VV, Baukov YI, Shmigol’ TA, Korlyukov AA, Tikhomirov AS, Shchekotikhin AE, Traven’ VF, Voskresenskii LG, Zubkov FI, Golubchikov OA, Semeikin AS, Berezin DB, Stuzhin PA, Filimonov VD, Krasnokutskaya EA, Fedorov AY, Nyuchev AV, Orlov VY, Begunov RS, Rusakov AI, Kolobov AV, Kofanov ER, Fedotova OV, Egorova AY, Charushin VN, Chupakhin ON, Klimochkin YN, Osyanin VA, Reznikov AN, Fisyuk AS, Sagitullina GP, Aksenov AV, Aksenov NA, Grachev MK, Maslennikova VI, Koroteev MP, Brel’ AK, Lisina SV, Medvedeva SM, Shikhaliev KS, Suboch GA, Tovbis MS, Mironovich LM, Ivanov SM, Kurbatov SV, Kletskii ME, Burov ON, Kobrakov KI, Kuznetsov DN. Modern Trends of Organic Chemistry in Russian Universities. RUSSIAN JOURNAL OF ORGANIC CHEMISTRY 2018. [DOI: 10.1134/s107042801802001x] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
32
|
Abstract
INTRODUCTION Activity landscapes (ALs) are representations and models of compound data sets annotated with a target-specific activity. In contrast to quantitative structure-activity relationship (QSAR) models, ALs aim at characterizing structure-activity relationships (SARs) on a large-scale level encompassing all active compounds for specific targets. The popularity of AL modeling has grown substantially with the public availability of large activity-annotated compound data sets. AL modeling crucially depends on molecular representations and similarity metrics used to assess structural similarity. Areas covered: The concepts of AL modeling are introduced and its basis in quantitatively assessing molecular similarity is discussed. The different types of AL modeling approaches are introduced. AL designs can broadly be divided into three categories: compound-pair based, dimensionality reduction, and network approaches. Recent developments for each of these categories are discussed focusing on the application of mathematical, statistical, and machine learning tools for AL modeling. AL modeling using chemical space networks is covered in more detail. Expert opinion: AL modeling has remained a largely descriptive approach for the analysis of SARs. Beyond mere visualization, the application of analytical tools from statistics, machine learning and network theory has aided in the sophistication of AL designs and provides a step forward in transforming ALs from descriptive to predictive tools. To this end, optimizing representations that encode activity relevant features of molecules might prove to be a crucial step.
Collapse
Affiliation(s)
- Martin Vogt
- a Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry , Rheinische Friedrich-Wilhelms-Universität , Bonn , Germany
| |
Collapse
|
33
|
Lin A, Horvath D, Afonina V, Marcou G, Reymond JL, Varnek A. Mapping of the Available Chemical Space versus the Chemical Universe of Lead-Like Compounds. ChemMedChem 2018; 13:540-554. [PMID: 29154440 DOI: 10.1002/cmdc.201700561] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2017] [Revised: 11/07/2017] [Indexed: 12/15/2022]
Abstract
This is, to our knowledge, the most comprehensive analysis to date based on generative topographic mapping (GTM) of fragment-like chemical space (40 million molecules with no more than 17 heavy atoms, both from the theoretically enumerated GDB-17 and real-world PubChem/ChEMBL databases). The challenge was to prove that a robust map of fragment-like chemical space can actually be built, in spite of a limited (≪105 ) maximal number of compounds ("frame set") usable for fitting the GTM manifold. An evolutionary map building strategy has been updated with a "coverage check" step, which discards manifolds failing to accommodate compounds outside the frame set. The evolved map has a good propensity to separate actives from inactives for more than 20 external structure-activity sets. It was proven to properly accommodate the entire collection of 40 m compounds. Next, it served as a library comparison tool to highlight biases of real-world molecules (PubChem and ChEMBL) versus the universe of all possible species represented by FDB-17, a fragment-like subset of GDB-17 containing 10 million molecules. Specific patterns, proper to some libraries and absent from others (diversity holes), were highlighted.
Collapse
Affiliation(s)
- Arkadii Lin
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France
| | - Dragos Horvath
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France
| | - Valentina Afonina
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France.,Laboratory of Chemoinformatics and Molecular Modeling, Department of Organic Chemistry, A.M. Butlerov Institute of Chemistry, Kazan Federal University, 18 Kremlyovskaya str., 420008, Kazan, Russia
| | - Gilles Marcou
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France
| | - Jean-Louis Reymond
- Department of Chemistry and Biochemistry, University of Berne, 3 Freiestrasse, 3012, Berne, Switzerland
| | - Alexandre Varnek
- Laboratory of Chemoinformatics, Faculty of Chemistry, University of Strasbourg, 4 Blaise Pascal str., 67081, Strasbourg, France
| |
Collapse
|
34
|
Abstract
Various methods of machine learning, supervised and unsupervised, linear and nonlinear, classification and regression, in combination with various types of molecular descriptors, both "handcrafted" and "data-driven," are considered in the context of their use in computational toxicology. The use of multiple linear regression, variants of naïve Bayes classifier, k-nearest neighbors, support vector machine, decision trees, ensemble learning, random forest, several types of neural networks, and deep learning is the focus of attention of this review. The role of fragment descriptors, graph mining, and graph kernels is highlighted. The application of unsupervised methods, such as Kohonen's self-organizing maps and related approaches, which allow for combining predictions with data analysis and visualization, is also considered. The necessity of applying a wide range of machine learning methods in computational toxicology is underlined.
Collapse
Affiliation(s)
- Igor I Baskin
- Faculty of Physics, M.V. Lomonosov Moscow State University, Moscow, Russian Federation.
- Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russian Federation.
| |
Collapse
|
35
|
From bird’s eye views to molecular communities: two-layered visualization of structure–activity relationships in large compound data sets. J Comput Aided Mol Des 2017; 31:961-977. [DOI: 10.1007/s10822-017-0070-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2017] [Accepted: 09/21/2017] [Indexed: 01/18/2023]
|
36
|
Gaspar HA, Breen G. Drug enrichment and discovery from schizophrenia genome-wide association results: an analysis and visualisation approach. Sci Rep 2017; 7:12460. [PMID: 28963561 PMCID: PMC5622077 DOI: 10.1038/s41598-017-12325-3] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Accepted: 09/06/2017] [Indexed: 12/27/2022] Open
Abstract
Using successful genome-wide association results in psychiatry for drug repurposing is an ongoing challenge. Databases collecting drug targets and gene annotations are growing and can be harnessed to shed a new light on psychiatric disorders. We used genome-wide association study (GWAS) summary statistics from the Psychiatric Genetics Consortium (PGC) Schizophrenia working group to build a drug repositioning model for schizophrenia. As sample size increases, schizophrenia GWAS results show increasing enrichment for known antipsychotic drugs, selective calcium channel blockers, and antiepileptics. Each of these therapeutical classes targets different gene subnetworks. We identify 123 Bonferroni-significant druggable genes outside the MHC, and 128 FDR-significant biological pathways related to neurons, synapses, genic intolerance, membrane transport, epilepsy, and mental disorders. These results suggest that, in schizophrenia, current well-powered GWAS results can reliably detect known schizophrenia drugs and thus may hold considerable potential for the identification of new therapeutic leads. Moreover, antiepileptics and calcium channel blockers may provide repurposing opportunities. This study also reveals significant pathways in schizophrenia that were not identified previously, and provides a workflow for pathway analysis and drug repurposing using GWAS results.
Collapse
Affiliation(s)
- H A Gaspar
- King's College London, Institute of Psychiatry, Psychology and Neuroscience, MRC Social, Genetic and Developmental Psychiatry (SGDP) Centre, London, UK.
- National Institute for Health Research Biomedical Research Centre, South London and Maudsley National Health Service Trust, London, UK.
| | - G Breen
- King's College London, Institute of Psychiatry, Psychology and Neuroscience, MRC Social, Genetic and Developmental Psychiatry (SGDP) Centre, London, UK
- National Institute for Health Research Biomedical Research Centre, South London and Maudsley National Health Service Trust, London, UK
| |
Collapse
|
37
|
Predictive cartography of metal binders using generative topographic mapping. J Comput Aided Mol Des 2017; 31:701-714. [DOI: 10.1007/s10822-017-0033-6] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2017] [Accepted: 06/11/2017] [Indexed: 12/27/2022]
|
38
|
Kayastha S, Horvath D, Gilberg E, Gütschow M, Bajorath J, Varnek A. Privileged Structural Motif Detection and Analysis Using Generative Topographic Maps. J Chem Inf Model 2017; 57:1218-1232. [DOI: 10.1021/acs.jcim.7b00128] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Shilva Kayastha
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
| | - Dragos Horvath
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
| | - Erik Gilberg
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany
- Pharmaceutical
Institute, University of Bonn, An der Immenburg 4, 53121 Bonn, Germany
| | - Michael Gütschow
- Pharmaceutical
Institute, University of Bonn, An der Immenburg 4, 53121 Bonn, Germany
| | - Jürgen Bajorath
- Department
of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology
and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Dahlmannstr. 2, D-53113 Bonn, Germany
| | - Alexandre Varnek
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
| |
Collapse
|
39
|
Horvath D, Baskin I, Marcou G, Varnek A. Generative Topographic Mapping of Conformational Space. Mol Inform 2017; 36. [PMID: 28421706 DOI: 10.1002/minf.201700036] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Accepted: 03/31/2017] [Indexed: 12/17/2022]
Abstract
Herein, Generative Topographic Mapping (GTM) was challenged to produce planar projections of the high-dimensional conformational space of complex molecules (the 1LE1 peptide). GTM is a probability-based mapping strategy, and its capacity to support property prediction models serves to objectively assess map quality (in terms of regression statistics). The properties to predict were total, non-bonded and contact energies, surface area and fingerprint darkness. Map building and selection was controlled by a previously introduced evolutionary strategy allowed to choose the best-suited conformational descriptors, options including classical terms and novel atom-centric autocorrellograms. The latter condensate interatomic distance patterns into descriptors of rather low dimensionality, yet precise enough to differentiate between close favorable contacts and atom clashes. A subset of 20 K conformers of the 1LE1 peptide, randomly selected from a pool of 2 M geometries (generated by the S4MPLE tool) was employed for map building and cross-validation of property regression models. The GTM build-up challenge reached robust three-fold cross-validated determination coefficients of Q2 =0.7…0.8, for all modeled properties. Mapping of the full 2 M conformer set produced intuitive and information-rich property landscapes. Functional and folding subspaces appear as well-separated zones, even though RMSD with respect to the PDB structure was never used as a selection criterion of the maps.
Collapse
Affiliation(s)
- Dragos Horvath
- Laboratoire de Chémoinformatique, UMR 7140 CNRS-Univ. Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| | | | - Gilles Marcou
- Laboratoire de Chémoinformatique, UMR 7140 CNRS-Univ. Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique, UMR 7140 CNRS-Univ. Strasbourg, 1 rue Blaise Pascal, Strasbourg, 67000, France
| |
Collapse
|
40
|
QSAR modeling and chemical space analysis of antimalarial compounds. J Comput Aided Mol Des 2017; 31:441-451. [PMID: 28374255 DOI: 10.1007/s10822-017-0019-4] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2016] [Accepted: 03/18/2017] [Indexed: 10/19/2022]
Abstract
Generative topographic mapping (GTM) has been used to visualize and analyze the chemical space of antimalarial compounds as well as to build predictive models linking structure of molecules with their antimalarial activity. For this, a database, including ~3000 molecules tested in one or several of 17 anti-Plasmodium activity assessment protocols, has been compiled by assembling experimental data from in-house and ChEMBL databases. GTM classification models built on subsets corresponding to individual bioassays perform similarly to the earlier reported SVM models. Zones preferentially populated by active and inactive molecules, respectively, clearly emerge in the class landscapes supported by the GTM model. Their analysis resulted in identification of privileged structural motifs of potential antimalarial compounds. Projection of marketed antimalarial drugs on this map allowed us to delineate several areas in the chemical space corresponding to different mechanisms of antimalarial activity. This helped us to make a suggestion about the mode of action of the molecules populating these zones.
Collapse
|
41
|
Horvath D, Marcou G, Varnek A. Generative Topographic Mapping Approach to Chemical Space Analysis. CHALLENGES AND ADVANCES IN COMPUTATIONAL CHEMISTRY AND PHYSICS 2017. [DOI: 10.1007/978-3-319-56850-8_6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
|
42
|
Takeda S, Kaneko H, Funatsu K. Chemical-Space-Based de Novo Design Method To Generate Drug-Like Molecules. J Chem Inf Model 2016; 56:1885-1893. [PMID: 27632418 DOI: 10.1021/acs.jcim.6b00038] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
To discover drug compounds in chemical space containing an enormous number of compounds, a structure generator is required to produce virtual drug-like chemical structures. The de novo design algorithm for exploring chemical space (DAECS) visualizes the activity distribution on a two-dimensional plane corresponding to chemical space and generates structures in a target area on a plane selected by the user. In this study, we modify the DAECS to enable the user to select a target area to consider properties other than activity and improve the diversity of the generated structures by visualizing the drug-likeness distribution and the activity distribution, generating structures by substructure-based structural changes, including addition, deletion, and substitution of substructures, as well as the slight structural changes used in the DAECS. Through case studies using ligand data for the human adrenergic alpha2A receptor and the human histamine H1 receptor, the modified DAECS can generate high diversity drug-like structures, and the usefulness of the modification of the DAECS is verified.
Collapse
Affiliation(s)
- Shunichi Takeda
- Department of Chemical Systems Engineering, The University of Tokyo , 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Hiromasa Kaneko
- Department of Chemical Systems Engineering, The University of Tokyo , 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Kimito Funatsu
- Department of Chemical Systems Engineering, The University of Tokyo , 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| |
Collapse
|
43
|
Generative Topographic Mapping Approach to Modeling and Chemical Space Visualization of Human Intestinal Transporters. BIONANOSCIENCE 2016. [DOI: 10.1007/s12668-016-0246-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
44
|
Klimenko K, Marcou G, Horvath D, Varnek A. Chemical Space Mapping and Structure-Activity Analysis of the ChEMBL Antiviral Compound Set. J Chem Inf Model 2016; 56:1438-54. [PMID: 27410486 DOI: 10.1021/acs.jcim.6b00192] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Curation, standardization and data fusion of the antiviral information present in the ChEMBL public database led to the definition of a robust data set, providing an association of antiviral compounds to seven broadly defined antiviral activity classes. Generative topographic mapping (GTM) subjected to evolutionary tuning was then used to produce maps of the antiviral chemical space, providing an optimal separation of compound families associated with the different antiviral classes. The ability to pinpoint the specific spots occupied (responsibility patterns) on a map by various classes of antiviral compounds opened the way for a GTM-supported search for privileged structural motifs, typical for each antiviral class. The privileged locations of antiviral classes were analyzed in order to highlight underlying privileged common structural motifs. Unlike in classical medicinal chemistry, where privileged structures are, almost always, predefined scaffolds, privileged structural motif detection based on GTM responsibility patterns has the decisive advantage of being able to automatically capture the nature ("resolution detail"-scaffold, detailed substructure, pharmacophore pattern, etc.) of the relevant structural motifs. Responsibility patterns were found to represent underlying structural motifs of various natures-from very fuzzy (groups of various "interchangeable" similar scaffolds), to the classical scenario in medicinal chemistry (underlying motif actually being the scaffold), to very precisely defined motifs (specifically substituted scaffolds).
Collapse
Affiliation(s)
- Kyrylo Klimenko
- Laboratoire de Chemoinformatique, UMR 7140 CNRS/Université de Strasbourg , 1, rue Blaise Pascal, Strasbourg 67000, France.,Department on Molecular Structure and Chemoinformatics, A.V. Bogatsky Physico-Chemical Institute of NAS of Ukraine , Lyustdorfskaya doroga, 86, Odessa 65080, Ukraine
| | - Gilles Marcou
- Laboratoire de Chemoinformatique, UMR 7140 CNRS/Université de Strasbourg , 1, rue Blaise Pascal, Strasbourg 67000, France
| | - Dragos Horvath
- Laboratoire de Chemoinformatique, UMR 7140 CNRS/Université de Strasbourg , 1, rue Blaise Pascal, Strasbourg 67000, France
| | - Alexandre Varnek
- Laboratoire de Chemoinformatique, UMR 7140 CNRS/Université de Strasbourg , 1, rue Blaise Pascal, Strasbourg 67000, France
| |
Collapse
|
45
|
Abstract
INTRODUCTION Neural networks are becoming a very popular method for solving machine learning and artificial intelligence problems. The variety of neural network types and their application to drug discovery requires expert knowledge to choose the most appropriate approach. AREAS COVERED In this review, the authors discuss traditional and newly emerging neural network approaches to drug discovery. Their focus is on backpropagation neural networks and their variants, self-organizing maps and associated methods, and a relatively new technique, deep learning. The most important technical issues are discussed including overfitting and its prevention through regularization, ensemble and multitask modeling, model interpretation, and estimation of applicability domain. Different aspects of using neural networks in drug discovery are considered: building structure-activity models with respect to various targets; predicting drug selectivity, toxicity profiles, ADMET and physicochemical properties; characteristics of drug-delivery systems and virtual screening. EXPERT OPINION Neural networks continue to grow in importance for drug discovery. Recent developments in deep learning suggests further improvements may be gained in the analysis of large chemical data sets. It's anticipated that neural networks will be more widely used in drug discovery in the future, and applied in non-traditional areas such as drug delivery systems, biologically compatible materials, and regenerative medicine.
Collapse
Affiliation(s)
- Igor I Baskin
- a Faculty of Physics , M.V. Lomonosov Moscow State University , Moscow , Russia.,b A.M. Butlerov Institute of Chemistry , Kazan Federal University , Kazan , Russia
| | - David Winkler
- c CSIRO Manufacturing , Clayton , VIC , Australia.,d Monash Institute for Pharmaceutical Sciences , Monash University , Parkville , VIC , Australia.,e Latrobe Institute for Molecular Science , Bundoora , VIC , Australia.,f School of Chemical and Physical Sciences , Flinders University , Bedford Park , SA , Australia
| | - Igor V Tetko
- g Helmholtz Zentrum München - German Research Center for Environmental Health (GmbH) , Institute of Structural Biology , Neuherberg , Germany.,h BigChem GmbH , Neuherberg , Germany
| |
Collapse
|
46
|
Kaneko H, Funatsu K. Applicability Domains and Consistent Structure Generation. Mol Inform 2016; 36. [DOI: 10.1002/minf.201600032] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2016] [Accepted: 04/25/2016] [Indexed: 11/08/2022]
Affiliation(s)
- Hiromasa Kaneko
- Department of Chemical System Engineering The University of Tokyo 7-3-1 Hongo Bunkyo-ku, Tokyo 113-8656 Japan
| | - Kimito Funatsu
- Department of Chemical System Engineering The University of Tokyo 7-3-1 Hongo Bunkyo-ku, Tokyo 113-8656 Japan
| |
Collapse
|
47
|
Gaspar HA, Sidorov P, Horvath D, Baskin II, Marcou G, Varnek A. Generative Topographic Mapping Approach to Chemical Space Analysis. FRONTIERS IN MOLECULAR DESIGN AND CHEMICAL INFORMATION SCIENCE - HERMAN SKOLNIK AWARD SYMPOSIUM 2015: JÜRGEN BAJORATH 2016. [DOI: 10.1021/bk-2016-1222.ch011] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Affiliation(s)
- Héléna A. Gaspar
- Laboratoire de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg 67000, France
- Faculty of Physics, M.V. Lomonosov Moscow State University, Leninskie Gory, Moscow 119991, Russia
- Laboratory of Chemoinformatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia
| | - Pavel Sidorov
- Laboratoire de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg 67000, France
- Faculty of Physics, M.V. Lomonosov Moscow State University, Leninskie Gory, Moscow 119991, Russia
- Laboratory of Chemoinformatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia
| | - Dragos Horvath
- Laboratoire de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg 67000, France
- Faculty of Physics, M.V. Lomonosov Moscow State University, Leninskie Gory, Moscow 119991, Russia
- Laboratory of Chemoinformatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia
| | - Igor I. Baskin
- Laboratoire de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg 67000, France
- Faculty of Physics, M.V. Lomonosov Moscow State University, Leninskie Gory, Moscow 119991, Russia
- Laboratory of Chemoinformatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia
| | - Gilles Marcou
- Laboratoire de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg 67000, France
- Faculty of Physics, M.V. Lomonosov Moscow State University, Leninskie Gory, Moscow 119991, Russia
- Laboratory of Chemoinformatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia
| | - Alexandre Varnek
- Laboratoire de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue Blaise Pascal, Strasbourg 67000, France
- Faculty of Physics, M.V. Lomonosov Moscow State University, Leninskie Gory, Moscow 119991, Russia
- Laboratory of Chemoinformatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia
| |
Collapse
|
48
|
Sidorov P, Gaspar H, Marcou G, Varnek A, Horvath D. Mappability of drug-like space: towards a polypharmacologically competent map of drug-relevant compounds. J Comput Aided Mol Des 2015; 29:1087-108. [PMID: 26564142 DOI: 10.1007/s10822-015-9882-z] [Citation(s) in RCA: 45] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2015] [Accepted: 11/06/2015] [Indexed: 11/30/2022]
Abstract
Intuitive, visual rendering--mapping--of high-dimensional chemical spaces (CS), is an important topic in chemoinformatics. Such maps were so far dedicated to specific compound collections--either limited series of known activities, or large, even exhaustive enumerations of molecules, but without associated property data. Typically, they were challenged to answer some classification problem with respect to those same molecules, admired for their aesthetical virtues and then forgotten--because they were set-specific constructs. This work wishes to address the question whether a general, compound set-independent map can be generated, and the claim of "universality" quantitatively justified, with respect to all the structure-activity information available so far--or, more realistically, an exploitable but significant fraction thereof. The "universal" CS map is expected to project molecules from the initial CS into a lower-dimensional space that is neighborhood behavior-compliant with respect to a large panel of ligand properties. Such map should be able to discriminate actives from inactives, or even support quantitative neighborhood-based, parameter-free property prediction (regression) models, for a wide panel of targets and target families. It should be polypharmacologically competent, without requiring any target-specific parameter fitting. This work describes an evolutionary growth procedure of such maps, based on generative topographic mapping, followed by the validation of their polypharmacological competence. Validation was achieved with respect to a maximum of exploitable structure-activity information, covering all of Homo sapiens proteins of the ChEMBL database, antiparasitic and antiviral data, etc. Five evolved maps satisfactorily solved hundreds of activity-based ligand classification challenges for targets, and even in vivo properties independent from training data. They also stood chemogenomics-related challenges, as cumulated responsibility vectors obtained by mapping of target-specific ligand collections were shown to represent validated target descriptors, complying with currently accepted target classification in biology. Therefore, they represent, in our opinion, a robust and well documented answer to the key question "What is a good CS map?"
Collapse
Affiliation(s)
- Pavel Sidorov
- Laboratoire de Chémoinformatique, UMR 7140, CNRS-Univ. Strasbourg, 1 rue Blaise Pascal, 67000, Strasbourg, France.,Laboratory of Chemoinformatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia
| | - Helena Gaspar
- Laboratoire de Chémoinformatique, UMR 7140, CNRS-Univ. Strasbourg, 1 rue Blaise Pascal, 67000, Strasbourg, France
| | - Gilles Marcou
- Laboratoire de Chémoinformatique, UMR 7140, CNRS-Univ. Strasbourg, 1 rue Blaise Pascal, 67000, Strasbourg, France
| | - Alexandre Varnek
- Laboratoire de Chémoinformatique, UMR 7140, CNRS-Univ. Strasbourg, 1 rue Blaise Pascal, 67000, Strasbourg, France.,Laboratory of Chemoinformatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan, Russia
| | - Dragos Horvath
- Laboratoire de Chémoinformatique, UMR 7140, CNRS-Univ. Strasbourg, 1 rue Blaise Pascal, 67000, Strasbourg, France.
| |
Collapse
|
49
|
Gaspar HA, Baskin II, Marcou G, Horvath D, Varnek A. Stargate GTM: Bridging Descriptor and Activity Spaces. J Chem Inf Model 2015; 55:2403-10. [DOI: 10.1021/acs.jcim.5b00398] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Héléna A. Gaspar
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
| | - Igor I. Baskin
- Faculty
of Physics, M.V. Lomonosov Moscow State University, Leninskie Gory, Moscow 119991, Russia
- Laboratory
of Chemoinformatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan 420008, Russia
| | - Gilles Marcou
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
| | - Dragos Horvath
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
| | - Alexandre Varnek
- Laboratoire
de Chemoinformatique, UMR 7140, Université de Strasbourg, 1 rue
Blaise Pascal, Strasbourg 67000, France
- Laboratory
of Chemoinformatics, Butlerov Institute of Chemistry, Kazan Federal University, Kazan 420008, Russia
| |
Collapse
|