1
|
Yang J, Cai Y, Zhao K, Xie H, Chen X. Concepts and applications of chemical fingerprint for hit and lead screening. Drug Discov Today 2022; 27:103356. [PMID: 36113834 DOI: 10.1016/j.drudis.2022.103356] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 07/28/2022] [Accepted: 09/08/2022] [Indexed: 11/22/2022]
Abstract
Molecular fingerprints are used to represent chemical (structural, physicochemical, etc.) properties of large-scale chemical sets in a low computational cost way. They have a prominent role in transforming chemical data sets into consistent input formats (bit strings or numeric values) suitable for in silico approaches. In this review, we summarize and classify common and state-of-the-art fingerprints into eight different types (dictionary based, circular, topological, pharmacophore, protein-ligand interaction, shape based, reinforced, and multi). We also highlight applications of fingerprints in early drug research and development (R&D). Thus, this review provides a guide for the selection of appropriate fingerprints of compounds (or ligand-protein complexes) for use in drug R&D.
Collapse
Affiliation(s)
- Jingbo Yang
- Department of Pharmagenomics, College of Bioinformatics Science and Technology, Harbin Medical University, 150081 Harbin, Heilongjiang, China
| | - Yiyang Cai
- Department of Pharmagenomics, College of Bioinformatics Science and Technology, Harbin Medical University, 150081 Harbin, Heilongjiang, China
| | - Kairui Zhao
- Department of Pharmagenomics, College of Bioinformatics Science and Technology, Harbin Medical University, 150081 Harbin, Heilongjiang, China
| | - Hongbo Xie
- Department of Pharmagenomics, College of Bioinformatics Science and Technology, Harbin Medical University, 150081 Harbin, Heilongjiang, China.
| | - Xiujie Chen
- Department of Pharmagenomics, College of Bioinformatics Science and Technology, Harbin Medical University, 150081 Harbin, Heilongjiang, China.
| |
Collapse
|
2
|
Abstract
INTRODUCTION The popularity and success of advanced AI methods like deep neural networks has led to novel ways for exploring chemical space. Their opaque nature poses challenges for model evaluation regarding novelty, uniqueness, and distribution of the chemical space covered. However, these methods also promise to be able to explore uncharted chemical space in novel ways that do not rely directly on structural similarity. AREAS COVERED This review provides an overview of popular deep learning methods for chemical space exploration. Crucial aspects like choice of molecular representation, training for focused chemical space exploration, and criteria for assessing and validating chemical space coverage are discussed. EXPERT OPINION Deep learning offers great potential for chemical space exploration beyond conventional fragment-based methods. Given the rarity of prospective applications and considering the difficulty in assessing representativeness and comprehensiveness of chemical space covered, developing criteria for assessing and validating generative models is of great significance. Latent space models like variational autoencoders are conceptually appealing for inverse QSAR/QSPR approaches as neighborhood relationships in latent space can be trained to reflect property similarities. Future research in understanding and interpreting generative models might lead to a better understanding of biologically relevant properties of molecules.
Collapse
Affiliation(s)
- Martin Vogt
- Department of Life Science Informatics, B-it, Limes Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich Wilhelms-Universität, Bonn, Germany
| |
Collapse
|
3
|
Meyers J, Fabian B, Brown N. De novo molecular design and generative models. Drug Discov Today 2021; 26:2707-2715. [PMID: 34082136 DOI: 10.1016/j.drudis.2021.05.019] [Citation(s) in RCA: 95] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 04/21/2021] [Accepted: 05/26/2021] [Indexed: 02/09/2023]
Abstract
Molecular design strategies are integral to therapeutic progress in drug discovery. Computational approaches for de novo molecular design have been developed over the past three decades and, recently, thanks in part to advances in machine learning (ML) and artificial intelligence (AI), the drug discovery field has gained practical experience. Here, we review these learnings and present de novo approaches according to the coarseness of their molecular representation: that is, whether molecular design is modeled on an atom-based, fragment-based, or reaction-based paradigm. Furthermore, we emphasize the value of strong benchmarks, describe the main challenges to using these methods in practice, and provide a viewpoint on further opportunities for exploration and challenges to be tackled in the upcoming years.
Collapse
Affiliation(s)
| | | | - Nathan Brown
- BenevolentAI, 4-8 Maple Street, London W1T 5HD, UK
| |
Collapse
|
4
|
Lambrinidis G, Tsantili-Kakoulidou A. Multi-objective optimization methods in novel drug design. Expert Opin Drug Discov 2020; 16:647-658. [PMID: 33353441 DOI: 10.1080/17460441.2021.1867095] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Introduction: In multi-objective drug design, optimization gains importance, being upgraded to a discipline that attracts its own research. Current strategies are broadly classified into single - objective optimization (SOO) and multi-objective optimization (MOO).Areas covered: Starting with SOO and the ways used to incorporate multiple criteria into it, the present review focuses on MOO techniques, their comparison, advantages, and restrictions. Pareto analysis and the concept of dominance stand in the core of MOO. The Pareto front, Pareto ranking, and limitations of Pareto-based methods, due to high dimensions and data uncertainty, are outlined. Desirability functions and the weighted sum approaches are described as stand-alone techniques to transform the MOO problem to SOO or in combination with pareto analysis and evolutionary algorithms. Representative applications in different drug research areas are also discussed.Expert opinion: Despite their limitations, the use of combined MOO techniques, as well as being complementary to SOO or in conjunction with artificial intelligence, contributes dramatically to efficient drug design, assisting decisions and increasing success probabilities. For multi-target drug design, optimization is supported by network approaches, while applicability of MOO to other fields like drug technology or biological complexity opens new perspectives in the interrelated fields of medicinal chemistry and molecular biology.
Collapse
Affiliation(s)
- George Lambrinidis
- Division of Pharmaceutical Chemistry, Department of Pharmacy, National and Kapodistrian University of Athens, Panepistimiopolis, Zografou, Athens, Greece
| | - Anna Tsantili-Kakoulidou
- Division of Pharmaceutical Chemistry, Department of Pharmacy, National and Kapodistrian University of Athens, Panepistimiopolis, Zografou, Athens, Greece
| |
Collapse
|
5
|
Keyvanpour MR, Shirzad MB. An Analysis of QSAR Research Based on Machine Learning Concepts. Curr Drug Discov Technol 2020; 18:17-30. [PMID: 32178612 DOI: 10.2174/1570163817666200316104404] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2019] [Revised: 08/22/2019] [Accepted: 10/28/2019] [Indexed: 11/22/2022]
Abstract
Quantitative Structure-Activity Relationship (QSAR) is a popular approach developed to correlate chemical molecules with their biological activities based on their chemical structures. Machine learning techniques have proved to be promising solutions to QSAR modeling. Due to the significant role of machine learning strategies in QSAR modeling, this area of research has attracted much attention from researchers. A considerable amount of literature has been published on machine learning based QSAR modeling methodologies whilst this domain still suffers from lack of a recent and comprehensive analysis of these algorithms. This study systematically reviews the application of machine learning algorithms in QSAR, aiming to provide an analytical framework. For this purpose, we present a framework called 'ML-QSAR'. This framework has been designed for future research to: a) facilitate the selection of proper strategies among existing algorithms according to the application area requirements, b) help to develop and ameliorate current methods and c) providing a platform to study existing methodologies comparatively. In ML-QSAR, first a structured categorization is depicted which studied the QSAR modeling research based on machine models. Then several criteria are introduced in order to assess the models. Finally, inspired by aforementioned criteria the qualitative analysis is carried out.
Collapse
Affiliation(s)
| | - Mehrnoush Barani Shirzad
- Data Mining Research Laboratory, Department of Computer Engineering, Alzahra University, Tehran, Iran
| |
Collapse
|
6
|
Sattarov B, Baskin II, Horvath D, Marcou G, Bjerrum EJ, Varnek A. De Novo Molecular Design by Combining Deep Autoencoder Recurrent Neural Networks with Generative Topographic Mapping. J Chem Inf Model 2019; 59:1182-1196. [PMID: 30785751 DOI: 10.1021/acs.jcim.8b00751] [Citation(s) in RCA: 85] [Impact Index Per Article: 14.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Here we show that Generative Topographic Mapping (GTM) can be used to explore the latent space of the SMILES-based autoencoders and generate focused molecular libraries of interest. We have built a sequence-to-sequence neural network with Bidirectional Long Short-Term Memory layers and trained it on the SMILES strings from ChEMBL23. Very high reconstruction rates of the test set molecules were achieved (>98%), which are comparable to the ones reported in related publications. Using GTM, we have visualized the autoencoder latent space on the two-dimensional topographic map. Targeted map zones can be used for generating novel molecular structures by sampling associated latent space points and decoding them to SMILES. The sampling method based on a genetic algorithm was introduced to optimize compound properties "on the fly". The generated focused molecular libraries were shown to contain original and a priori feasible compounds which, pending actual synthesis and testing, showed encouraging behavior in independent structure-based affinity estimation procedures (pharmacophore matching, docking).
Collapse
Affiliation(s)
- Boris Sattarov
- Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France
| | - Igor I Baskin
- Faculty of Physics , M.V. Lomonosov Moscow State University , Leninskie Gory , Moscow 19991 , Russia
| | - Dragos Horvath
- Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France
| | - Gilles Marcou
- Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France
| | - Esben Jannik Bjerrum
- Wildcard Pharmaceutical Consulting, Zeaborg Science Center, Frødings Allé 41 , 2860 Søborg , Denmark
| | - Alexandre Varnek
- Laboratory of Chemoinformatics , UMR 7177 University of Strasbourg/CNRS , 4 rue B. Pascal , 67000 Strasbourg , France
| |
Collapse
|
7
|
Miyao T, Funatsu K. Finding Chemical Structures Corresponding to a Set of Coordinates in Chemical Descriptor Space. Mol Inform 2017; 36. [DOI: 10.1002/minf.201700030] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2017] [Accepted: 04/04/2017] [Indexed: 11/10/2022]
Affiliation(s)
- Tomoyuki Miyao
- Department of Chemical System Engineering; The University of Tokyo; 7-3-1 Hongo, Bunkyo-ku Tokyo 113-8656 Japan
| | - Kimito Funatsu
- Department of Chemical System Engineering; The University of Tokyo; 7-3-1 Hongo, Bunkyo-ku Tokyo 113-8656 Japan
| |
Collapse
|
8
|
Bayesian molecular design with a chemical language model. J Comput Aided Mol Des 2017; 31:379-391. [PMID: 28281211 PMCID: PMC5393296 DOI: 10.1007/s10822-016-0008-z] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Accepted: 12/31/2016] [Indexed: 11/05/2022]
Abstract
The aim of computational molecular design is the identification of promising hypothetical molecules with a predefined set of desired properties. We address the issue of accelerating the material discovery with state-of-the-art machine learning techniques. The method involves two different types of prediction; the forward and backward predictions. The objective of the forward prediction is to create a set of machine learning models on various properties of a given molecule. Inverting the trained forward models through Bayes’ law, we derive a posterior distribution for the backward prediction, which is conditioned by a desired property requirement. Exploring high-probability regions of the posterior with a sequential Monte Carlo technique, molecules that exhibit the desired properties can computationally be created. One major difficulty in the computational creation of molecules is the exclusion of the occurrence of chemically unfavorable structures. To circumvent this issue, we derive a chemical language model that acquires commonly occurring patterns of chemical fragments through natural language processing of ASCII strings of existing compounds, which follow the SMILES chemical language notation. In the backward prediction, the trained language model is used to refine chemical strings such that the properties of the resulting structures fall within the desired property region while chemically unfavorable structures are successfully removed. The present method is demonstrated through the design of small organic molecules with the property requirements on HOMO-LUMO gap and internal energy. The R package iqspr is available at the CRAN repository.
Collapse
|
9
|
Shoombuatong W, Prathipati P, Owasirikul W, Worachartcheewan A, Simeon S, Anuwongcharoen N, Wikberg JES, Nantasenamat C. Towards the Revival of Interpretable QSAR Models. CHALLENGES AND ADVANCES IN COMPUTATIONAL CHEMISTRY AND PHYSICS 2017. [DOI: 10.1007/978-3-319-56850-8_1] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
|
10
|
Daeyaert F, Deem MW. A Pareto Algorithm for Efficient De Novo Design of Multi-functional Molecules. Mol Inform 2016; 36. [PMID: 28124835 DOI: 10.1002/minf.201600044] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2016] [Accepted: 07/06/2016] [Indexed: 12/19/2022]
Abstract
We have introduced a Pareto sorting algorithm into Synopsis, a de novo design program that generates synthesizable molecules with desirable properties. We give a detailed description of the algorithm and illustrate its working in 2 different de novo design settings: the design of putative dual and selective FGFR and VEGFR inhibitors, and the successful design of organic structure determining agents (OSDAs) for the synthesis of zeolites. We show that the introduction of Pareto sorting not only enables the simultaneous optimization of multiple properties but also greatly improves the performance of the algorithm to generate molecules with hard-to-meet constraints. This in turn allows us to suggest approaches to address the problem of false positive hits in de novo structure based drug design by introducing structural and physicochemical constraints in the designed molecules, and by forcing essential interactions between these molecules and their target receptor.
Collapse
Affiliation(s)
- Frits Daeyaert
- FD Computing, Stijn Streuvelsstraat 64, 2340, Beerse, Belgium.,Department of Bioengineering, Rice University, 6100 Main Street, Houston, TX, USA
| | - Micheal W Deem
- Department of Bioengineering, Rice University, 6100 Main Street, Houston, TX, USA
| |
Collapse
|
11
|
Miyao T, Kaneko H, Funatsu K. Inverse QSPR/QSAR Analysis for Chemical Structure Generation (from y to x). J Chem Inf Model 2016; 56:286-99. [PMID: 26818135 DOI: 10.1021/acs.jcim.5b00628] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Retrieving descriptor information (x information) from a value of an objective variable (y) is a fundamental problem in inverse quantitative structure-property relationship (inverse-QSPR) analysis but challenging because of the complexity of the preimage function. Herewith, we propose using a cluster-wise multiple linear regression (cMLR) model as a QSPR model for inverse-QSPR analysis. x information is acquired as a probability density function by combining cMLR and the prior distribution modeled with a mixture of Gaussians (GMMs). Three case studies were conducted to demonstrate various aspects of the potential of cMLR. It was found that the predictive power of cMLR was superior to that of MLR, especially for data with nonlinearity. Moreover, it turned out that the applicability domain could be considered since the posterior distribution inherits the prior distribution's feature (i.e., training data feature) and represents the possibility of having the desired property. Finally, a series of inverse analyses with the GMMs/cMLR was demonstrated with the aim to generate de novo structures having specific aqueous solubility.
Collapse
Affiliation(s)
- Tomoyuki Miyao
- Department of Chemical System Engineering, The University of Tokyo , 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Hiromasa Kaneko
- Department of Chemical System Engineering, The University of Tokyo , 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| | - Kimito Funatsu
- Department of Chemical System Engineering, The University of Tokyo , 7-3-1 Hongo, Bunkyo-ku, Tokyo 113-8656, Japan
| |
Collapse
|
12
|
Gawehn E, Hiss JA, Schneider G. Deep Learning in Drug Discovery. Mol Inform 2015; 35:3-14. [PMID: 27491648 DOI: 10.1002/minf.201501008] [Citation(s) in RCA: 309] [Impact Index Per Article: 30.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2015] [Accepted: 12/01/2015] [Indexed: 12/18/2022]
Abstract
Artificial neural networks had their first heyday in molecular informatics and drug discovery approximately two decades ago. Currently, we are witnessing renewed interest in adapting advanced neural network architectures for pharmaceutical research by borrowing from the field of "deep learning". Compared with some of the other life sciences, their application in drug discovery is still limited. Here, we provide an overview of this emerging field of molecular informatics, present the basic concepts of prominent deep learning methods and offer motivation to explore these techniques for their usefulness in computer-assisted drug discovery and design. We specifically emphasize deep neural networks, restricted Boltzmann machine networks and convolutional networks.
Collapse
Affiliation(s)
- Erik Gawehn
- Swiss Federal Institute of Technology (ETH), Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, CH-8093 Zurich, Switzerland, Fax: +41 44 633 13 79, Tel: +41 44 633 74 38
| | - Jan A Hiss
- Swiss Federal Institute of Technology (ETH), Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, CH-8093 Zurich, Switzerland, Fax: +41 44 633 13 79, Tel: +41 44 633 74 38
| | - Gisbert Schneider
- Swiss Federal Institute of Technology (ETH), Department of Chemistry and Applied Biosciences, Vladimir-Prelog-Weg 4, CH-8093 Zurich, Switzerland, Fax: +41 44 633 13 79, Tel: +41 44 633 74 38.
| |
Collapse
|
13
|
Firth NC, Atrash B, Brown N, Blagg J. MOARF, an Integrated Workflow for Multiobjective Optimization: Implementation, Synthesis, and Biological Evaluation. J Chem Inf Model 2015; 55:1169-80. [DOI: 10.1021/acs.jcim.5b00073] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Affiliation(s)
- Nicholas C. Firth
- Cancer Research UK Cancer
Therapeutics Unit, Division of Cancer Therapeutics, The Institute of Cancer Research, London, SM2 5NG, U.K
| | - Butrus Atrash
- Cancer Research UK Cancer
Therapeutics Unit, Division of Cancer Therapeutics, The Institute of Cancer Research, London, SM2 5NG, U.K
| | - Nathan Brown
- Cancer Research UK Cancer
Therapeutics Unit, Division of Cancer Therapeutics, The Institute of Cancer Research, London, SM2 5NG, U.K
| | - Julian Blagg
- Cancer Research UK Cancer
Therapeutics Unit, Division of Cancer Therapeutics, The Institute of Cancer Research, London, SM2 5NG, U.K
| |
Collapse
|
14
|
Saldana DA, Starck L, Mougin P, Rousseau B, Creton B. On the rational formulation of alternative fuels: melting point and net heat of combustion predictions for fuel compounds using machine learning methods. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:259-277. [PMID: 23574496 DOI: 10.1080/1062936x.2013.766634] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
We report the development of predictive models for two fuel specifications: melting points (T(m)) and net heat of combustion (Δ(c)H). Compounds inside the scope of these models are those likely to be found in alternative fuels, i.e. hydrocarbons, alcohols and esters. Experimental T(m) and Δ(c)H values for these types of molecules have been gathered to generate a unique database. Various quantitative structure-property relationship (QSPR) approaches have been used to build models, ranging from methods leading to multi-linear models such as genetic function approximation (GFA), or partial least squares (PLS) to those leading to non-linear models such as feed-forward artificial neural networks (FFANN), general regression neural networks (GRNN), support vector machines (SVM), or graph machines. Except for the case of the graph machines method for which the only inputs are SMILES formulae, previously listed approaches working on molecular descriptors and functional group count descriptors were used to develop specific models for T(m) and Δ(c)H. For each property, the predictive models return slightly different responses for each molecular structure. Therefore, models labelled as 'consensus models' were built by averaging values computed with selected individual models. Predicted results were then compared with experimental data and with predictions of models in the literature.
Collapse
Affiliation(s)
- D A Saldana
- IFP Energies Nouvelles, Rueil-Malmaison, France
| | | | | | | | | |
Collapse
|
15
|
Nicolotti O, Giangreco I, Introcaso A, Leonetti F, Stefanachi A, Carotti A. Strategies of multi-objective optimization in drug discovery and development. Expert Opin Drug Discov 2011; 6:871-84. [DOI: 10.1517/17460441.2011.588696] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
|
16
|
Kutchukian PS, Shakhnovich EI. De novo design: balancing novelty and confined chemical space. Expert Opin Drug Discov 2010; 5:789-812. [PMID: 22827800 DOI: 10.1517/17460441.2010.497534] [Citation(s) in RCA: 62] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/05/2022]
Abstract
IMPORTANCE OF THE FIELD De novo drug design serves as a tool for the discovery of new ligands for macromolecular targets as well as optimization of known ligands. Recently developed tools aim to address the multi-objective nature of drug design in an unprecedented manner. AREAS COVERED IN THIS REVIEW This article discusses recent advances in de novo drug design programs and accessory programs used to evaluate compounds post-generation. WHAT THE READER WILL GAIN The reader is introduced to the challenges inherent in de novo drug design and will become familiar with current trends in de novo design. Furthermore, the reader will be better prepared to assess the value of a tool, and be equipped to design more elegant tools in the future. TAKE HOME MESSAGE De novo drug design can assist in the efficient discovery of new compounds with a high affinity for a given target. The inclusion of existing chemoinformatic methods with current structure-based de novo design tools provides a means of enhancing the therapeutic value of these generated compounds.
Collapse
Affiliation(s)
- Peter S Kutchukian
- Harvard University, Chemistry and Chemical Biology Department, 12 Oxford Street, Cambridge, MA 02138, USA
| | | |
Collapse
|
17
|
Machado A, Tejera E, Cruz-Monteagudo M, Rebelo I. Application of desirability-based multi(bi)-objective optimization in the design of selective arylpiperazine derivates for the 5-HT1A serotonin receptor. Eur J Med Chem 2009; 44:5045-54. [DOI: 10.1016/j.ejmech.2009.09.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2009] [Revised: 07/06/2009] [Accepted: 09/06/2009] [Indexed: 10/20/2022]
|
18
|
Patel SJ, Ng D, Mannan MS. QSPR Flash Point Prediction of Solvents Using Topological Indices for Application in Computer Aided Molecular Design. Ind Eng Chem Res 2009. [DOI: 10.1021/ie9000794] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Suhani J. Patel
- Mary Kay O’Connor Process Safety Center, Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122
| | - Dedy Ng
- Mary Kay O’Connor Process Safety Center, Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122
| | - M. Sam Mannan
- Mary Kay O’Connor Process Safety Center, Department of Chemical Engineering, Texas A&M University, College Station, Texas 77843-3122
| |
Collapse
|
19
|
Nicolaou CA, Apostolakis J, Pattichis CS. De novo drug design using multiobjective evolutionary graphs. J Chem Inf Model 2009; 49:295-307. [PMID: 19434831 DOI: 10.1021/ci800308h] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Drug discovery and development is a complex, lengthy process, and failure of a candidate molecule can occur as a result of a combination of reasons, such as poor pharmacokinetics, lack of efficacy, or toxicity. Successful drug candidates necessarily represent a compromise between the numerous, sometimes competing objectives so that the benefits to patients outweigh potential drawbacks and risks. De novo drug design involves searching an immense space of feasible, druglike molecules to select those with the highest chances of becoming drugs using computational technology. Traditionally, de novo design has focused on designing molecules satisfying a single objective, such as similarity to a known ligand or an interaction score, and ignored the presence of the multiple objectives required for druglike behavior. Recently, methods have appeared in the literature that attempt to design molecules satisfying multiple predefined objectives and thereby produce candidate solutions with a higher chance of serving as viable drug leads. This paper describes the Multiobjective Evolutionary Graph Algorithm (MEGA), a new multiobjective optimization de novo design algorithmic framework that can be used to design structurally diverse molecules satisfying one or more objectives. The algorithm combines evolutionary techniques with graph-theory to directly manipulate graphs and perform an efficient global search for promising solutions. In the Experimental Section we present results from the application of MEGA for designing molecules that selectively bind to a known pharmaceutical target using the ChillScore interaction score family. The primary constraints applied to the design are based on the identified structure of the protein target and a known ligand currently marketed as a drug. A detailed explanation of the key elements of the specific implementation of the algorithm is given, including the methods for obtaining molecular building blocks, evolving the chemical graphs, and scoring the designed molecules. Our findings demonstrate that MEGA can produce structurally diverse candidate molecules representing a wide range of compromises of the supplied constraints and thus can be used as an "idea generator" to support expert chemists assigned with the task of molecular design.
Collapse
Affiliation(s)
- Christos A Nicolaou
- Computer Science Department, University of Cyprus, 75 Kallipoleos Street, CY-1678 Nicosia, Cyprus.
| | | | | |
Collapse
|
20
|
Wong WW, Burkowski FJ. A constructive approach for discovering new drug leads: Using a kernel methodology for the inverse-QSAR problem. J Cheminform 2009; 1:4. [PMID: 20142987 PMCID: PMC2816860 DOI: 10.1186/1758-2946-1-4] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2009] [Accepted: 04/28/2009] [Indexed: 12/04/2022] Open
Abstract
Background
The inverse-QSAR problem seeks to find a new molecular descriptor from which one can recover the structure of a molecule that possess a desired activity or property. Surprisingly, there are very few papers providing solutions to this problem. It is a difficult problem because the molecular descriptors involved with the inverse-QSAR algorithm must adequately address the forward QSAR problem for a given biological activity if the subsequent recovery phase is to be meaningful. In addition, one should be able to construct a feasible molecule from such a descriptor. The difficulty of recovering the molecule from its descriptor is the major limitation of most inverse-QSAR methods. Results
In this paper, we describe the reversibility of our previously reported descriptor, the vector space model molecular descriptor (VSMMD) based on a vector space model that is suitable for kernel studies in QSAR modeling. Our inverse-QSAR approach can be described using five steps: (1) generate the VSMMD for the compounds in the training set; (2) map the VSMMD in the input space to the kernel feature space using an appropriate kernel function; (3) design or generate a new point in the kernel feature space using a kernel feature space algorithm; (4) map the feature space point back to the input space of descriptors using a pre-image approximation algorithm; (5) build the molecular structure template using our VSMMD molecule recovery algorithm. Conclusion
The empirical results reported in this paper show that our strategy of using kernel methodology for an inverse-Quantitative Structure-Activity Relationship is sufficiently powerful to find a meaningful solution for practical problems. Electronic supplementary material The online version of this article (doi:10.1186/1758-2946-1-4) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- William Wl Wong
- The David R. Cheriton School of Computer Science, University of Waterloo, Waterloo, Ontario N2L 3G1, Canada
| | | |
Collapse
|
21
|
Cruz-Monteagudo M, Borges F, Cordeiro MNDS. Desirability-based multiobjective optimization for global QSAR studies: application to the design of novel NSAIDs with improved analgesic, antiinflammatory, and ulcerogenic profiles. J Comput Chem 2008; 29:2445-59. [PMID: 18452123 DOI: 10.1002/jcc.20994] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
Up to now, very few reports have been published concerning the application of multiobjective optimization (MOOP) techniques to quantitative structure-activity relationship (QSAR) studies. However, none reports the optimization of objectives related directly to the desired pharmaceutical profile of the drug. In this work, for the first time, it is proposed a MOOP method based on Derringer's desirability function that allows conducting global QSAR studies considering simultaneously the pharmacological, pharmacokinetic and toxicological profile of a set of molecule candidates. The usefulness of the method is demonstrated by applying it to the simultaneous optimization of the analgesic, antiinflammatory, and ulcerogenic properties of a library of fifteen 3-(3-methylphenyl)-2-substituted amino-3H-quinazolin-4-one compounds. The levels of the predictor variables producing concurrently the best possible compromise between these properties is found and used to design a set of new optimized drug candidates. Our results also suggest the relevant role of the bulkiness of alkyl substituents on the C-2 position of the quinazoline ring over the ulcerogenic properties for this family of compounds. Finally, and most importantly, the desirability-based MOOP method proposed is a valuable tool and shall aid in the future rational design of novel successful drugs.
Collapse
Affiliation(s)
- Maykel Cruz-Monteagudo
- Physico-Chemical Molecular Research Unit, Department of Organic Chemistry, Faculty of Pharmacy, University of Porto, 4150-047 Porto, Portugal.
| | | | | |
Collapse
|
22
|
Guha R. On the interpretation and interpretability of quantitative structure-activity relationship models. J Comput Aided Mol Des 2008; 22:857-71. [PMID: 18784976 DOI: 10.1007/s10822-008-9240-5] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2008] [Accepted: 08/14/2008] [Indexed: 01/28/2023]
Abstract
The goal of a quantitative structure-activity relationship (QSAR) model is to encode the relationship between molecular structure and biological activity or physical property. Based on this encoding, such models can be used for predictive purposes. Assuming the use of relevant and meaningful descriptors, and a statistically significant model, extraction of the encoded structure-activity relationships (SARs) can provide insight into what makes a molecule active or inactive. Such analyses by QSAR models are useful in a number of scenarios, such as suggesting structural modifications to enhance activity, explanation of outliers and exploratory analysis of novel SARs. In this paper we discuss the need for interpretation and an overview of the factors that affect interpretability of QSAR models. We then describe interpretation protocols for different types of models, highlighting the different types of interpretations, ranging from very broad, global, trends to very specific, case-by-case, descriptions of the SAR, using examples from the training set. Finally, we discuss a number of case studies where workers have provide some form of interpretation of a QSAR model.
Collapse
Affiliation(s)
- Rajarshi Guha
- School of Informatics, Indiana University, Bloomington, IN 47408, USA.
| |
Collapse
|
23
|
Gillet VJ. New directions in library design and analysis. Curr Opin Chem Biol 2008; 12:372-8. [PMID: 18331851 DOI: 10.1016/j.cbpa.2008.02.015] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2008] [Accepted: 02/06/2008] [Indexed: 10/22/2022]
Abstract
The high costs associated with high-throughput screening (HTS) coupled with the limited coverage and bias of current screening collections is such that diversity analysis continues to be an important criterion in lead generation. Whereas early approaches to diversity analysis were based on traditional descriptors such as two-dimensional fingerprints a recent emphasis has been on assessing scaffold coverage to ensure that a variety of different chemotypes are represented. Moreover, whether designing diverse or focused libraries, it is widely recognised that designs should aim to achieve a balance in a number of different properties and multiobjective optimisation provides an effective way of achieving such designs.
Collapse
Affiliation(s)
- Valerie J Gillet
- Department of Information Studies, University of Sheffield, Sheffield, UK.
| |
Collapse
|
24
|
Ebalunode JO, Ouyang Z, Liang J, Zheng W. Novel Approach to Structure-Based Pharmacophore Search Using Computational Geometry and Shape Matching Techniques. J Chem Inf Model 2008; 48:889-901. [DOI: 10.1021/ci700368p] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jerry Osagie Ebalunode
- Department of Pharmaceutical Sciences, BRITE Institute, North Carolina Central University, 1801 Fayetteville Street, Durham, North Carolina 27707, and Bioengineering Department, University of Illinois at Chicago, Chicago, Illinois 60612
| | - Zheng Ouyang
- Department of Pharmaceutical Sciences, BRITE Institute, North Carolina Central University, 1801 Fayetteville Street, Durham, North Carolina 27707, and Bioengineering Department, University of Illinois at Chicago, Chicago, Illinois 60612
| | - Jie Liang
- Department of Pharmaceutical Sciences, BRITE Institute, North Carolina Central University, 1801 Fayetteville Street, Durham, North Carolina 27707, and Bioengineering Department, University of Illinois at Chicago, Chicago, Illinois 60612
| | - Weifan Zheng
- Department of Pharmaceutical Sciences, BRITE Institute, North Carolina Central University, 1801 Fayetteville Street, Durham, North Carolina 27707, and Bioengineering Department, University of Illinois at Chicago, Chicago, Illinois 60612
| |
Collapse
|
25
|
Scott DJ, Manos S, Coveney PV. Design of Electroceramic Materials Using Artificial Neural Networks and Multiobjective Evolutionary Algorithms. J Chem Inf Model 2008; 48:262-73. [DOI: 10.1021/ci700269r] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- D. J. Scott
- Centre for Computational Science, Department of Chemistry, University College London, Christopher Ingold Laboratories, 20 Gordon Street, London WC1H 0AJ, U.K
| | - S. Manos
- Centre for Computational Science, Department of Chemistry, University College London, Christopher Ingold Laboratories, 20 Gordon Street, London WC1H 0AJ, U.K
| | - P. V. Coveney
- Centre for Computational Science, Department of Chemistry, University College London, Christopher Ingold Laboratories, 20 Gordon Street, London WC1H 0AJ, U.K
| |
Collapse
|