1
|
Vogt M. Chemoinformatic approaches for navigating large chemical spaces. Expert Opin Drug Discov 2024; 19:403-414. [PMID: 38300511 DOI: 10.1080/17460441.2024.2313475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2023] [Accepted: 01/30/2024] [Indexed: 02/02/2024]
Abstract
INTRODUCTION Large chemical spaces (CSs) include traditional large compound collections, combinatorial libraries covering billions to trillions of molecules, DNA-encoded chemical libraries comprising complete combinatorial CSs in a single mixture, and virtual CSs explored by generative models. The diverse nature of these types of CSs require different chemoinformatic approaches for navigation. AREAS COVERED An overview of different types of large CSs is provided. Molecular representations and similarity metrics suitable for large CS exploration are discussed. A summary of navigation of CSs in generative models is provided. Methods for characterizing and comparing CSs are discussed. EXPERT OPINION The size of large CSs might restrict navigation to specialized algorithms and limit it to considering neighborhoods of structurally similar molecules. Efficient navigation of large CSs not only requires methods that scale with size but also requires smart approaches that focus on better but not necessarily larger molecule selections. Deep generative models aim to provide such approaches by implicitly learning features relevant for targeted biological properties. It is unclear whether these models can fulfill this ideal as validation is difficult as long as the covered CSs remain mainly virtual without experimental verification.
Collapse
Affiliation(s)
- Martin Vogt
- Department of Life Science Informatics, B-IT, LIMES Program Unit Chemical Biology and Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Bonn, Germany
| |
Collapse
|
2
|
Hönig SMN, Flachsenberg F, Ehrt C, Neumann A, Schmidt R, Lemmen C, Rarey M. SpaceGrow: efficient shape-based virtual screening of billion-sized combinatorial fragment spaces. J Comput Aided Mol Des 2024; 38:13. [PMID: 38493240 PMCID: PMC10944417 DOI: 10.1007/s10822-024-00551-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Accepted: 02/13/2024] [Indexed: 03/18/2024]
Abstract
The growing size of make-on-demand chemical libraries is posing new challenges to cheminformatics. These ultra-large chemical libraries became too large for exhaustive enumeration. Using a combinatorial approach instead, the resource requirement scales approximately with the number of synthons instead of the number of molecules. This gives access to billions or trillions of compounds as so-called chemical spaces with moderate hardware and in a reasonable time frame. While extremely performant ligand-based 2D methods exist in this context, 3D methods still largely rely on exhaustive enumeration and therefore fail to apply. Here, we present SpaceGrow: a novel shape-based 3D approach for ligand-based virtual screening of billions of compounds within hours on a single CPU. Compared to a conventional superposition tool, SpaceGrow shows comparable pose reproduction capacity based on RMSD and superior ranking performance while being orders of magnitude faster. Result assessment of two differently sized subsets of the eXplore space reveals a higher probability of finding superior results in larger spaces highlighting the potential of searching in ultra-large spaces. Furthermore, the application of SpaceGrow in a drug discovery workflow was investigated in four examples involving G protein-coupled receptors (GPCRs) with the aim to identify compounds with similar binding capabilities and molecular novelty.
Collapse
Affiliation(s)
- Sophia M N Hönig
- BioSolveIT, An der Ziegelei 79, 53757, Sankt Augustin, Germany
- Universität Hamburg, ZBH - Center for Bioinformatics, Albert-Einstein-Ring 8-10, 22761, Hamburg, Germany
| | | | - Christiane Ehrt
- Universität Hamburg, ZBH - Center for Bioinformatics, Albert-Einstein-Ring 8-10, 22761, Hamburg, Germany
| | | | - Robert Schmidt
- BioSolveIT, An der Ziegelei 79, 53757, Sankt Augustin, Germany
| | | | - Matthias Rarey
- Universität Hamburg, ZBH - Center for Bioinformatics, Albert-Einstein-Ring 8-10, 22761, Hamburg, Germany.
| |
Collapse
|
3
|
Neumann A, Marrison L, Klein R. Relevance of the Trillion-Sized Chemical Space "eXplore" as a Source for Drug Discovery. ACS Med Chem Lett 2023; 14:466-472. [PMID: 37077402 PMCID: PMC10108389 DOI: 10.1021/acsmedchemlett.3c00021] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 03/10/2023] [Indexed: 03/18/2023] Open
Abstract
Within the past two decades, virtual combinatorial compound collections, so-called chemical spaces, became an important molecule source for pharmaceutical research all over the world. The emergence of compound vendor chemical spaces with rapidly growing numbers of molecules raises questions about their application suitability and the quality of the content. Here, we examine the composition of the recently published and, so far, biggest chemical space, "eXplore", which comprises approximately 2.8 trillion virtual product molecules. The utility of eXplore to retrieve interesting chemistry around approved drugs and common Bemis Murcko scaffolds has been assessed with several methods (FTrees, SpaceLight, SpaceMACS). Further, the overlap between several vendor chemical spaces and a physicochemical property distribution analysis has been performed. Despite the straightforward chemical reactions underlying its setup, eXplore is demonstrated to provide relevant and, most importantly, easily accessible molecules for drug discovery campaigns.
Collapse
Affiliation(s)
| | - Lester Marrison
- eMolecules, 3430 Carmel Mountain Road, Suite
250, San Diego, California 92121, United States
| | - Raphael Klein
- BioSolveIT
GmbH, An der Ziegelei 79, 53757 Sankt Augustin, Germany
| |
Collapse
|
4
|
Tingle B, Tang KG, Castanon M, Gutierrez JJ, Khurelbaatar M, Dandarchuluun C, Moroz YS, Irwin JJ. ZINC-22─A Free Multi-Billion-Scale Database of Tangible Compounds for Ligand Discovery. J Chem Inf Model 2023; 63:1166-1176. [PMID: 36790087 PMCID: PMC9976280 DOI: 10.1021/acs.jcim.2c01253] [Citation(s) in RCA: 24] [Impact Index Per Article: 24.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2022] [Indexed: 02/16/2023]
Abstract
Purchasable chemical space has grown rapidly into the tens of billions of molecules, providing unprecedented opportunities for ligand discovery but straining the tools that might exploit these molecules at scale. We have therefore developed ZINC-22, a database of commercially accessible small molecules derived from multi-billion-scale make-on-demand libraries. The new database and tools enable analog searching in this vast new space via a facile GUI, CartBlanche, drawing on similarity methods that scale sublinearly in the number of molecules. The new library also uses data organization methods, enabling rapid lookup of molecules and their physical properties, including conformations, partial atomic charges, c Log P values, and solvation energies, all crucial for molecule docking, which had become slow with older database organizations in previous versions of ZINC. As the libraries have continued to grow, we have been interested in finding whether molecular diversity has suffered, for instance, because certain scaffolds have come to dominate via easy analoging. This has not occurred thus far, and chemical diversity continues to grow with database size, with a log increase in Bemis-Murcko scaffolds for every two-log unit increase in database size. Most new scaffolds come from compounds with the highest heavy atom count. Finally, we consider the implications for databases like ZINC as the libraries grow toward and beyond the trillion-molecule range. ZINC is freely available to everyone and may be accessed at cartblanche22.docking.org, via Globus, and in the Amazon AWS and Oracle OCI clouds.
Collapse
Affiliation(s)
- Benjamin
I. Tingle
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Khanh G. Tang
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Mar Castanon
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - John J. Gutierrez
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Munkhzul Khurelbaatar
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Chinzorig Dandarchuluun
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| | - Yurii S. Moroz
- Taras
Shevchenko National University of Kyïv, 60 Volodymyrska Street, Kyïv 01601, Ukraine
- Chemspace
LLC, 85 Chervonotkatska
Street, Kyïv 02094, Ukraine
| | - John J. Irwin
- Department
of Pharmaceutical Chemistry, University
of California San Francisco, 1700 4th St, Mailcode 2550, San Francisco, California 94158-2330, United States
| |
Collapse
|
5
|
Transition from Animal-Based to Human Induced Pluripotent Stem Cells (iPSCs)-Based Models of Neurodevelopmental Disorders: Opportunities and Challenges. Cells 2023; 12:cells12040538. [PMID: 36831205 PMCID: PMC9954744 DOI: 10.3390/cells12040538] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 01/25/2023] [Accepted: 02/02/2023] [Indexed: 02/11/2023] Open
Abstract
Neurodevelopmental disorders (NDDs) arise from the disruption of highly coordinated mechanisms underlying brain development, which results in impaired sensory, motor and/or cognitive functions. Although rodent models have offered very relevant insights to the field, the translation of findings to clinics, particularly regarding therapeutic approaches for these diseases, remains challenging. Part of the explanation for this failure may be the genetic differences-some targets not being conserved between species-and, most importantly, the differences in regulation of gene expression. This prompts the use of human-derived models to study NDDS. The generation of human induced pluripotent stem cells (hIPSCs) added a new suitable alternative to overcome species limitations, allowing for the study of human neuronal development while maintaining the genetic background of the donor patient. Several hIPSC models of NDDs already proved their worth by mimicking several pathological phenotypes found in humans. In this review, we highlight the utility of hIPSCs to pave new paths for NDD research and development of new therapeutic tools, summarize the challenges and advances of hIPSC-culture and neuronal differentiation protocols and discuss the best way to take advantage of these models, illustrating this with examples of success for some NDDs.
Collapse
|
6
|
Perebyinis M, Rognan D. Overlap of On-demand Ultra-large Combinatorial Spaces with On-the-shelf Drug-like Libraries. Mol Inform 2023; 42:e2200163. [PMID: 36072995 DOI: 10.1002/minf.202200163] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2022] [Accepted: 09/07/2022] [Indexed: 01/12/2023]
Abstract
On-demand combinatorial spaces are shifting paradigms in early drug discovery, by considerably increasing the searchable chemical space to several billions of compounds while securing their synthetic accessibility. We here systematically compared the on-the-shelf available drug-like chemical space (9 million compounds) to three on-demand ultra-large (ODUL) combinatorial fragment spaces (REAL, CHEMriya, GalaXi) covering 32 billion of readily accessible molecules. Surprisingly, only one space (REAL) intersects almost entirely the currently available drug-like space, suggesting that it is the only ODUL widely suitable for in-stock hit expansion. Of course, expanding a preliminary ODUL hit in the same chemical space is the best possible strategy to rapidly generate structure-activity relationships. All three spaces remain well suited to early hit finding initiatives since they all provide numerous unique scaffolds that are not described by on-the shelf collections.
Collapse
Affiliation(s)
- Mariana Perebyinis
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS-Université de Strasbourg, 74 route du Rhin, F-67400, Illkirch, France
| | - Didier Rognan
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS-Université de Strasbourg, 74 route du Rhin, F-67400, Illkirch, France
| |
Collapse
|
7
|
Müller J, Klein R, Tarkhanova O, Gryniukova A, Borysko P, Merkl S, Ruf M, Neumann A, Gastreich M, Moroz YS, Klebe G, Glinca S. Magnet for the Needle in Haystack: "Crystal Structure First" Fragment Hits Unlock Active Chemical Matter Using Targeted Exploration of Vast Chemical Spaces. J Med Chem 2022; 65:15663-15678. [PMID: 36069712 DOI: 10.1021/acs.jmedchem.2c00813] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Fragment-based drug discovery (FBDD) has successfully led to approved therapeutics for challenging and "undruggable" targets. In the context of FBDD, we introduce a novel, multidisciplinary method to identify active molecules from purchasable chemical space. Starting from four small-molecule fragment complexes of protein kinase A (PKA), a template-based docking screen using Enamine's multibillion REAL Space was performed. A total of 93 molecules out of 106 selected compounds were successfully synthesized. Forty compounds were active in at least one validation assay with the most active follow-up having a 13,500-fold gain in affinity. Crystal structures for six of the most promising binders were rapidly obtained, verifying the binding mode. The overall success rate for this novel fragment-to-hit approach was 40%, accomplished in only 9 weeks. The results challenge the established fragment prescreening paradigm since the standard industrial filters for fragment hit identification in a thermal shift assay would have missed the initial fragments.
Collapse
Affiliation(s)
- Janis Müller
- CrystalsFirst GmbH, Marbacher Weg 6, 35037Marburg, Germany
| | - Raphael Klein
- BioSolveIT GmbH, An der Ziegelei 79, 53757Sankt Augustin, Germany
| | - Olga Tarkhanova
- Chemspace LLC, 85 Chervonotkatska Street, Suite 1, 03190Kyïv, Ukraine
| | | | - Petro Borysko
- Enamine Ltd., 78 Chervonotkatska Street 78, 02094Kyïv, Ukraine
| | - Stefan Merkl
- CrystalsFirst GmbH, Marbacher Weg 6, 35037Marburg, Germany
| | - Moritz Ruf
- CrystalsFirst GmbH, Marbacher Weg 6, 35037Marburg, Germany
| | | | - Marcus Gastreich
- BioSolveIT GmbH, An der Ziegelei 79, 53757Sankt Augustin, Germany
| | - Yurii S Moroz
- Chemspace LLC, 85 Chervonotkatska Street, Suite 1, 03190Kyïv, Ukraine
- Taras Shevchenko National University of Kyïv, 60 Volodymyrska Street 60, Kyïv01601, Ukraine
| | - Gerhard Klebe
- Department for Pharmaceutical Chemistry, Philipps-University Marburg, Marbacher Weg 6, 35037Marburg, Germany
| | - Serghei Glinca
- CrystalsFirst GmbH, Marbacher Weg 6, 35037Marburg, Germany
| |
Collapse
|
8
|
Chemical space docking enables large-scale structure-based virtual screening to discover ROCK1 kinase inhibitors. Nat Commun 2022; 13:6447. [PMID: 36307407 PMCID: PMC9616902 DOI: 10.1038/s41467-022-33981-8] [Citation(s) in RCA: 23] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Accepted: 10/05/2022] [Indexed: 12/25/2022] Open
Abstract
With the ever-increasing number of synthesis-on-demand compounds for drug lead discovery, there is a great need for efficient search technologies. We present the successful application of a virtual screening method that combines two advances: (1) it avoids full library enumeration (2) products are evaluated by molecular docking, leveraging protein structural information. Crucially, these advances enable a structure-based technique that can efficiently explore libraries with billions of molecules and beyond. We apply this method to identify inhibitors of ROCK1 from almost one billion commercially available compounds. Out of 69 purchased compounds, 27 (39%) have Ki values < 10 µM. X-ray structures of two leads confirm their docked poses. This approach to docking scales roughly with the number of reagents that span a chemical space and is therefore multiple orders of magnitude faster than traditional docking.
Collapse
|
9
|
Artificial intelligence and machine-learning approaches in structure and ligand-based discovery of drugs affecting central nervous system. Mol Divers 2022; 27:959-985. [PMID: 35819579 DOI: 10.1007/s11030-022-10489-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2022] [Accepted: 06/21/2022] [Indexed: 12/11/2022]
Abstract
CNS disorders are indications with a very high unmet medical needs, relatively smaller number of available drugs, and a subpar satisfaction level among patients and caregiver. Discovery of CNS drugs is extremely expensive affair with its own unique challenges leading to extremely high attrition rates and low efficiency. With explosion of data in information age, there is hardly any aspect of life that has not been touched by data driven technologies such as artificial intelligence (AI) and machine learning (ML). Drug discovery is no exception, emergence of big data via genomic, proteomic, biological, and chemical technologies has driven pharmaceutical giants to collaborate with AI oriented companies to revolutionise drug discovery, with the goal of increasing the efficiency of the process. In recent years many examples of innovative applications of AI and ML techniques in CNS drug discovery has been reported. Research on therapeutics for diseases such as schizophrenia, Alzheimer's and Parkinsonism has been provided with a new direction and thrust from these developments. AI and ML has been applied to both ligand-based and structure-based drug discovery and design of CNS therapeutics. In this review, we have summarised the general aspects of AI and ML from the perspective of drug discovery followed by a comprehensive coverage of the recent developments in the applications of AI/ML techniques in CNS drug discovery.
Collapse
|
10
|
Bellmann L, Klein R, Rarey M. Calculating and Optimizing Physicochemical Property Distributions of Large Combinatorial Fragment Spaces. J Chem Inf Model 2022; 62:2800-2810. [PMID: 35653228 DOI: 10.1021/acs.jcim.2c00334] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The distributions of physicochemical property values, like the octanol-water partition coefficient, are routinely calculated to describe and compare virtual chemical libraries. Traditionally, these distributions are derived by processing each member of a library individually and summarizing all values in a distribution. This process becomes impractical when operating on chemical spaces which surpass billions of compounds in size. In this work, we present a novel algorithmic method called SpaceProp for the property distribution calculation of large nonenumerable combinatorial fragment spaces. The novel method follows a combinatorial approach and is able to calculate physicochemical property distributions of prominent spaces like Enamine's REAL Space, WuXi's GalaXi Space, and OTAVA's CHEMriya Space for the first time. Furthermore, we present a first approach of optimizing property distributions directly in combinatorial fragment spaces.
Collapse
Affiliation(s)
- Louis Bellmann
- Universität Hamburg, ZBH - Center for Bioinformatics, Research Group for Computational Molecular Design, Bundesstraße 43, 20146 Hamburg, Germany
| | - Raphael Klein
- BioSolveIT GmbH, An der Ziegelei 79, 53757 Sankt Augustin, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH - Center for Bioinformatics, Research Group for Computational Molecular Design, Bundesstraße 43, 20146 Hamburg, Germany
| |
Collapse
|
11
|
Warr WA, Nicklaus MC, Nicolaou CA, Rarey M. Exploration of Ultralarge Compound Collections for Drug Discovery. J Chem Inf Model 2022; 62:2021-2034. [PMID: 35421301 DOI: 10.1021/acs.jcim.2c00224] [Citation(s) in RCA: 35] [Impact Index Per Article: 17.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Designing new medicines more cheaply and quickly is tightly linked to the quest of exploring chemical space more widely and efficiently. Chemical space is monumentally large, but recent advances in computer software and hardware have enabled researchers to navigate virtual chemical spaces containing billions of chemical structures. This review specifically concerns collections of many millions or even billions of enumerated chemical structures as well as even larger chemical spaces that are not fully enumerated. We present examples of chemical libraries and spaces and the means used to construct them, and we discuss new technologies for searching huge libraries and for searching combinatorially in chemical space. We also cover space navigation techniques and consider new approaches to de novo drug design and the impact of the "autonomous laboratory" on synthesis of designed compounds. Finally, we summarize some other challenges and opportunities for the future.
Collapse
Affiliation(s)
- Wendy A Warr
- Wendy Warr & Associates, 6 Berwick Court, Holmes Chapel, Crewe, Cheshire CW4 7HZ, United Kingdom
| | - Marc C Nicklaus
- NCI, NIH, CADD Group, NCI-Frederick, Frederick, Maryland 21702, United States
| | - Christos A Nicolaou
- Discovery Chemistry, Lilly Research Laboratories, Eli Lilly and Company, Indianapolis, Indiana 46285, United States
| | - Matthias Rarey
- Universität Hamburg, ZBH Center for Bioinformatics, 20146 Hamburg, Germany
| |
Collapse
|
12
|
Wahl J, Sander T. Fully Automated Creation of Virtual Chemical Fragment Spaces Using the Open-Source Library OpenChemLib. J Chem Inf Model 2022; 62:2202-2211. [DOI: 10.1021/acs.jcim.1c01041] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Affiliation(s)
- Joel Wahl
- Scientific Computing Drug Discovery, Idorsia Pharmaceuticals Ltd., Hegenheimermattweg 91, CH-4123 Allschwil, Switzerland
| | - Thomas Sander
- Scientific Computing Drug Discovery, Idorsia Pharmaceuticals Ltd., Hegenheimermattweg 91, CH-4123 Allschwil, Switzerland
| |
Collapse
|
13
|
Bellmann L, Penner P, Gastreich M, Rarey M. Comparison of Combinatorial Fragment Spaces and Its Application to Ultralarge Make-on-Demand Compound Catalogs. J Chem Inf Model 2022; 62:553-566. [PMID: 35050621 DOI: 10.1021/acs.jcim.1c01378] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The set of chemical compounds shared by two or more chemical libraries is assessed routinely as means of comparing these libraries for various applications. Traditionally this is achieved by comparing the members of the chemical libraries individually for identity. This approach becomes impractical when operating on chemical libraries exceeding billions or even trillions of compounds in size. As a result, no such analysis exists for ultralarge chemical spaces like the Enamine REAL Space containing over 20 billion compounds. In this work, we present a novel tool called SpaceCompare for the overlap calculation of large, nonenumerable combinatorial fragment spaces. In contrast to existing methods, SpaceCompare utilizes topological fingerprints and the combinatorial character of these chemical spaces. The tool is able to determine the exact overlap of prominent spaces like Enamine's REAL Space, WuXi's GalaXi Space, and Otava's CHEMriya for the first time.
Collapse
Affiliation(s)
- Louis Bellmann
- Universität Hamburg, ZBH - Center for Bioinformatics, Research Group for Computational Molecular Design, Bundesstraße 43, 20146 Hamburg, Germany
| | - Patrick Penner
- Universität Hamburg, ZBH - Center for Bioinformatics, Research Group for Computational Molecular Design, Bundesstraße 43, 20146 Hamburg, Germany
| | - Marcus Gastreich
- BioSolveIT GmbH, An der Ziegelei 79, 53757 Sankt Augustin, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH - Center for Bioinformatics, Research Group for Computational Molecular Design, Bundesstraße 43, 20146 Hamburg, Germany
| |
Collapse
|
14
|
|
15
|
Grygorenko OO. Enamine Ltd.: The Science and Business of Organic Chemistry and Beyond. European J Org Chem 2021. [DOI: 10.1002/ejoc.202101210] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Affiliation(s)
- Oleksandr O. Grygorenko
- Enamine Ltd. Chervonotkatska 78 Kyiv 02094 Ukraine
- Taras Shevchenko National University of Kyiv Volodymyrska Street 60 Kyiv 01601 Ukraine
| |
Collapse
|
16
|
Schmidt R, Klein R, Rarey M. Maximum Common Substructure Searching in Combinatorial Make-on-Demand Compound Spaces. J Chem Inf Model 2021; 62:2133-2150. [PMID: 34478299 DOI: 10.1021/acs.jcim.1c00640] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Commercial make-on-demand compound spaces have become increasingly popular within the past few years. Since these libraries are too large for enumeration, they are usually accessed using combinatorial fragment space technologies like FTrees-FS and SpaceLight. Although both search types are of high practical impact, they lack the ability to search for precise structural features on the atomic level. To address this important use case, we developed SpaceMACS enabling efficient and precise maximum common induced substructure (MCIS) similarity and substructure searches within chemical fragment spaces. SpaceMACS enumerates a user-defined number of compounds in a multistep procedure. First, substructures of the query are extracted and matched to all fragments of the space. Then partial results are combined to actual compounds of the space. In this way, SpaceMACS identifies common substructures even if they cross fragment borders. We applied SpaceMACS on three commercial fragment spaces searching for the 150 000 most similar analogs to a glucosyltransferase binder from literature. We were able to find almost all building blocks used for the synthesis of the 90 listed analogs and a plethora of additional results. SpaceMACS is the missing link to enable rational drug discovery on make-on-demand combinatorial catalogs. No matter whether initial compound suggestions come from a de novo design, an AI-based compound generation, or a medicinal chemist's drawing board, the method gives access to the structurally closest chemically available analogs in seconds to at most minutes.
Collapse
Affiliation(s)
- Robert Schmidt
- Universität Hamburg, ZBH-Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| | - Raphael Klein
- BioSolveIT GmbH, An der Ziegelei 79, 53757 Sankt Augustin, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH-Center for Bioinformatics, Bundesstraße 43, 20146 Hamburg, Germany
| |
Collapse
|
17
|
Radchenko DS, Naumchyk VS, Dziuba I, Kyrylchuk AA, Gubina KE, Moroz YS, Grygorenko OO. One-pot parallel synthesis of 1,3,5-trisubstituted 1,2,4-triazoles. Mol Divers 2021; 26:993-1004. [PMID: 33797670 DOI: 10.1007/s11030-021-10218-2] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 03/24/2021] [Indexed: 11/24/2022]
Abstract
An implementation of the three-component one-pot approach to unsymmetrical 1,3,5-trisubstituted-1,2,4-triazoles into combinatorial chemistry is described. The procedure is based on the coupling of amidines with carboxylic acids and subsequent cyclization with hydrazines. After the preliminary assessment of the reagent scope, the method had 81% success rate in parallel synthesis. It was shown that over a billion-sized chemical space of readily accessible ("REAL") compounds may be generated based on the proposed methodology. Analysis of physicochemical parameters shows that the library contains significant fractions of both drug-like and "beyond-rule-of-five" members. More than 10 million of accessible compounds meet the strictest lead-likeness criteria. Additionally, 195 Mln of sp3-enriched compounds can be produced. This makes the proposed approach a valuable tool in medicinal chemistry.
Collapse
Affiliation(s)
- Dmytro S Radchenko
- Enamine Ltd., Chervonotkatska Street 78, Kyiv, 02094, Ukraine.,Taras Shevchenko National University of Kyiv, Volodymyrska Street 60, Kyiv, 01601, Ukraine
| | | | - Igor Dziuba
- Chemspace, Chervonotkatska Street 78, Kyiv, 02094, Ukraine
| | - Andrii A Kyrylchuk
- Enamine Ltd., Chervonotkatska Street 78, Kyiv, 02094, Ukraine.,Institute of Organic Chemistry, National Academy of Sciences of Ukraine, Murmanska Street 5, Kyiv, 02094, Ukraine
| | - Kateryna E Gubina
- Taras Shevchenko National University of Kyiv, Volodymyrska Street 60, Kyiv, 01601, Ukraine
| | - Yurii S Moroz
- Taras Shevchenko National University of Kyiv, Volodymyrska Street 60, Kyiv, 01601, Ukraine.,Chemspace, Chervonotkatska Street 78, Kyiv, 02094, Ukraine
| | - Oleksandr O Grygorenko
- Enamine Ltd., Chervonotkatska Street 78, Kyiv, 02094, Ukraine. .,Taras Shevchenko National University of Kyiv, Volodymyrska Street 60, Kyiv, 01601, Ukraine.
| |
Collapse
|
18
|
Grygorenko OO, Radchenko DS, Dziuba I, Chuprina A, Gubina KE, Moroz YS. Generating Multibillion Chemical Space of Readily Accessible Screening Compounds. iScience 2020; 23:101681. [PMID: 33145486 PMCID: PMC7593547 DOI: 10.1016/j.isci.2020.101681] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 09/17/2020] [Accepted: 10/10/2020] [Indexed: 11/25/2022] Open
Abstract
An approach to the generation of ultra-large chemical libraries of readily accessible (“REAL”) compounds is described. The strategy is based on the use of two- or three-step three-component reaction sequences and available starting materials with pre-validated chemical reactivity. After the preliminary parallel experiments, the methods with at least ∼80% synthesis success rate (such as acylation – deprotection – acylation of monoprotected diamines or amide formation – click reaction with functionalized azides) can be selected and used to generate the target chemical space. It is shown that by using only on the two aforementioned reaction sequences, a nearly 29-billion compound library is easily obtained. According to the predicted physico-chemical descriptor values, the generated chemical space contains large fractions of both drug-like and “beyond rule-of-five” members, whereas the strictest lead-likeness criteria (the so-called Churcher's rules) are met by the lesser part, which still exceeds 22 million. A strategy for ultra-large readily accessible (REAL) compound libraries is described Pre-validated two- or three-step three-component reaction sequences are used A 29-billion chemical space with ∼80% synthesis success rate has been easily obtained
Collapse
|
19
|
Irwin JJ, Tang KG, Young J, Dandarchuluun C, Wong BR, Khurelbaatar M, Moroz YS, Mayfield J, Sayle RA. ZINC20-A Free Ultralarge-Scale Chemical Database for Ligand Discovery. J Chem Inf Model 2020; 60:6065-6073. [PMID: 33118813 DOI: 10.1021/acs.jcim.0c00675] [Citation(s) in RCA: 257] [Impact Index Per Article: 64.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Identifying and purchasing new small molecules to test in biological assays are enabling for ligand discovery, but as purchasable chemical space continues to grow into the tens of billions based on inexpensive make-on-demand compounds, simply searching this space becomes a major challenge. We have therefore developed ZINC20, a new version of ZINC with two major new features: billions of new molecules and new methods to search them. As a fully enumerated database, ZINC can be searched precisely using explicit atomic-level graph-based methods, such as SmallWorld for similarity and Arthor for pattern and substructure search, as well as 3D methods such as docking. Analysis of the new make-on-demand compound sets by these and related tools reveals startling features. For instance, over 97% of the core Bemis-Murcko scaffolds in make-on-demand libraries are unavailable from "in-stock" collections. Correspondingly, the number of new Bemis-Murcko scaffolds is rising almost as a linear fraction of the elaborated molecules. Thus, an 88-fold increase in the number of molecules in the make-on-demand versus the in-stock sets is built upon a 16-fold increase in the number of Bemis-Murcko scaffolds. The make-on-demand library is also more structurally diverse than physical libraries, with a massive increase in disc- and sphere-like shaped molecules. The new system is freely available at zinc20.docking.org.
Collapse
Affiliation(s)
- John J Irwin
- Byers Hall, Department of Pharmaceutical Chemistry, University of California San Francisco, 1700 4th St, Mailcode 2330, Room BH508A, San Francisco, California 94158-2330, United States
| | - Khanh G Tang
- Byers Hall, Department of Pharmaceutical Chemistry, University of California San Francisco, 1700 4th St, Mailcode 2330, Room BH508A, San Francisco, California 94158-2330, United States
| | - Jennifer Young
- Byers Hall, Department of Pharmaceutical Chemistry, University of California San Francisco, 1700 4th St, Mailcode 2330, Room BH508A, San Francisco, California 94158-2330, United States
| | - Chinzorig Dandarchuluun
- Byers Hall, Department of Pharmaceutical Chemistry, University of California San Francisco, 1700 4th St, Mailcode 2330, Room BH508A, San Francisco, California 94158-2330, United States
| | - Benjamin R Wong
- Byers Hall, Department of Pharmaceutical Chemistry, University of California San Francisco, 1700 4th St, Mailcode 2330, Room BH508A, San Francisco, California 94158-2330, United States
| | - Munkhzul Khurelbaatar
- Byers Hall, Department of Pharmaceutical Chemistry, University of California San Francisco, 1700 4th St, Mailcode 2330, Room BH508A, San Francisco, California 94158-2330, United States
| | - Yurii S Moroz
- Chemspace LLC, 85 Chervonotkatska Street, Suite 1, Kyiv 02094, Ukraine.,Taras Shevchenko National University of Kyiv, Volodymyrska Street 60, Kyiv 01601, Ukraine
| | - John Mayfield
- NextMove Software Ltd, Innovation Centre, 320 Cambridge Science Park, Milton Road, Cambridge CB4 0WG, United Kingdom
| | - Roger A Sayle
- NextMove Software Ltd, Innovation Centre, 320 Cambridge Science Park, Milton Road, Cambridge CB4 0WG, United Kingdom
| |
Collapse
|
20
|
Zhao L, Ciallella HL, Aleksunes LM, Zhu H. Advancing computer-aided drug discovery (CADD) by big data and data-driven machine learning modeling. Drug Discov Today 2020; 25:1624-1638. [PMID: 32663517 PMCID: PMC7572559 DOI: 10.1016/j.drudis.2020.07.005] [Citation(s) in RCA: 66] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Revised: 06/26/2020] [Accepted: 07/06/2020] [Indexed: 02/06/2023]
Abstract
Advancing a new drug to market requires substantial investments in time as well as financial resources. Crucial bioactivities for drug candidates, including their efficacy, pharmacokinetics (PK), and adverse effects, need to be investigated during drug development. With advancements in chemical synthesis and biological screening technologies over the past decade, a large amount of biological data points for millions of small molecules have been generated and are stored in various databases. These accumulated data, combined with new machine learning (ML) approaches, such as deep learning, have shown great potential to provide insights into relevant chemical structures to predict in vitro, in vivo, and clinical outcomes, thereby advancing drug discovery and development in the big data era.
Collapse
Affiliation(s)
- Linlin Zhao
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA
| | - Heather L Ciallella
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA
| | - Lauren M Aleksunes
- Department of Pharmacology and Toxicology, Ernest Mario School of Pharmacy, Rutgers University, Piscataway, NJ 08854, USA
| | - Hao Zhu
- The Rutgers Center for Computational and Integrative Biology, Camden, NJ 08102, USA; Department of Chemistry, Rutgers University, Camden, NJ 08102, USA.
| |
Collapse
|
21
|
Unprecedented Potential for Neural Drug Discovery Based on Self-Organizing hiPSC Platforms. Molecules 2020; 25:molecules25051150. [PMID: 32143423 PMCID: PMC7179160 DOI: 10.3390/molecules25051150] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 02/29/2020] [Accepted: 03/02/2020] [Indexed: 12/12/2022] Open
Abstract
Human induced pluripotent stem cells (hiPSCs) have transformed conventional drug discovery pathways in recent years. In particular, recent advances in hiPSC biology, including organoid technologies, have highlighted a new potential for neural drug discovery with clear advantages over the use of primary tissues. This is important considering the financial and social burden of neurological health care worldwide, directly impacting the life expectancy of many populations. Patient-derived iPSCs-neurons are invaluable tools for novel drug-screening and precision medicine approaches directly aimed at reducing the burden imposed by the increasing prevalence of neurological disorders in an aging population. 3-Dimensional self-assembled or so-called ‘organoid’ hiPSCs cultures offer key advantages over traditional 2D ones and may well be gamechangers in the drug-discovery quest for neurological disorders in the coming years.
Collapse
|