1
|
Wang M, Li S, Wang J, Zhang O, Du H, Jiang D, Wu Z, Deng Y, Kang Y, Pan P, Li D, Wang X, Yao X, Hou T, Hsieh CY. ClickGen: Directed exploration of synthesizable chemical space via modular reactions and reinforcement learning. Nat Commun 2024; 15:10127. [PMID: 39578485 PMCID: PMC11584676 DOI: 10.1038/s41467-024-54456-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2024] [Accepted: 11/07/2024] [Indexed: 11/24/2024] Open
Abstract
Despite the significant potential of generative models, low synthesizability of many generated molecules limits their real-world applications. In response to this issue, we develop ClickGen, a deep learning model that utilizes modular reactions like click chemistry to assemble molecules and incorporates reinforcement learning along with inpainting technique to ensure that the proposed molecules display high diversity, novelty and strong binding tendency. ClickGen demonstrates superior performance over the other reaction-based generative models in terms of novelty, synthesizability, and docking conformation similarity for existing binders targeting the three proteins. We then proceeded to conduct wet-lab validation on the ClickGen's proposed molecules for poly adenosine diphosphate-ribose polymerase 1. Due to the guaranteed high synthesizability and model-generated synthetic routes for reference, we successfully produced and tested the bioactivity of these novel compounds in just 20 days, much faster than typically expected time frame when handling sufficiently novel molecules. In bioactivity assays, two lead compounds demonstrated superior anti-proliferative efficacy against cancer cell lines, low toxicity, and nanomolar-level inhibitory activity to PARP1. We demonstrate that ClickGen and related models may represent a new paradigm in molecular generation, bringing AI-driven, automated experimentation and closed-loop molecular design closer to realization.
Collapse
Affiliation(s)
- Mingyang Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Shuai Li
- Institute of Traditional Chinese Medicine, Chengde Medical University, Chengde, 067000, Hebei, China
- Department of Pharmacy, College of Biology, Hunan University, Changsha, 410082, Hunan, China
| | - Jike Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, Zhejiang, China
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, 310018, Zhejiang, China
| | - Odin Zhang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Hongyan Du
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Dejun Jiang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Zhenxing Wu
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Yafeng Deng
- Institute of Traditional Chinese Medicine, Chengde Medical University, Chengde, 067000, Hebei, China
| | - Yu Kang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Peichen Pan
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Dan Li
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Xiaorui Wang
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology, Macao, 999078, China
| | - Xiaojun Yao
- Faculty of Applied Sciences, Macao Polytechnic University, Macao, 999078, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| |
Collapse
|
2
|
Djikic-Stojsic T, Bret G, Blond G, Girard N, Le Guen C, Marsol C, Schmitt M, Schneider S, Bihel F, Bonnet D, Gulea M, Kellenberger E. The IMS Library: from IN-Stock to Virtual. ChemMedChem 2024; 19:e202400381. [PMID: 39031900 DOI: 10.1002/cmdc.202400381] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Revised: 06/18/2024] [Accepted: 06/19/2024] [Indexed: 07/22/2024]
Abstract
A chemical library is a key element in the early stages of pharmaceutical research. Its design encompasses various factors, such as diversity, size, ease of synthesis, aimed at increasing the likelihood of success in drug discovery. This article explores the collaborative efforts of computational and synthetic chemists in tailoring chemical libraries for cost-effective and resource-efficient use, particularly in the context of academic research projects. It proposes chemoinformatics methodologies that address two pivotal questions: first, crafting a diverse panel of under 1000 compounds from an existing pool through synthetic efforts, leveraging the expertise of organic chemists; and second, expanding pharmacophoric diversity within this panel by creating a highly accessible virtual chemical library. Chemoinformatics tools were developed to analyse initial panel of about 10,000 compounds into two tailored libraries: eIMS and vIMS. The eIMS Library comprises 578 diverse in-stock compounds ready for screening. Its virtual counterpart, vIMS, features novel compounds guided by chemists, ensuring synthetic accessibility. vIMS offers a broader array of binding motifs and improved drug-like characteristics achieved through the addition of diverse functional groups to eIMS scaffolds followed by filtering of reactive or unusual structures. The uniqueness of vIMS is emphasized through a comparison with commercial suppliers' virtual chemical space.
Collapse
Affiliation(s)
- Teodora Djikic-Stojsic
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS - Université de Strasbourg, Faculté de Pharmacie, 74 route du Rhin, Illkirch-Graffenstaden, 67400, France
| | - Guillaume Bret
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS - Université de Strasbourg, Faculté de Pharmacie, 74 route du Rhin, Illkirch-Graffenstaden, 67400, France
| | - Gaëlle Blond
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS - Université de Strasbourg, Faculté de Pharmacie, 74 route du Rhin, Illkirch-Graffenstaden, 67400, France
| | - Nicolas Girard
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS - Université de Strasbourg, Faculté de Pharmacie, 74 route du Rhin, Illkirch-Graffenstaden, 67400, France
| | - Clothilde Le Guen
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS - Université de Strasbourg, Faculté de Pharmacie, 74 route du Rhin, Illkirch-Graffenstaden, 67400, France
- Inovarion, 251 rue St Jacques, Paris, 75005, France
| | - Claire Marsol
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS - Université de Strasbourg, Faculté de Pharmacie, 74 route du Rhin, Illkirch-Graffenstaden, 67400, France
| | - Martine Schmitt
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS - Université de Strasbourg, Faculté de Pharmacie, 74 route du Rhin, Illkirch-Graffenstaden, 67400, France
| | - Séverine Schneider
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS - Université de Strasbourg, Faculté de Pharmacie, 74 route du Rhin, Illkirch-Graffenstaden, 67400, France
| | - Frederic Bihel
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS - Université de Strasbourg, Faculté de Pharmacie, 74 route du Rhin, Illkirch-Graffenstaden, 67400, France
| | - Dominique Bonnet
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS - Université de Strasbourg, Faculté de Pharmacie, 74 route du Rhin, Illkirch-Graffenstaden, 67400, France
| | - Mihaela Gulea
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS - Université de Strasbourg, Faculté de Pharmacie, 74 route du Rhin, Illkirch-Graffenstaden, 67400, France
| | - Esther Kellenberger
- Laboratoire d'Innovation Thérapeutique, UMR7200 CNRS - Université de Strasbourg, Faculté de Pharmacie, 74 route du Rhin, Illkirch-Graffenstaden, 67400, France
| |
Collapse
|
3
|
Song RX, Nicklaus MC, Tarasova NI. Correlation of protein binding pocket properties with hits' chemistries used in generation of ultra-large virtual libraries. J Comput Aided Mol Des 2024; 38:22. [PMID: 38753096 PMCID: PMC11098933 DOI: 10.1007/s10822-024-00562-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 04/22/2024] [Indexed: 05/19/2024]
Abstract
Although the size of virtual libraries of synthesizable compounds is growing rapidly, we are still enumerating only tiny fractions of the drug-like chemical universe. Our capability to mine these newly generated libraries also lags their growth. That is why fragment-based approaches that utilize on-demand virtual combinatorial libraries are gaining popularity in drug discovery. These à la carte libraries utilize synthetic blocks found to be effective binders in parts of target protein pockets and a variety of reliable chemistries to connect them. There is, however, no data on the potential impact of the chemistries used for making on-demand libraries on the hit rates during virtual screening. There are also no rules to guide in the selection of these synthetic methods for production of custom libraries. We have used the SAVI (Synthetically Accessible Virtual Inventory) library, constructed using 53 reliable reaction types (transforms), to evaluate the impact of these chemistries on docking hit rates for 40 well-characterized protein pockets. The data shows that the virtual hit rates differ significantly for different chemistries with cross coupling reactions such as Sonogashira, Suzuki-Miyaura, Hiyama and Liebeskind-Srogl coupling producing the highest hit rates. Virtual hit rates appear to depend not only on the property of the formed chemical bond but also on the diversity of available building blocks and the scope of the reaction. The data identifies reactions that deserve wider use through increasing the number of corresponding building blocks and suggests the reactions that are more effective for pockets with certain physical and hydrogen bond-forming properties.
Collapse
Affiliation(s)
- Robert X Song
- Cancer Innovation Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, MD, 21702, USA
| | - Marc C Nicklaus
- Computer-Aided Drug Design Group, Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, NIH, Frederick, MD, 21702, USA
| | - Nadya I Tarasova
- Cancer Innovation Laboratory, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Frederick, MD, 21702, USA.
| |
Collapse
|
4
|
Schuck B, Brenk R. On the hunt for metalloenzyme inhibitors: Investigating the presence of metal-coordinating compounds in screening libraries and chemical spaces. Arch Pharm (Weinheim) 2024; 357:e2300648. [PMID: 38279543 DOI: 10.1002/ardp.202300648] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2023] [Revised: 12/20/2023] [Accepted: 12/27/2023] [Indexed: 01/28/2024]
Abstract
Metalloenzymes play vital roles in various biological processes, requiring the search for inhibitors to develop treatment options for diverse diseases. While compound library screening is a conventional approach, the exploration of virtual chemical spaces housing trillions of compounds has emerged as an alternative strategy. In this study, we investigated the suitability of selected screening libraries and chemical spaces for discovering inhibitors of metalloenzymes featuring common ions (Mg2+, Mn2+, and Zn2+). First, metal-coordinating groups from ligands interacting with ions in the Protein Data Bank were extracted. Subsequently, the prevalence of these groups in two focused screening libraries (Life Chemicals' chelator library, comprising 6,428 compounds, and Otava's chelator fragment library, with 1,784 fragments) as well as two chemical spaces (GalaXi and REAL space, containing billions of virtual products) was investigated. In total, 1,223 metal-coordinating groups were identified, with about a quarter of these groups found within the examined libraries and spaces. Our results indicate that these can serve as valuable starting points for drug discovery targeting metalloenzymes. In addition, this study suggests ways to improve libraries and spaces for better success in finding potential inhibitors for metalloenzymes.
Collapse
Affiliation(s)
- Bruna Schuck
- Department of Biomedicine, University of Bergen, Bergen, Norway
| | - Ruth Brenk
- Department of Biomedicine, University of Bergen, Bergen, Norway
- Computational Biology Unit, University of Bergen, Bergen, Norway
| |
Collapse
|
5
|
Neumann A, Marrison L, Klein R. Relevance of the Trillion-Sized Chemical Space "eXplore" as a Source for Drug Discovery. ACS Med Chem Lett 2023; 14:466-472. [PMID: 37077402 PMCID: PMC10108389 DOI: 10.1021/acsmedchemlett.3c00021] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2023] [Accepted: 03/10/2023] [Indexed: 03/18/2023] Open
Abstract
Within the past two decades, virtual combinatorial compound collections, so-called chemical spaces, became an important molecule source for pharmaceutical research all over the world. The emergence of compound vendor chemical spaces with rapidly growing numbers of molecules raises questions about their application suitability and the quality of the content. Here, we examine the composition of the recently published and, so far, biggest chemical space, "eXplore", which comprises approximately 2.8 trillion virtual product molecules. The utility of eXplore to retrieve interesting chemistry around approved drugs and common Bemis Murcko scaffolds has been assessed with several methods (FTrees, SpaceLight, SpaceMACS). Further, the overlap between several vendor chemical spaces and a physicochemical property distribution analysis has been performed. Despite the straightforward chemical reactions underlying its setup, eXplore is demonstrated to provide relevant and, most importantly, easily accessible molecules for drug discovery campaigns.
Collapse
Affiliation(s)
| | - Lester Marrison
- eMolecules, 3430 Carmel Mountain Road, Suite
250, San Diego, California 92121, United States
| | - Raphael Klein
- BioSolveIT
GmbH, An der Ziegelei 79, 53757 Sankt Augustin, Germany
| |
Collapse
|
6
|
Korn M, Ehrt C, Ruggiu F, Gastreich M, Rarey M. Navigating large chemical spaces in early-phase drug discovery. Curr Opin Struct Biol 2023; 80:102578. [PMID: 37019067 DOI: 10.1016/j.sbi.2023.102578] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2022] [Revised: 01/28/2023] [Accepted: 02/26/2023] [Indexed: 04/07/2023]
Abstract
The size of actionable chemical spaces is surging, owing to a variety of novel techniques, both computational and experimental. As a consequence, novel molecular matter is now at our fingertips that cannot and should not be neglected in early-phase drug discovery. Huge, combinatorial, make-on-demand chemical spaces with high probability of synthetic success rise exponentially in content, generative machine learning models go hand in hand with synthesis prediction, and DNA-encoded libraries offer new ways of hit structure discovery. These technologies enable to search for new chemical matter in a much broader and deeper manner with less effort and fewer financial resources. These transformational developments require new cheminformatics approaches to make huge chemical spaces searchable and analyzable with low resources, and with as little energy consumption as possible. Substantial progress has been made in the past years with respect to computation as well as organic synthesis. First examples of bioactive compounds resulting from the successful use of these novel technologies demonstrate their power to contribute to tomorrow's drug discovery programs. This article gives a compact overview of the state-of-the-art.
Collapse
Affiliation(s)
- Malte Korn
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstr. 43, 20146 Hamburg, Germany
| | - Christiane Ehrt
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstr. 43, 20146 Hamburg, Germany
| | - Fiorella Ruggiu
- insitro, 279 E Grand Ave., CA 94608, South San Francisco, USA
| | - Marcus Gastreich
- BioSolveIT GmbH, An der Ziegelei 79, 53757 Sankt Augustin, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH - Center for Bioinformatics, Bundesstr. 43, 20146 Hamburg, Germany.
| |
Collapse
|
7
|
Revillo Imbernon J, Chiesa L, Kellenberger E. Mining the Protein Data Bank to inspire fragment library design. Front Chem 2023; 11:1089714. [PMID: 36846858 PMCID: PMC9950109 DOI: 10.3389/fchem.2023.1089714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2022] [Accepted: 01/27/2023] [Indexed: 02/12/2023] Open
Abstract
The fragment approach has emerged as a method of choice for drug design, as it allows difficult therapeutic targets to be addressed. Success lies in the choice of the screened chemical library and the biophysical screening method, and also in the quality of the selected fragment and structural information used to develop a drug-like ligand. It has recently been proposed that promiscuous compounds, i.e., those that bind to several proteins, present an advantage for the fragment approach because they are likely to give frequent hits in screening. In this study, we searched the Protein Data Bank for fragments with multiple binding modes and targeting different sites. We identified 203 fragments represented by 90 scaffolds, some of which are not or hardly present in commercial fragment libraries. By contrast to other available fragment libraries, the studied set is enriched in fragments with a marked three-dimensional character (download at 10.5281/zenodo.7554649).
Collapse
Affiliation(s)
- Julia Revillo Imbernon
- Laboratoire d’Innovation Thérapeutique, Faculté de Pharmacie, UMR7200 CNRS Université de Strasbourg, Illkirch-Graffenstaden, France
| | - Luca Chiesa
- Laboratoire d’Innovation Thérapeutique, Faculté de Pharmacie, UMR7200 CNRS Université de Strasbourg, Illkirch-Graffenstaden, France
| | | |
Collapse
|