1
|
Shen X, Zeng T, Chen N, Li J, Wu R. NIMO: A Natural Product-Inspired Molecular Generative Model Based on Conditional Transformer. Molecules 2024; 29:1867. [PMID: 38675687 PMCID: PMC11053988 DOI: 10.3390/molecules29081867] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2024] [Revised: 04/11/2024] [Accepted: 04/13/2024] [Indexed: 04/28/2024] Open
Abstract
Natural products (NPs) have diverse biological activity and significant medicinal value. The structural diversity of NPs is the mainstay of drug discovery. Expanding the chemical space of NPs is an urgent need. Inspired by the concept of fragment-assembled pseudo-natural products, we developed a computational tool called NIMO, which is based on the transformer neural network model. NIMO employs two tailor-made motif extraction methods to map a molecular graph into a semantic motif sequence. All these generated motif sequences are used to train our molecular generative models. Various NIMO models were trained under different task scenarios by recognizing syntactic patterns and structure-property relationships. We further explored the performance of NIMO in structure-guided, activity-oriented, and pocket-based molecule generation tasks. Our results show that NIMO had excellent performance for molecule generation from scratch and structure optimization from a scaffold.
Collapse
Affiliation(s)
- Xiaojuan Shen
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, China; (X.S.); (T.Z.); (N.C.)
| | - Tao Zeng
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, China; (X.S.); (T.Z.); (N.C.)
| | - Nianhang Chen
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, China; (X.S.); (T.Z.); (N.C.)
| | - Jiabo Li
- ChemXAI Inc., 53 Barry Lane, Syosset, NY 11791, USA
| | - Ruibo Wu
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, China; (X.S.); (T.Z.); (N.C.)
| |
Collapse
|
2
|
Liu D, Song T, Na K, Wang S. PED: a novel predictor-encoder-decoder model for Alzheimer drug molecular generation. Front Artif Intell 2024; 7:1374148. [PMID: 38690194 PMCID: PMC11058643 DOI: 10.3389/frai.2024.1374148] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2024] [Accepted: 04/01/2024] [Indexed: 05/02/2024] Open
Abstract
Alzheimer's disease (AD) is a gradually advancing neurodegenerative disorder characterized by a concealed onset. Acetylcholinesterase (AChE) is an efficient hydrolase that catalyzes the hydrolysis of acetylcholine (ACh), which regulates the concentration of ACh at synapses and then terminates ACh-mediated neurotransmission. There are inhibitors to inhibit the activity of AChE currently, but its side effects are inevitable. In various application fields where Al have gained prominence, neural network-based models for molecular design have recently emerged and demonstrate encouraging outcomes. However, in the conditional molecular generation task, most of the current generation models need additional optimization algorithms to generate molecules with intended properties which make molecular generation inefficient. Consequently, we introduce a cognitive-conditional molecular design model, termed PED, which leverages the variational auto-encoder. Its primary function is to adeptly produce a molecular library tailored for specific properties. From this library, we can then identify molecules that inhibit AChE activity without adverse effects. These molecules serve as lead compounds, hastening AD treatment and concurrently enhancing the AI's cognitive abilities. In this study, we aim to fine-tune a VAE model pre-trained on the ZINC database using active compounds of AChE collected from Binding DB. Different from other molecular generation models, the PED can simultaneously perform both property prediction and molecule generation, consequently, it can generate molecules with intended properties without additional optimization process. Experiments of evaluation show that proposed model performs better than other methods benchmarked on the same data sets. The results indicated that the model learns a good representation of potential chemical space, it can well generate molecules with intended properties. Extensive experiments on benchmark datasets confirmed PED's efficiency and efficacy. Furthermore, we also verified the binding ability of molecules to AChE through molecular docking. The results showed that our molecular generation system for AD shows excellent cognitive capacities, the molecules within the molecular library could bind well to AChE and inhibit its activity, thus preventing the hydrolysis of ACh.
Collapse
Affiliation(s)
- Dayan Liu
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, China
| | - Tao Song
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, China
| | - Kang Na
- The Ninth Department of Health Care Administration, The Second Medical Center, Chinese PLA General Hospital, Beijing, China
| | - Shudong Wang
- College of Computer Science and Technology, China University of Petroleum (East China), Qingdao, China
| |
Collapse
|
3
|
Pang C, Qiao J, Zeng X, Zou Q, Wei L. Deep Generative Models in De Novo Drug Molecule Generation. J Chem Inf Model 2024; 64:2174-2194. [PMID: 37934070 DOI: 10.1021/acs.jcim.3c01496] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2023]
Abstract
The discovery of new drugs has important implications for human health. Traditional methods for drug discovery rely on experiments to optimize the structure of lead molecules, which are time-consuming and high-cost. Recently, artificial intelligence has exhibited promising and efficient performance for drug-like molecule generation. In particular, deep generative models achieve great success in de novo generation of drug-like molecules with desired properties, showing massive potential for novel drug discovery. In this study, we review the recent progress of molecule generation using deep generative models, mainly focusing on molecule representations, public databases, data processing tools, and advanced artificial intelligence based molecule generation frameworks. In particular, we present a comprehensive comparison of state-of-the-art deep generative models for molecule generation and a summary of commonly used molecular design strategies. We identify research gaps and challenges of molecule generation such as the need for better databases, missing 3D information in molecular representation, and the lack of high-precision evaluation metrics. We suggest future directions for molecular generation and drug discovery.
Collapse
Affiliation(s)
- Chao Pang
- School of Software, Shandong University, Jinan 250100, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250100, China
| | - Jianbo Qiao
- School of Software, Shandong University, Jinan 250100, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250100, China
| | - Xiangxiang Zeng
- College of Information Science and Engineering, Hunan University, Changsha 410082, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan 250100, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250100, China
| |
Collapse
|
4
|
Lu H, Wei Z, Wang X, Zhang K, Liu H. GraphGPT: A Graph Enhanced Generative Pretrained Transformer for Conditioned Molecular Generation. Int J Mol Sci 2023; 24:16761. [PMID: 38069085 PMCID: PMC10706000 DOI: 10.3390/ijms242316761] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2023] [Revised: 11/16/2023] [Accepted: 11/23/2023] [Indexed: 12/18/2023] Open
Abstract
Condition-based molecular generation can generate a large number of molecules with particular properties, expanding the virtual drug screening library, and accelerating the process of drug discovery. In this study, we combined a molecular graph structure and sequential representations using a generative pretrained transformer (GPT) architecture for generating molecules conditionally. The incorporation of graph structure information facilitated a better comprehension of molecular topological features, and the augmentation of a sequential contextual understanding of GPT architecture facilitated molecular generation. The experiments indicate that our model efficiently produces molecules with the desired properties, with valid and unique metrics that are close to 100%. Faced with the typical task of generating molecules based on a scaffold in drug discovery, our model is able to preserve scaffold information and generate molecules with low similarity and specified properties.
Collapse
Affiliation(s)
| | | | | | | | - Hao Liu
- College of Computer Science and Technology, Ocean University of China, Qingdao 266100, China
| |
Collapse
|
5
|
Ohue M, Kojima Y, Kosugi T. Generating Potential Protein-Protein Interaction Inhibitor Molecules Based on Physicochemical Properties. Molecules 2023; 28:5652. [PMID: 37570623 PMCID: PMC10420264 DOI: 10.3390/molecules28155652] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2023] [Revised: 07/06/2023] [Accepted: 07/24/2023] [Indexed: 08/13/2023] Open
Abstract
Protein-protein interactions (PPIs) are associated with various diseases; hence, they are important targets in drug discovery. However, the physicochemical empirical properties of PPI-targeted drugs are distinct from those of conventional small molecule oral pharmaceuticals, which adhere to the "rule of five (RO5)". Therefore, developing PPI-targeted drugs using conventional methods, such as molecular generation models, is challenging. In this study, we propose a molecular generation model based on deep reinforcement learning that is specialized for the production of PPI inhibitors. By introducing a scoring function that can represent the properties of PPI inhibitors, we successfully generated potential PPI inhibitor compounds. These newly constructed virtual compounds possess the desired properties for PPI inhibitors, and they show similarity to commercially available PPI libraries. The virtual compounds are freely available as a virtual library.
Collapse
Affiliation(s)
- Masahito Ohue
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Kanagawa 226-8501, Japan (T.K.)
| | | | | |
Collapse
|
6
|
Zhu JF, Hao ZK, Liu Q, Yin Y, Lu CQ, Huang ZY, Chen EH. Towards Exploring Large Molecular Space: An Efficient Chemical Genetic Algorithm. J Comput Sci Technol 2022; 37:1464-1477. [PMID: 36594005 PMCID: PMC9797891 DOI: 10.1007/s11390-021-0970-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Accepted: 04/20/2021] [Indexed: 06/17/2023]
Abstract
UNLABELLED Generating molecules with desired properties is an important task in chemistry and pharmacy. An efficient method may have a positive impact on finding drugs to treat diseases like COVID-19. Data mining and artificial intelligence may be good ways to find an efficient method. Recently, both the generative models based on deep learning and the work based on genetic algorithms have made some progress in generating molecules and optimizing the molecule's properties. However, existing methods need to be improved in efficiency and performance. To solve these problems, we propose a method named the Chemical Genetic Algorithm for Large Molecular Space (CALM). Specifically, CALM employs a scalable and efficient molecular representation called molecular matrix. Then, we design corresponding crossover, mutation, and mask operators inspired by domain knowledge and previous studies. We apply our genetic algorithm to several tasks related to molecular property optimization and constraint molecular optimization. The results of these tasks show that our approach outperforms the other state-of-the-art deep learning and genetic algorithm methods, where the z tests performed on the results of several experiments show that our method is more than 99% likely to be significant. At the same time, based on the experimental results, we point out the insufficiency in the experimental evaluation standard which affects the fair evaluation of previous work. SUPPLEMENTARY INFORMATION The online version contains supplementary material available at 10.1007/s11390-021-0970-3.
Collapse
Affiliation(s)
- Jian-Fu Zhu
- Anhui Province Key Laboratory of Big Data Analysis and Application, School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026 China
| | - Zhong-Kai Hao
- Anhui Province Key Laboratory of Big Data Analysis and Application, School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026 China
| | - Qi Liu
- Anhui Province Key Laboratory of Big Data Analysis and Application, School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026 China
| | - Yu Yin
- Anhui Province Key Laboratory of Big Data Analysis and Application, School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026 China
| | - Cheng-Qiang Lu
- Anhui Province Key Laboratory of Big Data Analysis and Application, School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026 China
| | - Zhen-Ya Huang
- Anhui Province Key Laboratory of Big Data Analysis and Application, School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026 China
| | - En-Hong Chen
- Anhui Province Key Laboratory of Big Data Analysis and Application, School of Computer Science and Technology, University of Science and Technology of China, Hefei, 230026 China
| |
Collapse
|
7
|
Zeng T, Hess BA, Zhang F, Wu R. Bio-inspired chemical space exploration of terpenoids. Brief Bioinform 2022; 23:6586263. [PMID: 35576010 DOI: 10.1093/bib/bbac197] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 04/26/2022] [Accepted: 04/28/2022] [Indexed: 11/12/2022] Open
Abstract
Many computational methods are devoted to rapidly generating pseudo-natural products to expand the open-ended border of chemical spaces for natural products. However, the accessibility and chemical interpretation were often ignored or underestimated in conventional library/fragment-based or rule-based strategies, thus hampering experimental synthesis. Herein, a bio-inspired strategy (named TeroGen) is developed to mimic the two key biosynthetic stages (cyclization and decoration) of terpenoid natural products, by utilizing physically based simulations and deep learning models, respectively. The precision and efficiency are validated for different categories of terpenoids, and in practice, more than 30 000 sesterterpenoids (10 times as many as the known sesterterpenoids) are predicted to be linked in a reaction network, and their synthetic accessibility and chemical interpretation are estimated by thermodynamics and kinetics. Since it could not only greatly expand the chemical space of terpenoids but also numerate plausible biosynthetic routes, TeroGen is promising for accelerating heterologous biosynthesis, bio-mimic and chemical synthesis of complicated terpenoids and derivatives.
Collapse
Affiliation(s)
- Tao Zeng
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, P.R. China
| | | | - Fan Zhang
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, P.R. China
| | - Ruibo Wu
- School of Pharmaceutical Sciences, Sun Yat-sen University, Guangzhou 510006, P.R. China
| |
Collapse
|
8
|
Wang M, Sun H, Wang J, Pang J, Chai X, Xu L, Li H, Cao D, Hou T. Comprehensive assessment of deep generative architectures for de novo drug design. Brief Bioinform 2021; 23:6470970. [PMID: 34929743 DOI: 10.1093/bib/bbab544] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Revised: 11/24/2021] [Accepted: 11/25/2021] [Indexed: 01/20/2023] Open
Abstract
Recently, deep learning (DL)-based de novo drug design represents a new trend in pharmaceutical research, and numerous DL-based methods have been developed for the generation of novel compounds with desired properties. However, a comprehensive understanding of the advantages and disadvantages of these methods is still lacking. In this study, the performances of different generative models were evaluated by analyzing the properties of the generated molecules in different scenarios, such as goal-directed (rediscovery, optimization and scaffold hopping of active compounds) and target-specific (generation of novel compounds for a given target) tasks. In overall, the DL-based models have significant advantages over the baseline models built by the traditional methods in learning the physicochemical property distributions of the training sets and may be more suitable for target-specific tasks. However, both the baselines and DL-based generative models cannot fully exploit the scaffolds of the training sets, and the molecules generated by the DL-based methods even have lower scaffold diversity than those generated by the traditional models. Moreover, our assessment illustrates that the DL-based methods do not exhibit obvious advantages over the genetic algorithm-based baselines in goal-directed tasks. We believe that our study provides valuable guidance for the effective use of generative models in de novo drug design.
Collapse
Affiliation(s)
- Mingyang Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Huiyong Sun
- Department of Medicinal Chemistry, China Pharmaceutical University, Nanjing 210009, Jiangsu, P. R. China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Jinping Pang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Xin Chai
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Lei Xu
- Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, Jiangsu, China
| | - Honglin Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science & Technology, Shanghai 200237, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| |
Collapse
|
9
|
Kolesniková L, León I, Alonso ER, Mata S, Alonso JL. An Innovative Approach for the Generation of Species of the Interstellar Medium. Angew Chem Int Ed Engl 2021; 60:24461-24466. [PMID: 34496111 PMCID: PMC8597129 DOI: 10.1002/anie.202110325] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 09/02/2021] [Indexed: 11/16/2022]
Abstract
The large amount of unstable species in the realm of interstellar chemistry drives an urgent need to develop efficient methods for the in situ generations of molecules that enable their spectroscopic characterizations. Such laboratory experiments are fundamental to decode the molecular universe by matching the interstellar and terrestrial spectra. We propose an approach based on laser ablation of nonvolatile solid organic precursors. The generated chemical species are cooled in a supersonic expansion and probed by high‐resolution microwave spectroscopy. We present a proof of concept through a simultaneous formation of interstellar compounds and the first generation of aminocyanoacetylene using diaminomaleonitrile as a prototypical precursor. With this micro‐laboratory, we open the door to generation of unsuspected species using precursors not typically accessible to traditional techniques such as electric discharge and pyrolysis.
Collapse
Affiliation(s)
- Lucie Kolesniková
- Department of Analytical Chemistry, University of Chemistry and Technology, Technická 5, 16628, Prague 6, Czech Republic
| | - Iker León
- Grupo de Espectroscopia Molecular (GEM), Edificio Quifima, Área de Química-Física, Laboratorios de Espectroscopia y, Bioespectroscopia, Parque Científico UVa, Unidad Asociada CSIC, Universidad de Valladolid, 47011, Valladolid, Spain
| | - Elena R Alonso
- Instituto Biofisika (UPV/EHU, CSIC), University of the Basque Country, 48940, Leioa, Spain.,Departamento de Química Física, Facultad de Ciencia y Tecnología, Universidad del País Vasco, Barrio Sarriena s/n, 48940, Leioa, Spain
| | - Santiago Mata
- Grupo de Espectroscopia Molecular (GEM), Edificio Quifima, Área de Química-Física, Laboratorios de Espectroscopia y, Bioespectroscopia, Parque Científico UVa, Unidad Asociada CSIC, Universidad de Valladolid, 47011, Valladolid, Spain
| | - Jose Luis Alonso
- Grupo de Espectroscopia Molecular (GEM), Edificio Quifima, Área de Química-Física, Laboratorios de Espectroscopia y, Bioespectroscopia, Parque Científico UVa, Unidad Asociada CSIC, Universidad de Valladolid, 47011, Valladolid, Spain
| |
Collapse
|
10
|
Lai X, Yang P, Wang K, Yang Q, Yu D. MGRNN: Structure Generation of Molecules Based on Graph Recurrent Neural Networks. Mol Inform 2021; 40:e2100091. [PMID: 34411448 DOI: 10.1002/minf.202100091] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2021] [Accepted: 07/18/2021] [Indexed: 11/11/2022]
Abstract
Molecular structure generation is a critical problem for materials science and has attracted growing attention. The problem is challenging since it requires to generate chemically valid molecular structures. Inspired by the recent work in deep generative models, we propose a graph recurrent neural network model for drug molecular structure generation, briefly called MGRNN (Molecular Graph Recurrent Neural Networks). MGRNN combines the advantages of both iterative molecular generation algorithm and the efficiency of the training strategies. Moreover, MGRNN shows: (i) efficient computation for training; (ii) high model robustness for data; and (iii) an iterative sampling process, which allows to use chemical domain expertise for valency checking. Experimental results show that MGRNN is able to generate 69 % chemically valid molecules even without chemical knowledge and 100 % valid molecules with chemical rules.
Collapse
Affiliation(s)
- Xin Lai
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Peisong Yang
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Kunfeng Wang
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Qingyuan Yang
- College of Chemical Engineering, Beijing University of Chemical Technology, Beijing, 100029, China
| | - Duli Yu
- College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China.,Beijing Advanced Innovation Center for Soft Matter Science and Engineering, Beijing University of Chemical Technology, Beijing, 100029, China
| |
Collapse
|
11
|
Yang J, Hou L, Liu KM, He WB, Cai Y, Yang FQ, Hu YJ. ChemGenerator: a web server for generating potential ligands for specific targets. Brief Bioinform 2020; 22:6055961. [PMID: 33381797 DOI: 10.1093/bib/bbaa407] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2020] [Revised: 11/23/2020] [Accepted: 12/10/2020] [Indexed: 12/15/2022] Open
Abstract
In drug discovery, one of the most important tasks is to find novel and biologically active molecules. Given that only a tip of iceberg of drugs was founded in nearly one-century's experimental exploration, it shows great significance to use in silico methods to expand chemical database and profile drug-target linkages. In this study, a web server named ChemGenerator was proposed to generate novel activates for specific targets based on users' input. The ChemGenerator relies on an autoencoder-based algorithm of Recurrent Neural Networks with Long Short-Term Memory by training of 7 million of molecular Simplified Molecular-Input Line-Entry System as the basic model, and further develops target guided generation by transfer learning. As results, ChemGenerator gains lower loss (<0.01) than existing reference model (0.2~0.4) and shows good performance in the case of Epidermal Growth Factor Receptor. Meanwhile, ChemGenerator is now freely accessible to the public by http://smiles.tcmobile.org. In proportion to endless molecular enumeration and time-consuming expensive experiments, this work demonstrates an efficient alternative way for the first virtual screening in drug discovery.
Collapse
Affiliation(s)
- Jing Yang
- Institute of Chinese Medical Sciences, State Key Laboratory of Quality Research in Chinese Medicine, University of Macau, Macao SAR, China
| | - Ling Hou
- Institute of Chinese Medical Sciences, State Key Laboratory of Quality Research in Chinese Medicine, University of Macau, Macao SAR, China
| | - Kun-Meng Liu
- Institute of Chinese Medical Sciences, State Key Laboratory of Quality Research in Chinese Medicine, University of Macau, Macao SAR, China
| | - Wen-Bin He
- Shanxi Key Laboratory of Chinese Medicine Encephalopathy, Shanxi University of Chinese Medicine, Jinzhong, Shanxi, China
| | - Yong Cai
- Beijing Normal University, Zhuhai, China
| | - Feng-Qing Yang
- School of Chemistry and Chemical Engineering, Chongqing University, Chongqing, China
| | - Yuan-Jia Hu
- Institute of Chinese Medical Sciences, State Key Laboratory of Quality Research in Chinese Medicine, University of Macau, Macao SAR, China
| |
Collapse
|
12
|
Abstract
The use of computer tools to solve chemistry-related problems has given rise to a large and increasing number of publications these last decades. This new field of science is now well recognized and labelled Chemoinformatics. Among all chemoinformatics techniques, the use of statistical based approaches for property predictions has been the subject of numerous research reflecting both new developments and many cases of applications. The so obtained predictive models relating a property to molecular features - descriptors - are gathered under the acronym QSPR, for Quantitative Structure Property Relationships. Apart from the obvious use of such models to predict property values for new compounds, their use to virtually synthesize new molecules - de novo design - is currently a high-interest subject. Inverse-QSPR (i-QSPR) methods have hence been developed to accelerate the discovery of new materials that meet a set of specifications. In the proposed manuscript, we review existing i-QSPR methodologies published in the open literature in a way to highlight developments, applications, improvements and limitations of each.
Collapse
Affiliation(s)
- Philippe Gantzer
- IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| | - Benoit Creton
- IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| | - Carlos Nieto-Draghi
- IFP Energies nouvelles, 1 et 4 avenue de Bois-Préau, 92852, Rueil-Malmaison, France
| |
Collapse
|