51
|
Hierarchical Molecular Graph Self-Supervised Learning for property prediction. Commun Chem 2023; 6:34. [PMID: 36801953 PMCID: PMC9938270 DOI: 10.1038/s42004-023-00825-5] [Citation(s) in RCA: 33] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2022] [Accepted: 01/31/2023] [Indexed: 02/19/2023] Open
Abstract
Molecular graph representation learning has shown considerable strength in molecular analysis and drug discovery. Due to the difficulty of obtaining molecular property labels, pre-training models based on self-supervised learning has become increasingly popular in molecular representation learning. Notably, Graph Neural Networks (GNN) are employed as the backbones to encode implicit representations of molecules in most existing works. However, vanilla GNN encoders ignore chemical structural information and functions implied in molecular motifs, and obtaining the graph-level representation via the READOUT function hinders the interaction of graph and node representations. In this paper, we propose Hierarchical Molecular Graph Self-supervised Learning (HiMol), which introduces a pre-training framework to learn molecule representation for property prediction. First, we present a Hierarchical Molecular Graph Neural Network (HMGNN), which encodes motif structure and extracts node-motif-graph hierarchical molecular representations. Then, we introduce Multi-level Self-supervised Pre-training (MSP), in which corresponding multi-level generative and predictive tasks are designed as self-supervised signals of HiMol model. Finally, superior molecular property prediction results on both classification and regression tasks demonstrate the effectiveness of HiMol. Moreover, the visualization performance in the downstream dataset shows that the molecule representations learned by HiMol can capture chemical semantic information and properties.
Collapse
|
52
|
Schoenmaker L, Béquignon OJM, Jespers W, van Westen GJP. UnCorrupt SMILES: a novel approach to de novo design. J Cheminform 2023; 15:22. [PMID: 36788579 PMCID: PMC9926805 DOI: 10.1186/s13321-023-00696-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 02/06/2023] [Indexed: 02/16/2023] Open
Abstract
Generative deep learning models have emerged as a powerful approach for de novo drug design as they aid researchers in finding new molecules with desired properties. Despite continuous improvements in the field, a subset of the outputs that sequence-based de novo generators produce cannot be progressed due to errors. Here, we propose to fix these invalid outputs post hoc. In similar tasks, transformer models from the field of natural language processing have been shown to be very effective. Therefore, here this type of model was trained to translate invalid Simplified Molecular-Input Line-Entry System (SMILES) into valid representations. The performance of this SMILES corrector was evaluated on four representative methods of de novo generation: a recurrent neural network (RNN), a target-directed RNN, a generative adversarial network (GAN), and a variational autoencoder (VAE). This study has found that the percentage of invalid outputs from these specific generative models ranges between 4 and 89%, with different models having different error-type distributions. Post hoc correction of SMILES was shown to increase model validity. The SMILES corrector trained with one error per input alters 60-90% of invalid generator outputs and fixes 35-80% of them. However, a higher error detection and performance was obtained for transformer models trained with multiple errors per input. In this case, the best model was able to correct 60-95% of invalid generator outputs. Further analysis showed that these fixed molecules are comparable to the correct molecules from the de novo generators based on novelty and similarity. Additionally, the SMILES corrector can be used to expand the amount of interesting new molecules within the targeted chemical space. Introducing different errors into existing molecules yields novel analogs with a uniqueness of 39% and a novelty of approximately 20%. The results of this research demonstrate that SMILES correction is a viable post hoc extension and can enhance the search for better drug candidates.
Collapse
Affiliation(s)
- Linde Schoenmaker
- grid.5132.50000 0001 2312 1970Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Olivier J. M. Béquignon
- grid.5132.50000 0001 2312 1970Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Willem Jespers
- grid.5132.50000 0001 2312 1970Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Gerard J. P. van Westen
- grid.5132.50000 0001 2312 1970Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| |
Collapse
|
53
|
Wang L, Song Y, Wang H, Zhang X, Wang M, He J, Li S, Zhang L, Li K, Cao L. Advances of Artificial Intelligence in Anti-Cancer Drug Design: A Review of the Past Decade. Pharmaceuticals (Basel) 2023; 16:253. [PMID: 37259400 PMCID: PMC9963982 DOI: 10.3390/ph16020253] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 01/25/2023] [Accepted: 02/06/2023] [Indexed: 10/13/2023] Open
Abstract
Anti-cancer drug design has been acknowledged as a complicated, expensive, time-consuming, and challenging task. How to reduce the research costs and speed up the development process of anti-cancer drug designs has become a challenging and urgent question for the pharmaceutical industry. Computer-aided drug design methods have played a major role in the development of cancer treatments for over three decades. Recently, artificial intelligence has emerged as a powerful and promising technology for faster, cheaper, and more effective anti-cancer drug designs. This study is a narrative review that reviews a wide range of applications of artificial intelligence-based methods in anti-cancer drug design. We further clarify the fundamental principles of these methods, along with their advantages and disadvantages. Furthermore, we collate a large number of databases, including the omics database, the epigenomics database, the chemical compound database, and drug databases. Other researchers can consider them and adapt them to their own requirements.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | - Kang Li
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin 150081, China
| | - Lei Cao
- Department of Biostatistics, School of Public Health, Harbin Medical University, Harbin 150081, China
| |
Collapse
|
54
|
New avenues in artificial-intelligence-assisted drug discovery. Drug Discov Today 2023; 28:103516. [PMID: 36736583 DOI: 10.1016/j.drudis.2023.103516] [Citation(s) in RCA: 25] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2022] [Revised: 12/08/2022] [Accepted: 01/26/2023] [Indexed: 02/05/2023]
Abstract
Over the past decade, the amount of biomedical data available has grown at unprecedented rates. Increased automation technology and larger data volumes have encouraged the use of machine learning (ML) or artificial intelligence (AI) techniques for mining such data and extracting useful patterns. Because the identification of chemical entities with desired biological activity is a crucial task in drug discovery, AI technologies have the potential to accelerate this process and support decision making. In addition, the advent of deep learning (DL) has shown great promise in addressing diverse problems in drug discovery, such as de novo molecular design. Herein, we will appraise the current state-of-the-art in AI-assisted drug discovery, discussing the recent applications covering generative models for chemical structure generation, scoring functions to improve binding affinity and pose prediction, and molecular dynamics to assist in the parametrization, featurization and generalization tasks. Finally, we will discuss current hurdles and the strategies to overcome them, as well as potential future directions.
Collapse
|
55
|
Wang J, Mao J, Wang M, Le X, Wang Y. Explore drug-like space with deep generative models. Methods 2023; 210:52-59. [PMID: 36682423 DOI: 10.1016/j.ymeth.2023.01.004] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2022] [Revised: 01/05/2023] [Accepted: 01/17/2023] [Indexed: 01/20/2023] Open
Abstract
The process of design/discovery of drugs involves the identification and design of novel molecules that have the desired properties and bind well to a given disease-relevant target. One of the main challenges to effectively identify potential drug candidates is to explore the vast drug-like chemical space to find novel chemical structures with desired physicochemical properties and biological characteristics. Moreover, the chemical space of currently available molecular libraries is only a small fraction of the total possible drug-like chemical space. Deep molecular generative models have received much attention and provide an alternative approach to the design and discovery of molecules. To efficiently explore the drug-like space, we first constructed the drug-like dataset and then performed the generative design of drug-like molecules using a Conditional Randomized Transformer approach with the molecular access system (MACCS) fingerprint as a condition and compared it with previously published molecular generative models. The results show that the deep molecular generative model explores the wider drug-like chemical space. The generated drug-like molecules share the chemical space with known drugs, and the drug-like space captured by the combination of quantitative estimation of drug-likeness (QED) and quantitative estimate of protein-protein interaction targeting drug-likeness (QEPPI) can cover a larger drug-like space. Finally, we show the potential application of the model in design of inhibitors of MDM2-p53 protein-protein interaction. Our results demonstrate the potential application of deep molecular generative models for guided exploration in drug-like chemical space and molecular design.
Collapse
Affiliation(s)
- Jianmin Wang
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Korea
| | - Jiashun Mao
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Korea
| | - Meng Wang
- Department of Biostatistics, School of Public Health, Harbin Medical University
| | - Xiangyang Le
- Department of Medicinal Chemistry, Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, China
| | - Yunyun Wang
- School of Pharmacy and Jiangsu Province Key Laboratory for Inflammation and Molecular Drug Target, Nantong University, Nantong 226001, China
| |
Collapse
|
56
|
Wan H, Liu Q, Ju Y. Utilize a few features to classify presynaptic and postsynaptic neurotoxins. Comput Biol Med 2023; 152:106380. [PMID: 36473343 DOI: 10.1016/j.compbiomed.2022.106380] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2022] [Revised: 10/21/2022] [Accepted: 11/28/2022] [Indexed: 12/02/2022]
Abstract
Neurotoxins are a class of proteins that have a significant damaging effect on nerve tissue. Neurotoxins are classified into presynaptic neurotoxins and postsynaptic neurotoxins, and accurate identification of neurotoxins plays a key role in drug development. In this study, 90 presynaptic neurotoxins and 165 postsynaptic neurotoxins were classified. The features of the presynaptic and postsynaptic neurotoxin sequences were extracted using the AutoProp feature extraction method and feature selection was performed using the maximum relevance maximum distance (MRMD) program, Finally, only two features were retained to achieve 84.7% classification accuracy. Moreover, it was found that the two retained features were present in the conserved sites and motifs of presynaptic neurotoxins and could represent the critical structures of presynaptic neurotoxins. This method demonstrates that using a few key features to classify proteins can effectively identify critical protein structures.
Collapse
Affiliation(s)
- Hao Wan
- Institute of Advanced Cross-field Science, College of Life Science, Qingdao University, Qingdao, China
| | - Qing Liu
- Department of Anesthesiology, Hospital (T.C.M) Affiliated to Southwest Medical University, Luzhou, China.
| | - Ying Ju
- School of Informatics, Xiamen University, Xiamen, China.
| |
Collapse
|
57
|
Application of a deep generative model produces novel and diverse functional peptides against microbial resistance. Comput Struct Biotechnol J 2022; 21:463-471. [PMID: 36618982 PMCID: PMC9804011 DOI: 10.1016/j.csbj.2022.12.029] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2022] [Revised: 12/13/2022] [Accepted: 12/16/2022] [Indexed: 12/23/2022] Open
Abstract
Antimicrobial resistance could threaten millions of lives in the immediate future. Antimicrobial peptides (AMPs) are an alternative to conventional antibiotics practice against infectious diseases. Despite the potential contribution of AMPs to the antibiotic's world, their development and optimization have encountered serious challenges. Cutting-edge methods with novel and improved selectivity toward resistant targets must be established to create AMPs-driven treatments. Here, we present AMPTrans-lstm, a deep generative network-based approach for the rational design of AMPs. The AMPTrans-lstm pipeline involves pre-training, transfer learning, and module identification. The AMPTrans-lstm model has two sub-models, namely, (long short-term memory) LSTM sampler and Transformer converter, which can be connected in series to make full use of the stability of LSTM and the novelty of Transformer model. These elements could generate AMPs candidates, which can then be tailored for specific applications. By analyzing the generated sequence and trained AMPs, we prove that AMPTrans-lstm can expand the design space of the trained AMPs and produce reasonable and brand-new AMPs sequences. AMPTrans-lstm can generate functional peptides for antimicrobial resistance with good novelty and diversity, so it is an efficient AMPs design tool.
Collapse
|
58
|
Tan Y, Dai L, Huang W, Guo Y, Zheng S, Lei J, Chen H, Yang Y. DRlinker: Deep Reinforcement Learning for Optimization in Fragment Linking Design. J Chem Inf Model 2022; 62:5907-5917. [PMID: 36404642 DOI: 10.1021/acs.jcim.2c00982] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Fragment-based drug discovery is a widely used strategy for drug design in both academic and pharmaceutical industries. Although fragments can be linked to generate candidate compounds by the latest deep generative models, generating linkers with specified attributes remains underdeveloped. In this study, we presented a novel framework, DRlinker, to control fragment linking toward compounds with given attributes through reinforcement learning. The method has been shown to be effective for many tasks from controlling the linker length and log P, optimizing predicted bioactivity of compounds, to various multiobjective tasks. Specifically, our model successfully generated 91.0% and 93.9% of compounds complying with the desired linker length and log P and improved the 7.5 pChEMBL value in bioactivity optimization. Finally, a quasi-scaffold-hopping study revealed that DRlinker could generate nearly 30% molecules with high 3D similarity but low 2D similarity to the lead inhibitor, demonstrating the benefits and applicability of DRlinker in actual fragment-based drug design.
Collapse
Affiliation(s)
- Youhai Tan
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou510006, China
| | - Lingxue Dai
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou510006, China
| | - Weifeng Huang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou510006, China
| | - Yinfeng Guo
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou510006, China
| | - Shuangjia Zheng
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou510006, China.,Galixir Technologies, Beijing100083, China
| | - Jinping Lei
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou510006, China
| | - Hongming Chen
- Guangzhou Laboratory, No. 9 XinDaoHuanBei Road, Guangzhou International Bio Island, Guangzhou510005, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou510006, China
| |
Collapse
|
59
|
Chan L, Kumar R, Verdonk M, Poelking C. A multilevel generative framework with hierarchical self-contrasting for bias control and transparency in structure-based ligand design. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00564-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
60
|
Lin Y, Zhang Y, Wang D, Yang B, Shen YQ. Computer especially AI-assisted drug virtual screening and design in traditional Chinese medicine. PHYTOMEDICINE : INTERNATIONAL JOURNAL OF PHYTOTHERAPY AND PHYTOPHARMACOLOGY 2022; 107:154481. [PMID: 36215788 DOI: 10.1016/j.phymed.2022.154481] [Citation(s) in RCA: 42] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 09/14/2022] [Accepted: 09/27/2022] [Indexed: 06/16/2023]
Abstract
BACKGROUND Traditional Chinese medicine (TCM), as a significant part of the global pharmaceutical science, the abundant molecular compounds it contains is a valuable potential source of designing and screening new drugs. However, due to the un-estimated quantity of the natural molecular compounds and diversity of the related problems drug discovery such as precise screening of molecular compounds or the evaluation of efficacy, physicochemical properties and pharmacokinetics, it is arduous for researchers to design or screen applicable compounds through old methods. With the rapid development of computer technology recently, especially artificial intelligence (AI), its innovation in the field of virtual screening contributes to an increasing efficiency and accuracy in the process of discovering new drugs. PURPOSE This study systematically reviewed the application of computational approaches and artificial intelligence in drug virtual filtering and devising of TCM and presented the potential perspective of computer-aided TCM development. STUDY DESIGN We made a systematic review following the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines. Then screening the most typical articles for our research. METHODS The systematic review was performed by following the PRISMA guidelines. The databases PubMed, EMBASE, Web of Science, CNKI were used to search for publications that focused on computer-aided drug virtual screening and design in TCM. RESULT Totally, 42 corresponding articles were included in literature reviewing. Aforementioned studies were of great significance to the treatment and cost control of many challenging diseases such as COVID-19, diabetes, Alzheimer's Disease (AD), etc. Computational approaches and AI were widely used in virtual screening in the process of TCM advancing, which include structure-based virtual screening (SBVS) and ligand-based virtual screening (LBVS). Besides, computational technologies were also extensively applied in absorption, distribution, metabolism, excretion and toxicity (ADMET) prediction of candidate drugs and new drug design in crucial course of drug discovery. CONCLUSIONS The applications of computer and AI play an important role in the drug virtual screening and design in the field of TCM, with huge application prospects.
Collapse
Affiliation(s)
- Yumeng Lin
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Chinese Academy of Medical Sciences Research Unit of Oral Carcinogenesis and Management, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - You Zhang
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Chinese Academy of Medical Sciences Research Unit of Oral Carcinogenesis and Management, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Dongyang Wang
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Chinese Academy of Medical Sciences Research Unit of Oral Carcinogenesis and Management, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Bowen Yang
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Chinese Academy of Medical Sciences Research Unit of Oral Carcinogenesis and Management, West China Hospital of Stomatology, Sichuan University, Chengdu, China
| | - Ying-Qiang Shen
- State Key Laboratory of Oral Diseases, National Clinical Research Center for Oral Diseases, Chinese Academy of Medical Sciences Research Unit of Oral Carcinogenesis and Management, West China Hospital of Stomatology, Sichuan University, Chengdu, China.
| |
Collapse
|
61
|
Li Y, Zhang L, Wang Y, Zou J, Yang R, Luo X, Wu C, Yang W, Tian C, Xu H, Wang F, Yang X, Li L, Yang S. Generative deep learning enables the discovery of a potent and selective RIPK1 inhibitor. Nat Commun 2022; 13:6891. [PMID: 36371441 PMCID: PMC9653409 DOI: 10.1038/s41467-022-34692-w] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2022] [Accepted: 11/03/2022] [Indexed: 11/13/2022] Open
Abstract
The retrieval of hit/lead compounds with novel scaffolds during early drug development is an important but challenging task. Various generative models have been proposed to create drug-like molecules. However, the capacity of these generative models to design wet-lab-validated and target-specific molecules with novel scaffolds has hardly been verified. We herein propose a generative deep learning (GDL) model, a distribution-learning conditional recurrent neural network (cRNN), to generate tailor-made virtual compound libraries for given biological targets. The GDL model is then applied to RIPK1. Virtual screening against the generated tailor-made compound library and subsequent bioactivity evaluation lead to the discovery of a potent and selective RIPK1 inhibitor with a previously unreported scaffold, RI-962. This compound displays potent in vitro activity in protecting cells from necroptosis, and good in vivo efficacy in two inflammatory models. Collectively, the findings prove the capacity of our GDL model in generating hit/lead compounds with unreported scaffolds, highlighting a great potential of deep learning in drug discovery.
Collapse
Affiliation(s)
- Yueshan Li
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Liting Zhang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Yifei Wang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Jun Zou
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Ruicheng Yang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Xinling Luo
- grid.13291.380000 0001 0807 1581Key Laboratory of Drug Targeting and Drug Delivery System of Ministry of Education, West China School of Pharmacy, Sichuan University, 610041 Chengdu, Sichuan China
| | - Chengyong Wu
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Wei Yang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Chenyu Tian
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Haixing Xu
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Falu Wang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Xin Yang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| | - Linli Li
- grid.13291.380000 0001 0807 1581Key Laboratory of Drug Targeting and Drug Delivery System of Ministry of Education, West China School of Pharmacy, Sichuan University, 610041 Chengdu, Sichuan China
| | - Shengyong Yang
- grid.13291.380000 0001 0807 1581State Key Laboratory of Biotherapy and Cancer Center, West China Hospital, Sichuan University, 610041 Chengdu, Sichuan China
| |
Collapse
|
62
|
Zhang Y, Luo M, Wu P, Wu S, Lee TY, Bai C. Application of Computational Biology and Artificial Intelligence in Drug Design. Int J Mol Sci 2022; 23:13568. [PMID: 36362355 PMCID: PMC9658956 DOI: 10.3390/ijms232113568] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2022] [Revised: 10/29/2022] [Accepted: 11/03/2022] [Indexed: 08/24/2023] Open
Abstract
Traditional drug design requires a great amount of research time and developmental expense. Booming computational approaches, including computational biology, computer-aided drug design, and artificial intelligence, have the potential to expedite the efficiency of drug discovery by minimizing the time and financial cost. In recent years, computational approaches are being widely used to improve the efficacy and effectiveness of drug discovery and pipeline, leading to the approval of plenty of new drugs for marketing. The present review emphasizes on the applications of these indispensable computational approaches in aiding target identification, lead discovery, and lead optimization. Some challenges of using these approaches for drug design are also discussed. Moreover, we propose a methodology for integrating various computational techniques into new drug discovery and design.
Collapse
Affiliation(s)
- Yue Zhang
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- School of Chemistry and Materials Science, University of Science and Technology of China, Hefei 230026, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| | - Mengqi Luo
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- South China Hospital, Health Science Center, Shenzhen University, Shenzhen 518116, China
| | - Peng Wu
- School of Biomedical Engineering, Health Science Center, Shenzhen University, Shenzhen 518055, China
| | - Song Wu
- South China Hospital, Health Science Center, Shenzhen University, Shenzhen 518116, China
| | - Tzong-Yi Lee
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| | - Chen Bai
- School of Life and Health Sciences, School of Medicine, The Chinese University of Hong Kong, Shenzhen 518172, China
- Warshel Institute for Computational Biology, Shenzhen 518172, China
| |
Collapse
|
63
|
Kong W, Hu Y, Zhang J, Tan Q. Application of SMILES-based molecular generative model in new drug design. Front Pharmacol 2022; 13:1046524. [PMCID: PMC9606214 DOI: 10.3389/fphar.2022.1046524] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2022] [Accepted: 10/03/2022] [Indexed: 11/13/2022] Open
Affiliation(s)
- Weiya Kong
- School of Sports Medicine and Rehabilitation, Beijing Sport University, Beijing, China
| | - Yuejuan Hu
- Nursing Department of Fenyang College of Shanxi Medical University, Fenyang, China
| | - Jiao Zhang
- Innovation and Entrepreneurship College of Hunan University of Finance and Economics, Changsha, China
| | - Qiaoyin Tan
- College of Teacher Education, Zhejiang Normal University, Jinhua, China
- *Correspondence: Qiaoyin Tan,
| |
Collapse
|
64
|
D'Souza S, Kv P, Balaji S. Training recurrent neural networks as generative neural networks for molecular structures: how does it impact drug discovery? Expert Opin Drug Discov 2022; 17:1071-1079. [PMID: 36216812 DOI: 10.1080/17460441.2023.2134340] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
INTRODUCTION Deep learning approaches have become popular in recent years in de novo drug design. Generative models for molecule generation and optimization have shown promising results. Molecules trained on different chemical data could regenerate molecules that were similar to the query molecule, thus supporting lead optimization. Recurrent neural network-based generative models have demonstrated application in low-data drug discovery, fragment-based drug design and in lead optimization. AREAS COVERED In this review, we have provided an overview of recurrent neural network models and their variants for molecule generation with recent examples. The input representation of molecules as SMILES and molecular graphs have been discussed. The evaluation benchmarks and metrics used in generative neural network models are also highlighted. For this, ScienceDirect, Web of Science, and Google Scholar databases were searched with the article's keywords and their combinations to retrieve the most relevant and up-to-date information. EXPERT OPINION The simplicity of SMILES notation makes it suitable for training a sequence-based model such as a recurrent neural network. However, models that could be trained on molecular graphs to generate molecular structures which could be synthesized could open new possibility for valid molecule generation and synthetic feasibility.
Collapse
Affiliation(s)
- Sofia D'Souza
- Department of Computer Science and Engineering, Manipal Institute of Technology, MAHE, Manipal, India
| | - Prema Kv
- Department of Computer Science and Engineering, Manipal Institute of Technology, MAHE, Manipal, India
| | - Seetharaman Balaji
- Department of Computer Science and Engineering, Manipal Institute of Technology, MAHE, Manipal, India
| |
Collapse
|
65
|
Li C, Wang C, Sun M, Zeng Y, Yuan Y, Gou Q, Wang G, Guo Y, Pu X. Correlated RNN Framework to Quickly Generate Molecules with Desired Properties for Energetic Materials in the Low Data Regime. J Chem Inf Model 2022; 62:4873-4887. [PMID: 35998331 DOI: 10.1021/acs.jcim.2c00997] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Motivated by the challenging of deep learning on the low data regime and the urgent demand for intelligent design on highly energetic materials, we explore a correlated deep learning framework, which consists of three recurrent neural networks (RNNs) correlated by the transfer learning strategy, to efficiently generate new energetic molecules with a high detonation velocity in the case of very limited data available. To avoid the dependence on the external big data set, data augmentation by fragment shuffling of 303 energetic compounds is utilized to produce 500,000 molecules to pretrain RNN, through which the model can learn sufficient structure knowledge. Then the pretrained RNN is fine-tuned by focusing on the 303 energetic compounds to generate 7153 molecules similar to the energetic compounds. In order to more reliably screen the molecules with a high detonation velocity, the SMILE enumeration augmentation coupled with the pretrained knowledge is utilized to build an RNN-based prediction model, through which R2 is boosted from 0.4446 to 0.9572. The comparable performance with the transfer learning strategy based on an existing big database (ChEMBL) to produce the energetic molecules and drug-like ones further supports the effectiveness and generality of our strategy in the low data regime. High-precision quantum mechanics calculations further confirm that 35 new molecules present a higher detonation velocity and lower synthetic accessibility than the classic explosive RDX, along with good thermal stability. In particular, three new molecules are comparable to caged CL-20 in the detonation velocity. All the source codes and the data set are freely available at https://github.com/wangchenghuidream/RNNMGM.
Collapse
Affiliation(s)
- Chuan Li
- College of Computer Science, Sichuan University, Chengdu 610064, China
| | - Chenghui Wang
- College of Computer Science, Sichuan University, Chengdu 610064, China
| | - Ming Sun
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Yan Zeng
- College of Computer Science, Sichuan University, Chengdu 610064, China
| | - Yuan Yuan
- College of Management, Southwest University for Nationalities, Chengdu 610041, China
| | - Qiaolin Gou
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Guangchuan Wang
- College of Computer Science, Sichuan University, Chengdu 610064, China
| | - Yanzhi Guo
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Xuemei Pu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| |
Collapse
|
66
|
Ton AT, Pandey M, Smith JR, Ban F, Fernandez M, Cherkasov A. Targeting SARS-CoV-2 Papain-Like Protease in the Post-Vaccine Era. Trends Pharmacol Sci 2022; 43:906-919. [PMID: 36114026 PMCID: PMC9399131 DOI: 10.1016/j.tips.2022.08.008] [Citation(s) in RCA: 22] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2022] [Revised: 08/10/2022] [Accepted: 08/19/2022] [Indexed: 11/29/2022]
Abstract
While vaccines remain at the forefront of global healthcare responses, pioneering therapeutics against SARS-CoV-2 are expected to fill the gaps for waning immunity. Rapid development and approval of orally available direct-acting antivirals targeting crucial SARS-CoV-2 proteins marked the beginning of the era of small-molecule drugs for COVID-19. In that regard, the papain-like protease (PLpro) can be considered a major SARS-CoV-2 therapeutic target due to its dual biological role in suppressing host innate immune responses and in ensuring viral replication. Here, we summarize the challenges of targeting PLpro and innovative early-stage PLpro-specific small molecules. We propose that state-of-the-art computer-aided drug design (CADD) methodologies will play a critical role in the discovery of PLpro compounds as a novel class of COVID-19 drugs.
Collapse
Affiliation(s)
- Anh-Tien Ton
- Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada
| | - Mohit Pandey
- Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada
| | - Jason R Smith
- Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada; Department of Chemistry, Simon Fraser University, Burnaby, Canada
| | - Fuqiang Ban
- Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada
| | - Michael Fernandez
- Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada
| | - Artem Cherkasov
- Vancouver Prostate Centre, University of British Columbia, Vancouver, BC, Canada.
| |
Collapse
|
67
|
Wang J, Chu Y, Mao J, Jeon HN, Jin H, Zeb A, Jang Y, Cho KH, Song T, No KT. De novo molecular design with deep molecular generative models for PPI inhibitors. Brief Bioinform 2022; 23:6643455. [PMID: 35830870 DOI: 10.1093/bib/bbac285] [Citation(s) in RCA: 30] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2022] [Revised: 06/14/2022] [Accepted: 06/20/2022] [Indexed: 12/27/2022] Open
Abstract
We construct a protein-protein interaction (PPI) targeted drug-likeness dataset and propose a deep molecular generative framework to generate novel drug-likeness molecules from the features of the seed compounds. This framework gains inspiration from published molecular generative models, uses the key features associated with PPI inhibitors as input and develops deep molecular generative models for de novo molecular design of PPI inhibitors. For the first time, quantitative estimation index for compounds targeting PPI was applied to the evaluation of the molecular generation model for de novo design of PPI-targeted compounds. Our results estimated that the generated molecules had better PPI-targeted drug-likeness and drug-likeness. Additionally, our model also exhibits comparable performance to other several state-of-the-art molecule generation models. The generated molecules share chemical space with iPPI-DB inhibitors as demonstrated by chemical space analysis. The peptide characterization-oriented design of PPI inhibitors and the ligand-based design of PPI inhibitors are explored. Finally, we recommend that this framework will be an important step forward for the de novo design of PPI-targeted therapeutics.
Collapse
Affiliation(s)
- Jianmin Wang
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea.,Bioinformatics and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea
| | - Yanyi Chu
- State Key Laboratory of Microbial Metabolism, Shanghai-Islamabad Belgrade Joint Innovation Center on Antibacterial Resistances, Joint International Research Laboratory of Metabolic & Developmental Sciences and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200030, P.R. China
| | - Jiashun Mao
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea.,Bioinformatics and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea
| | - Hyeon-Nae Jeon
- Bioinformatics and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea.,Biotechnology, College of Life Science and Biotechnology, Yonsei University, Seoul 03722, Republic of Korea
| | - Haiyan Jin
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea.,Bioinformatics and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea
| | - Amir Zeb
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea.,Department of Natural and Basic Sciences, University of Turbat, 92600, Pakistan
| | - Yuil Jang
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea.,Bioinformatics and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea
| | - Kwang-Hwi Cho
- School of Systems Biomedical Science, Soongsil University, Seoul, Republic of Korea
| | - Tao Song
- School of Computer Science and Technology, China University of Petroleum, Qingdao, 266580, Shandong, China
| | - Kyoung Tai No
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon 21983, Republic of Korea.,Bioinformatics and Molecular Design Research Center (BMDRC), Incheon 21983, Republic of Korea
| |
Collapse
|
68
|
Yoshimori A, Bajorath J. Computational analysis, alignment and extension of analogue series from medicinal chemistry. Future Sci OA 2022; 8:FSO804. [PMID: 36248066 PMCID: PMC9540237 DOI: 10.2144/fsoa-2022-0033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Accepted: 06/10/2022] [Indexed: 11/23/2022] Open
Affiliation(s)
- Atsushi Yoshimori
- Department of Life Science Informatics & Data Science, B-IT, LIMES Program Unit Chemical Biology & Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, Bonn, D 53115, Germany
| | - Jürgen Bajorath
- Institute for Theoretical Medicine, Inc., 26-1 Muraoka-Higashi 2-chome, Fujisawa, Kanagawa, 2510012, Japan
| |
Collapse
|
69
|
Cox PB, Gupta R. Contemporary Computational Applications and Tools in Drug Discovery. ACS Med Chem Lett 2022; 13:1016-1029. [DOI: 10.1021/acsmedchemlett.1c00662] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Affiliation(s)
- Philip B. Cox
- Drug Discovery Science and Technology, AbbVie, 1 North Waukegan Road, North Chicago, Illinois 60064-6217, United States
| | - Rishi Gupta
- Drug Discovery Science and Technology, AbbVie, 1 North Waukegan Road, North Chicago, Illinois 60064-6217, United States
| |
Collapse
|
70
|
Goldman B, Kearnes S, Kramer T, Riley P, Walters WP. Defining Levels of Automated Chemical Design. J Med Chem 2022; 65:7073-7087. [PMID: 35511951 PMCID: PMC9150065 DOI: 10.1021/acs.jmedchem.2c00334] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2022] [Indexed: 01/07/2023]
Abstract
One application area of computational methods in drug discovery is the automated design of small molecules. Despite the large number of publications describing methods and their application in both retrospective and prospective studies, there is a lack of agreement on terminology and key attributes to distinguish these various systems. We introduce Automated Chemical Design (ACD) Levels to clearly define the level of autonomy along the axes of ideation and decision making. To fully illustrate this framework, we provide literature exemplars and place some notable methods and applications into the levels. The ACD framework provides a common language for describing automated small molecule design systems and enables medicinal chemists to better understand and evaluate such systems.
Collapse
Affiliation(s)
- Brian Goldman
- Relay
Therapeutics, 399 Binney Street, Cambridge, Massachusetts 02139, United States
| | - Steven Kearnes
- Relay
Therapeutics, 399 Binney Street, Cambridge, Massachusetts 02139, United States
| | - Trevor Kramer
- Relay
Therapeutics, 399 Binney Street, Cambridge, Massachusetts 02139, United States
| | - Patrick Riley
- Relay
Therapeutics, 399 Binney Street, Cambridge, Massachusetts 02139, United States
| | - W. Patrick Walters
- Relay
Therapeutics, 399 Binney Street, Cambridge, Massachusetts 02139, United States
| |
Collapse
|
71
|
Bajorath J. Artificial intelligence in interdisciplinary life science and drug discovery research. Future Sci OA 2022; 8:FSO792. [PMID: 35369273 PMCID: PMC8965817 DOI: 10.2144/fsoa-2022-0010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 02/23/2022] [Indexed: 11/23/2022] Open
Affiliation(s)
- Jürgen Bajorath
- Department of Life Science Informatics & Data Science, B-IT, LIMES Program Unit Chemical Biology & Medicinal Chemistry, Rheinische Friedrich-Wilhelms-Universität, Friedrich-Hirzebruch-Allee 5/6, Bonn, D-53115, Germany
| |
Collapse
|
72
|
Bai X, Yin Y. Exploration and augmentation of pharmacological space via adversarial auto-encoder model for facilitating kinase-centric drug development. J Cheminform 2021; 13:95. [PMID: 34872613 PMCID: PMC8650415 DOI: 10.1186/s13321-021-00574-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2020] [Accepted: 11/20/2021] [Indexed: 11/10/2022] Open
Abstract
Predicting compound-protein interactions (CPIs) is of great importance for drug discovery and repositioning, yet still challenging mainly due to the sparse nature of CPI matrixes, resulting in poor generalization performance. Hence, unlike typical CPI prediction models focused on representation learning or model selection, we propose a deep neural network-based strategy, PCM-AAE, that re-explores and augments the pharmacological space of kinase inhibitors by introducing the adversarial auto-encoder model (AAE) to improve the generalization of the prediction model. To complete the data space, we constructed Ensemble of PCM-AAE (EPA), an ensemble model that quickly and accurately yields quantitative predictions of binding affinity between any human kinase and inhibitor. In rigorous internal validation, EPA showed excellent performance, consistently outperforming the model trained with the imbalanced set, especially for targets with relatively fewer training data points. Improved prediction accuracy of EPA for external datasets enhances its generalization ability, making it possible to gracefully handle previously unseen kinases and inhibitors. EPA showed promising potential when directly applied to virtual screening and off-target prediction, exhibiting its practicality in hit prediction. Our strategy is expected to facilitate kinase-centric drug development, as well as to solve more challenging prediction problems with insufficient data points.
Collapse
Affiliation(s)
- Xinyu Bai
- Department of Pathology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, 100191, China
- Institute of Systems Biomedicine, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, 100191, People's Republic of China
| | - Yuxin Yin
- Department of Pathology, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, 100191, China.
- Institute of Systems Biomedicine, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, 100191, People's Republic of China.
- Peking-Tsinghua Center for Life Sciences, Peking University Health Science Center, Beijing, 100191, China.
| |
Collapse
|