1
|
Noury H, Rahdar A, Romanholo Ferreira LF, Jamalpoor Z. AI-driven innovations in smart multifunctional nanocarriers for drug and gene delivery: A mini-review. Crit Rev Oncol Hematol 2025; 210:104701. [PMID: 40086770 DOI: 10.1016/j.critrevonc.2025.104701] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2025] [Revised: 03/07/2025] [Accepted: 03/07/2025] [Indexed: 03/16/2025] Open
Abstract
The convergence of artificial intelligence (AI) and nanomedicine has revolutionized the design of smart multifunctional nanocarriers (SMNs) for drug and gene delivery, offering unprecedented precision, efficiency, and personalization in therapeutic applications. AI-driven approaches enhance the development of these nanocarriers by accelerating their design, optimizing drug loading and release kinetics, improving biocompatibility, and predicting interactions with biological barriers. This review explores the transformative role of AI in the fabrication and functionalization of SMNs, emphasizing its impact on overcoming challenges in targeted drug delivery, controlled release, and theranostics. We discuss the integration of AI with advanced nanomaterials-such as polymeric, lipidic, and inorganic nanoparticles-highlighting their potential in oncology and hematology. Furthermore, we examine recent clinical and preclinical case studies demonstrating AI-assisted nanocarrier development for personalized medicine. The synergy between AI and nanotechnology paves the way for next-generation precision therapeutics, addressing critical limitations in traditional drug delivery systems. However, data standardization, regulatory compliance, and translational scalability challenges remain. This review underscores the need for interdisciplinary collaboration to unlock AI's potential in nanomedicine fully, ultimately advancing the clinical application of SMNs for more effective and safer patient care.
Collapse
Affiliation(s)
- Hamid Noury
- Health Research Center, Chamran Hospital, Tehran, Iran
| | - Abbas Rahdar
- Department of Physics, Faculty of Sciences, University of Zabol, Zabol 538-98615, Iran.
| | | | - Zahra Jamalpoor
- Trauma and Surgery Research Center, Aja University of Medical Sciences, Tehran, Iran.
| |
Collapse
|
2
|
Shoaib M, Tariq A, Liu Y, Yang M, Qu L, Yang L, Song J. Recent update on the development of HPV16 inhibitors for cervical cancer. Crit Rev Oncol Hematol 2025; 210:104703. [PMID: 40107437 DOI: 10.1016/j.critrevonc.2025.104703] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2024] [Revised: 03/10/2025] [Accepted: 03/12/2025] [Indexed: 03/22/2025] Open
Abstract
Persistent infection with human papillomavirus (HPV) can lead to cervical cancer (CC), which is the fourth most commonly diagnosed cancer in women globally. In this review, we have explained the HPV genome and the development of CC. Additionally, we summarized recently discovered small molecules that act as inhibitors of HPV-16. These molecules were identified through experimental and in-silico studies aimed at preventing or treating CC. HPV-16 and HPV-18 are the most common subtypes of HPV that cause CC globally. E6 oncoprotein of HPV-16 is considered the primary cause of CC progression. Therefore, E6 is the most focused targeted protein for developing specific and novel therapeutic inhibitors to treat HPV-related cancers. In recent years, various HPV inhibitors have been identified by means of experimental and in-silico studies. In addition, artificial intelligence-based medical diagnostic tools have grown more popular as they are capable of screening and diagnosing HPV-related cancer.
Collapse
Affiliation(s)
- Muhammad Shoaib
- College of Chemistry, Pingyuan Laboratory, and State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Zhengzhou University, Zhengzhou, Henan 450001, China
| | - Amina Tariq
- College of Chemistry, Pingyuan Laboratory, and State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Zhengzhou University, Zhengzhou, Henan 450001, China
| | - Yanchen Liu
- College of Chemistry, Pingyuan Laboratory, and State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Zhengzhou University, Zhengzhou, Henan 450001, China
| | - Mingwei Yang
- College of Chemistry, Pingyuan Laboratory, and State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Zhengzhou University, Zhengzhou, Henan 450001, China
| | - Lingbo Qu
- College of Chemistry, Pingyuan Laboratory, and State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Zhengzhou University, Zhengzhou, Henan 450001, China; Institute of Chemistry, Henan Academy of Science, Zhengzhou, Henan 450046, China
| | - Longhua Yang
- School of Pharmaceutical Sciences & Key Laboratory of Advanced Drug Preparation Technologies, Ministry of Education, Zhengzhou University, Zhengzhou, Henan 450001, China.
| | - Jinshuai Song
- College of Chemistry, Pingyuan Laboratory, and State Key Laboratory of Cotton Bio-breeding and Integrated Utilization, Zhengzhou University, Zhengzhou, Henan 450001, China.
| |
Collapse
|
3
|
Kyro GW, Martin MT, Watt ED, Batista VS. CardioGenAI: a machine learning-based framework for re-engineering drugs for reduced hERG liability. J Cheminform 2025; 17:30. [PMID: 40045386 PMCID: PMC11881490 DOI: 10.1186/s13321-025-00976-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2024] [Accepted: 02/21/2025] [Indexed: 03/09/2025] Open
Abstract
The link between in vitro hERG ion channel inhibition and subsequent in vivo QT interval prolongation, a critical risk factor for the development of arrythmias such as Torsade de Pointes, is so well established that in vitro hERG activity alone is often sufficient to end the development of an otherwise promising drug candidate. It is therefore of tremendous interest to develop advanced methods for identifying hERG-active compounds in the early stages of drug development, as well as for proposing redesigned compounds with reduced hERG liability and preserved primary pharmacology. In this work, we present CardioGenAI, a machine learning-based framework for re-engineering both developmental and commercially available drugs for reduced hERG activity while preserving their pharmacological activity. The framework incorporates novel state-of-the-art discriminative models for predicting hERG channel activity, as well as activity against the voltage-gated NaV1.5 and CaV1.2 channels due to their potential implications in modulating the arrhythmogenic potential induced by hERG channel blockade. We applied the complete framework to pimozide, an FDA-approved antipsychotic agent that demonstrates high affinity to the hERG channel, and generated 100 refined candidates. Remarkably, among the candidates is fluspirilene, a compound which is of the same class of drugs as pimozide (diphenylmethanes) and therefore has similar pharmacological activity, yet exhibits over 700-fold weaker binding to hERG. Furthermore, we demonstrated the framework's ability to optimize hERG, NaV1.5 and CaV1.2 profiles of multiple FDA-approved compounds while maintaining the physicochemical nature of the original drugs. We envision that this method can effectively be applied to developmental compounds exhibiting hERG liabilities to provide a means of rescuing drug development programs that have stalled due to hERG-related safety concerns. Additionally, the discriminative models can also serve independently as effective components of virtual screening pipelines. We have made all of our software open-source at https://github.com/gregory-kyro/CardioGenAI to facilitate integration of the CardioGenAI framework for molecular hypothesis generation into drug discovery workflows.Scientific contributionThis work introduces CardioGenAI, an open-source machine learning-based framework designed to re-engineer drugs for reduced hERG liability while preserving their pharmacological activity. The complete CardioGenAI framework can be applied to developmental compounds exhibiting hERG liabilities to provide a means of rescuing drug discovery programs facing hERG-related challenges. In addition, the framework incorporates novel state-of-the-art discriminative models for predicting hERG, NaV1.5 and CaV1.2 channel activity, which can function independently as effective components of virtual screening pipelines.
Collapse
Affiliation(s)
- Gregory W Kyro
- Department of Chemistry, Yale University, New Haven, CT, 06511, USA.
- Drug Safety Research & Development, Pfizer Research & Development, Groton, CT, 06340, USA.
| | - Matthew T Martin
- Drug Safety Research & Development, Pfizer Research & Development, Groton, CT, 06340, USA
| | - Eric D Watt
- Drug Safety Research & Development, Pfizer Research & Development, Groton, CT, 06340, USA
| | - Victor S Batista
- Department of Chemistry, Yale University, New Haven, CT, 06511, USA.
| |
Collapse
|
4
|
Sako M, Yasuo N, Sekijima M. DiffInt: A Diffusion Model for Structure-Based Drug Design with Explicit Hydrogen Bond Interaction Guidance. J Chem Inf Model 2025; 65:71-82. [PMID: 39696768 PMCID: PMC11733934 DOI: 10.1021/acs.jcim.4c01385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2024] [Revised: 11/04/2024] [Accepted: 12/05/2024] [Indexed: 12/20/2024]
Abstract
The design of drug molecules is a critical stage in the drug discovery process. The structure-based drug design has long played an important role in efficient development. Significant progress has been made in recent years in the generation of 3D molecules via deep generation models. However, while many existing models have succeeded in incorporating structural information on target proteins, they have not been able to address important interactions between protein and drug molecules, especially hydrogen bonds. In this study, we propose DiffInt as a novel structure-based approach that explicitly addresses interactions. The model naturally incorporates hydrogen bonds between protein and ligand molecules by treating them as pseudoparticles. The experimental results show that DiffInt reproduces hydrogen bonds, and the hydrogen binding energies significantly outperform those of existing models. To facilitate the use of our tool for generating new drug molecules based on any protein's three-dimensional structure, we have made the source code and trained model available on GitHub (https://github.com/sekijima-lab/DiffInt) under the MIT license, with the execution environment provided on Google Colab.
Collapse
Affiliation(s)
- Masami Sako
- Department
of Computer Science, Institute of Science
Tokyo, Yokohama, Kanagawa 226-8501, Japan
| | - Nobuaki Yasuo
- Academy
for Convergence of Materials and Informatics (TAC-MI), Institute of Science Tokyo, Meguro-ku, Tokyo 152-8550, Japan
| | - Masakazu Sekijima
- Department
of Computer Science, Institute of Science
Tokyo, Yokohama, Kanagawa 226-8501, Japan
| |
Collapse
|
5
|
Liu H, Qin Y, Niu Z, Xu M, Wu J, Xiao X, Lei J, Ran T, Chen H. How Good are Current Pocket-Based 3D Generative Models?: The Benchmark Set and Evaluation of Protein Pocket-Based 3D Molecular Generative Models. J Chem Inf Model 2024; 64:9260-9275. [PMID: 39629985 DOI: 10.1021/acs.jcim.4c01598] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
The development of a three-dimensional (3D) molecular generative model based on protein pockets has recently attracted a lot of attention. This type of model aims to achieve the simultaneous generation of molecular graphs and 3D binding conformation under the constraint of protein binding. Various pocket-based generative models have been proposed; however, currently, there is a lack of systematic and objective evaluation metrics for these models. To address this issue, a comprehensive benchmark data set, named POKMOL-3D, is proposed to evaluate protein pocket-based 3D molecular generative models. It includes 32 protein targets together with their known active compounds as a test set to evaluate the versatility of generation models to mimic the real-world scenario. Additionally, a series of two-dimensional (2D) and 3D evaluation metrics with some newly created ones was integrated to assess the quality of generated molecular structures and their binding conformations. It is expected that this work can enhance our comprehension of the effectiveness and weakness of current 3D generative models and stimulate the discussion on challenges and useful guidance for developing the next wave of molecular generative models.
Collapse
Affiliation(s)
- Haoyang Liu
- State Key Laboratory of Medicinal Chemical Biology and College of Life Sciences, Nankai University, 94 Weijin Road, Tianjin 300071, China
- Division of Drug and Vaccine Research, Guangzhou National Laboratory, Guangzhou 510005, Guangdong, China
| | - Yifei Qin
- School of Pharmacy and Food Engineering, Wuyi University, Jiangmen 529020, Guangdong, China
| | - Zhangming Niu
- National Heart and Lung Institute, Imperial College London, London SW7 2AZ, U.K
- MindRank AI, Hangzhou 311113, Zhejiang, China
- AI Research Center, MindRank Technologies Limited, London EC2N 2AX, U.K
| | - Mingyuan Xu
- Division of Drug and Vaccine Research, Guangzhou National Laboratory, Guangzhou 510005, Guangdong, China
| | - Jiaqiang Wu
- School of Pharmacy and Food Engineering, Wuyi University, Jiangmen 529020, Guangdong, China
| | - Xianglu Xiao
- MindRank AI, Hangzhou 311113, Zhejiang, China
- AI Research Center, MindRank Technologies Limited, London EC2N 2AX, U.K
- Bioengineering Department and Imperial-X, Imperial College London, London W12 7SL, U.K
| | - Jinping Lei
- School of Pharmaceutical Science, Sun Yat-Sen University, Guangzhou 510006, China
| | - Ting Ran
- Division of Drug and Vaccine Research, Guangzhou National Laboratory, Guangzhou 510005, Guangdong, China
| | - Hongming Chen
- Division of Drug and Vaccine Research, Guangzhou National Laboratory, Guangzhou 510005, Guangdong, China
- School of Basic Medical Sciences, Guangzhou Laboratory, Guangzhou Medical University, Guangzhou 511436, China
| |
Collapse
|
6
|
Bernatavicius A, Šícho M, Janssen APA, Hassen AK, Preuss M, van Westen GJP. AlphaFold Meets De Novo Drug Design: Leveraging Structural Protein Information in Multitarget Molecular Generative Models. J Chem Inf Model 2024; 64:8113-8122. [PMID: 39475544 PMCID: PMC11558674 DOI: 10.1021/acs.jcim.4c00309] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 10/15/2024] [Accepted: 10/15/2024] [Indexed: 11/12/2024]
Abstract
Recent advancements in deep learning and generative models have significantly expanded the applications of virtual screening for drug-like compounds. Here, we introduce a multitarget transformer model, PCMol, that leverages the latent protein embeddings derived from AlphaFold2 as a means of conditioning a de novo generative model on different targets. Incorporating rich protein representations allows the model to capture their structural relationships, enabling the chemical space interpolation of active compounds and target-side generalization to new proteins based on embedding similarities. In this work, we benchmark against other existing target-conditioned transformer models to illustrate the validity of using AlphaFold protein representations over raw amino acid sequences. We show that low-dimensional projections of these protein embeddings cluster appropriately based on target families and that model performance declines when these representations are intentionally corrupted. We also show that the PCMol model generates diverse, potentially active molecules for a wide array of proteins, including those with sparse ligand bioactivity data. The generated compounds display higher similarity known active ligands of held-out targets and have comparable molecular docking scores while maintaining novelty. Additionally, we demonstrate the important role of data augmentation in bolstering the performance of generative models in low-data regimes. Software package and AlphaFold protein embeddings are freely available at https://github.com/CDDLeiden/PCMol.
Collapse
Affiliation(s)
- Andrius Bernatavicius
- Leiden
Academic Centre for Drug Research, Leiden
University, Einsteinweg 55, 2333CC Leiden, The Netherlands
- Leiden
Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, 2333CA Leiden, The Netherlands
| | - Martin Šícho
- Leiden
Academic Centre for Drug Research, Leiden
University, Einsteinweg 55, 2333CC Leiden, The Netherlands
- CZ-OPENSCREEN:
National Infrastructure for Chemical Biology, Department of Informatics
and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28 Prague, Czech
Republic
| | - Antonius P. A. Janssen
- Leiden
Academic Centre for Drug Research, Leiden
University, Einsteinweg 55, 2333CC Leiden, The Netherlands
- Leiden
Institute of Chemistry, Leiden University, Einsteinweg 55, 2333CC Leiden, The
Netherlands
| | - Alan Kai Hassen
- Leiden
Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, 2333CA Leiden, The Netherlands
| | - Mike Preuss
- Leiden
Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, 2333CA Leiden, The Netherlands
| | - Gerard J. P. van Westen
- Leiden
Academic Centre for Drug Research, Leiden
University, Einsteinweg 55, 2333CC Leiden, The Netherlands
| |
Collapse
|
7
|
Yang Y, Chen G, Li J, Li J, Zhang O, Zhang X, Li L, Hao J, Wang E, Heng PA. Enabling target-aware molecule generation to follow multi objectives with Pareto MCTS. Commun Biol 2024; 7:1074. [PMID: 39223327 PMCID: PMC11368924 DOI: 10.1038/s42003-024-06746-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2024] [Accepted: 08/16/2024] [Indexed: 09/04/2024] Open
Abstract
Target-aware drug discovery has greatly accelerated the drug discovery process to design small-molecule ligands with high binding affinity to disease-related protein targets. Conditioned on targeted proteins, previous works utilize various kinds of deep generative models and have shown great potential in generating molecules with strong protein-ligand binding interactions. However, beyond binding affinity, effective drug molecules must manifest other essential properties such as high drug-likeness, which are not explicitly addressed by current target-aware generative methods. In this article, aiming to bridge the gap of multi-objective target-aware molecule generation in the field of deep learning-based drug discovery, we propose ParetoDrug, a Pareto Monte Carlo Tree Search (MCTS) generation algorithm. ParetoDrug searches molecules on the Pareto Front in chemical space using MCTS to enable synchronous optimization of multiple properties. Specifically, ParetoDrug utilizes pretrained atom-by-atom autoregressive generative models for the exploration guidance to desired molecules during MCTS searching. Besides, when selecting the next atom symbol, a scheme named ParetoPUCT is proposed to balance exploration and exploitation. Benchmark experiments and case studies demonstrate that ParetoDrug is highly effective in traversing the large and complex chemical space to discover novel compounds with satisfactory binding affinities and drug-like properties for various multi-objective target-aware drug discovery tasks.
Collapse
Affiliation(s)
- Yaodong Yang
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | | | - Jinpeng Li
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| | | | | | | | | | - Jianye Hao
- Noah's Ark Lab, Huawei, Shenzhen, China.
| | | | - Pheng-Ann Heng
- Department of Computer Science and Engineering, The Chinese University of Hong Kong, Hong Kong, China
| |
Collapse
|
8
|
Krishnan SR, Bung N, Srinivasan R, Roy A. Target-specific novel molecules with their recipe: Incorporating synthesizability in the design process. J Mol Graph Model 2024; 129:108734. [PMID: 38442440 DOI: 10.1016/j.jmgm.2024.108734] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2023] [Revised: 02/14/2024] [Accepted: 02/15/2024] [Indexed: 03/07/2024]
Abstract
Application of Artificial intelligence (AI) in drug discovery has led to several success stories in recent times. While traditional methods mostly relied upon screening large chemical libraries for early-stage drug-design, de novo design can help identify novel target-specific molecules by sampling from a much larger chemical space. Although this has increased the possibility of finding diverse and novel molecules from previously unexplored chemical space, this has also posed a great challenge for medicinal chemists to synthesize at least some of the de novo designed novel molecules for experimental validation. To address this challenge, in this work, we propose a novel forward synthesis-based generative AI method, which is used to explore the synthesizable chemical space. The method uses a structure-based drug design framework, where the target protein structure and a target-specific seed fragment from co-crystal structures can be the initial inputs. A random fragment from a purchasable fragment library can also be the input if a target-specific fragment is unavailable. Then a template-based forward synthesis route prediction and molecule generation is performed in parallel using the Monte Carlo Tree Search (MCTS) method where, the subsequent fragments for molecule growth can again be obtained from a purchasable fragment library. The rewards for each iteration of MCTS are computed using a drug-target affinity (DTA) model based on the docking pose of the generated reaction intermediates at the binding site of the target protein of interest. With the help of the proposed method, it is now possible to overcome one of the major obstacles posed to the AI-based drug design approaches through the ability of the method to design novel target-specific synthesizable molecules.
Collapse
Affiliation(s)
| | - Navneet Bung
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, 500081, India
| | - Rajgopal Srinivasan
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, 500081, India
| | - Arijit Roy
- TCS Research (Life Sciences Division), Tata Consultancy Services Limited, Hyderabad, 500081, India.
| |
Collapse
|
9
|
Zheng M, Sun G, Li X, Fan Y. EGPDI: identifying protein-DNA binding sites based on multi-view graph embedding fusion. Brief Bioinform 2024; 25:bbae330. [PMID: 38975896 PMCID: PMC11229037 DOI: 10.1093/bib/bbae330] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/20/2024] [Revised: 06/08/2024] [Accepted: 06/26/2024] [Indexed: 07/09/2024] Open
Abstract
Mechanisms of protein-DNA interactions are involved in a wide range of biological activities and processes. Accurately identifying binding sites between proteins and DNA is crucial for analyzing genetic material, exploring protein functions, and designing novel drugs. In recent years, several computational methods have been proposed as alternatives to time-consuming and expensive traditional experiments. However, accurately predicting protein-DNA binding sites still remains a challenge. Existing computational methods often rely on handcrafted features and a single-model architecture, leaving room for improvement. We propose a novel computational method, called EGPDI, based on multi-view graph embedding fusion. This approach involves the integration of Equivariant Graph Neural Networks (EGNN) and Graph Convolutional Networks II (GCNII), independently configured to profoundly mine the global and local node embedding representations. An advanced gated multi-head attention mechanism is subsequently employed to capture the attention weights of the dual embedding representations, thereby facilitating the integration of node features. Besides, extra node features from protein language models are introduced to provide more structural information. To our knowledge, this is the first time that multi-view graph embedding fusion has been applied to the task of protein-DNA binding site prediction. The results of five-fold cross-validation and independent testing demonstrate that EGPDI outperforms state-of-the-art methods. Further comparative experiments and case studies also verify the superiority and generalization ability of EGPDI.
Collapse
Affiliation(s)
- Mengxin Zheng
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
| | - Guicong Sun
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
| | - Xueping Li
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
| | - Yongxian Fan
- School of Computer Science and Information Security, Guilin University of Electronic Technology, Guilin 541004, China
| |
Collapse
|
10
|
Liu Y, Yu H, Duan X, Zhang X, Cheng T, Jiang F, Tang H, Ruan Y, Zhang M, Zhang H, Zhang Q. TransGEM: a molecule generation model based on Transformer with gene expression data. Bioinformatics 2024; 40:btae189. [PMID: 38632084 PMCID: PMC11078772 DOI: 10.1093/bioinformatics/btae189] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 03/26/2024] [Accepted: 04/16/2024] [Indexed: 04/19/2024] Open
Abstract
MOTIVATION It is difficult to generate new molecules with desirable bioactivity through ligand-based de novo drug design, and receptor-based de novo drug design is constrained by disease target information availability. The combination of artificial intelligence and phenotype-based de novo drug design can generate new bioactive molecules, independent from disease target information. Gene expression profiles can be used to characterize biological phenotypes. The Transformer model can be utilized to capture the associations between gene expression profiles and molecular structures due to its remarkable ability in processing contextual information. RESULTS We propose TransGEM (Transformer-based model from gene expression to molecules), which is a phenotype-based de novo drug design model. A specialized gene expression encoder is used to embed gene expression difference values between diseased cell lines and their corresponding normal tissue cells into TransGEM model. The results demonstrate that the TransGEM model can generate molecules with desirable evaluation metrics and property distributions. Case studies illustrate that TransGEM model can generate structurally novel molecules with good binding affinity to disease target proteins. The majority of genes with high attention scores obtained from TransGEM model are associated with the onset of the disease, indicating the potential of these genes as disease targets. Therefore, this study provides a new paradigm for de novo drug design, and it will promote phenotype-based drug discovery. AVAILABILITY AND IMPLEMENTATION The code is available at https://github.com/hzauzqy/TransGEM.
Collapse
Affiliation(s)
- Yanguang Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Hailong Yu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Xinya Duan
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Xiaomin Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Ting Cheng
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Feng Jiang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Hao Tang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Yao Ruan
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Miao Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Hongyu Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| | - Qingye Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, P.R. China
| |
Collapse
|
11
|
Huang L, Xu T, Yu Y, Zhao P, Chen X, Han J, Xie Z, Li H, Zhong W, Wong KC, Zhang H. A dual diffusion model enables 3D molecule generation and lead optimization based on target pockets. Nat Commun 2024; 15:2657. [PMID: 38531837 DOI: 10.1038/s41467-024-46569-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 03/01/2024] [Indexed: 03/28/2024] Open
Abstract
Structure-based generative chemistry is essential in computer-aided drug discovery by exploring a vast chemical space to design ligands with high binding affinity for targets. However, traditional in silico methods are limited by computational inefficiency, while machine learning approaches face bottlenecks due to auto-regressive sampling. To address these concerns, we have developed a conditional deep generative model, PMDM, for 3D molecule generation fitting specified targets. PMDM consists of a conditional equivariant diffusion model with both local and global molecular dynamics, enabling PMDM to consider the conditioned protein information to generate molecules efficiently. The comprehensive experiments indicate that PMDM outperforms baseline models across multiple evaluation metrics. To evaluate the applications of PMDM under real drug design scenarios, we conduct lead compound optimization for SARS-CoV-2 main protease (Mpro) and Cyclin-dependent Kinase 2 (CDK2), respectively. The selected lead optimization molecules are synthesized and evaluated for their in-vitro activities against CDK2, displaying improved CDK2 activity.
Collapse
Affiliation(s)
- Lei Huang
- City University of Hong Kong, Hong Kong, SAR, China
- Tencent AI Lab, Shenzhen, China
| | | | - Yang Yu
- Tencent AI Lab, Shenzhen, China
| | | | | | - Jing Han
- Regor Therapeutics Group, Shanghai, China
| | - Zhi Xie
- Regor Therapeutics Group, Shanghai, China
| | - Hailong Li
- Regor Therapeutics Group, Shanghai, China.
| | | | - Ka-Chun Wong
- City University of Hong Kong, Hong Kong, SAR, China.
| | | |
Collapse
|
12
|
Tang Y, Moretti R, Meiler J. Recent Advances in Automated Structure-Based De Novo Drug Design. J Chem Inf Model 2024; 64:1794-1805. [PMID: 38485516 PMCID: PMC10966644 DOI: 10.1021/acs.jcim.4c00247] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2024] [Revised: 02/26/2024] [Accepted: 02/29/2024] [Indexed: 03/26/2024]
Abstract
As the number of determined and predicted protein structures and the size of druglike 'make-on-demand' libraries soar, the time-consuming nature of structure-based computer-aided drug design calls for innovative computational algorithms. De novo drug design introduces in silico heuristics to accelerate searching in the vast chemical space. This review focuses on recent advances in structure-based de novo drug design, ranging from conventional fragment-based methods, evolutionary algorithms, and Metropolis Monte Carlo methods to deep generative models. Due to the historical limitation of de novo drug design generating readily available drug-like molecules, we highlight the synthetic accessibility efforts in each category and the benchmarking strategies taken to validate the proposed framework.
Collapse
Affiliation(s)
- Yidan Tang
- Department
of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
| | - Rocco Moretti
- Department
of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
- Center
for Structural Biology, Vanderbilt University, Nashville, Tennessee 37240, United States
| | - Jens Meiler
- Department
of Chemistry, Vanderbilt University, Nashville, Tennessee 37235, United States
- Center
for Structural Biology, Vanderbilt University, Nashville, Tennessee 37240, United States
- Institute
of Drug Discovery, Faculty of Medicine, University of Leipzig, 04103 Leipzig, Germany
| |
Collapse
|
13
|
Wang M, Wu Z, Wang J, Weng G, Kang Y, Pan P, Li D, Deng Y, Yao X, Bing Z, Hsieh CY, Hou T. Genetic Algorithm-Based Receptor Ligand: A Genetic Algorithm-Guided Generative Model to Boost the Novelty and Drug-Likeness of Molecules in a Sampling Chemical Space. J Chem Inf Model 2024; 64:1213-1228. [PMID: 38302422 DOI: 10.1021/acs.jcim.3c01964] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Deep learning-based de novo molecular design has recently gained significant attention. While numerous DL-based generative models have been successfully developed for designing novel compounds, the majority of the generated molecules lack sufficiently novel scaffolds or high drug-like profiles. The aforementioned issues may not be fully captured by commonly used metrics for the assessment of molecular generative models, such as novelty, diversity, and quantitative estimation of the drug-likeness score. To address these limitations, we proposed a genetic algorithm-guided generative model called GARel (genetic algorithm-based receptor-ligand interaction generator), a novel framework for training a DL-based generative model to produce drug-like molecules with novel scaffolds. To efficiently train the GARel model, we utilized dense net to update the parameters based on molecules with novel scaffolds and drug-like features. To demonstrate the capability of the GARel model, we used it to design inhibitors for three targets: AA2AR, EGFR, and SARS-Cov2. The results indicate that GARel-generated molecules feature more diverse and novel scaffolds and possess more desirable physicochemical properties and favorable docking scores. Compared with other generative models, GARel makes significant progress in balancing novelty and drug-likeness, providing a promising direction for the further development of DL-based de novo design methodology with potential impacts on drug discovery.
Collapse
Affiliation(s)
- Mingyang Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
- CarbonSilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang ,China
| | - Zhengjian Wu
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
- School of Computer Science, Wuhan University, Wuhan 430072, Hubei ,China
| | - Jike Wang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
- CarbonSilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang ,China
| | - Gaoqi Weng
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Yu Kang
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Peichen Pan
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Dan Li
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd., Hangzhou 310018, Zhejiang ,China
| | - Xiaojun Yao
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery Macau Institute for Applied Research in Medicine and Health State Key Laboratory of Quality Research in Chinese Medicine, Macau University of Science and Technology, Taipa, Macau 999078, China
| | - Zhitong Bing
- Institute of Modern Physics, Chinese Academy of Sciences, Lanzhou, Gansu 730000, China
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| | - Tingjun Hou
- College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang ,China
| |
Collapse
|
14
|
Kim DN, McNaughton AD, Kumar N. Leveraging Artificial Intelligence to Expedite Antibody Design and Enhance Antibody-Antigen Interactions. Bioengineering (Basel) 2024; 11:185. [PMID: 38391671 PMCID: PMC10886287 DOI: 10.3390/bioengineering11020185] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 01/30/2024] [Accepted: 02/06/2024] [Indexed: 02/24/2024] Open
Abstract
This perspective sheds light on the transformative impact of recent computational advancements in the field of protein therapeutics, with a particular focus on the design and development of antibodies. Cutting-edge computational methods have revolutionized our understanding of protein-protein interactions (PPIs), enhancing the efficacy of protein therapeutics in preclinical and clinical settings. Central to these advancements is the application of machine learning and deep learning, which offers unprecedented insights into the intricate mechanisms of PPIs and facilitates precise control over protein functions. Despite these advancements, the complex structural nuances of antibodies pose ongoing challenges in their design and optimization. Our review provides a comprehensive exploration of the latest deep learning approaches, including language models and diffusion techniques, and their role in surmounting these challenges. We also present a critical analysis of these methods, offering insights to drive further progress in this rapidly evolving field. The paper includes practical recommendations for the application of these computational techniques, supplemented with independent benchmark studies. These studies focus on key performance metrics such as accuracy and the ease of program execution, providing a valuable resource for researchers engaged in antibody design and development. Through this detailed perspective, we aim to contribute to the advancement of antibody design, equipping researchers with the tools and knowledge to navigate the complexities of this field.
Collapse
Affiliation(s)
| | | | - Neeraj Kumar
- Pacific Northwest National Laboratory, 902 Battelle Blvd., Richland, WA 99352, USA; (D.N.K.); (A.D.M.)
| |
Collapse
|
15
|
Kyro GW, Morgunov A, Brent RI, Batista VS. ChemSpaceAL: An Efficient Active Learning Methodology Applied to Protein-Specific Molecular Generation. J Chem Inf Model 2024; 64:653-665. [PMID: 38287889 DOI: 10.1021/acs.jcim.3c01456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2024]
Abstract
The incredible capabilities of generative artificial intelligence models have inevitably led to their application in the domain of drug discovery. Within this domain, the vastness of chemical space motivates the development of more efficient methods for identifying regions with molecules that exhibit desired characteristics. In this work, we present a computationally efficient active learning methodology and demonstrate its applicability to targeted molecular generation. When applied to c-Abl kinase, a protein with FDA-approved small-molecule inhibitors, the model learns to generate molecules similar to the inhibitors without prior knowledge of their existence and even reproduces two of them exactly. We also show that the methodology is effective for a protein without any commercially available small-molecule inhibitors, the HNH domain of the CRISPR-associated protein 9 (Cas9) enzyme. To facilitate implementation and reproducibility, we made all of our software available through the open-source ChemSpaceAL Python package.
Collapse
Affiliation(s)
- Gregory W Kyro
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| | - Anton Morgunov
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| | - Rafael I Brent
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| | - Victor S Batista
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| |
Collapse
|
16
|
Zeng C, Jian Y, Zhuo C, Li A, Zeng C, Zhao Y. Evaluation of DNA-protein complex structures using the deep learning method. Phys Chem Chem Phys 2023; 26:130-143. [PMID: 38063012 DOI: 10.1039/d3cp04980a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2023]
Abstract
Biological processes such as transcription, repair, and regulation require interactions between DNA and proteins. To unravel their functions, it is imperative to determine the high-resolution structures of DNA-protein complexes. However, experimental methods for this purpose are costly and technically demanding. Consequently, there is an urgent need for computational techniques to identify the structures of DNA-protein complexes. Despite technological advancements, accurately identifying DNA-protein complexes through computational methods still poses a challenge. Our team has developed a cutting-edge deep-learning approach called DDPScore that assesses DNA-protein complex structures. DDPScore utilizes a 4D convolutional neural network to overcome limited training data. This approach effectively captures local and global features while comprehensively considering the conformational changes arising from the flexibility during the DNA-protein docking process. DDPScore consistently outperformed the available methods in comprehensive DNA-protein complex docking evaluations, even for the flexible docking challenges. DDPScore has a wide range of applications in predicting and designing structures of DNA-protein complexes.
Collapse
Affiliation(s)
- Chengwei Zeng
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, 430079, China.
| | - Yiren Jian
- Department of Computer Science, Dartmouth College, Hanover, NH 03755, USA
| | - Chen Zhuo
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, 430079, China.
| | - Anbang Li
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, 430079, China.
| | - Chen Zeng
- Department of Physics, The George Washington University, Washington, DC 20052, USA
| | - Yunjie Zhao
- Institute of Biophysics and Department of Physics, Central China Normal University, Wuhan, 430079, China.
| |
Collapse
|
17
|
Barghout RA, Xu Z, Betala S, Mahadevan R. Advances in generative modeling methods and datasets to design novel enzymes for renewable chemicals and fuels. Curr Opin Biotechnol 2023; 84:103007. [PMID: 37931573 DOI: 10.1016/j.copbio.2023.103007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2023] [Revised: 09/12/2023] [Accepted: 09/13/2023] [Indexed: 11/08/2023]
Abstract
Biotechnology has revolutionized the development of sustainable energy sources by harnessing biomass as a feedstock for energy production. However, challenges such as recalcitrant feedstocks and inefficient metabolic pathways hinder the large-scale integration of renewable energy systems. Enzyme engineering has emerged as a powerful tool to address these challenges by enhancing enzyme activity, specificity, and stability. Generative machine learning (ML) models have shown great promise in accelerating protein design, allowing for the generation of novel protein sequences with desired properties by navigating vast spaces. This review paper aims to summarize the state of the art in generative models for protein design and how they can be applied to bioenergy applications, including the underlying architectures and training strategies. Additionally, it highlights the importance of high-quality datasets for training and evaluating generative models, organizes available datasets for generative protein design, and discusses the potential of applying generative models to strain design for bioenergy production.
Collapse
Affiliation(s)
- Rana A Barghout
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, 200 College St, Toronto, ON, Canada.
| | - Zhiqing Xu
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, 200 College St, Toronto, ON, Canada
| | - Siddharth Betala
- Department of Biotechnology, Bhupat and Jyoti Mehta School of Biosciences, Indian Institute of Technology Madras, Chennai, India
| | - Radhakrishnan Mahadevan
- Department of Chemical Engineering & Applied Chemistry, University of Toronto, 200 College St, Toronto, ON, Canada
| |
Collapse
|
18
|
Lv Q, Zhou F, Liu X, Zhi L. Artificial intelligence in small molecule drug discovery from 2018 to 2023: Does it really work? Bioorg Chem 2023; 141:106894. [PMID: 37776682 DOI: 10.1016/j.bioorg.2023.106894] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Revised: 09/24/2023] [Accepted: 09/25/2023] [Indexed: 10/02/2023]
Abstract
Utilizing artificial intelligence (AI) in drug design represents an advanced approach for identifying targets and developing new drugs. Integrating AI techniques significantly reduces the workload involved in drug development and enhances the efficiency of early-stage drug discovery. This review aims to present a comprehensive overview of the utilization of AI methods in the field of small drug design, with a specific focus on four key areas: protein structure prediction, molecular virtual screening, molecular design, and absorption, distribution, metabolism, excretion, and toxicity (ADMET) prediction. Additionally, the role and limitations of AI in drug development are explored, and the impact of AI on decision-making processes is studied. It is important to note that while AI can bring numerous benefits to the early stage of drug development, the direction and quality of decision-making should still be emphasized, as AI should be considered as a tool rather than a decisive factor.
Collapse
Affiliation(s)
- Qi Lv
- School of Pharmacy, Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, Hefei 230032, PR China
| | - Feilong Zhou
- School of Pharmacy, Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, Hefei 230032, PR China
| | - Xinhua Liu
- School of Pharmacy, Inflammation and Immune Mediated Diseases Laboratory of Anhui Province, Hefei 230032, PR China.
| | - Liping Zhi
- School of Health Management, Anhui Medical University Hefei, 230032, PR China.
| |
Collapse
|
19
|
Du H, Jiang D, Zhang O, Wu Z, Gao J, Zhang X, Wang X, Deng Y, Kang Y, Li D, Pan P, Hsieh CY, Hou T. A flexible data-free framework for structure-based de novo drug design with reinforcement learning. Chem Sci 2023; 14:12166-12181. [PMID: 37969589 PMCID: PMC10631243 DOI: 10.1039/d3sc04091g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2023] [Accepted: 10/11/2023] [Indexed: 11/17/2023] Open
Abstract
Contemporary structure-based molecular generative methods have demonstrated their potential to model the geometric and energetic complementarity between ligands and receptors, thereby facilitating the design of molecules with favorable binding affinity and target specificity. Despite the introduction of deep generative models for molecular generation, the atom-wise generation paradigm that partially contradicts chemical intuition limits the validity and synthetic accessibility of the generated molecules. Additionally, the dependence of deep learning models on large-scale structural data has hindered their adaptability across different targets. To overcome these challenges, we present a novel search-based framework, 3D-MCTS, for structure-based de novo drug design. Distinct from prevailing atom-centric methods, 3D-MCTS employs a fragment-based molecular editing strategy. The fragments decomposed from small-molecule drugs are recombined under predefined retrosynthetic rules, offering improved drug-likeness and synthesizability, overcoming the inherent limitations of atom-based approaches. Leveraging multi-threaded parallel simulations combined with a real-time energy constraint-based pruning strategy, 3D-MCTS achieves remarkable efficiency. At a fixed computational cost, it outperforms other state-of-the-art (SOTA) methods by producing molecules with enhanced binding affinity. Furthermore, its fragment-based approach ensures the generation of more dependable binding conformations, exhibiting a success rate 43.6% higher than that of other SOTAs. This advantage becomes even more pronounced when handling targets that significantly deviate from the training dataset. 3D-MCTS is capable of achieving thirty times more hits with high binding affinity than traditional virtual screening methods, which demonstrates the superior ability of 3D-MCTS to explore chemical space. Moreover, the flexibility of our framework makes it easy to incorporate domain knowledge during the process, thereby enabling the generation of molecules with desirable pharmacophores and enhanced binding affinity. The adaptability of 3D-MCTS is further showcased in metalloprotein applications, highlighting its potential across various drug design scenarios.
Collapse
Affiliation(s)
- Hongyan Du
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Dejun Jiang
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Odin Zhang
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Zhenxing Wu
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Junbo Gao
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Xujun Zhang
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Xiaorui Wang
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery, State Key Laboratory of Quality Research in Chinese Medicine, Macau Institute for Applied Research in Medicine and Health, Macau University of Science and Technology Macao 999078 China
| | - Yafeng Deng
- Hangzhou Carbonsilicon AI Technology Co., Ltd Hangzhou 310018 Zhejiang China
| | - Yu Kang
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Dan Li
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Peichen Pan
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Chang-Yu Hsieh
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University Hangzhou 310058 Zhejiang China
| |
Collapse
|
20
|
Dollar O, Joshi N, Pfaendtner J, Beck DAC. Efficient 3D Molecular Design with an E(3) Invariant Transformer VAE. J Phys Chem A 2023; 127:7844-7852. [PMID: 37670244 DOI: 10.1021/acs.jpca.3c04188] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/07/2023]
Abstract
This work introduces a three-dimensional (3D) invariant graph-to-string transformer variational autoencoders (VAE) (Vagrant) for generating molecules with accurate density functional theory (DFT)-level properties. Vagrant learns to model the joint probability distribution of a 3D molecular structure and its properties by encoding molecular structures into a 3D-aware latent space. Directed navigation through this latent space implicitly optimizes the 3D structure of a molecule, and the latent embedding can be used to condition a generative transformer to predict the candidate structure as a one-dimensional (1D) sequence. Additionally, we introduce two novel sampling methods that exploit the latent characteristics of a VAE to improve performance. We show that our method outperforms comparable 3D autoregressive and diffusion methods for predicting quantum chemical property values of novel molecules in terms of both sample quality and computational efficiency.
Collapse
Affiliation(s)
- Orion Dollar
- Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
| | - Nisarg Joshi
- Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
| | - Jim Pfaendtner
- Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
| | - David A C Beck
- Department of Chemical Engineering, University of Washington, Seattle, Washington 98195, United States
- escience Institute, University of Washington, Seattle, Washington 98195, United States
| |
Collapse
|
21
|
Li JN, Yang G, Zhao PC, Wei XX, Shi JY. CProMG: controllable protein-oriented molecule generation with desired binding affinity and drug-like properties. Bioinformatics 2023; 39:i326-i336. [PMID: 37387157 DOI: 10.1093/bioinformatics/btad222] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023] Open
Abstract
MOTIVATION Deep learning-based molecule generation becomes a new paradigm of de novo molecule design since it enables fast and directional exploration in the vast chemical space. However, it is still an open issue to generate molecules, which bind to specific proteins with high-binding affinities while owning desired drug-like physicochemical properties. RESULTS To address these issues, we elaborate a novel framework for controllable protein-oriented molecule generation, named CProMG, which contains a 3D protein embedding module, a dual-view protein encoder, a molecule embedding module, and a novel drug-like molecule decoder. Based on fusing the hierarchical views of proteins, it enhances the representation of protein binding pockets significantly by associating amino acid residues with their comprising atoms. Through jointly embedding molecule sequences, their drug-like properties, and binding affinities w.r.t. proteins, it autoregressively generates novel molecules having specific properties in a controllable manner by measuring the proximity of molecule tokens to protein residues and atoms. The comparison with state-of-the-art deep generative methods demonstrates the superiority of our CProMG. Furthermore, the progressive control of properties demonstrates the effectiveness of CProMG when controlling binding affinity and drug-like properties. After that, the ablation studies reveal how its crucial components contribute to the model respectively, including hierarchical protein views, Laplacian position encoding as well as property control. Last, a case study w.r.t. protein illustrates the novelty of CProMG and the ability to capture crucial interactions between protein pockets and molecules. It's anticipated that this work can boost de novo molecule design. AVAILABILITY AND IMPLEMENTATION The code and data underlying this article are freely available at https://github.com/lijianing0902/CProMG.
Collapse
Affiliation(s)
- Jia-Ning Li
- School of Life Sciences, Northwestern Polytechnical University, Xi'an 710072, China
| | - Guang Yang
- School of Life Sciences, Northwestern Polytechnical University, Xi'an 710072, China
| | - Peng-Cheng Zhao
- School of Life Sciences, Northwestern Polytechnical University, Xi'an 710072, China
| | - Xue-Xin Wei
- School of Life Sciences, Northwestern Polytechnical University, Xi'an 710072, China
| | - Jian-Yu Shi
- School of Life Sciences, Northwestern Polytechnical University, Xi'an 710072, China
| |
Collapse
|
22
|
Thomas M, Bender A, de Graaf C. Integrating structure-based approaches in generative molecular design. Curr Opin Struct Biol 2023; 79:102559. [PMID: 36870277 DOI: 10.1016/j.sbi.2023.102559] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2022] [Revised: 01/23/2023] [Accepted: 01/31/2023] [Indexed: 03/06/2023]
Abstract
Generative molecular design for drug discovery and development has seen a recent resurgence promising to improve the efficiency of the design-make-test-analyse cycle; by computationally exploring much larger chemical spaces than traditional virtual screening techniques. However, most generative models thus far have only utilized small-molecule information to train and condition de novo molecule generators. Here, we instead focus on recent approaches that incorporate protein structure into de novo molecule optimization in an attempt to maximize the predicted on-target binding affinity of generated molecules. We summarize these structure integration principles into either distribution learning or goal-directed optimization and for each case whether the approach is protein structure-explicit or implicit with respect to the generative model. We discuss recent approaches in the context of this categorization and provide our perspective on the future direction of the field.
Collapse
Affiliation(s)
- Morgan Thomas
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK.
| | - Andreas Bender
- Centre for Molecular Informatics, Department of Chemistry, University of Cambridge, Cambridge, CB2 1EW, UK. https://twitter.com/@AndreasBenderUK
| | - Chris de Graaf
- Sosei Heptares, Steinmetz Building, Granta Park, Great Abington, Cambridge, CB21 6DG, UK. https://twitter.com/@Chris_de_Graaf
| |
Collapse
|
23
|
Chen Y, Wang Z, Wang L, Wang J, Li P, Cao D, Zeng X, Ye X, Sakurai T. Deep generative model for drug design from protein target sequence. J Cheminform 2023; 15:38. [PMID: 36978179 PMCID: PMC10052801 DOI: 10.1186/s13321-023-00702-2] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2022] [Accepted: 02/18/2023] [Indexed: 03/30/2023] Open
Abstract
Drug discovery for a protein target is a laborious and costly process. Deep learning (DL) methods have been applied to drug discovery and successfully generated novel molecular structures, and they can substantially reduce development time and costs. However, most of them rely on prior knowledge, either by drawing on the structure and properties of known molecules to generate similar candidate molecules or extracting information on the binding sites of protein pockets to obtain molecules that can bind to them. In this paper, DeepTarget, an end-to-end DL model, was proposed to generate novel molecules solely relying on the amino acid sequence of the target protein to reduce the heavy reliance on prior knowledge. DeepTarget includes three modules: Amino Acid Sequence Embedding (AASE), Structural Feature Inference (SFI), and Molecule Generation (MG). AASE generates embeddings from the amino acid sequence of the target protein. SFI inferences the potential structural features of the synthesized molecule, and MG seeks to construct the eventual molecule. The validity of the generated molecules was demonstrated by a benchmark platform of molecular generation models. The interaction between the generated molecules and the target proteins was also verified on the basis of two metrics, drug-target affinity and molecular docking. The results of the experiments indicated the efficacy of the model for direct molecule generation solely conditioned on amino acid sequence.
Collapse
Affiliation(s)
- Yangyang Chen
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan.
| | - Zixu Wang
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan
| | - Lei Wang
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410013, Hunan, China
| | - Jianmin Wang
- The Interdisciplinary Graduate Program in Integrative Biotechnology and Translational Medicine, Yonsei University, Incheon, 21983, Republic of Korea
- Bioinformatics and Molecular Design Research Center (BMDRC), Incheon, 21983, Republic of Korea
| | - Pengyong Li
- School of Computer Science and Technology, Xidian University, Xian, 710071, China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha, 410013, Hunan, China.
| | - Xiangxiang Zeng
- College of Computer Science and Electronic Engineering, Hunan University, Changsha, 410082, Hunan, People's Republic of China.
| | - Xiucai Ye
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan.
| | - Tetsuya Sakurai
- Department of Computer Science, University of Tsukuba, Tsukuba, 3058577, Japan
| |
Collapse
|
24
|
Duran-Frigola M, Cigler M, Winter GE. Advancing Targeted Protein Degradation via Multiomics Profiling and Artificial Intelligence. J Am Chem Soc 2023; 145:2711-2732. [PMID: 36706315 PMCID: PMC9912273 DOI: 10.1021/jacs.2c11098] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Indexed: 01/28/2023]
Abstract
Only around 20% of the human proteome is considered to be druggable with small-molecule antagonists. This leaves some of the most compelling therapeutic targets outside the reach of ligand discovery. The concept of targeted protein degradation (TPD) promises to overcome some of these limitations. In brief, TPD is dependent on small molecules that induce the proximity between a protein of interest (POI) and an E3 ubiquitin ligase, causing ubiquitination and degradation of the POI. In this perspective, we want to reflect on current challenges in the field, and discuss how advances in multiomics profiling, artificial intelligence, and machine learning (AI/ML) will be vital in overcoming them. The presented roadmap is discussed in the context of small-molecule degraders but is equally applicable for other emerging proximity-inducing modalities.
Collapse
Affiliation(s)
- Miquel Duran-Frigola
- CeMM
Research Center for Molecular Medicine of the Austrian Academy of
Sciences, 1090 Vienna, Austria
- Ersilia
Open Source Initiative, 28 Belgrave Road, CB1 3DE, Cambridge, United Kingdom
| | - Marko Cigler
- CeMM
Research Center for Molecular Medicine of the Austrian Academy of
Sciences, 1090 Vienna, Austria
| | - Georg E. Winter
- CeMM
Research Center for Molecular Medicine of the Austrian Academy of
Sciences, 1090 Vienna, Austria
| |
Collapse
|
25
|
Danel T, Łęski J, Podlewska S, Podolak IT. Docking-based generative approaches in the search for new drug candidates. Drug Discov Today 2023; 28:103439. [PMID: 36372330 DOI: 10.1016/j.drudis.2022.103439] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 10/08/2022] [Accepted: 11/08/2022] [Indexed: 11/13/2022]
Abstract
Despite the popularity of virtual screening (VS) of existing compound libraries, the search for new potential drug candidates also takes advantage of generative protocols, where new compound suggestions are enumerated using various algorithms. To increase the activity potency of generative approaches, they have recently been coupled with molecular docking, a leading methodology of structure-based drug design (SBDD). In this review, we summarize progress since docking-based generative models emerged. We propose a new taxonomy for these methods and discuss their importance for the field of computer-aided drug design (CADD). In addition, we discuss the most promising directions for the further development of generative protocols coupled with docking.
Collapse
Affiliation(s)
- Tomasz Danel
- Faculty of Mathematics and Computer Science, Jagiellonian University, 6 Łojasiewicza Street, 30-348 Kraków, Poland.
| | - Jan Łęski
- Faculty of Mathematics and Computer Science, Jagiellonian University, 6 Łojasiewicza Street, 30-348 Kraków, Poland
| | - Sabina Podlewska
- Maj Institute of Pharmacology, Polish Academy of Sciences, Department of Medicinal Chemistry, 31-343 Kraków, Smętna Street 12, Poland
| | - Igor T Podolak
- Faculty of Mathematics and Computer Science, Jagiellonian University, 6 Łojasiewicza Street, 30-348 Kraków, Poland
| |
Collapse
|
26
|
Wang M, Wang J, Weng G, Kang Y, Pan P, Li D, Deng Y, Li H, Hsieh CY, Hou T. ReMODE: a deep learning-based web server for target-specific drug design. J Cheminform 2022; 14:84. [PMID: 36510307 PMCID: PMC9743675 DOI: 10.1186/s13321-022-00665-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2022] [Accepted: 12/01/2022] [Indexed: 12/14/2022] Open
Abstract
Deep learning (DL) and machine learning contribute significantly to basic biology research and drug discovery in the past few decades. Recent advances in DL-based generative models have led to superior developments in de novo drug design. However, data availability, deep data processing, and the lack of user-friendly DL tools and interfaces make it difficult to apply these DL techniques to drug design. We hereby present ReMODE (Receptor-based MOlecular DEsign), a new web server based on DL algorithm for target-specific ligand design, which integrates different functional modules to enable users to develop customizable drug design tasks. As designed, the ReMODE sever can construct the target-specific tasks toward the protein targets selected by users. Meanwhile, the server also provides some extensions: users can optimize the drug-likeness or synthetic accessibility of the generated molecules, and control other physicochemical properties; users can also choose a sub-structure/scaffold as a starting point for fragment-based drug design. The ReMODE server also enables users to optimize the pharmacophore matching and docking conformations of the generated molecules. We believe that the ReMODE server will benefit researchers for drug discovery. ReMODE is publicly available at http://cadd.zju.edu.cn/relation/remode/ .
Collapse
Affiliation(s)
- Mingyang Wang
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058 Zhejiang People’s Republic of China ,CarbonSilicon AI Technology Co., Ltd, Hangzhou, 310018 Zhejiang People’s Republic of China
| | - Jike Wang
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058 Zhejiang People’s Republic of China ,CarbonSilicon AI Technology Co., Ltd, Hangzhou, 310018 Zhejiang People’s Republic of China
| | - Gaoqi Weng
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058 Zhejiang People’s Republic of China ,CarbonSilicon AI Technology Co., Ltd, Hangzhou, 310018 Zhejiang People’s Republic of China
| | - Yu Kang
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058 Zhejiang People’s Republic of China
| | - Peichen Pan
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058 Zhejiang People’s Republic of China
| | - Dan Li
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058 Zhejiang People’s Republic of China
| | - Yafeng Deng
- CarbonSilicon AI Technology Co., Ltd, Hangzhou, 310018 Zhejiang People’s Republic of China
| | - Honglin Li
- grid.28056.390000 0001 2163 4895Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science & Technology, Shanghai, 200237 People’s Republic of China
| | - Chang-Yu Hsieh
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058 Zhejiang People’s Republic of China
| | - Tingjun Hou
- grid.13402.340000 0004 1759 700XInnovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou, 310058 Zhejiang People’s Republic of China
| |
Collapse
|
27
|
Chan L, Kumar R, Verdonk M, Poelking C. A multilevel generative framework with hierarchical self-contrasting for bias control and transparency in structure-based ligand design. NAT MACH INTELL 2022. [DOI: 10.1038/s42256-022-00564-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
28
|
Yuan Q, Chen S, Wang Y, Zhao H, Yang Y. Alignment-free metal ion-binding site prediction from protein sequence through pretrained language model and multi-task learning. Brief Bioinform 2022; 23:6770088. [PMID: 36274238 DOI: 10.1093/bib/bbac444] [Citation(s) in RCA: 21] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2022] [Revised: 09/02/2022] [Accepted: 09/17/2022] [Indexed: 12/14/2022] Open
Abstract
More than one-third of the proteins contain metal ions in the Protein Data Bank. Correct identification of metal ion-binding residues is important for understanding protein functions and designing novel drugs. Due to the small size and high versatility of metal ions, it remains challenging to computationally predict their binding sites from protein sequence. Existing sequence-based methods are of low accuracy due to the lack of structural information, and time-consuming owing to the usage of multi-sequence alignment. Here, we propose LMetalSite, an alignment-free sequence-based predictor for binding sites of the four most frequently seen metal ions in BioLiP (Zn2+, Ca2+, Mg2+ and Mn2+). LMetalSite leverages the pretrained language model to rapidly generate informative sequence representations and employs transformer to capture long-range dependencies. Multi-task learning is adopted to compensate for the scarcity of training data and capture the intrinsic similarities between different metal ions. LMetalSite was shown to surpass state-of-the-art structure-based methods by more than 19.7, 14.4, 36.8 and 12.6% in area under the precision recall on the four independent tests, respectively. Further analyses indicated that the self-attention modules are effective to learn the structural contexts of residues from protein sequence. We provide the data sets, source codes and trained models of LMetalSite at https://github.com/biomed-AI/LMetalSite.
Collapse
Affiliation(s)
- Qianmu Yuan
- School of Computer Science and Engineering at Sun Yat-sen University, Guangzhou 510000, China
| | - Sheng Chen
- School of Computer Science and Engineering at Sun Yat-sen University, Guangzhou 510000, China
| | - Yu Wang
- Peng Cheng National Laboratory at Shenzhen, Guangzhou 510000, China
| | - Huiying Zhao
- Sun Yat-sen Memorial Hospital at Sun Yat-sen University, Guangzhou 510000, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China, and Key Laboratory of Machine Intelligence and Advanced Computing of MOE, Sun Yat-sen University, Guangzhou 510000, China
| |
Collapse
|
29
|
Sauer S, Matter H, Hessler G, Grebner C. Optimizing interactions to protein binding sites by integrating docking-scoring strategies into generative AI methods. Front Chem 2022; 10:1012507. [PMID: 36339033 PMCID: PMC9629386 DOI: 10.3389/fchem.2022.1012507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 09/20/2022] [Indexed: 11/14/2022] Open
Abstract
The identification and optimization of promising lead molecules is essential for drug discovery. Recently, artificial intelligence (AI) based generative methods provided complementary approaches for generating molecules under specific design constraints of relevance in drug design. The goal of our study is to incorporate protein 3D information directly into generative design by flexible docking plus an adapted protein-ligand scoring function, thereby moving towards automated structure-based design. First, the protein-ligand scoring function RFXscore integrating individual scoring terms, ligand descriptors, and combined terms was derived using the PDBbind database and internal data. Next, design results for different workflows are compared to solely ligand-based reward schemes. Our newly proposed, optimal workflow for structure-based generative design is shown to produce promising results, especially for those exploration scenarios, where diverse structures fitting to a protein binding site are requested. Best results are obtained using docking followed by RFXscore, while, depending on the exact application scenario, it was also found useful to combine this approach with other metrics that bias structure generation into "drug-like" chemical space, such as target-activity machine learning models, respectively.
Collapse
Affiliation(s)
| | | | | | - Christoph Grebner
- Synthetic Molecular Design, Integrated Drug Discovery, Sanofi, Frankfurt, Germany
| |
Collapse
|
30
|
A pocket-based 3D molecule generative model fueled by experimental electron density. Sci Rep 2022; 12:15100. [PMID: 36068257 PMCID: PMC9448726 DOI: 10.1038/s41598-022-19363-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2022] [Accepted: 08/29/2022] [Indexed: 11/08/2022] Open
Abstract
We report for the first time the use of experimental electron density (ED) as training data for the generation of drug-like three-dimensional molecules based on the structure of a target protein pocket. Similar to a structural biologist building molecules based on their ED, our model functions with two main components: a generative adversarial network (GAN) to generate the ligand ED in the input pocket and an ED interpretation module for molecule generation. The model was tested on three targets: a kinase (hematopoietic progenitor kinase 1), protease (SARS-CoV-2 main protease), and nuclear receptor (vitamin D receptor), and evaluated with a reference dataset composed of over 8000 compounds that have their activities reported in the literature. The evaluation considered the chemical validity, chemical space distribution-based diversity, and similarity with reference active compounds concerning the molecular structure and pocket-binding mode. Our model can generate molecules with similar structures to classical active compounds and novel compounds sharing similar binding modes with active compounds, making it a promising tool for library generation supporting high-throughput virtual screening. The ligand ED generated can also be used to support fragment-based drug design. Our model is available as an online service to academic users via https://edmg.stonewise.cn/#/create .
Collapse
|
31
|
Qian H, Lin C, Zhao D, Tu S, Xu L. AlphaDrug: protein target specific de novo molecular generation. PNAS NEXUS 2022; 1:pgac227. [PMID: 36714828 PMCID: PMC9802440 DOI: 10.1093/pnasnexus/pgac227] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/07/2022] [Accepted: 10/02/2022] [Indexed: 11/06/2022]
Abstract
Traditional drug discovery is very laborious, expensive, and time-consuming, due to the huge combinatorial complexity of the discrete molecular search space. Researchers have turned to machine learning methods for help to tackle this difficult problem. However, most existing methods are either virtual screening on the available database of compounds by protein-ligand affinity prediction, or unconditional molecular generation, which does not take into account the information of the protein target. In this paper, we propose a protein target-oriented de novo drug design method, called AlphaDrug. Our method is able to automatically generate molecular drug candidates in an autoregressive way, and the drug candidates can dock into the given target protein well. To fulfill this goal, we devise a modified transformer network for the joint embedding of protein target and the molecule, and a Monte Carlo tree search (MCTS) algorithm for the conditional molecular generation. In the transformer variant, we impose a hierarchy of skip connections from protein encoder to molecule decoder for efficient feature transfer. The transformer variant computes the probabilities of next atoms based on the protein target and the molecule intermediate. We use the probabilities to guide the look-ahead search by MCTS to enhance or correct the next-atom selection. Moreover, MCTS is also guided by a value function implemented by a docking program, such that the paths with many low docking values are seldom chosen. Experiments on diverse protein targets demonstrate the effectiveness of our methods, indicating that AlphaDrug is a potentially promising solution to target-specific de novo drug design.
Collapse
Affiliation(s)
- Hao Qian
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
- Centre for Cognitive Machines and Computational Health (CMaCH), Shanghai Jiao Tong University, Shanghai 200240, China
| | - Cheng Lin
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
- Centre for Cognitive Machines and Computational Health (CMaCH), Shanghai Jiao Tong University, Shanghai 200240, China
| | - Dengwei Zhao
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
- Centre for Cognitive Machines and Computational Health (CMaCH), Shanghai Jiao Tong University, Shanghai 200240, China
| | - Shikui Tu
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
- Centre for Cognitive Machines and Computational Health (CMaCH), Shanghai Jiao Tong University, Shanghai 200240, China
| | - Lei Xu
- Department of Computer Science and Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
- Centre for Cognitive Machines and Computational Health (CMaCH), Shanghai Jiao Tong University, Shanghai 200240, China
| |
Collapse
|
32
|
Zhang J, Chen H. De Novo Molecule Design Using Molecular Generative Models Constrained by Ligand-Protein Interactions. J Chem Inf Model 2022; 62:3291-3306. [PMID: 35793555 DOI: 10.1021/acs.jcim.2c00177] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In recent years, molecular deep generative models have attracted much attention for its application in de novo drug design. The data-driven molecular deep generative model approximates the high dimensional distribution of the chemical space through learning from a large number of molecular structural data. So far, most of the molecular generative models rely on purely 2D ligand information in structure generation. Here, we propose a novel molecular deep generative model which adopts a recurrent neural network architecture coupled with a ligand-protein interaction fingerprint as constraints. The fingerprint was constructed on ligand docking poses and represents the 3D binding mode of ligands in the protein pocket. In the current work, generative models constrained with interaction fingerprints were trained and compared with normal RNN models. It has been shown that models trained with constraints of ligand-protein interaction fingerprint have a clear tendency to generating compounds maintaining similar binding modes. Our results demonstrate the potential application of the interaction fingerprint-constrained generative model for the targeted molecule generation and guided exploration on the drug-like chemical space.
Collapse
Affiliation(s)
- Jie Zhang
- Guangdong Provincial Key Laboratory of Laboratory Animals, Guangdong Laboratory Animals Monitoring Institute, Guangzhou 510663, P. R. China.,State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, P. R. China.,Bioland Laboratory (Guangzhou Regenerative Medicine and Health─Guangdong Laboratory), Guangzhou 510530, P. R. China
| | - Hongming Chen
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health─Guangdong Laboratory), Guangzhou 510530, P. R. China.,Guangzhou International Bio Island, Guangzhou Laboratory, No. 9 XinDaoHuanBei Road, Guangzhou 510005, China
| |
Collapse
|
33
|
Wang M, Hsieh CY, Wang J, Wang D, Weng G, Shen C, Yao X, Bing Z, Li H, Cao D, Hou T. RELATION: A Deep Generative Model for Structure-Based De Novo Drug Design. J Med Chem 2022; 65:9478-9492. [PMID: 35713420 DOI: 10.1021/acs.jmedchem.2c00732] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Deep learning (DL)-based de novo molecular design has recently gained considerable traction. Many DL-based generative models have been successfully developed to design novel molecules, but most of them are ligand-centric and the role of the 3D geometries of target binding pockets in molecular generation has not been well-exploited. Here, we proposed a new 3D-based generative model called RELATION. In the RELATION model, the BiTL algorithm was specifically designed to extract and transfer the desired geometric features of the protein-ligand complexes to a latent space for generation. The pharmacophore conditioning and docking-based Bayesian sampling were applied to efficiently navigate the vast chemical space for the design of molecules with desired geometric properties and pharmacophore features. As a proof of concept, the RELATION model was used to design inhibitors for two targets, AKT1 and CDK2. The calculation results demonstrated that the RELATION model could efficiently generate novel molecules with favorable binding affinity and pharmacophore features.
Collapse
Affiliation(s)
- Mingyang Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Chang-Yu Hsieh
- Tencent, Tencent Quantum Lab, Shenzhen 518057, Guangdong, P. R. China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Dong Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Gaoqi Weng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Chao Shen
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| | - Xiaojun Yao
- Dr. Neher's Biophysics Laboratory for Innovative Drug Discovery Macau Institute for Applied Research in Medicine and Health State Key Laboratory of Quality Research in Chinese Medicine, Macau University of Science and Technology, Taipa 999078, Macau, P. R. China
| | - Zhitong Bing
- Institute of Modern Physics, Chinese Academy of Sciences, Lanzhou 730000, P. R. China
| | - Honglin Li
- Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science & Technology, Shanghai 200237, P. R. China
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, P. R. China
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences and Cancer Center, Zhejiang University, Hangzhou 310058, Zhejiang, P. R. China
| |
Collapse
|
34
|
Xie W, Wang F, Li Y, Lai L, Pei J. Advances and Challenges in De Novo Drug Design Using Three-Dimensional Deep Generative Models. J Chem Inf Model 2022; 62:2269-2279. [PMID: 35544331 DOI: 10.1021/acs.jcim.2c00042] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A persistent goal for de novo drug design is to generate novel chemical compounds with desirable properties in a labor-, time-, and cost-efficient manner. Deep generative models provide alternative routes to this goal. Numerous model architectures and optimization strategies have been explored in recent years, most of which have been developed to generate two-dimensional molecular structures. Some generative models aiming at three-dimensional (3D) molecule generation have also been proposed, gaining attention for their unique advantages and potential to directly design drug-like molecules in a target-conditioning manner. This review highlights current developments in 3D molecular generative models combined with deep learning and discusses future directions for de novo drug design.
Collapse
Affiliation(s)
- Weixin Xie
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Fanhao Wang
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Yibo Li
- Center for Life Sciences, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| | - Luhua Lai
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China.,Peking-Tsinghua Center for Life Science at BNLMS, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China
| | - Jianfeng Pei
- Center for Quantitative Biology, Academy for Advanced Interdisciplinary Studies, Peking University, Beijing 100871, China
| |
Collapse
|
35
|
Bai Q, Liu S, Tian Y, Xu T, Banegas‐Luna AJ, Pérez‐Sánchez H, Huang J, Liu H, Yao X. Application advances of deep learning methods for de novo drug design and molecular dynamics simulation. WIRES COMPUTATIONAL MOLECULAR SCIENCE 2022. [DOI: 10.1002/wcms.1581] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Affiliation(s)
- Qifeng Bai
- Key Lab of Preclinical Study for New Drugs of Gansu Province Institute of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Lanzhou University Lanzhou Gansu China
| | - Shuo Liu
- School of Pharmacy Lanzhou University Lanzhou Gansu China
| | - Yanan Tian
- School of Pharmacy Lanzhou University Lanzhou Gansu China
| | - Tingyang Xu
- Tencent AI Lab, Shenzhen Tencent Computer Ltd Shenzhen China
| | - Antonio Jesús Banegas‐Luna
- Structural Bioinformatics and High Performance Computing Research Group (BIO‐HPC), Computer Engineering Department UCAM Universidad Católica de Murcia Murcia Spain
| | - Horacio Pérez‐Sánchez
- Structural Bioinformatics and High Performance Computing Research Group (BIO‐HPC), Computer Engineering Department UCAM Universidad Católica de Murcia Murcia Spain
| | - Junzhou Huang
- Tencent AI Lab, Shenzhen Tencent Computer Ltd Shenzhen China
| | - Huanxiang Liu
- School of Pharmacy Lanzhou University Lanzhou Gansu China
| | - Xiaojun Yao
- College of Chemistry and Chemical Engineering Lanzhou University Lanzhou Gansu China
| |
Collapse
|
36
|
Shi W, Singha M, Srivastava G, Pu L, Ramanujam J, Brylinski M. Pocket2Drug: An Encoder-Decoder Deep Neural Network for the Target-Based Drug Design. Front Pharmacol 2022; 13:837715. [PMID: 35359869 PMCID: PMC8962739 DOI: 10.3389/fphar.2022.837715] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2021] [Accepted: 02/10/2022] [Indexed: 11/13/2022] Open
Abstract
Computational modeling is an essential component of modern drug discovery. One of its most important applications is to select promising drug candidates for pharmacologically relevant target proteins. Because of continuing advances in structural biology, putative binding sites for small organic molecules are being discovered in numerous proteins linked to various diseases. These valuable data offer new opportunities to build efficient computational models predicting binding molecules for target sites through the application of data mining and machine learning. In particular, deep neural networks are powerful techniques capable of learning from complex data in order to make informed drug binding predictions. In this communication, we describe Pocket2Drug, a deep graph neural network model to predict binding molecules for a given a ligand binding site. This approach first learns the conditional probability distribution of small molecules from a large dataset of pocket structures with supervised training, followed by the sampling of drug candidates from the trained model. Comprehensive benchmarking simulations show that using Pocket2Drug significantly improves the chances of finding molecules binding to target pockets compared to traditional drug selection procedures. Specifically, known binders are generated for as many as 80.5% of targets present in the testing set consisting of dissimilar data from that used to train the deep graph neural network model. Overall, Pocket2Drug is a promising computational approach to inform the discovery of novel biopharmaceuticals.
Collapse
Affiliation(s)
- Wentao Shi
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, United States
| | - Manali Singha
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Gopal Srivastava
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
| | - Limeng Pu
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, United States
| | - J. Ramanujam
- Division of Electrical and Computer Engineering, Louisiana State University, Baton Rouge, LA, United States
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, United States
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, United States
- Center for Computation and Technology, Louisiana State University, Baton Rouge, LA, United States
- *Correspondence: Michal Brylinski,
| |
Collapse
|
37
|
Yuan Q, Chen S, Rao J, Zheng S, Zhao H, Yang Y. AlphaFold2-aware protein-DNA binding site prediction using graph transformer. Brief Bioinform 2022; 23:6509729. [PMID: 35039821 DOI: 10.1093/bib/bbab564] [Citation(s) in RCA: 51] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2021] [Revised: 11/24/2021] [Accepted: 12/09/2021] [Indexed: 12/13/2022] Open
Abstract
Protein-DNA interactions play crucial roles in the biological systems, and identifying protein-DNA binding sites is the first step for mechanistic understanding of various biological activities (such as transcription and repair) and designing novel drugs. How to accurately identify DNA-binding residues from only protein sequence remains a challenging task. Currently, most existing sequence-based methods only consider contextual features of the sequential neighbors, which are limited to capture spatial information. Based on the recent breakthrough in protein structure prediction by AlphaFold2, we propose an accurate predictor, GraphSite, for identifying DNA-binding residues based on the structural models predicted by AlphaFold2. Here, we convert the binding site prediction problem into a graph node classification task and employ a transformer-based variant model to take the protein structural information into account. By leveraging predicted protein structures and graph transformer, GraphSite substantially improves over the latest sequence-based and structure-based methods. The algorithm is further confirmed on the independent test set of 181 proteins, where GraphSite surpasses the state-of-the-art structure-based method by 16.4% in area under the precision-recall curve and 11.2% in Matthews correlation coefficient, respectively. We provide the datasets, the predicted structures and the source codes along with the pre-trained models of GraphSite at https://github.com/biomed-AI/GraphSite. The GraphSite web server is freely available at https://biomed.nscc-gz.cn/apps/GraphSite.
Collapse
Affiliation(s)
- Qianmu Yuan
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Sheng Chen
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Jiahua Rao
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Shuangjia Zheng
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
| | - Huiying Zhao
- Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou 510000, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou 510000, China
- Key Laboratory of Machine Intelligence and Advanced Computing of MOE, Sun Yat-sen University, Guangzhou 510000, China
| |
Collapse
|
38
|
Wang M, Wang Z, Sun H, Wang J, Shen C, Weng G, Chai X, Li H, Cao D, Hou T. Deep learning approaches for de novo drug design: An overview. Curr Opin Struct Biol 2021; 72:135-144. [PMID: 34823138 DOI: 10.1016/j.sbi.2021.10.001] [Citation(s) in RCA: 58] [Impact Index Per Article: 14.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2021] [Revised: 08/28/2021] [Accepted: 10/10/2021] [Indexed: 01/01/2023]
Abstract
De novo drug design is the process of generating novel lead compounds with desirable pharmacological and physiochemical properties. The application of deep learning (DL) in de novo drug design has become a hot topic, and many DL-based approaches have been developed for molecular generation tasks. Generally, these approaches were developed as per four frameworks: recurrent neural networks; encoder-decoder; reinforcement learning; and generative adversarial networks. In this review, we first introduced the molecular representation and assessment metrics used in DL-based de novo drug design. Then, we summarized the features of each architecture. Finally, the potential challenges and future directions of DL-based molecular generation were prospected.
Collapse
Affiliation(s)
- Mingyang Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, PR China
| | - Zhe Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, PR China
| | - Huiyong Sun
- Department of Medicinal Chemistry, China Pharmaceutical University, Nanjing 210009, Jiangsu, PR China
| | - Jike Wang
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, PR China
| | - Chao Shen
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, PR China
| | - Gaoqi Weng
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, PR China
| | - Xin Chai
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, PR China
| | - Honglin Li
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, PR China; Shanghai Key Laboratory of New Drug Design, School of Pharmacy, East China University of Science & Technology, Shanghai 200237, PR China.
| | - Dongsheng Cao
- Xiangya School of Pharmaceutical Sciences, Central South University, Changsha 410013, Hunan, PR China.
| | - Tingjun Hou
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, PR China.
| |
Collapse
|
39
|
Imrie F, Hadfield TE, Bradley AR, Deane CM. Deep generative design with 3D pharmacophoric constraints. Chem Sci 2021; 12:14577-14589. [PMID: 34881010 PMCID: PMC8580048 DOI: 10.1039/d1sc02436a] [Citation(s) in RCA: 39] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Accepted: 10/18/2021] [Indexed: 12/30/2022] Open
Abstract
Generative models have increasingly been proposed as a solution to the molecular design problem. However, it has proved challenging to control the design process or incorporate prior knowledge, limiting their practical use in drug discovery. In particular, generative methods have made limited use of three-dimensional (3D) structural information even though this is critical to binding. This work describes a method to incorporate such information and demonstrates the benefit of doing so. We combine an existing graph-based deep generative model, DeLinker, with a convolutional neural network to utilise physically-meaningful 3D representations of molecules and target pharmacophores. We apply our model, DEVELOP, to both linker and R-group design, demonstrating its suitability for both hit-to-lead and lead optimisation. The 3D pharmacophoric information results in improved generation and allows greater control of the design process. In multiple large-scale evaluations, we show that including 3D pharmacophoric constraints results in substantial improvements in the quality of generated molecules. On a challenging test set derived from PDBbind, our model improves the proportion of generated molecules with high 3D similarity to the original molecule by over 300%. In addition, DEVELOP recovers 10× more of the original molecules compared to the baseline DeLinker method. Our approach is general-purpose, readily modifiable to alternate 3D representations, and can be incorporated into other generative frameworks. Code is available at https://github.com/oxpig/DEVELOP.
Collapse
Affiliation(s)
- Fergus Imrie
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford Oxford OX1 3LB UK
| | - Thomas E Hadfield
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford Oxford OX1 3LB UK
| | - Anthony R Bradley
- Exscientia Ltd The Schrödinger Building, Oxford Science Park Oxford OX4 4GE UK
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford Oxford OX1 3LB UK
| |
Collapse
|