1
|
Tan JZE, Wee J, Gong X, Xia K. Topology-Enhanced Machine Learning Model (Top-ML) for Anticancer Peptide Prediction. J Chem Inf Model 2025; 65:4232-4242. [PMID: 40229641 DOI: 10.1021/acs.jcim.5c00476] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/16/2025]
Abstract
Recently, therapeutic peptides have demonstrated great promise for cancer treatment. To explore powerful anticancer peptides, artificial intelligence (AI)-based approaches have been developed to systematically screen potential candidates. However, the lack of efficient featurization of peptides has become a bottleneck for these machine-learning models. In this paper, we propose a topology-enhanced machine learning model (Top-ML) for anticancer peptide prediction. Our Top-ML employs peptide topological features derived from its sequence "connection" information characterized by spectral descriptors. Our Top-ML model, employing an Extra-Trees classifier, has been validated on the AntiCP 2.0 and mACPpred 2.0 benchmark data sets, achieving state-of-the-art performance or results comparable to existing deep learning models, while providing greater interpretability. Our results highlight the potential of leveraging novel topology-based featurization to accelerate the identification of anticancer peptides.
Collapse
Affiliation(s)
- Joshua Zhi En Tan
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| | - JunJie Wee
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Xue Gong
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| |
Collapse
|
2
|
Jiang J, Chen L, Zhu Y, Shi Y, Qiu H, Zhang B, Zhou T, Wei GW. Proteomic Learning of Gamma-Aminobutyric Acid (GABA) Receptor-Mediated Anesthesia. J Chem Inf Model 2025; 65:3655-3668. [PMID: 40094320 PMCID: PMC12004937 DOI: 10.1021/acs.jcim.5c00114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2025] [Revised: 02/27/2025] [Accepted: 03/04/2025] [Indexed: 03/19/2025]
Abstract
Anesthetics are crucial in surgical procedures and therapeutic interventions, but they come with side effects and varying levels of effectiveness, calling for novel anesthetic agents that offer more precise and controllable effects. Targeting Gamma-aminobutyric acid (GABA) receptors, the primary inhibitory receptors in the central nervous system, could enhance their inhibitory action, potentially reducing side effects while improving the potency of anesthetics. In this study, we introduce a proteomic learning of GABA receptor-mediated anesthesia based on 24 GABA receptor subtypes by considering over 4000 proteins in protein-protein interaction (PPI) networks and over 1.5 millions known binding compounds. We develop a corresponding drug-target interaction network to identify potential lead compounds for novel anesthetic design. To ensure robust proteomic learning predictions, we curated a data set comprising 136 targets from a pool of 980 targets within the PPI networks. We employed three machine learning algorithms, integrating advanced natural language processing (NLP) models such as pretrained transformers and autoencoder embeddings. Through a comprehensive screening process, we evaluated the side effects and repurposing potential of over 180,000 drug candidates targeting the GABRA5 receptor. Additionally, we assessed the ADMET (absorption, distribution, metabolism, excretion, and toxicity) properties of these candidates to identify those with near-optimal characteristics. This approach also involved optimizing the structures of existing anesthetics. Our work presents an innovative strategy for the development of new anesthetic drugs, optimization of anesthetic use, and a deeper understanding of potential anesthesia-related side effects.
Collapse
Affiliation(s)
- Jian Jiang
- Research
Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430200, P R. China
- Department
of Mathematics, Michigan State University, East Lansing 48824, Michigan, United States
| | - Long Chen
- Research
Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430200, P R. China
| | - Yueying Zhu
- Research
Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430200, P R. China
| | - Yazhou Shi
- Research
Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430200, P R. China
| | - Huahai Qiu
- Research
Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430200, P R. China
| | - Bengong Zhang
- Research
Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan 430200, P R. China
| | - Tianshou Zhou
- Key
Laboratory of Computational Mathematics, Guangdong Province, and School
of Mathematics, Sun Yat-sen University, Guangzhou 510006, P R. China
| | - Guo-Wei Wei
- Department
of Mathematics, Michigan State University, East Lansing 48824, Michigan, United States
- Department
of Electrical and Computer Engineering, Michigan State University, East Lansing 48824, Michigan, United States
- Department
of Biochemistry and Molecular Biology, Michigan
State University, East Lansing 48824, Michigan, United States
| |
Collapse
|
3
|
Kyro GW, Martin MT, Watt ED, Batista VS. CardioGenAI: a machine learning-based framework for re-engineering drugs for reduced hERG liability. J Cheminform 2025; 17:30. [PMID: 40045386 PMCID: PMC11881490 DOI: 10.1186/s13321-025-00976-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2024] [Accepted: 02/21/2025] [Indexed: 03/09/2025] Open
Abstract
The link between in vitro hERG ion channel inhibition and subsequent in vivo QT interval prolongation, a critical risk factor for the development of arrythmias such as Torsade de Pointes, is so well established that in vitro hERG activity alone is often sufficient to end the development of an otherwise promising drug candidate. It is therefore of tremendous interest to develop advanced methods for identifying hERG-active compounds in the early stages of drug development, as well as for proposing redesigned compounds with reduced hERG liability and preserved primary pharmacology. In this work, we present CardioGenAI, a machine learning-based framework for re-engineering both developmental and commercially available drugs for reduced hERG activity while preserving their pharmacological activity. The framework incorporates novel state-of-the-art discriminative models for predicting hERG channel activity, as well as activity against the voltage-gated NaV1.5 and CaV1.2 channels due to their potential implications in modulating the arrhythmogenic potential induced by hERG channel blockade. We applied the complete framework to pimozide, an FDA-approved antipsychotic agent that demonstrates high affinity to the hERG channel, and generated 100 refined candidates. Remarkably, among the candidates is fluspirilene, a compound which is of the same class of drugs as pimozide (diphenylmethanes) and therefore has similar pharmacological activity, yet exhibits over 700-fold weaker binding to hERG. Furthermore, we demonstrated the framework's ability to optimize hERG, NaV1.5 and CaV1.2 profiles of multiple FDA-approved compounds while maintaining the physicochemical nature of the original drugs. We envision that this method can effectively be applied to developmental compounds exhibiting hERG liabilities to provide a means of rescuing drug development programs that have stalled due to hERG-related safety concerns. Additionally, the discriminative models can also serve independently as effective components of virtual screening pipelines. We have made all of our software open-source at https://github.com/gregory-kyro/CardioGenAI to facilitate integration of the CardioGenAI framework for molecular hypothesis generation into drug discovery workflows.Scientific contributionThis work introduces CardioGenAI, an open-source machine learning-based framework designed to re-engineer drugs for reduced hERG liability while preserving their pharmacological activity. The complete CardioGenAI framework can be applied to developmental compounds exhibiting hERG liabilities to provide a means of rescuing drug discovery programs facing hERG-related challenges. In addition, the framework incorporates novel state-of-the-art discriminative models for predicting hERG, NaV1.5 and CaV1.2 channel activity, which can function independently as effective components of virtual screening pipelines.
Collapse
Affiliation(s)
- Gregory W Kyro
- Department of Chemistry, Yale University, New Haven, CT, 06511, USA.
- Drug Safety Research & Development, Pfizer Research & Development, Groton, CT, 06340, USA.
| | - Matthew T Martin
- Drug Safety Research & Development, Pfizer Research & Development, Groton, CT, 06340, USA
| | - Eric D Watt
- Drug Safety Research & Development, Pfizer Research & Development, Groton, CT, 06340, USA
| | - Victor S Batista
- Department of Chemistry, Yale University, New Haven, CT, 06511, USA.
| |
Collapse
|
4
|
Fu S, Chen Z, Luo Z, Nie M, Fu T, Zhou Y, Yang Q, Zhu F, Ni F. Chem(Pro)2: the atlas of chemoproteomic probes labelling human proteins. Nucleic Acids Res 2025; 53:D1651-D1662. [PMID: 39436046 PMCID: PMC11701659 DOI: 10.1093/nar/gkae943] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Revised: 09/25/2024] [Accepted: 10/11/2024] [Indexed: 10/23/2024] Open
Abstract
Chemoproteomic probes (CPPs) have been widely considered as powerful molecular biological tools that enable the highly efficient discovery of both binding proteins and modes of action for the studied compounds. They have been successfully used to validate targets and identify binders. The design of CPP has been considered extremely challenging, which asks for the generalization using a large number of probe data. However, none of the existing databases gives such valuable data of CPPs. Herein, a database entitled 'Chem(Pro)2' was therefore developed to systematically describe the atlas of diverse types of CPPs labelling human protein in living cell/lysate. With the booming application of chemoproteomic technique and artificial intelligence in current chemical biology study, Chem(Pro)2 was expected to facilitate the AI-based learning of interacting pattern among molecules for discovering innovative targets and new drugs. Till now, Chem(Pro)2 has been open to all users without any login requirement at: https://idrblab.org/chemprosquare/.
Collapse
Affiliation(s)
- Songsen Fu
- Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
- LeadArt Biotechnologies Ltd., Ningbo 315201, China
| | - Zhen Chen
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, State Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Zhiming Luo
- LeadArt Biotechnologies Ltd., Ningbo 315201, China
| | - Meiyun Nie
- LeadArt Biotechnologies Ltd., Ningbo 315201, China
| | - Tingting Fu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, State Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Ying Zhou
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, State Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
| | - Qingxia Yang
- Zhejiang Provincial Key Laboratory of Precision Diagnosis and Therapy for Major Gynecological Diseases, Women's Hospital, Zhejiang University School of Medicine, Hangzhou 310058, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, The Second Affiliated Hospital, Zhejiang University School of Medicine, State Key Laboratory of Advanced Drug Delivery and Release Systems, Zhejiang University, Hangzhou 310058, China
- Innovation Institute for Artificial Intelligence in Medicine of Zhejiang University, Alibaba-Zhejiang University Joint Research Center of Future Digital Healthcare, Hangzhou 330110, China
| | - Feng Ni
- Institute of Drug Discovery Technology, Ningbo University, Ningbo 315211, China
- LeadArt Biotechnologies Ltd., Ningbo 315201, China
| |
Collapse
|
5
|
Kim H, Ryu S, Jung N, Yang J, Seok C. CSearch: chemical space search via virtual synthesis and global optimization. J Cheminform 2024; 16:137. [PMID: 39639340 PMCID: PMC11622599 DOI: 10.1186/s13321-024-00936-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2024] [Accepted: 11/22/2024] [Indexed: 12/07/2024] Open
Abstract
The two key components of computational molecular design are virtually generating molecules and predicting the properties of these generated molecules. This study focuses on an effective method for molecular generation through virtual synthesis and global optimization of a given objective function. Using a pre-trained graph neural network (GNN) objective function to approximate the docking energies of compounds for four target receptors, we generated highly optimized compounds with 300-400 times less computational effort compared to virtual compound library screening. These optimized compounds exhibit similar synthesizability and diversity to known binders with high potency and are notably novel compared to library chemicals or known ligands. This method, called CSearch, can be effectively utilized to generate chemicals optimized for a given objective function. With the GNN function approximating docking energies, CSearch generated molecules with predicted binding poses to the target receptors similar to known inhibitors, demonstrating its effectiveness in producing drug-like binders.Scientific Contribution We have developed a method for effectively exploring the chemical space of drug-like molecules using a global optimization algorithm with fragment-based virtual synthesis. The compounds generated using this method optimize the given objective function efficiently and are synthesizable like commercial library compounds. Furthermore, they are diverse, novel drug-like molecules with properties similar to known inhibitors for target receptors.
Collapse
Affiliation(s)
- Hakjean Kim
- Department of Chemistry, Seoul National University, Seoul, 08826, Republic of Korea
| | | | - Nuri Jung
- Department of Chemistry, Seoul National University, Seoul, 08826, Republic of Korea
| | | | - Chaok Seok
- Department of Chemistry, Seoul National University, Seoul, 08826, Republic of Korea.
- Galux Inc, Seoul, 08738, Republic of Korea.
| |
Collapse
|
6
|
Zhang Y, Shen C, Xia K. Multi-Cover Persistence (MCP)-based machine learning for polymer property prediction. Brief Bioinform 2024; 25:bbae465. [PMID: 39323091 PMCID: PMC11424509 DOI: 10.1093/bib/bbae465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2024] [Revised: 08/07/2024] [Accepted: 09/05/2024] [Indexed: 09/27/2024] Open
Abstract
Accurate and efficient prediction of polymers properties is crucial for polymer design. Recently, data-driven artificial intelligence (AI) models have demonstrated great promise in polymers property analysis. Even with the great progresses, a pivotal challenge in all the AI-driven models remains to be the effective representation of molecules. Here we introduce Multi-Cover Persistence (MCP)-based molecular representation and featurization for the first time. Our MCP-based polymer descriptors are combined with machine learning models, in particular, Gradient Boosting Tree (GBT) models, for polymers property prediction. Different from all previous molecular representation, polymer molecular structure and interactions are represented as MCP, which utilizes Delaunay slices at different dimensions and Rhomboid tiling to characterize the complicated geometric and topological information within the data. Statistic features from the generated persistent barcodes are used as polymer descriptors, and further combined with GBT model. Our model has been extensively validated on polymer benchmark datasets. It has been found that our models can outperform traditional fingerprint-based models and has similar accuracy with geometric deep learning models. In particular, our model tends to be more effective on large-sized monomer structures, demonstrating the great potential of MCP in characterizing more complicated polymer data. This work underscores the potential of MCP in polymer informatics, presenting a novel perspective on molecular representation and its application in polymer science.
Collapse
Affiliation(s)
- Yipeng Zhang
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| | - Cong Shen
- Department of Mathematics, National University of Singapore, Singapore 119076, Singapore
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371, Singapore
| |
Collapse
|
7
|
Lavecchia A. Navigating the frontier of drug-like chemical space with cutting-edge generative AI models. Drug Discov Today 2024; 29:104133. [PMID: 39103144 DOI: 10.1016/j.drudis.2024.104133] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2024] [Revised: 07/20/2024] [Accepted: 07/31/2024] [Indexed: 08/07/2024]
Abstract
Deep generative models (GMs) have transformed the exploration of drug-like chemical space (CS) by generating novel molecules through complex, nontransparent processes, bypassing direct structural similarity. This review examines five key architectures for CS exploration: recurrent neural networks (RNNs), variational autoencoders (VAEs), generative adversarial networks (GANs), normalizing flows (NF), and Transformers. It discusses molecular representation choices, training strategies for focused CS exploration, evaluation criteria for CS coverage, and related challenges. Future directions include refining models, exploring new notations, improving benchmarks, and enhancing interpretability to better understand biologically relevant molecular properties.
Collapse
Affiliation(s)
- Antonio Lavecchia
- 'Drug Discovery' Laboratory, Department of Pharmacy, University of Naples Federico II, I-80131 Naples, Italy.
| |
Collapse
|
8
|
Matsukiyo Y, Tengeiji A, Li C, Yamanishi Y. Transcriptionally Conditional Recurrent Neural Network for De Novo Drug Design. J Chem Inf Model 2024; 64:5844-5852. [PMID: 39049516 DOI: 10.1021/acs.jcim.4c00531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/27/2024]
Abstract
Computational molecular generation methods that generate chemical structures from gene expression profiles have been actively developed for de novo drug design. However, most omics-based methods involve complex models consisting of multiple neural networks, which require pretraining. In this study, we propose a straightforward molecular generation method called GxRNN (gene expression profile-based recurrent neural network), employing a single recurrent neural network (RNN) that necessitates no pretraining for omics-based drug design. Specifically, our method utilizes the desired gene expression profile as input for the RNN, conditioning it to generate molecules likely to induce a similar profile. In a case study involving ten target proteins, GxRNN exhibited superior structural reproducibility of known ligands, surpassing several existing methods. This advancement positions our proposed method as a promising tool for facilitating de novo drug design.
Collapse
Affiliation(s)
- Yuki Matsukiyo
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
- Department of Complex Systems Science, Graduate School of Informatics, Nagoya University, Chikusa, Nagoya, Aichi 464-8601, Japan
| | - Atsushi Tengeiji
- Modality Research Laboratories I, Daiichi Sankyo Co., Ltd., 1-2-58 Hiromachi, Shinagawa, Tokyo 140-8710, Japan
| | - Chen Li
- Department of Complex Systems Science, Graduate School of Informatics, Nagoya University, Chikusa, Nagoya, Aichi 464-8601, Japan
| | - Yoshihiro Yamanishi
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
- Department of Complex Systems Science, Graduate School of Informatics, Nagoya University, Chikusa, Nagoya, Aichi 464-8601, Japan
| |
Collapse
|
9
|
Jin H, Merz KM. LigandDiff: de Novo Ligand Design for 3D Transition Metal Complexes with Diffusion Models. J Chem Theory Comput 2024; 20:4377-4384. [PMID: 38743854 PMCID: PMC11137811 DOI: 10.1021/acs.jctc.4c00232] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2024] [Revised: 05/06/2024] [Accepted: 05/07/2024] [Indexed: 05/16/2024]
Abstract
Transition metal complexes are a class of compounds with varied and versatile properties, making them of great technological importance. Their applications cover a wide range of fields, either as metallodrugs in medicine or as materials, catalysts, batteries, solar cells, etc. The demand for the novel design of transition metal complexes with new properties remains of great interest. However, the traditional high-throughput screening approach is inherently expensive and laborious since it depends on human expertise. Here, we present LigandDiff, a generative model for the de novo design of novel transition metal complexes. Unlike the existing methods that simply extract and combine ligands with the metal to get new complexes, LigandDiff aims at designing configurationally novel ligands from scratch, which opens new pathways for the discovery of organometallic complexes. Moreover, it overcomes the limitations of current methods, where the diversity of new complexes highly relies on the diversity of available ligands, while LigandDiff can design numerous novel ligands without human intervention. Our results indicate that LigandDiff designs unique and novel ligands under different contexts, and these generated ligands are synthetically accessible. Moreover, LigandDiff shows good transferability by generating successful ligands for any transition metal complex.
Collapse
Affiliation(s)
- Hongni Jin
- Department
of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
| | - Kenneth M. Merz
- Department
of Chemistry, Michigan State University, East Lansing, Michigan 48824, United States
- Department
of Biochemistry and Molecular Biology, Michigan
State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
10
|
Ye G. De novo drug design as GPT language modeling: large chemistry models with supervised and reinforcement learning. J Comput Aided Mol Des 2024; 38:20. [PMID: 38647700 PMCID: PMC11035455 DOI: 10.1007/s10822-024-00559-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/08/2023] [Accepted: 03/22/2024] [Indexed: 04/25/2024]
Abstract
In recent years, generative machine learning algorithms have been successful in designing innovative drug-like molecules. SMILES is a sequence-like language used in most effective drug design models. Due to data's sequential structure, models such as recurrent neural networks and transformers can design pharmacological compounds with optimized efficacy. Large language models have advanced recently, but their implications on drug design have not yet been explored. Although one study successfully pre-trained a large chemistry model (LCM), its application to specific tasks in drug discovery is unknown. In this study, the drug design task is modeled as a causal language modeling problem. Thus, the procedure of reward modeling, supervised fine-tuning, and proximal policy optimization was used to transfer the LCM to drug design, similar to Open AI's ChatGPT and InstructGPT procedures. By combining the SMILES sequence with chemical descriptors, the novel efficacy evaluation model exceeded its performance compared to previous studies. After proximal policy optimization, the drug design model generated molecules with 99.2% having efficacy pIC50 > 7 towards the amyloid precursor protein, with 100% of the generated molecules being valid and novel. This demonstrated the applicability of LCMs in drug discovery, with benefits including less data consumption while fine-tuning. The applicability of LCMs to drug discovery opens the door for larger studies involving reinforcement-learning with human feedback, where chemists provide feedback to LCMs and generate higher-quality molecules. LCMs' ability to design similar molecules from datasets paves the way for more accessible, non-patented alternatives to drug molecules.
Collapse
Affiliation(s)
- Gavin Ye
- Columbia Grammar & Preparatory School, New York, NY, USA.
| |
Collapse
|
11
|
Matsukiyo Y, Yamanaka C, Yamanishi Y. De Novo Generation of Chemical Structures of Inhibitor and Activator Candidates for Therapeutic Target Proteins by a Transformer-Based Variational Autoencoder and Bayesian Optimization. J Chem Inf Model 2024; 64:2345-2355. [PMID: 37768595 DOI: 10.1021/acs.jcim.3c00824] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/29/2023]
Abstract
Deep generative models for molecular generation have been gaining much attention as structure generators to accelerate drug discovery. However, most previously developed methods are chemistry-centric approaches, and comprehensive biological responses in the cell have not been taken into account. In this study, we propose a novel computational method, TRIOMPHE-BOA (transcriptome-based inference and generation of molecules with desired phenotypes using the Bayesian optimization algorithm), to generate new chemical structures of inhibitor or activator candidates for therapeutic target proteins by integrating chemically and genetically perturbed transcriptome profiles. In the algorithm, the substructures of multiple molecules that were selected based on the transcriptome analysis are fused in the design of new chemical structures by exploring the latent space of a Transformer-based variational autoencoder using Bayesian optimization. Our results demonstrate the usefulness of the proposed method in terms of having high reproducibility of existing ligands for 10 therapeutic target proteins when compared with previous methods. Moreover, this method can be applied to proteins without detailed 3D structures or known ligands and is expected to become a powerful tool for more efficient hit identification.
Collapse
Affiliation(s)
- Yuki Matsukiyo
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Chikashige Yamanaka
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
| | - Yoshihiro Yamanishi
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka 820-8502, Japan
- Graduate School of Informatics, Nagoya University, Chikusa, Nagoya, Aichi 464-8601, Japan
| |
Collapse
|
12
|
Pang C, Qiao J, Zeng X, Zou Q, Wei L. Deep Generative Models in De Novo Drug Molecule Generation. J Chem Inf Model 2024; 64:2174-2194. [PMID: 37934070 DOI: 10.1021/acs.jcim.3c01496] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2023]
Abstract
The discovery of new drugs has important implications for human health. Traditional methods for drug discovery rely on experiments to optimize the structure of lead molecules, which are time-consuming and high-cost. Recently, artificial intelligence has exhibited promising and efficient performance for drug-like molecule generation. In particular, deep generative models achieve great success in de novo generation of drug-like molecules with desired properties, showing massive potential for novel drug discovery. In this study, we review the recent progress of molecule generation using deep generative models, mainly focusing on molecule representations, public databases, data processing tools, and advanced artificial intelligence based molecule generation frameworks. In particular, we present a comprehensive comparison of state-of-the-art deep generative models for molecule generation and a summary of commonly used molecular design strategies. We identify research gaps and challenges of molecule generation such as the need for better databases, missing 3D information in molecular representation, and the lack of high-precision evaluation metrics. We suggest future directions for molecular generation and drug discovery.
Collapse
Affiliation(s)
- Chao Pang
- School of Software, Shandong University, Jinan 250100, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250100, China
| | - Jianbo Qiao
- School of Software, Shandong University, Jinan 250100, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250100, China
| | - Xiangxiang Zeng
- College of Information Science and Engineering, Hunan University, Changsha 410082, China
| | - Quan Zou
- Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
| | - Leyi Wei
- School of Software, Shandong University, Jinan 250100, China
- Joint SDU-NTU Centre for Artificial Intelligence Research (C-FAIR), Shandong University, Jinan 250100, China
| |
Collapse
|
13
|
Qiu Y, Cheng F. Artificial intelligence for drug discovery and development in Alzheimer's disease. Curr Opin Struct Biol 2024; 85:102776. [PMID: 38335558 DOI: 10.1016/j.sbi.2024.102776] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 12/29/2023] [Accepted: 01/15/2024] [Indexed: 02/12/2024]
Abstract
The complex molecular mechanism and pathophysiology of Alzheimer's disease (AD) limits the development of effective therapeutics or prevention strategies. Artificial Intelligence (AI)-guided drug discovery combined with genetics/multi-omics (genomics, epigenomics, transcriptomics, proteomics, and metabolomics) analysis contributes to the understanding of the pathophysiology and precision medicine of the disease, including AD and AD-related dementia. In this review, we summarize the AI-driven methodologies for AD-agnostic drug discovery and development, including de novo drug design, virtual screening, and prediction of drug-target interactions, all of which have shown potentials. In particular, AI-based drug repurposing emerges as a compelling strategy to identify new indications for existing drugs for AD. We provide several emerging AD targets from human genetics and multi-omics findings and highlight recent AI-based technologies and their applications in drug discovery using AD as a prototypical example. In closing, we discuss future challenges and directions in AI-based drug discovery for AD and other neurodegenerative diseases.
Collapse
Affiliation(s)
- Yunguang Qiu
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA. https://twitter.com/YunguangQiu
| | - Feixiong Cheng
- Genomic Medicine Institute, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA; Department of Molecular Medicine, Cleveland Clinic Lerner College of Medicine, Case Western Reserve University, Cleveland, OH 44195, USA; Cleveland Clinic Genome Center, Lerner Research Institute, Cleveland Clinic, Cleveland, OH 44195, USA.
| |
Collapse
|
14
|
Kyro GW, Morgunov A, Brent RI, Batista VS. ChemSpaceAL: An Efficient Active Learning Methodology Applied to Protein-Specific Molecular Generation. J Chem Inf Model 2024; 64:653-665. [PMID: 38287889 DOI: 10.1021/acs.jcim.3c01456] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2024]
Abstract
The incredible capabilities of generative artificial intelligence models have inevitably led to their application in the domain of drug discovery. Within this domain, the vastness of chemical space motivates the development of more efficient methods for identifying regions with molecules that exhibit desired characteristics. In this work, we present a computationally efficient active learning methodology and demonstrate its applicability to targeted molecular generation. When applied to c-Abl kinase, a protein with FDA-approved small-molecule inhibitors, the model learns to generate molecules similar to the inhibitors without prior knowledge of their existence and even reproduces two of them exactly. We also show that the methodology is effective for a protein without any commercially available small-molecule inhibitors, the HNH domain of the CRISPR-associated protein 9 (Cas9) enzyme. To facilitate implementation and reproducibility, we made all of our software available through the open-source ChemSpaceAL Python package.
Collapse
Affiliation(s)
- Gregory W Kyro
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| | - Anton Morgunov
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| | - Rafael I Brent
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| | - Victor S Batista
- Department of Chemistry, Yale University, New Haven, Connecticut 06511-8499, United States
| |
Collapse
|
15
|
Zou J, Yu J, Hu P, Zhao L, Shi S. STAGAN: An approach for improve the stability of molecular graph generation based on generative adversarial networks. Comput Biol Med 2023; 167:107691. [PMID: 37976819 DOI: 10.1016/j.compbiomed.2023.107691] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2022] [Revised: 09/18/2023] [Accepted: 11/06/2023] [Indexed: 11/19/2023]
Abstract
With the wide application of deep learning in Drug Discovery, deep generative model has shown its advantages in drug molecular generation. Generative adversarial networks can be used to learn the internal structure of molecules, but the training process may be unstable, such as gradient disappearance and model collapse, which may lead to the generation of molecules that do not conform to chemical rules or a single style. In this paper, a novel method called STAGAN was proposed to solve the difficulty of model training, by adding a new gradient penalty term in the discriminator and designing a parallel layer of batch normalization used in generator. As an illustration of method, STAGAN generated higher valid and unique molecules than previous models in training datasets from QM9 and ZINC-250K. This indicates that the proposed method can effectively solve the instability problem in the model training process, and can provide more instructive guidance for the further study of molecular graph generation.
Collapse
Affiliation(s)
- Jinping Zou
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Jialin Yu
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Pengwei Hu
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Long Zhao
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Shaoping Shi
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China; Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China.
| |
Collapse
|
16
|
Xu M, Chen H. Tree-Invent: A Novel Multipurpose Molecular Generative Model Constrained with a Topological Tree. J Chem Inf Model 2023; 63:7067-7082. [PMID: 37962855 DOI: 10.1021/acs.jcim.3c01626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
De novo molecular design plays an important role in drug discovery. Here, a novel generative model, Tree-Invent, was proposed to integrate topological constraints in the generation of a molecular graph. In this model, a molecular graph is represented as a topological tree in which a ring system, a nonring atom, and a chemical bond are regarded as the ring node, single node, and edge, respectively. The molecule generation is driven by three independent submodels for carrying out operations of node addition, ring generation, and node connection. One unique feature of the generative model is that the topological tree structure can be specified as a constraint for structure generation, which provides more precise control of structure generation. Combined with reinforcement learning, the Tree-Invent model could efficiently explore targeted chemical space. Moreover, the Tree-Invent model is flexible enough to be used in versatile molecule design settings such as scaffold decoration, scaffold hopping, and linker generation.
Collapse
Affiliation(s)
- Mingyuan Xu
- Guangzhou National Laboratory, No. 9 XingDaoHuanBei Road, Guangzhou International Bio Island, Guangzhou, Guangdong 510005, China
| | - Hongming Chen
- Guangzhou National Laboratory, No. 9 XingDaoHuanBei Road, Guangzhou International Bio Island, Guangzhou, Guangdong 510005, China
| |
Collapse
|
17
|
Wang R, Feng H, Wei GW. ChatGPT in Drug Discovery: A Case Study on Anticocaine Addiction Drug Development with Chatbots. J Chem Inf Model 2023; 63:7189-7209. [PMID: 37956228 PMCID: PMC11021135 DOI: 10.1021/acs.jcim.3c01429] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2023]
Abstract
The birth of ChatGPT, a cutting-edge language model-based chatbot developed by OpenAI, ushered in a new era in AI. However, due to potential pitfalls, its role in rigorous scientific research is not clear yet. This paper vividly showcases its innovative application within the field of drug discovery. Focused specifically on developing anticocaine addiction drugs, the study employs GPT-4 as a virtual guide, offering strategic and methodological insights to researchers working on generative models for drug candidates. The primary objective is to generate optimal drug-like molecules with desired properties. By leveraging the capabilities of ChatGPT, the study introduces a novel approach to the drug discovery process. This symbiotic partnership between AI and researchers transforms how drug development is approached. Chatbots become facilitators, steering researchers toward innovative methodologies and productive paths for creating effective drug candidates. This research sheds light on the collaborative synergy between human expertise and AI assistance, wherein ChatGPT's cognitive abilities enhance the design and development of pharmaceutical solutions. This paper not only explores the integration of advanced AI in drug discovery but also reimagines the landscape by advocating for AI-powered chatbots as trailblazers in revolutionizing therapeutic innovation.
Collapse
Affiliation(s)
- Rui Wang
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Hongsong Feng
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
18
|
Zou J, Zhao L, Shi S. Generation of focused drug molecule library using recurrent neural network. J Mol Model 2023; 29:361. [PMID: 37932607 DOI: 10.1007/s00894-023-05772-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/30/2023] [Accepted: 10/26/2023] [Indexed: 11/08/2023]
Abstract
CONTEXT With the wide application of deep learning in drug research and development, de novo molecular design methods based on recurrent neural network (RNN) have strong advantages in drug molecule generation. The RNN model can be used to learn the internal chemical structure of molecules, which is similar to a natural language processing task. Although techniques for generating target-specific molecular libraries based on RNN models are mature, research related to drug design and screening continues around the clock. Research based on de novo drug design methods to generate larger quantities of valid compounds is necessary. METHODS In this study, a molecular generation model based on RNN was designed, which abandoned the traditional way of stacked RNN and introduced the Nested long short-term memory network structure. To enrich the library of focused molecules for specific targets, we fine-tuned the model using active molecules from novel coronavirus pneumonia and screened the molecules using machine learning models. Following rigorous screening, the selected molecules underwent molecular docking with the SARS-CoV-2 M-pro receptor using AutoDock2.4 to identify the top 3 potential inhibitors. Subsequently, 100-ns molecular dynamics simulations were conducted using Amber22. Molecule parameterization involved the GAFF2 force field, while the proteins were modeled using the ff19SB force field, with solvation facilitated by a truncated octahedral TIP3P solvent environment. Upon completion of molecular dynamics simulations, stability of ligand-protein complexes was assessed by analysis of RMSD, H-bonds, and MM-GBSA. Reasonable results prove that the model can complete the task of de novo drug design and has the potential to be ideal drug molecules.
Collapse
Affiliation(s)
- Jinping Zou
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Long Zhao
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China
| | - Shaoping Shi
- Department of Mathematics, School of Mathematics and Computer Sciences, Nanchang University, Nanchang, 330031, China.
- Institute of Mathematics and Interdisciplinary Sciences, Nanchang University, Nanchang, 330031, China.
| |
Collapse
|
19
|
Wang R, Feng H, Wei GW. ChatGPT in Drug Discovery: A Case Study on Anti-Cocaine Addiction Drug Development with Chatbots. ARXIV 2023:arXiv:2308.06920v2. [PMID: 37645039 PMCID: PMC10462169] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 08/31/2023]
Abstract
The birth of ChatGPT, a cutting-edge language model-based chatbot developed by OpenAI, ushered in a new era in AI. However, due to potential pitfalls, its role in rigorous scientific research is not clear yet. This paper vividly showcases its innovative application within the field of drug discovery. Focused specifically on developing anti-cocaine addiction drugs, the study employs GPT-4 as a virtual guide, offering strategic and methodological insights to researchers working on generative models for drug candidates. The primary objective is to generate optimal drug-like molecules with desired properties. By leveraging the capabilities of ChatGPT, the study introduces a novel approach to the drug discovery process. This symbiotic partnership between AI and researchers transforms how drug development is approached. Chatbots become facilitators, steering researchers towards innovative methodologies and productive paths for creating effective drug candidates. This research sheds light on the collaborative synergy between human expertise and AI assistance, wherein ChatGPT's cognitive abilities enhance the design and development of potential pharmaceutical solutions. This paper not only explores the integration of advanced AI in drug discovery but also reimagines the landscape by advocating for AI-powered chatbots as trailblazers in revolutionizing therapeutic innovation.
Collapse
Affiliation(s)
- Rui Wang
- Department of Mathematics,Michigan State University, MI 48824, USA
| | - Hongsong Feng
- Department of Mathematics,Michigan State University, MI 48824, USA
| | - Guo-Wei Wei
- Department of Mathematics,Michigan State University, MI 48824, USA
- Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA
- Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
20
|
Parrot M, Tajmouati H, da Silva VBR, Atwood BR, Fourcade R, Gaston-Mathé Y, Do Huu N, Perron Q. Integrating synthetic accessibility with AI-based generative drug design. J Cheminform 2023; 15:83. [PMID: 37726842 PMCID: PMC10507964 DOI: 10.1186/s13321-023-00742-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2023] [Accepted: 08/03/2023] [Indexed: 09/21/2023] Open
Abstract
Generative models are frequently used for de novo design in drug discovery projects to propose new molecules. However, the question of whether or not the generated molecules can be synthesized is not systematically taken into account during generation, even though being able to synthesize the generated molecules is a fundamental requirement for such methods to be useful in practice. Methods have been developed to estimate molecule "synthesizability", but, so far, there is no consensus on whether or not a molecule is synthesizable. In this paper we introduce the Retro-Score (RScore), which computes a synthetic accessibility score of molecules by performing a full retrosynthetic analysis through our data-driven synthetic planning software Spaya, and its dedicated API: Spaya-API (https://spaya.ai). We start by comparing several synthetic accessibility scores to a binary "chemist score" as estimated by chemists on a bench of generated molecules, as a first experimental validation that the RScore is a reliable synthetic accessibility score. We then describe a pipeline to generate molecules that validate a list of targets while still being easy to synthesize. We further this idea by performing experiments comparing molecular generator outputs across a range of constraints and conditions. We show that the RScore can be learned by a Neural Network, which leads to a new score: RSPred. We demonstrate that using the RScore or RSPred as a constraint during molecular generation enables our molecular generators to produce more synthesizable solutions, with higher diversity. The open-source Python code containing all the scores and the experiments can be found on ( https://github.com/iktos/generation-under-synthetic-constraint ).
Collapse
Affiliation(s)
- Maud Parrot
- Iktos, 65 rue de Prony, 75017, Paris, France
| | | | | | | | | | | | | | | |
Collapse
|
21
|
Feng H, Wang R, Zhan CG, Wei GW. Multiobjective Molecular Optimization for Opioid Use Disorder Treatment Using Generative Network Complex. J Med Chem 2023; 66:12479-12498. [PMID: 37623046 PMCID: PMC11037444 DOI: 10.1021/acs.jmedchem.3c01053] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/26/2023]
Abstract
Opioid use disorder (OUD) has emerged as a significant global public health issue, necessitating the discovery of new medications. In this study, we propose a deep generative model that combines a stochastic differential equation (SDE)-based diffusion model with a pretrained autoencoder. The molecular generator enables efficient generation of molecules that target multiple opioid receptors, including mu, kappa, and delta. Additionally, we assess the ADMET (absorption, distribution, metabolism, excretion, and toxicity) properties of the generated molecules to identify druglike compounds. We develop a molecular optimization approach to enhance the pharmacokinetic properties of some lead compounds. Advanced binding affinity predictors were built using molecular fingerprints, including autoencoder embeddings, transformer embeddings, and topological Laplacians. Our process yields druglike molecules that can be used in highly focused experimental studies to further evaluate their pharmacological effects. Our machine learning platform serves as a valuable tool for designing effective molecules to address OUD.
Collapse
Affiliation(s)
- Hongsong Feng
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Rui Wang
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Chang-Guo Zhan
- Department of Pharmaceutical Sciences, University of Kentucky, Lexington, Kentucky 40506, United States
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
22
|
Yamanaka C, Uki S, Kaitoh K, Iwata M, Yamanishi Y. De novo drug design based on patient gene expression profiles via deep learning. Mol Inform 2023; 42:e2300064. [PMID: 37475603 DOI: 10.1002/minf.202300064] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2023] [Revised: 05/25/2023] [Accepted: 07/20/2023] [Indexed: 07/22/2023]
Abstract
Computational de novo drug design is a challenging issue in medicine, and it is desirable to consider all of the relevant information of the biological systems in a disease state. Here, we propose a novel computational method to generate drug candidate molecular structures from patient gene expression profiles via deep learning, which we call DRAGONET. Our model can generate new molecules that are likely to counteract disease-specific gene expression patterns in patients, which is made possible by exploring the latent space constructed by a transformer-based variational autoencoder and integrating the substructures of disease-correlated molecules. We applied DRAGONET to generate drug candidate molecules for gastric cancer, atopic dermatitis, and Alzheimer's disease, and demonstrated that the newly generated molecules were chemically similar to registered drugs for each disease. This approach is applicable to diseases with unknown therapeutic target proteins and will make a significant contribution to the field of precision medicine.
Collapse
Affiliation(s)
- Chikashige Yamanaka
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan
| | - Shunya Uki
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan
| | - Kazuma Kaitoh
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan
- Graduate School of Informatics, Nagoya University, Chikusa, Nagoya, 464-8602, Japan
| | - Michio Iwata
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan
| | - Yoshihiro Yamanishi
- Department of Bioscience and Bioinformatics, Faculty of Computer Science and Systems Engineering, Kyushu Institute of Technology, 680-4 Kawazu, Iizuka, Fukuoka, 820-8502, Japan
- Graduate School of Informatics, Nagoya University, Chikusa, Nagoya, 464-8602, Japan
| |
Collapse
|
23
|
Dou B, Zhu Z, Merkurjev E, Ke L, Chen L, Jiang J, Zhu Y, Liu J, Zhang B, Wei GW. Machine Learning Methods for Small Data Challenges in Molecular Science. Chem Rev 2023; 123:8736-8780. [PMID: 37384816 PMCID: PMC10999174 DOI: 10.1021/acs.chemrev.3c00189] [Citation(s) in RCA: 79] [Impact Index Per Article: 39.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/01/2023]
Abstract
Small data are often used in scientific and engineering research due to the presence of various constraints, such as time, cost, ethics, privacy, security, and technical limitations in data acquisition. However, big data have been the focus for the past decade, small data and their challenges have received little attention, even though they are technically more severe in machine learning (ML) and deep learning (DL) studies. Overall, the small data challenge is often compounded by issues, such as data diversity, imputation, noise, imbalance, and high-dimensionality. Fortunately, the current big data era is characterized by technological breakthroughs in ML, DL, and artificial intelligence (AI), which enable data-driven scientific discovery, and many advanced ML and DL technologies developed for big data have inadvertently provided solutions for small data problems. As a result, significant progress has been made in ML and DL for small data challenges in the past decade. In this review, we summarize and analyze several emerging potential solutions to small data challenges in molecular science, including chemical and biological sciences. We review both basic machine learning algorithms, such as linear regression, logistic regression (LR), k-nearest neighbor (KNN), support vector machine (SVM), kernel learning (KL), random forest (RF), and gradient boosting trees (GBT), and more advanced techniques, including artificial neural network (ANN), convolutional neural network (CNN), U-Net, graph neural network (GNN), Generative Adversarial Network (GAN), long short-term memory (LSTM), autoencoder, transformer, transfer learning, active learning, graph-based semi-supervised learning, combining deep learning with traditional machine learning, and physical model-based data augmentation. We also briefly discuss the latest advances in these methods. Finally, we conclude the survey with a discussion of promising trends in small data challenges in molecular science.
Collapse
Affiliation(s)
- Bozheng Dou
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Zailiang Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Ekaterina Merkurjev
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Lu Ke
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Long Chen
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jian Jiang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yueying Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Jie Liu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Bengong Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences,Wuhan Textile University, Wuhan 430200, P, R. China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
24
|
Kao PY, Yang YC, Chiang WY, Hsiao JY, Cao Y, Aliper A, Ren F, Aspuru-Guzik A, Zhavoronkov A, Hsieh MH, Lin YC. Exploring the Advantages of Quantum Generative Adversarial Networks in Generative Chemistry. J Chem Inf Model 2023. [PMID: 37171372 DOI: 10.1021/acs.jcim.3c00562] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
De novo drug design with desired biological activities is crucial for developing novel therapeutics for patients. The drug development process is time- and resource-consuming, and it has a low probability of success. Recent advances in machine learning and deep learning technology have reduced the time and cost of the discovery process and therefore, improved pharmaceutical research and development. In this paper, we explore the combination of two rapidly developing fields with lead candidate discovery in the drug development process. First, artificial intelligence has already been demonstrated to successfully accelerate conventional drug design approaches. Second, quantum computing has demonstrated promising potential in different applications, such as quantum chemistry, combinatorial optimizations, and machine learning. This article explores hybrid quantum-classical generative adversarial networks (GAN) for small molecule discovery. We substituted each element of GAN with a variational quantum circuit (VQC) and demonstrated the quantum advantages in the small drug discovery. Utilizing a VQC in the noise generator of a GAN to generate small molecules achieves better physicochemical properties and performance in the goal-directed benchmark than the classical counterpart. Moreover, we demonstrate the potential of a VQC with only tens of learnable parameters in the generator of GAN to generate small molecules. We also demonstrate the quantum advantage of a VQC in the discriminator of GAN. In this hybrid model, the number of learnable parameters is significantly less than the classical ones, and it can still generate valid molecules. The hybrid model with only tens of training parameters in the quantum discriminator outperforms the MLP-based one in terms of both generated molecule properties and the achieved KL divergence. However, the hybrid quantum-classical GANs still face challenges in generating unique and valid molecules compared to their classical counterparts.
Collapse
Affiliation(s)
- Po-Yu Kao
- Insilico Medicine Taiwan Ltd., Taipei 110208, Taiwan
| | - Ya-Chu Yang
- Insilico Medicine Taiwan Ltd., Taipei 110208, Taiwan
| | - Wei-Yin Chiang
- Hon Hai (Foxconn) Research Institute, Taipei 114699, Taiwan
| | - Jen-Yueh Hsiao
- Hon Hai (Foxconn) Research Institute, Taipei 114699, Taiwan
| | - Yudong Cao
- Zapata Computing, Inc., Boston, Massachusetts 02110, United States
| | - Alex Aliper
- Insilico Medicine AI Limited, Masdar City, Abu Dhabi 145748, UAE
| | - Feng Ren
- Insilico Medicine Shanghai Ltd., Shanghai 201203, China
| | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, Toronto, ON M5S 3H6, Canada
- Department of Computer Science, University of Toronto, Toronto, ON M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON M5S 1M1, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research, Toronto, ON M5S 1M1, Canada
| | | | - Min-Hsiu Hsieh
- Hon Hai (Foxconn) Research Institute, Taipei 114699, Taiwan
| | - Yen-Chu Lin
- Insilico Medicine Taiwan Ltd., Taipei 110208, Taiwan
- Department of Pharmacy, National Yang Ming Chiao Tung University, Taipei 112304, Taiwan
| |
Collapse
|
25
|
Choo HY, Wee J, Shen C, Xia K. Fingerprint-Enhanced Graph Attention Network (FinGAT) Model for Antibiotic Discovery. J Chem Inf Model 2023; 63:2928-2935. [PMID: 37167016 DOI: 10.1021/acs.jcim.3c00045] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2023]
Abstract
Artificial Intelligence (AI) techniques are of great potential to fundamentally change antibiotic discovery industries. Efficient and effective molecular featurization is key to all highly accurate learning models for antibiotic discovery. In this paper, we propose a fingerprint-enhanced graph attention network (FinGAT) model by the combination of sequence-based 2D fingerprints and structure-based graph representation. In our feature learning process, sequence information is transformed into a fingerprint vector, and structural information is encoded through a GAT module into another vector. These two vectors are concatenated and input into a multilayer perceptron (MLP) for antibiotic activity classification. Our model is extensively tested and compared with existing models. It has been found that our FinGAT can outperform various state-of-the-art GNN models in antibiotic discovery.
Collapse
Affiliation(s)
- Hou Yee Choo
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371
| | - JunJie Wee
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371
| | - Cong Shen
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371
- College of Computer Science and Electronic Engineering, Hunan University, Changsha 410083, China
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371
| |
Collapse
|
26
|
Feng H, Jiang J, Wei GW. Machine-learning repurposing of DrugBank compounds for opioid use disorder. Comput Biol Med 2023; 160:106921. [PMID: 37178605 DOI: 10.1016/j.compbiomed.2023.106921] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2023] [Revised: 03/30/2023] [Accepted: 04/13/2023] [Indexed: 05/15/2023]
Abstract
Opioid use disorder (OUD) is a chronic and relapsing condition that involves the continued and compulsive use of opioids despite harmful consequences. The development of medications with improved efficacy and safety profiles for OUD treatment is urgently needed. Drug repurposing is a promising option for drug discovery due to its reduced cost and expedited approval procedures. Computational approaches based on machine learning enable the rapid screening of DrugBank compounds, identifying those with the potential to be repurposed for OUD treatment. We collected inhibitor data for four major opioid receptors and used advanced machine learning predictors of binding affinity that fuse the gradient boosting decision tree algorithm with two natural language processing (NLP)-based molecular fingerprints and one traditional 2D fingerprint. Using these predictors, we systematically analyzed the binding affinities of DrugBank compounds on four opioid receptors. Based on our machine learning predictions, we were able to discriminate DrugBank compounds with various binding affinity thresholds and selectivities for different receptors. The prediction results were further analyzed for ADMET (absorption, distribution, metabolism, excretion, and toxicity), which provided guidance on repurposing DrugBank compounds for the inhibition of selected opioid receptors. The pharmacological effects of these compounds for OUD treatment need to be tested in further experimental studies and clinical trials. Our machine learning studies provide a valuable platform for drug discovery in the context of OUD treatment.
Collapse
Affiliation(s)
- Hongsong Feng
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Jian Jiang
- Department of Mathematics, Michigan State University, MI 48824, USA; Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan, 430200, PR China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA.
| |
Collapse
|
27
|
Mucllari E, Zadorozhnyy V, Ye Q, Nguyen DD. Novel Molecular Representations Using Neumann-Cayley Orthogonal Gated Recurrent Unit. J Chem Inf Model 2023; 63:2656-2666. [PMID: 37075324 DOI: 10.1021/acs.jcim.2c01526] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/21/2023]
Abstract
Advances in deep neural networks (DNNs) have made a very powerful machine learning method available to researchers across many fields of study, including the biomedical and cheminformatics communities, where DNNs help to improve tasks such as protein performance, molecular design, drug discovery, etc. Many of those tasks rely on molecular descriptors for representing molecular characteristics in cheminformatics. Despite significant efforts and the introduction of numerous methods that derive molecular descriptors, the quantitative prediction of molecular properties remains challenging. One widely used method of encoding molecule features into bit strings is the molecular fingerprint. In this work, we propose using new Neumann-Cayley Gated Recurrent Units (NC-GRU) inside the Neural Nets encoder (AutoEncoder) to create neural molecular fingerprints (NC-GRU fingerprints). The NC-GRU AutoEncoder introduces orthogonal weights into widely used GRU architecture, resulting in faster, more stable training, and more reliable molecular fingerprints. Integrating novel NC-GRU fingerprints and Multi-Task DNN schematics improves the performance of various molecular-related tasks such as toxicity, partition coefficient, lipophilicity, and solvation-free energy, producing state-of-the-art results on several benchmarks.
Collapse
Affiliation(s)
- Edison Mucllari
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| | - Vasily Zadorozhnyy
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| | - Qiang Ye
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| | - Duc Duy Nguyen
- Department of Mathematics, University of Kentucky, Lexington, Kentucky 40506, United States
| |
Collapse
|
28
|
Luukkonen S, van den Maagdenberg HW, Emmerich MTM, van Westen GJP. Artificial intelligence in multi-objective drug design. Curr Opin Struct Biol 2023; 79:102537. [PMID: 36774727 DOI: 10.1016/j.sbi.2023.102537] [Citation(s) in RCA: 26] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2022] [Revised: 12/21/2022] [Accepted: 01/03/2023] [Indexed: 02/12/2023]
Abstract
The factors determining a drug's success are manifold, making de novo drug design an inherently multi-objective optimisation (MOO) problem. With the advent of machine learning and optimisation methods, the field of multi-objective compound design has seen a rapid increase in developments and applications. Population-based metaheuris-tics and deep reinforcement learning are the most commonly used artificial intelligence methods in the field, but recently conditional learning methods are gaining popularity. The former approaches are coupled with a MOO strat-egy which is most commonly an aggregation function, but Pareto-based strategies are widespread too. Besides these and conditional learning, various innovative approaches to tackle MOO in drug design have been proposed. Here we provide a brief overview of the field and the latest innovations.
Collapse
Affiliation(s)
- Sohvi Luukkonen
- Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, the Netherlands. https://twitter.com/sohvi_luukkonen
| | - Helle W van den Maagdenberg
- Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, the Netherlands
| | - Michael T M Emmerich
- Leiden Institute of Advanced Computer Science, Leiden University, Niels Bohrweg 1, Leiden, 2333 CC, the Netherlands
| | - Gerard J P van Westen
- Leiden Academic Centre for Drug Research, Leiden University, Einsteinweg 55, Leiden, 2333 CC, the Netherlands.
| |
Collapse
|
29
|
Zhu Z, Dou B, Cao Y, Jiang J, Zhu Y, Chen D, Feng H, Liu J, Zhang B, Zhou T, Wei GW. TIDAL: Topology-Inferred Drug Addiction Learning. J Chem Inf Model 2023; 63:1472-1489. [PMID: 36826415 DOI: 10.1021/acs.jcim.3c00046] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/25/2023]
Abstract
Drug addiction is a global public health crisis, and the design of antiaddiction drugs remains a major challenge due to intricate mechanisms. Since experimental drug screening and optimization are too time-consuming and expensive, there is urgent need to develop innovative artificial intelligence (AI) methods for addressing the challenge. We tackle this challenge by topology-inferred drug addiction learning (TIDAL) built from integrating multiscale topological Laplacians, deep bidirectional transformer, and ensemble-assisted neural networks (EANNs). Multiscale topological Laplacians are a novel class of algebraic topology tools that embed molecular topological invariants and algebraic invariants into its harmonic spectra and nonharmonic spectra, respectively. These invariants complement sequence information extracted from a bidirectional transformer. We validate the proposed TIDAL framework on 22 drug addiction related, 4 hERG, and 12 DAT data sets, which suggests that the proposed TIDAL is a state-of-the-art framework for the modeling and analysis of drug addiction data. We carry out cross-target analysis of the current drug addiction candidates to alert their side effects and identify their repurposing potentials. Our analysis reveals drug-mediated linear and bilinear target correlations. Finally, TIDAL is applied to shed light on relative efficacy, repurposing potential, and potential side effects of 12 existing antiaddiction medications. Our results suggest that TIDAL provides a new computational strategy for pressingly needed antisubstance addiction drug development.
Collapse
Affiliation(s)
- Zailiang Zhu
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, P R. China
| | - Bozheng Dou
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan, 430200, P R. China
| | - Yukang Cao
- School of Computer Science and Artificial Intelligence, Wuhan Textile University, Wuhan, 430200, P R. China
| | - Jian Jiang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan, 430200, P R. China.,Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yueying Zhu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan, 430200, P R. China
| | - Dong Chen
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Hongsong Feng
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Jie Liu
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan, 430200, P R. China
| | - Bengong Zhang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan, 430200, P R. China
| | - Tianshou Zhou
- Key Laboratory of Computational Mathematics, Guangdong Province, and School of Mathematics, Sun Yat-sen University, Guangzhou, 510006, P R. China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States.,Department of Electrical and Computer Engineering Michigan State University, East Lansing, Michigan 48824, United States.,Department of Biochemistry and Molecular Biology Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
30
|
Feng H, Elladki R, Jiang J, Wei GW. Machine-learning analysis of opioid use disorder informed by MOR, DOR, KOR, NOR and ZOR-based interactome networks. Comput Biol Med 2023; 157:106745. [PMID: 36924727 DOI: 10.1016/j.compbiomed.2023.106745] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2023] [Revised: 02/11/2023] [Accepted: 03/04/2023] [Indexed: 03/17/2023]
Abstract
Opioid use disorder (OUD) continuously poses major public health challenges and social implications worldwide with dramatic rise of opioid dependence leading to potential abuse. Despite that a few pharmacological agents have been approved for OUD treatment, the efficacy of said agents for OUD requires further improvement in order to provide safer and more effective pharmacological and psychosocial treatments. Proteins including mu, delta, kappa, nociceptin, and zeta opioid receptors are the direct targets of opioids and play critical roles in therapeutic treatments. The protein-protein interaction (PPI) networks of the these receptors increase the complexity in the drug development process for an effective opioid addiction treatment. The report below presents a PPI-network informed machine-learning study of OUD. We have examined more than 500 proteins in the five opioid receptor networks and subsequently collected 74 inhibitor datasets. Machine learning models were constructed by pairing gradient boosting decision tree (GBDT) algorithm with two advanced natural language processing (NLP)-based autoencoder and Transformer fingerprints for molecules. With these models, we systematically carried out evaluations of screening and repurposing potential of more than 120,000 drug candidates for four opioid receptors. In addition, absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties were also considered in the screening of potential drug candidates. Our machine-learning tools determined a few inhibitor compounds with desired potency and ADMET properties for nociceptin opioid receptors. Our approach offers a valuable and promising tool for the pharmacological development of OUD treatments.
Collapse
Affiliation(s)
- Hongsong Feng
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Rana Elladki
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Jian Jiang
- Research Center of Nonlinear Science, School of Mathematical and Physical Sciences, Wuhan Textile University, Wuhan, 430200, PR China
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA.
| |
Collapse
|
31
|
Zhang Y, Li S, Xing M, Yuan Q, He H, Sun S. Universal Approach to De Novo Drug Design for Target Proteins Using Deep Reinforcement Learning. ACS OMEGA 2023; 8:5464-5474. [PMID: 36816653 PMCID: PMC9933084 DOI: 10.1021/acsomega.2c06653] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/15/2022] [Accepted: 01/05/2023] [Indexed: 05/28/2023]
Abstract
In drug design, the design and manufacture of safe and effective compounds is a long-term, complex, and complicated process. Therefore, developing a new rapid and generalizable drug design method is of great value. This study aimed to propose a general model based on reinforcement learning combined with drug-target interaction, which could be used to design new molecules according to different protein targets. The method adopted recurrent neural network molecular modeling and took the drug-target affinity model as the reward function of optimal molecular generation. It did not need to know the three-dimensional structure and active sites of protein targets but only required the information of a one-dimensional amino acid sequence. This approach was demonstrated to produce drugs highly similar to marketed drugs and design molecules with a better binding energy.
Collapse
Affiliation(s)
- Yunjiang Zhang
- Beijing
Key Laboratory for Green Catalysis and Separation, The Faculty of
Environment and Life, Beijing University
of Technology, Beijing100124, PR China
| | - Shuyuan Li
- Beijing
Key Laboratory for Green Catalysis and Separation, The Faculty of
Environment and Life, Beijing University
of Technology, Beijing100124, PR China
| | - Miaojuan Xing
- Beijing
Key Laboratory for Green Catalysis and Separation, The Faculty of
Environment and Life, Beijing University
of Technology, Beijing100124, PR China
| | - Qing Yuan
- Department
of Chemistry and Chemical Engineering, Beijing
University of Technology, Beijing100124, China
| | - Hong He
- Beijing
Key Laboratory for Green Catalysis and Separation, The Faculty of
Environment and Life, Beijing University
of Technology, Beijing100124, PR China
| | - Shaorui Sun
- Beijing
Key Laboratory for Green Catalysis and Separation, The Faculty of
Environment and Life, Beijing University
of Technology, Beijing100124, PR China
| |
Collapse
|
32
|
Hayes N, Merkurjev E, Wei GW. Integrating transformer and autoencoder techniques with spectral graph algorithms for the prediction of scarcely labeled molecular data. Comput Biol Med 2023; 153:106479. [PMID: 36610214 PMCID: PMC9868114 DOI: 10.1016/j.compbiomed.2022.106479] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/03/2022] [Revised: 10/25/2022] [Accepted: 12/21/2022] [Indexed: 12/24/2022]
Abstract
In molecular and biological sciences, experiments are expensive, time-consuming, and often subject to ethical constraints. Consequently, one often faces the challenging task of predicting desirable properties from small data sets or scarcely-labeled data sets. Although transfer learning can be advantageous, it requires the existence of a related large data set. This work introduces three graph-based models incorporating Merriman-Bence-Osher (MBO) techniques to tackle this challenge. Specifically, graph-based modifications of the MBO scheme are integrated with state-of-the-art techniques, including a home-made transformer and an autoencoder, in order to deal with scarcely-labeled data sets. In addition, a consensus technique is detailed. The proposed models are validated using five benchmark data sets. We also provide a thorough comparison to other competing methods, such as support vector machines, random forests, and gradient boosting decision trees, which are known for their good performance on small data sets. The performances of various methods are analyzed using residue-similarity (R-S) scores and R-S indices. Extensive computational experiments and theoretical analysis show that the new models perform very well even when as little as 1% of the data set is used as labeled data.
Collapse
Affiliation(s)
- Nicole Hayes
- Department of Mathematics, Michigan State University, MI 48824, USA
| | - Ekaterina Merkurjev
- Department of Mathematics, Michigan State University, MI 48824, USA; Department of Computational Mathematics, Science and Engineering, Michigan State University, MI 48824, USA.
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, MI 48824, USA; Department of Electrical and Computer Engineering, Michigan State University, MI 48824, USA; Department of Biochemistry and Molecular Biology, Michigan State University, MI 48824, USA
| |
Collapse
|
33
|
Noguchi S, Inoue J. Exploration of Chemical Space Guided by PixelCNN for Fragment-Based De Novo Drug Discovery. J Chem Inf Model 2022; 62:5988-6001. [PMID: 36454646 DOI: 10.1021/acs.jcim.2c01345] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/05/2022]
Abstract
We report a novel framework for achieving fragment-based molecular design using pixel convolutional neural network (PixelCNN) combined with the simplified molecular input line entry system (SMILES) as molecular representation. While a widely used recurrent neural network (RNN) assumes monotonically decaying correlations in strings, PixelCNN captures a periodicity among characters of SMILES. Thus, PixelCNN provides us with a novel solution for the analysis of chemical space by extracting the periodicity of molecular structures that will be buried in SMILES. Moreover, this characteristic enables us to generate molecules by combining several simple building blocks, such as a benzene ring and side-chain structures, which contributes to the effective exploration of chemical space by step-by-step searching for molecules from a target fragment. In conclusion, PixelCNN could be a powerful approach focusing on the periodicity of molecules to explore chemical space for the fragment-based molecular design.
Collapse
Affiliation(s)
- Satoshi Noguchi
- Department of Advanced Interdisciplinary Studies, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo153-8904, Japan
| | - Junya Inoue
- Institute for Industrial Science, The University of Tokyo, 5-1-5 Kashiwanoha, Kashiwa, Chiba277-0082, Japan.,Department of Materials Engineering, The University of Tokyo, 7-3-1 Hongo, Bunkyo, Tokyo113-8656, Japan.,Research Center for Advanced Science and Technology, The University of Tokyo, 4-6-1 Komaba, Meguro, Tokyo153-8904, Japan
| |
Collapse
|
34
|
Hu F, Wang D, Huang H, Hu Y, Yin P. Bridging the Gap between Target-Based and Cell-Based Drug Discovery with a Graph Generative Multitask Model. J Chem Inf Model 2022; 62:6046-6056. [PMID: 36401569 DOI: 10.1021/acs.jcim.2c01180] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The development of new drugs is crucial for protecting humans from disease. In the past several decades, target-based screening has been one of the most popular methods for developing new drugs. This method efficiently screens potential inhibitors of a target protein in vitro, but it frequently fails in vivo due to insufficient activity of the selected drugs. There is a need for accurate computational methods to bridge this gap. Here, we present a novel graph multi-task deep learning model to identify compounds with both target inhibitory and cell active (MATIC) properties. On a carefully curated SARS-CoV-2 data set, the proposed MATIC model shows advantages compared with the traditional method in screening effective compounds in vivo. Following this, we investigated the interpretability of the model and discovered that the learned features for target inhibition (in vitro) or cell active (in vivo) tasks are different with molecular property correlations and atom functional attention. Based on these findings, we utilized a Monte Carlo-based reinforcement learning generative model to generate novel multiproperty compounds with both in vitro and in vivo efficacy, thus bridging the gap between target-based and cell-based drug discovery. The tool is freely accessible at https://github.com/SIAT-code/MATIC.
Collapse
Affiliation(s)
- Fan Hu
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen518055, China
| | - Dongqi Wang
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen518055, China
| | - Huazhen Huang
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen518055, China
| | - Yishen Hu
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen518055, China
| | - Peng Yin
- Guangdong-Hong Kong-Macao Joint Laboratory of Human-Machine Intelligence-Synergy Systems, Shenzhen Institute of Advanced Technology, Chinese Academy of Sciences, Shenzhen518055, China
| |
Collapse
|
35
|
Tan Y, Dai L, Huang W, Guo Y, Zheng S, Lei J, Chen H, Yang Y. DRlinker: Deep Reinforcement Learning for Optimization in Fragment Linking Design. J Chem Inf Model 2022; 62:5907-5917. [PMID: 36404642 DOI: 10.1021/acs.jcim.2c00982] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Fragment-based drug discovery is a widely used strategy for drug design in both academic and pharmaceutical industries. Although fragments can be linked to generate candidate compounds by the latest deep generative models, generating linkers with specified attributes remains underdeveloped. In this study, we presented a novel framework, DRlinker, to control fragment linking toward compounds with given attributes through reinforcement learning. The method has been shown to be effective for many tasks from controlling the linker length and log P, optimizing predicted bioactivity of compounds, to various multiobjective tasks. Specifically, our model successfully generated 91.0% and 93.9% of compounds complying with the desired linker length and log P and improved the 7.5 pChEMBL value in bioactivity optimization. Finally, a quasi-scaffold-hopping study revealed that DRlinker could generate nearly 30% molecules with high 3D similarity but low 2D similarity to the lead inhibitor, demonstrating the benefits and applicability of DRlinker in actual fragment-based drug design.
Collapse
Affiliation(s)
- Youhai Tan
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou510006, China
| | - Lingxue Dai
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou510006, China
| | - Weifeng Huang
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou510006, China
| | - Yinfeng Guo
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou510006, China
| | - Shuangjia Zheng
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou510006, China.,Galixir Technologies, Beijing100083, China
| | - Jinping Lei
- School of Pharmaceutical Science, Sun Yat-sen University, Guangzhou510006, China
| | - Hongming Chen
- Guangzhou Laboratory, No. 9 XinDaoHuanBei Road, Guangzhou International Bio Island, Guangzhou510005, China
| | - Yuedong Yang
- School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou510006, China
| |
Collapse
|
36
|
Urbina F, Ekins S. The Commoditization of AI for Molecule Design. ARTIFICIAL INTELLIGENCE IN THE LIFE SCIENCES 2022; 2:100031. [PMID: 36211981 PMCID: PMC9541920 DOI: 10.1016/j.ailsci.2022.100031] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]
Abstract
Anyone involved in designing or finding molecules in the life sciences over the past few years has witnessed a dramatic change in how we now work due to the COVID-19 pandemic. Computational technologies like artificial intelligence (AI) seemed to become ubiquitous in 2020 and have been increasingly applied as scientists worked from home and were separated from the laboratory and their colleagues. This shift may be more permanent as the future of molecule design across different industries will increasingly require machine learning models for design and optimization of molecules as they become "designed by AI". AI and machine learning has essentially become a commodity within the pharmaceutical industry. This perspective will briefly describe our personal opinions of how machine learning has evolved and is being applied to model different molecule properties that crosses industries in their utility and ultimately suggests the potential for tight integration of AI into equipment and automated experimental pipelines. It will also describe how many groups have implemented generative models covering different architectures, for de novo design of molecules. We also highlight some of the companies at the forefront of using AI to demonstrate how machine learning has impacted and influenced our work. Finally, we will peer into the future and suggest some of the areas that represent the most interesting technologies that may shape the future of molecule design, highlighting how we can help increase the efficiency of the design-make-test cycle which is currently a major focus across industries.
Collapse
Affiliation(s)
- Fabio Urbina
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| |
Collapse
|
37
|
Liu X, Feng H, Wu J, Xia K. Hom-Complex-Based Machine Learning (HCML) for the Prediction of Protein-Protein Binding Affinity Changes upon Mutation. J Chem Inf Model 2022; 62:3961-3969. [PMID: 36040839 DOI: 10.1021/acs.jcim.2c00580] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Protein-protein interactions (PPIs) are involved in almost all biological processes in the cell. Understanding protein-protein interactions holds the key for the understanding of biological functions, diseases and the development of therapeutics. Recently, artificial intelligence (AI) models have demonstrated great power in PPIs. However, a key issue for all AI-based PPI models is efficient molecular representations and featurization. Here, we propose Hom-complex-based PPI representation, and Hom-complex-based machine learning models for the prediction of PPI binding affinity changes upon mutation, for the first time. In our model, various Hom complexes Hom(G1, G) can be generated for the graph representation G of protein-protein complex by using different graphs G1, which reveal G1-related inner connections within the graph representation G of protein-protein complex. Further, for a specific graph G1, a series of nested Hom complexes are generated to give a multiscale characterization of the PPIs. Its persistent homology and persistent Euler characteristic are used as molecular descriptors and further combined with the machine learning model, in particular, gradient boosting tree (GBT). We systematically test our model on the two most-commonly used data sets, that is, SKEMPI and AB-Bind. It has been found that our model outperforms all the existing models as far as we know, which demonstrates the great potential of our model for the analysis of PPIs. Our model can be used for the analysis and design of efficient antibodies for SARS-CoV-2.
Collapse
Affiliation(s)
- Xiang Liu
- Chern Institute of Mathematics and LPMC, Nankai University, Tianjin, China, 300071.,Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371
| | - Huitao Feng
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371.,Mathematical Science Research Center, Chongqing University of Technology, Chongqing, China, 400054
| | - Jie Wu
- Yanqi Lake Beijing Institute of Mathematical Sciences and Applications (BIMSA), Beijing, China,101408
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences Nanyang Technological University, Singapore 637371
| |
Collapse
|
38
|
Li C, Wang C, Sun M, Zeng Y, Yuan Y, Gou Q, Wang G, Guo Y, Pu X. Correlated RNN Framework to Quickly Generate Molecules with Desired Properties for Energetic Materials in the Low Data Regime. J Chem Inf Model 2022; 62:4873-4887. [PMID: 35998331 DOI: 10.1021/acs.jcim.2c00997] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Motivated by the challenging of deep learning on the low data regime and the urgent demand for intelligent design on highly energetic materials, we explore a correlated deep learning framework, which consists of three recurrent neural networks (RNNs) correlated by the transfer learning strategy, to efficiently generate new energetic molecules with a high detonation velocity in the case of very limited data available. To avoid the dependence on the external big data set, data augmentation by fragment shuffling of 303 energetic compounds is utilized to produce 500,000 molecules to pretrain RNN, through which the model can learn sufficient structure knowledge. Then the pretrained RNN is fine-tuned by focusing on the 303 energetic compounds to generate 7153 molecules similar to the energetic compounds. In order to more reliably screen the molecules with a high detonation velocity, the SMILE enumeration augmentation coupled with the pretrained knowledge is utilized to build an RNN-based prediction model, through which R2 is boosted from 0.4446 to 0.9572. The comparable performance with the transfer learning strategy based on an existing big database (ChEMBL) to produce the energetic molecules and drug-like ones further supports the effectiveness and generality of our strategy in the low data regime. High-precision quantum mechanics calculations further confirm that 35 new molecules present a higher detonation velocity and lower synthetic accessibility than the classic explosive RDX, along with good thermal stability. In particular, three new molecules are comparable to caged CL-20 in the detonation velocity. All the source codes and the data set are freely available at https://github.com/wangchenghuidream/RNNMGM.
Collapse
Affiliation(s)
- Chuan Li
- College of Computer Science, Sichuan University, Chengdu 610064, China
| | - Chenghui Wang
- College of Computer Science, Sichuan University, Chengdu 610064, China
| | - Ming Sun
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Yan Zeng
- College of Computer Science, Sichuan University, Chengdu 610064, China
| | - Yuan Yuan
- College of Management, Southwest University for Nationalities, Chengdu 610041, China
| | - Qiaolin Gou
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Guangchuan Wang
- College of Computer Science, Sichuan University, Chengdu 610064, China
| | - Yanzhi Guo
- College of Chemistry, Sichuan University, Chengdu 610064, China
| | - Xuemei Pu
- College of Chemistry, Sichuan University, Chengdu 610064, China
| |
Collapse
|
39
|
Zhang J, Chen H. De Novo Molecule Design Using Molecular Generative Models Constrained by Ligand-Protein Interactions. J Chem Inf Model 2022; 62:3291-3306. [PMID: 35793555 DOI: 10.1021/acs.jcim.2c00177] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
In recent years, molecular deep generative models have attracted much attention for its application in de novo drug design. The data-driven molecular deep generative model approximates the high dimensional distribution of the chemical space through learning from a large number of molecular structural data. So far, most of the molecular generative models rely on purely 2D ligand information in structure generation. Here, we propose a novel molecular deep generative model which adopts a recurrent neural network architecture coupled with a ligand-protein interaction fingerprint as constraints. The fingerprint was constructed on ligand docking poses and represents the 3D binding mode of ligands in the protein pocket. In the current work, generative models constrained with interaction fingerprints were trained and compared with normal RNN models. It has been shown that models trained with constraints of ligand-protein interaction fingerprint have a clear tendency to generating compounds maintaining similar binding modes. Our results demonstrate the potential application of the interaction fingerprint-constrained generative model for the targeted molecule generation and guided exploration on the drug-like chemical space.
Collapse
Affiliation(s)
- Jie Zhang
- Guangdong Provincial Key Laboratory of Laboratory Animals, Guangdong Laboratory Animals Monitoring Institute, Guangzhou 510663, P. R. China.,State Key Laboratory of Respiratory Disease, Guangzhou Institutes of Biomedicine and Health, Chinese Academy of Sciences, Guangzhou 510530, P. R. China.,Bioland Laboratory (Guangzhou Regenerative Medicine and Health─Guangdong Laboratory), Guangzhou 510530, P. R. China
| | - Hongming Chen
- Bioland Laboratory (Guangzhou Regenerative Medicine and Health─Guangdong Laboratory), Guangzhou 510530, P. R. China.,Guangzhou International Bio Island, Guangzhou Laboratory, No. 9 XinDaoHuanBei Road, Guangzhou 510005, China
| |
Collapse
|
40
|
Gao K, Wang R, Chen J, Cheng L, Frishcosy J, Huzumi Y, Qiu Y, Schluckbier T, Wei X, Wei GW. Methodology-Centered Review of Molecular Modeling, Simulation, and Prediction of SARS-CoV-2. Chem Rev 2022; 122:11287-11368. [PMID: 35594413 PMCID: PMC9159519 DOI: 10.1021/acs.chemrev.1c00965] [Citation(s) in RCA: 37] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
Despite tremendous efforts in the past two years, our understanding of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), virus-host interactions, immune response, virulence, transmission, and evolution is still very limited. This limitation calls for further in-depth investigation. Computational studies have become an indispensable component in combating coronavirus disease 2019 (COVID-19) due to their low cost, their efficiency, and the fact that they are free from safety and ethical constraints. Additionally, the mechanism that governs the global evolution and transmission of SARS-CoV-2 cannot be revealed from individual experiments and was discovered by integrating genotyping of massive viral sequences, biophysical modeling of protein-protein interactions, deep mutational data, deep learning, and advanced mathematics. There exists a tsunami of literature on the molecular modeling, simulations, and predictions of SARS-CoV-2 and related developments of drugs, vaccines, antibodies, and diagnostics. To provide readers with a quick update about this literature, we present a comprehensive and systematic methodology-centered review. Aspects such as molecular biophysics, bioinformatics, cheminformatics, machine learning, and mathematics are discussed. This review will be beneficial to researchers who are looking for ways to contribute to SARS-CoV-2 studies and those who are interested in the status of the field.
Collapse
Affiliation(s)
- Kaifu Gao
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Rui Wang
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Jiahui Chen
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Limei Cheng
- Clinical
Pharmacology and Pharmacometrics, Bristol
Myers Squibb, Princeton, New Jersey 08536, United States
| | - Jaclyn Frishcosy
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yuta Huzumi
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Yuchi Qiu
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Tom Schluckbier
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Xiaoqi Wei
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department
of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department
of Electrical and Computer Engineering, Michigan State University, East Lansing, Michigan 48824, United States
- Department
of Biochemistry and Molecular Biology, Michigan
State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
41
|
Yang Y, Wu Z, Yao X, Kang Y, Hou T, Hsieh CY, Liu H. Exploring Low-Toxicity Chemical Space with Deep Learning for Molecular Generation. J Chem Inf Model 2022; 62:3191-3199. [PMID: 35713712 DOI: 10.1021/acs.jcim.2c00671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Creating a wide range of new compounds that not only have ideal pharmacological properties but also easily pass long-term toxicity evaluation is still a challenging task in current drug discovery. In this study, we developed a conditional generative model by combining a semisupervised variational autoencoder (SSVAE) with an MGA toxicity predictor. Our aim is to generate molecules with low toxicity, good drug-like properties, and structural diversity. For multiobjective optimization, we have developed a method with hierarchical constraints on the toxicity space of small molecules to generate drug-like small molecules, which can also minimize the effect on the diversity of generated results. The evaluation results of the metrics indicate that the developed model has good effectiveness, novelty, and diversity. The generated molecules by this model are mainly distributed in low-toxicity regions, which suggests that our model can efficiently constrain the generation of toxic structures. In contrast to simply filtering toxic ones after generation, the low-toxicity molecular generative model can generate molecules with structural diversity. Our strategy can be used in target-based drug discovery to improve the quality of generated molecules with low-toxicity, drug-like, and highly active properties.
Collapse
Affiliation(s)
- Yuwei Yang
- School of Pharmacy, Lanzhou University, Lanzhou 730000, China
| | - Zhenxing Wu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, P. R. China
| | - Xiaojun Yao
- College of Chemistry and Chemical Engineering, Lanzhou University, Lanzhou 730000, China
| | - Yu Kang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, P. R. China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou 310058, P. R. China
| | - Chang-Yu Hsieh
- Tencent Quantum Laboratory, Tencent, Shenzhen 518000, China
| | - Huanxiang Liu
- School of Pharmacy, Lanzhou University, Lanzhou 730000, China.,Faculty of Applied Science, Macao Polytechnic University, Macao, SAR 999078, China
| |
Collapse
|
42
|
Jackson IM, Webb EW, Scott PJ, James ML. In Silico Approaches for Addressing Challenges in CNS Radiopharmaceutical Design. ACS Chem Neurosci 2022; 13:1675-1683. [PMID: 35606334 PMCID: PMC9945852 DOI: 10.1021/acschemneuro.2c00269] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023] Open
Abstract
Positron emission tomography (PET) is a highly sensitive and versatile molecular imaging modality that leverages radiolabeled molecules, known as radiotracers, to interrogate biochemical processes such as metabolism, enzymatic activity, and receptor expression. The ability to probe specific molecular and cellular events longitudinally in a noninvasive manner makes PET imaging a particularly powerful technique for studying the central nervous system (CNS) in both health and disease. Unfortunately, developing and translating a single CNS PET tracer for clinical use is typically an extremely resource-intensive endeavor, often requiring synthesis and evaluation of numerous candidate molecules. While existing in vitro methods are beginning to address the challenge of derisking molecules prior to costly in vivo PET studies, most require a significant investment of resources and possess substantial limitations. In the context of CNS drug development, significant time and resources have been invested into the development and optimization of computational methods, particularly involving machine learning, to streamline the design of better CNS therapeutics. However, analogous efforts developed and validated for CNS radiotracer design are conspicuously limited. In this Perspective, we overview the requirements and challenges of CNS PET tracer design, survey the most promising computational methods for in silico CNS drug design, and bridge these two areas by discussing the potential applications and impact of computational design tools in CNS radiotracer design.
Collapse
Affiliation(s)
- Isaac M. Jackson
- Department of Radiology, Stanford University, Stanford, CA 94305
| | - E. William Webb
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109
| | - Peter J.H. Scott
- Department of Radiology, University of Michigan, Ann Arbor, MI 48109;,Corresponding Authors: Peter J. H. Scott − Department of Radiology, University of Michigan, Ann Arbor, MI 48109, United States; , Michelle L. James − Departments of Radiology, and Neurology & Neurological Sciences, 1201 Welch Rd., P-206, Stanford, CA 94305-5484, United States;
| | - Michelle L. James
- Department of Radiology, Stanford University, Stanford, CA 94305;,Department of Neurology & Neurological Sciences, Stanford University, Stanford, CA 94304.,Corresponding Authors: Peter J. H. Scott − Department of Radiology, University of Michigan, Ann Arbor, MI 48109, United States; , Michelle L. James − Departments of Radiology, and Neurology & Neurological Sciences, 1201 Welch Rd., P-206, Stanford, CA 94305-5484, United States;
| |
Collapse
|
43
|
Lee M, Min K. MGCVAE: Multi-Objective Inverse Design via Molecular Graph Conditional Variational Autoencoder. J Chem Inf Model 2022; 62:2943-2950. [PMID: 35666276 DOI: 10.1021/acs.jcim.2c00487] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The ultimate goal of various fields is to directly generate molecules with desired properties, such as water-soluble molecules in drug development and molecules suitable for organic light-emitting diodes or photosensitizers in the field of development of new organic materials. This study proposes a molecular graph generative model based on an autoencoder for the de novo design. The performance of the molecular graph conditional variational autoencoder (MGCVAE) for generating molecules with specific desired properties was investigated by comparing it to a molecular graph variational autoencoder (MGVAE). Furthermore, multi-objective optimization for MGCVAE was applied to satisfy the two selected properties simultaneously. In this study, two physical properties, calculated logP and molar refractivity, were used as optimization targets for designing de novo molecules. Consequently, it was confirmed that among the generated molecules, 25.89% of the optimized molecules were generated in MGCVAE compared to 0.66% in MGVAE. This demonstrates that MGCVAE effectively produced drug-like molecules with two target properties. The results of this study suggest that these graph-based data-driven models are an effective method for designing new molecules that fulfill various physical properties.
Collapse
Affiliation(s)
- Myeonghun Lee
- School of Systems Biomedical Science, Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul 06978, Republic of Korea
| | - Kyoungmin Min
- School of Mechanical Engineering, Soongsil University, 369 Sangdo-ro, Dongjak-gu, Seoul 06978, Republic of Korea
| |
Collapse
|
44
|
Tan Z, Li Y, Zhang Z, Wu X, Penfold T, Shi W, Yang S. Efficient Adversarial Generation of Thermally Activated Delayed Fluorescence Molecules. ACS OMEGA 2022; 7:18179-18188. [PMID: 35664624 PMCID: PMC9161419 DOI: 10.1021/acsomega.2c02253] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/11/2022] [Accepted: 05/11/2022] [Indexed: 06/15/2023]
Abstract
Adversarial generative models are becoming an essential tool in molecular design and discovery due to their efficiency in exploring the desired chemical space with the assistance of deep learning. In this article, we introduce an integrated framework by combining the modules of algorithmic synthesis, deep prediction, adversarial generation, and fine screening for the purpose of effective design of the thermally activated delayed fluorescence (TADF) molecules that can be used in the organic light-emitting diode devices. The retrosynthetic rules are employed to algorithmically synthesize the D-A complex based on the empirically defined donor and acceptor moieties, which is followed by the high-throughput labeling and prediction with the deep neural network. The new D-A molecules are subsequently generated via the adversarial autoencoder, with the excited-state property distributions perfectly matching those of the original samples. Fine screening of the generated molecules, including the spin-orbital coupling calculation and the excited-state optimization, is eventually implemented to select the qualified TADF candidates within the novel chemical space. Further investigation shows that the created structures fully mimic the original D-A samples by maintaining a significant charge transfer characteristic, a minimal adiabatic singlet-triplet gap, and a moderate spin-orbital coupling that are desirable for the delayed fluorescence.
Collapse
Affiliation(s)
- Zheng Tan
- Chengdu
Polytechnic, 83 Tianyi
Street, Chengdu, Sichuan 610000, P. R. China
| | - Yan Li
- Xiyuan
Quantitative Technology, 388 Yizhou Road, Chengdu, Sichuan 610000, P.
R. China
| | - Ziying Zhang
- Guangzhou
Yinfo Information Technology, 2 Ruyi Road, Panyu District, Guangzhou 511431, P. R. China
| | - Xin Wu
- Xiyuan
Quantitative Technology, 388 Yizhou Road, Chengdu, Sichuan 610000, P.
R. China
| | - Thomas Penfold
- Chemistry-School
of Natural and Environmental Sciences, Newcastle
University, Newcastle Upon Tyne NE1 7RU, U.K.
| | - Weimei Shi
- Chengdu
Polytechnic, 83 Tianyi
Street, Chengdu, Sichuan 610000, P. R. China
| | - Shiqing Yang
- Chengdu
Polytechnic, 83 Tianyi
Street, Chengdu, Sichuan 610000, P. R. China
| |
Collapse
|
45
|
Panapitiya G, Girard M, Hollas A, Sepulveda J, Murugesan V, Wang W, Saldanha E. Evaluation of Deep Learning Architectures for Aqueous Solubility Prediction. ACS OMEGA 2022; 7:15695-15710. [PMID: 35571767 PMCID: PMC9096921 DOI: 10.1021/acsomega.2c00642] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2022] [Accepted: 04/11/2022] [Indexed: 05/17/2023]
Abstract
Determining the aqueous solubility of molecules is a vital step in many pharmaceutical, environmental, and energy storage applications. Despite efforts made over decades, there are still challenges associated with developing a solubility prediction model with satisfactory accuracy for many of these applications. The goals of this study are to assess current deep learning methods for solubility prediction, develop a general model capable of predicting the solubility of a broad range of organic molecules, and to understand the impact of data properties, molecular representation, and modeling architecture on predictive performance. Using the largest currently available solubility data set, we implement deep learning-based models to predict solubility from the molecular structure and explore several different molecular representations including molecular descriptors, simplified molecular-input line-entry system strings, molecular graphs, and three-dimensional atomic coordinates using four different neural network architectures-fully connected neural networks, recurrent neural networks, graph neural networks (GNNs), and SchNet. We find that models using molecular descriptors achieve the best performance, with GNN models also achieving good performance. We perform extensive error analysis to understand the molecular properties that influence model performance, perform feature analysis to understand which information about the molecular structure is most valuable for prediction, and perform a transfer learning and data size study to understand the impact of data availability on model performance.
Collapse
Affiliation(s)
- Gihan Panapitiya
- Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| | - Michael Girard
- Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| | - Aaron Hollas
- Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| | - Jonathan Sepulveda
- Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| | | | - Wei Wang
- Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| | - Emily Saldanha
- Pacific Northwest National
Laboratory, Richland, Washington 99352, United States
| |
Collapse
|
46
|
Hadfield TE, Imrie F, Merritt A, Birchall K, Deane CM. Incorporating Target-Specific Pharmacophoric Information into Deep Generative Models for Fragment Elaboration. J Chem Inf Model 2022; 62:2280-2292. [PMID: 35499971 PMCID: PMC9131447 DOI: 10.1021/acs.jcim.1c01311] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]
Abstract
Despite recent interest in deep generative models for scaffold elaboration, their applicability to fragment-to-lead campaigns has so far been limited. This is primarily due to their inability to account for local protein structure or a user's design hypothesis. We propose a novel method for fragment elaboration, STRIFE, that overcomes these issues. STRIFE takes as input fragment hotspot maps (FHMs) extracted from a protein target and processes them to provide meaningful and interpretable structural information to its generative model, which in turn is able to rapidly generate elaborations with complementary pharmacophores to the protein. In a large-scale evaluation, STRIFE outperforms existing, structure-unaware, fragment elaboration methods in proposing highly ligand-efficient elaborations. In addition to automatically extracting pharmacophoric information from a protein target's FHM, STRIFE optionally allows the user to specify their own design hypotheses.
Collapse
Affiliation(s)
- Thomas E Hadfield
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
| | - Fergus Imrie
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
| | - Andy Merritt
- LifeArc, SBC Open Innovation Campus, Stevenage SG1 2FX, United Kingdom
| | - Kristian Birchall
- LifeArc, SBC Open Innovation Campus, Stevenage SG1 2FX, United Kingdom
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, United Kingdom
| |
Collapse
|
47
|
Liu CH, Korablyov M, Jastrzębski S, Włodarczyk-Pruszyński P, Bengio Y, Segler M. RetroGNN: Fast Estimation of Synthesizability for Virtual Screening and De Novo Design by Learning from Slow Retrosynthesis Software. J Chem Inf Model 2022; 62:2293-2300. [PMID: 35452226 DOI: 10.1021/acs.jcim.1c01476] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
De novo molecule design algorithms often result in chemically unfeasible or synthetically inaccessible molecules. A natural idea to mitigate this problem is to bias these algorithms toward more easily synthesizable molecules using a proxy score for synthetic accessibility. However, using currently available proxies can still result in highly unrealistic compounds. Here, we propose a novel approach, RetroGNN, to estimate synthesizability. First, we search for routes using synthesis planning software for a large number of random molecules. This information is then used to train a graph neural network to predict the outcome of the synthesis planner given the target molecule, in which the regression task can be used as a synthesizability scorer. We highlight how RetroGNN can be used in generative molecule-discovery pipelines together with other scoring functions. We evaluate our approach on several QSAR-based molecule design benchmarks, for which we find synthesizable molecules with state-of-the-art scores. Compared to the virtual screening of 5 million existing molecules from the ZINC database, using RetroGNNScore with a simple fragment-based de novo design algorithm finds molecules predicted to be more likely to possess the desired activity exponentially faster, while maintaining good druglike properties and being easier to synthesize. Importantly, our deep neural network can successfully filter out hard to synthesize molecules while achieving a 105 times speedup over using retrosynthesis planning software.
Collapse
Affiliation(s)
- Cheng-Hao Liu
- Mila and Université de Montréal, 6666 St-Urbain Street, Montreal, Canada H2S 3H1.,Department of Chemistry, McGill University, 801 Sherbooke Street W, Montreal, Canada H3A 0B8
| | - Maksym Korablyov
- Mila and Université de Montréal, 6666 St-Urbain Street, Montreal, Canada H2S 3H1
| | - Stanisław Jastrzębski
- Molecule.one, Warsaw 00-815, Poland.,Faculty of Mathematics and Computer Science, Jagiellonian University, 6 Łojasiewicza Street, 30-348 Kraków, Poland
| | | | - Yoshua Bengio
- Mila and Université de Montréal, 6666 St-Urbain Street, Montreal, Canada H2S 3H1
| | - Marwin Segler
- Institute of Organic Chemistry and Center for Multiscale Theory and Computation, Westfälische Wilhelms-Universität Münster, 48149 Münster, Germany.,Microsoft Research, 21 Station Road, Cambridge, U.K. CB1 2FB
| |
Collapse
|
48
|
Feng H, Gao K, Chen D, Shen L, Robison AJ, Ellsworth E, Wei GW. Machine Learning Analysis of Cocaine Addiction Informed by DAT, SERT, and NET-Based Interactome Networks. J Chem Theory Comput 2022; 18:2703-2719. [PMID: 35294204 DOI: 10.1021/acs.jctc.2c00002] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
Cocaine addiction is a psychosocial disorder induced by the chronic use of cocaine and causes a large number of deaths around the world. Despite decades of effort, no drugs have been approved by the Food and Drug Administration (FDA) for the treatment of cocaine dependence. Cocaine dependence is neurological and involves many interacting proteins in the interactome. Among them, the dopamine (DAT), serotonin (SERT), and norepinephrine (NET) transporters are three major targets. Each of these targets has a large protein-protein interaction (PPI) network, which must be considered in the anticocaine addiction drug discovery. This work presents DAT, SERT, and NET interactome network-informed machine learning/deep learning (ML/DL) studies of cocaine addiction. We collected and analyzed 61 protein targets out of 460 proteins in the DAT, SERT, and NET PPI networks that have sufficiently large existing inhibitor datasets. Utilizing autoencoder (AE) and other ML/DL algorithms, including gradient boosting decision tree (GBDT) and multitask deep neural network (MT-DNN), we built predictive models for these targets with 115 407 inhibitors to predict drug repurposing potential and possible side effects. We further screened their absorption, distribution, metabolism, and excretion, and toxicity (ADMET) properties to search for leads having potential for developing treatments for cocaine addiction. Our approach offers a new systematic protocol for artificial intelligence (AI)-based anticocaine addiction lead discovery.
Collapse
Affiliation(s)
- Hongsong Feng
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Kaifu Gao
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Dong Chen
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Li Shen
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
| | - Alfred J Robison
- Department of Physiology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Edmund Ellsworth
- Department of Pharmacology & Toxicology, Michigan State University, East Lansing, Michigan 48824, United States
| | - Guo-Wei Wei
- Department of Mathematics, Michigan State University, East Lansing, Michigan 48824, United States
- Department of Biochemistry and Molecular Biology Michigan State University, East Lansing, Michigan 48824, United States
- Department of Electrical and Computer Engineering Michigan State University, East Lansing, Michigan 48824, United States
| |
Collapse
|
49
|
Creanza TM, Lamanna G, Delre P, Contino M, Corriero N, Saviano M, Mangiatordi GF, Ancona N. DeLA-Drug: A Deep Learning Algorithm for Automated Design of Druglike Analogues. J Chem Inf Model 2022; 62:1411-1424. [PMID: 35294184 DOI: 10.1021/acs.jcim.2c00205] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
In this paper, we present a deep learning algorithm for automated design of druglike analogues (DeLA-Drug), a recurrent neural network (RNN) model composed of two long short-term memory (LSTM) layers and conceived for data-driven generation of similar-to-bioactive compounds. DeLA-Drug captures the syntax of SMILES strings of more than 1 million compounds belonging to the ChEMBL28 database and, by employing a new strategy called sampling with substitutions (SWS), generates molecules starting from a single user-defined query compound. Remarkably, the algorithm preserves druglikeness and synthetic accessibility of the known bioactive compounds present in the ChEMBL28 repository. The absence of any time-demanding fine-tuning procedure enables DeLA-Drug to perform a fast generation of focused libraries for further high-throughput screening and makes it a suitable tool for performing de novo design even in low-data regimes. To provide a concrete idea of its applicability, DeLA-Drug was applied to the cannabinoid receptor subtype 2 (CB2R), a known target involved in different pathological conditions such as cancer and neurodegeneration. DeLA-Drug, available as a free web platform (http://www.ba.ic.cnr.it/softwareic/deladrugportal/), can help medicinal chemists interested in generating analogues of compounds already available in their laboratories and, for this reason, good candidates for an easy and low-cost synthesis.
Collapse
Affiliation(s)
- Teresa Maria Creanza
- CNR─Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing, Via Amendola 122/o, 70126 Bari, Italy
| | - Giuseppe Lamanna
- Chemistry Department, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125 Bari, Italy.,CNR─Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Pietro Delre
- Chemistry Department, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125 Bari, Italy.,CNR─Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Marialessandra Contino
- Department of Pharmacy─Pharmaceutical Sciences, University of Bari "Aldo Moro", via E. Orabona, 4, I-70125 Bari, Italy
| | - Nicola Corriero
- CNR─Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | - Michele Saviano
- CNR─Institute of Crystallography, Via Amendola 122/o, 70126 Bari, Italy
| | | | - Nicola Ancona
- CNR─Institute of Intelligent Industrial Technologies and Systems for Advanced Manufacturing, Via Amendola 122/o, 70126 Bari, Italy
| |
Collapse
|
50
|
Harada Y, Hatakeyama M, Maeda S, Gao Q, Koizumi K, Sakamoto Y, Ono Y, Nakamura S. Molecular Design Learned from the Natural Product Porphyra-334: Molecular Generation via Chemical Variational Autoencoder versus Database Mining via Similarity Search, A Comparative Study. ACS OMEGA 2022; 7:8581-8590. [PMID: 35309498 PMCID: PMC8928499 DOI: 10.1021/acsomega.1c06453] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Accepted: 02/18/2022] [Indexed: 06/14/2023]
Abstract
A comparative study is presented. The method via chemical variational autoencoder (VAE) and the method via similarity search are compared, focusing on their generation ability for new functional molecular design. Focusing on the natural porphyra-334 as a model molecule, we generated three groups: molecules of mycosporine-like amino acids (MAAs) as seeds (G SEEDS ), molecules generated via chemical VAE (G VAE ) and molecules gathered via similarity search (G SIM ). The number of molecules that satisfy the condition for the light absorption ability of porphyra-334 in G SEEDS , G VAE , and G SIM are 52, 138, and 6, respectively. The method via chemical VAE shows a promising potential for future molecular design. By using quantum chemistry wave function properties for chemical VAE, we find new molecules that are comparable to porphyra-334, including some with unexpected geometries. At the end, we show a group of molecules found with this method.
Collapse
Affiliation(s)
- Yuki Harada
- Cluster
for Science, Technology, and Innovation Hub, Nakamura Laboratory, RIKEN, 2-1, Hirosawa, Wako, Saitama 351-0198, Japan
| | - Makoto Hatakeyama
- Cluster
for Science, Technology, and Innovation Hub, Nakamura Laboratory, RIKEN, 2-1, Hirosawa, Wako, Saitama 351-0198, Japan
- Sanyo-Onoda
City University, 1-1-1
Daigakudori, Sanyo-Onoda, Yamaguchi 756-0884, Japan
| | - Shuichi Maeda
- Cluster
for Science, Technology, and Innovation Hub, Nakamura Laboratory, RIKEN, 2-1, Hirosawa, Wako, Saitama 351-0198, Japan
| | - Qi Gao
- Mitsubishi
Chemical Corporation Science & Innovation Center 1000 Kamoshida-cho, Yokohama, Kanagawa 227-8502, Japan
| | - Kenichi Koizumi
- Cluster
for Science, Technology, and Innovation Hub, Nakamura Laboratory, RIKEN, 2-1, Hirosawa, Wako, Saitama 351-0198, Japan
| | - Yuki Sakamoto
- Cluster
for Science, Technology, and Innovation Hub, Nakamura Laboratory, RIKEN, 2-1, Hirosawa, Wako, Saitama 351-0198, Japan
| | - Yuuki Ono
- Mitsubishi
Chemical Corporation Science & Innovation Center 1000 Kamoshida-cho, Yokohama, Kanagawa 227-8502, Japan
| | - Shinichiro Nakamura
- Cluster
for Science, Technology, and Innovation Hub, Nakamura Laboratory, RIKEN, 2-1, Hirosawa, Wako, Saitama 351-0198, Japan
| |
Collapse
|