1
|
Dong Z, Chen H, Yang Y, Hao H. Research on the optimization model of anti-breast cancer candidate drugs based on machine learning. Front Genet 2025; 16:1523015. [PMID: 40276676 PMCID: PMC12018315 DOI: 10.3389/fgene.2025.1523015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2024] [Accepted: 03/31/2025] [Indexed: 04/26/2025] Open
Abstract
Breast cancer is one of the most common malignancies among women globally, with its incidence rate continuously increasing, posing a serious threat to women's health. Although current treatments, such as drugs targeting estrogen receptor alpha (ERα), have extended patient survival, issues such as drug resistance and severe side effects remain widespread. This study proposes a machine learning-based optimization model for anti-breast cancer candidate drugs, aimed at enhancing biological activity and optimizing ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) properties through multi-objective optimization. Initially, grey relational analysis and Spearman correlation analysis were performed on the molecular descriptors of 1,974 compounds, identifying 91 key descriptors. A Random Forest model combined with Shapley Additive Explanations (SHAP) values was then used to further select the top 20 descriptors with the greatest impact on biological activity. The constructed Quantitative Structure-Activity Relationship (QSAR) model, using algorithms such as LightGBM, Random Forest, and XGBoost, achieved an R2 value of 0.743 for biological activity prediction, demonstrating strong predictive performance. Additionally, a multi-model fusion strategy and Particle Swarm Optimization (PSO) algorithm were employed to optimize both biological activity and ADMET properties, thereby improving the prediction of Caco-2, CYP3A4, hERG, HOB, and MN properties. For example, the best model for predicting Caco-2 achieved an F1 score of 0.8905, while the model for predicting CYP3A4 reached an F1 score of 0.9733. This multi-objective optimization model provides a novel and efficient tool for drug development, offering significant improvements in both biological activity and pharmacokinetic properties, with practical implications for the optimization of future anti-breast cancer drugs.
Collapse
Affiliation(s)
- Zhou Dong
- School of Information Engineering, Xi’an Eurasia University, Xi’an, China
| | | | | | | |
Collapse
|
2
|
Hassen AK, Šícho M, van Aalst YJ, Huizenga MCW, Reynolds DNR, Luukkonen S, Bernatavicius A, Clevert DA, Janssen APA, van Westen GJP, Preuss M. Generate what you can make: achieving in-house synthesizability with readily available resources in de novo drug design. J Cheminform 2025; 17:41. [PMID: 40155970 PMCID: PMC11954305 DOI: 10.1186/s13321-024-00910-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/28/2024] [Accepted: 09/28/2024] [Indexed: 04/01/2025] Open
Abstract
Computer-Aided Synthesis Planning (CASP) and CASP-based approximated synthesizability scores have rarely been used as generation objectives in Computer-Aided Drug Design despite facilitating the in-silico generation of synthesizable molecules. However, these synthesizability approaches are disconnected from the reality of small laboratory drug design, where building block resources are limited, thus making the notion of in-house synthesizability with already available resources highly desirable. In this work, we show a successful in-house de novo drug design workflow generating active and in-house synthesizable ligands of monoglyceride lipase (MGLL). First, we demonstrate the successful transfer of CASP from 17.4 million commercial building blocks to a small laboratory setting of roughly 6000 building blocks with only a decrease of -12% in CASP success when accepting two reaction-steps longer synthesis routes on average. Next, we present a rapidly retrainable in-house synthesizability score, successfully capturing our in-house synthesizability without relying on external building block resources. We show that including our in-house synthesizability score in a multi-objective de novo drug design workflow, alongside a simple QSAR model, provides thousands of potentially active and easily in-house synthesizable molecules. Finally, we experimentally evaluate the synthesis and biochemical activity of three de novo candidates using their CASP-suggested synthesis routes employing only in-house building blocks. We find one candidate with evident activity, suggesting potential new ligand ideas for MGLL inhibitors while showcasing the usefulness of our in-house synthesizability score for de novo drug design.Scientific contribution Our core scientific contribution is the introduction of in-house de novo drug design, which enables the practical application of generative methods in small laboratories by utilizing a limited stock of available building blocks. Our fast-to-adapt workflow for in-house synthesizability scoring requires minimal computational retraining costs while supporting a high diversity of generated structures. We highlight the practicality of our approach through a comprehensive in-vitro case study that relies entirely on in-house resources, including in-silico generation, synthesis planning, and activity evaluation.
Collapse
Affiliation(s)
- Alan Kai Hassen
- Leiden Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands.
- Machine Learning Research, Pfizer Research and Development, Berlin, Germany.
| | - Martin Šícho
- Leiden Academic Centre of Drug Research, Leiden University, Leiden, The Netherlands
- CZ-OPENSCREEN: National Infrastructure for Chemical Biology, Department of Informatics and Chemistry, Faculty of Chemical Technolog, University of Chemistry and Technology Prague, Prague, Czech Republic
| | - Yorick J van Aalst
- Leiden Academic Centre of Drug Research, Leiden University, Leiden, The Netherlands
| | | | - Darcy N R Reynolds
- Leiden Institute of Chemistry, Leiden University, Leiden, The Netherlands
| | - Sohvi Luukkonen
- Leiden Academic Centre of Drug Research, Leiden University, Leiden, The Netherlands
| | - Andrius Bernatavicius
- Leiden Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands
- Leiden Academic Centre of Drug Research, Leiden University, Leiden, The Netherlands
| | - Djork-Arné Clevert
- Machine Learning Research, Pfizer Research and Development, Berlin, Germany
| | | | - Gerard J P van Westen
- Leiden Academic Centre of Drug Research, Leiden University, Leiden, The Netherlands.
| | - Mike Preuss
- Leiden Institute of Advanced Computer Science, Leiden University, Leiden, The Netherlands.
| |
Collapse
|
3
|
Rui M, Su Y, Tang H, Li Y, Fang N, Ge Y, Feng Q, Feng C. Computational Design and Optimization of Multi-Compound Multivesicular Liposomes for Co-Delivery of Traditional Chinese Medicine Compounds. AAPS PharmSciTech 2025; 26:61. [PMID: 39934607 DOI: 10.1208/s12249-025-03042-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2024] [Accepted: 01/08/2025] [Indexed: 02/13/2025] Open
Abstract
Study explored the synergistic anti-tumor effects of a combination of compounds from Traditional Chinese Medicine, including rosmarinic acid (RA), chlorogenic acid (CA), and scoparone (SCO), in the formulation of multivesicular liposomes (MVLs). Optimization of formulations and process parameters was essential to achieve effective liposomal encapsulation and optimal release profiles for these three compounds with diverse properties. Traditional trial-and-error approaches are inefficient for the optimization of complex multi-compound MVLs. We developed a new formulation optimization model, which could address this issue by predicting the optimal multi-compound MVLs formulation. Our machine learning model integrated support vector machine regression (SVR) algorithm and cuckoo search (CS) algorithm, resulting in three CS-SVR models to predict single-compound MVLs. The CS algorithm, with various weighting rules, was then applied to search the best formulation parameters across three CS-SVR models and to maximize the encapsulation efficiency for all three compounds. The multi-compound MLVs were subsequently prepared under the predicted conditions, achieving an optimized particle size of 15.12 µm, with encapsulation efficiencies of 82.93 ± 2.43% for CA, 82.22 ± 1.25% for RA, and 95.60 ± 0.18% for SCO. The predicted optimal multi-compound MVLs were further validated through in vitro characterization and in vivo anti-tumor experiments, showing a promising synergistic anti-tumor effect consistent with in vitro results. This model accurately predicted optimal encapsulation conditions, which were validated experimentally, demonstrating improved encapsulation efficiencies and reduced trial-and-error iterations. Collectively, our model provides a predictive pathway for multi-compound MVLs formulation, indicating the ability of this model to significantly reduce experimental burden and accelerate formulation development.
Collapse
Affiliation(s)
- Mengjie Rui
- School of Pharmacy, Jiangsu University, No.301 Xuefu Road, Zhenjiang, 212013, Jiangsu Province, China
| | - Yali Su
- School of Pharmacy, Jiangsu University, No.301 Xuefu Road, Zhenjiang, 212013, Jiangsu Province, China
| | - Haidan Tang
- School of Pharmacy, Jiangsu University, No.301 Xuefu Road, Zhenjiang, 212013, Jiangsu Province, China
| | - Yinfeng Li
- School of Pharmacy, Jiangsu University, No.301 Xuefu Road, Zhenjiang, 212013, Jiangsu Province, China
| | - Naying Fang
- School of Pharmacy, Jiangsu University, No.301 Xuefu Road, Zhenjiang, 212013, Jiangsu Province, China
| | - Yingying Ge
- School of Pharmacy, Jiangsu University, No.301 Xuefu Road, Zhenjiang, 212013, Jiangsu Province, China
| | - Qiuqi Feng
- School of Pharmacy, Jiangsu University, No.301 Xuefu Road, Zhenjiang, 212013, Jiangsu Province, China
| | - Chunlai Feng
- Department of Obstetrics, Affiliated Hospital of Jiangsu University, No.438 Jiefang Road, Zhenjiang, 212001, Jiangsu Province, China.
- School of Pharmacy, Jiangsu University, No.301 Xuefu Road, Zhenjiang, 212013, Jiangsu Province, China.
| |
Collapse
|
4
|
Zhang K, Yang X, Wang Y, Yu Y, Huang N, Li G, Li X, Wu JC, Yang S. Artificial intelligence in drug development. Nat Med 2025; 31:45-59. [PMID: 39833407 DOI: 10.1038/s41591-024-03434-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2024] [Accepted: 11/25/2024] [Indexed: 01/22/2025]
Abstract
Drug development is a complex and time-consuming endeavor that traditionally relies on the experience of drug developers and trial-and-error experimentation. The advent of artificial intelligence (AI) technologies, particularly emerging large language models and generative AI, is poised to redefine this paradigm. The integration of AI-driven methodologies into the drug development pipeline has already heralded subtle yet meaningful enhancements in both the efficiency and effectiveness of this process. Here we present an overview of recent advancements in AI applications across the entire drug development workflow, encompassing the identification of disease targets, drug discovery, preclinical and clinical studies, and post-market surveillance. Lastly, we critically examine the prevailing challenges to highlight promising future research directions in AI-augmented drug development.
Collapse
Affiliation(s)
- Kang Zhang
- Eye Hospital and Institute for Advanced Study on Eye Health and Diseases, Institute for clinical Data Science, Wenzhou Medical University, Wenzhou, China.
- State Key Laboratory of Macromolecular Drugs and Large-Scale Preparation, Wenzhou Medical University, Wenzhou, China.
| | - Xin Yang
- Department of Biotherapy, Cancer Center and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Yifei Wang
- Department of Biotherapy, Cancer Center and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China
| | - Yunfang Yu
- Sun Yat-Sen Memorial Hospital, Sun Yat-Sen University, Guangzhou, China
- Institute for AI in Medicine and faculty of Medicine, Macau University of Science and Technology, Macau, China
- Guangzhou National Laboratory, Guangzhou, China
| | - Niu Huang
- National Institute of Biological Sciences, Beijing, China
| | - Gen Li
- Eye Hospital and Institute for Advanced Study on Eye Health and Diseases, Institute for clinical Data Science, Wenzhou Medical University, Wenzhou, China
- Guangzhou National Laboratory, Guangzhou, China
- Eye and Vision Innovation Center, Eye Valley, Wenzhou, China
| | - Xiaokun Li
- State Key Laboratory of Macromolecular Drugs and Large-Scale Preparation, Wenzhou Medical University, Wenzhou, China
| | - Joseph C Wu
- Cardiovascular Research Institute, Stanford University, Stanford, CA, USA
| | - Shengyong Yang
- Department of Biotherapy, Cancer Center and State Key Laboratory of Biotherapy, West China Hospital, Sichuan University, Chengdu, China.
| |
Collapse
|
5
|
Yang CH, Chen YL, Cheung TH, Chuang LY. Multi-Objective Optimization Accelerates the De Novo Design of Antimicrobial Peptide for Staphylococcus aureus. Int J Mol Sci 2024; 25:13688. [PMID: 39769451 PMCID: PMC11728188 DOI: 10.3390/ijms252413688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2024] [Revised: 12/03/2024] [Accepted: 12/18/2024] [Indexed: 01/04/2025] Open
Abstract
Humans have long used antibiotics to fight bacteria, but increasing drug resistance has reduced their effectiveness. Antimicrobial peptides (AMPs) are a promising alternative with natural broad-spectrum activity against bacteria and viruses. However, their instability and hemolysis limit their medical use, making the design and improvement of AMPs a key research focus. Designing antimicrobial peptides with multiple desired properties using machine learning is still challenging, especially with limited data. This study utilized a multi-objective optimization method, the non-dominated sorting genetic algorithm II (NSGA-II), to enhance the physicochemical properties of peptide sequences and identify those with improved antimicrobial activity. Combining NSGA-II with neural networks, the approach efficiently identified promising AMP candidates and accurately predicted their antibacterial effectiveness. This method significantly advances by optimizing factors like hydrophobicity, instability index, and aliphatic index to improve peptide stability. It offers a more efficient way to address the limitations of AMPs, paving the way for the development of safer and more effective antimicrobial treatments.
Collapse
Affiliation(s)
- Cheng-Hong Yang
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 807618, Taiwan; (C.-H.Y.); (Y.-L.C.); (T.-H.C.)
- Department of Information Management, Tainan University of Technology, Tainan 710302, Taiwan
- Ph.D. Program in Biomedical Engineering, Kaohsiung Medical University, Kaohsiung 807378, Taiwan
- Drug Development and Value Creation Research Centre, Kaohsiung Medical University, Kaohsiung 807378, Taiwan
| | - Yi-Ling Chen
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 807618, Taiwan; (C.-H.Y.); (Y.-L.C.); (T.-H.C.)
| | - Tin-Ho Cheung
- Department of Electronic Engineering, National Kaohsiung University of Science and Technology, Kaohsiung 807618, Taiwan; (C.-H.Y.); (Y.-L.C.); (T.-H.C.)
| | - Li-Yeh Chuang
- Department of Chemical Engineering & Institute of Biotechnology Engineering and Chemical Engineering, I-Shou University, Kaohsiung 824005, Taiwan
| |
Collapse
|
6
|
Wang J, Zhu F. Multi-objective molecular generation via clustered Pareto-based reinforcement learning. Neural Netw 2024; 179:106596. [PMID: 39163823 DOI: 10.1016/j.neunet.2024.106596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Revised: 06/16/2024] [Accepted: 08/01/2024] [Indexed: 08/22/2024]
Abstract
De novo molecular design is the process of learning knowledge from existing data to propose new chemical structures that satisfy the desired properties. By using de novo design to generate compounds in a directed manner, better solutions can be obtained in large chemical libraries with less comparison cost. But drug design needs to take multiple factors into consideration. For example, in polypharmacology, molecules that activate or inhibit multiple target proteins produce multiple pharmacological activities and are less susceptible to drug resistance. However, most existing molecular generation methods either focus only on affinity for a single target or fail to effectively balance the relationship between multiple targets, resulting in insufficient validity and desirability of the generated molecules. To address the problems, an approach called clustered Pareto-based reinforcement learning (CPRL) is proposed. In CPRL, a pre-trained model is constructed to grasp existing molecular knowledge in a supervised learning manner. In addition, the clustered Pareto optimization algorithm is presented to find the best solution between different objectives. The algorithm first extracts an update set from the sampled molecules through the designed aggregation-based molecular clustering. Then, the final reward is computed by constructing the Pareto frontier ranking of the molecules from the updated set. To explore the vast chemical space, a reinforcement learning agent is designed in CPRL that can be updated under the guidance of the final reward to balance multiple properties. Furthermore, to increase the internal diversity of the molecules, a fixed-parameter exploration model is used for sampling in conjunction with the agent. The experimental results demonstrate that CPRL is capable of balancing multiple properties of the molecule and has higher desirability and validity, reaching 0.9551 and 0.9923, respectively.
Collapse
Affiliation(s)
- Jing Wang
- School of Computer Science and Technology, Soochow University, Suzhou, 215006, China.
| | - Fei Zhu
- School of Computer Science and Technology, Soochow University, Suzhou, 215006, China.
| |
Collapse
|
7
|
Sridharan B, Sinha A, Bardhan J, Modee R, Ehara M, Priyakumar UD. Deep reinforcement learning in chemistry: A review. J Comput Chem 2024; 45:1886-1898. [PMID: 38698628 DOI: 10.1002/jcc.27354] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/11/2023] [Revised: 03/17/2024] [Accepted: 03/20/2024] [Indexed: 05/05/2024]
Abstract
Reinforcement learning (RL) has been applied to various domains in computational chemistry and has found wide-spread success. In this review, we first motivate the application of RL to chemistry and list some broad application domains, for example, molecule generation, geometry optimization, and retrosynthetic pathway search. We set up some of the formalism associated with reinforcement learning that should help the reader translate their chemistry problems into a form where RL can be used to solve them. We then discuss the solution formulations and algorithms proposed in recent literature for these problems, the advantages of one over the other, together with the necessary details of the RL algorithms they employ. This article should help the reader understand the state of RL applications in chemistry, learn about some relevant actively-researched open problems, gain insight into how RL can be used to approach them and hopefully inspire innovative RL applications in Chemistry.
Collapse
Affiliation(s)
- Bhuvanesh Sridharan
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, India
| | - Animesh Sinha
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, India
| | - Jai Bardhan
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, India
| | - Rohit Modee
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, India
| | - Masahiro Ehara
- Research Center for Computational Science, Institute for Molecular Science, Okazaki, Japan
| | - U Deva Priyakumar
- Center for Computational Natural Sciences and Bioinformatics, International Institute of Information Technology, Hyderabad, India
| |
Collapse
|
8
|
Renz P, Luukkonen S, Klambauer G. Diverse Hits in De Novo Molecule Design: Diversity-Based Comparison of Goal-Directed Generators. J Chem Inf Model 2024; 64:5756-5761. [PMID: 39029090 PMCID: PMC11323242 DOI: 10.1021/acs.jcim.4c00519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2024] [Revised: 07/10/2024] [Accepted: 07/11/2024] [Indexed: 07/21/2024]
Abstract
Since the rise of generative AI models, many goal-directed molecule generators have been proposed as tools for discovering novel drug candidates. However, molecule generators often produce highly similar molecules and tend to overemphasize conformity to an imperfect scoring function rather than capturing the true underlying properties sought. We rectify these two shortcomings by offering diversity-based evaluations using the #Circles metric and considering constraints on scoring function calls or computation time. Our findings highlight the superior performance of SMILES-based autoregressive models in generating diverse sets of desired molecules compared to graph-based models or genetic algorithms.
Collapse
Affiliation(s)
- Philipp Renz
- Johannes Kepler University Linz, Altenbergerstraße 69, Linz, AT 4040, Austria
| | - Sohvi Luukkonen
- Johannes Kepler University Linz, ELLIS Unit Linz, LIT AI Lab, Institute for Machine Learning, Altenbergerstraße 69, Linz, AT 4040, Austria
| | - Günter Klambauer
- Johannes Kepler University Linz, ELLIS Unit Linz, LIT AI Lab, Institute for Machine Learning, Altenbergerstraße 69, Linz, AT 4040, Austria
| |
Collapse
|
9
|
Bhushan R, Grover V. The Advent of Artificial Intelligence into Cardiac Surgery: A Systematic Review of Our Understanding. Braz J Cardiovasc Surg 2024; 39:e20230308. [PMID: 39038236 PMCID: PMC11262144 DOI: 10.21470/1678-9741-2023-0308] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2023] [Accepted: 11/06/2023] [Indexed: 07/24/2024] Open
Abstract
When faced with questions about artificial intelligence (AI), many surgeons respond with scepticism and rejection. However, in the realm of cardiac surgery, it is imperative that we embrace the potential of AI and adopt a proactive mindset. This systematic review utilizes PubMed® to explore the intersection of AI and cardiac surgery since 2017. AI has found applications in various aspects of cardiac surgery, including teaching aids, diagnostics, predictive outcomes, surgical assistance, and expertise. Nevertheless, challenges such as data computation errors, vulnerabilities to malware, and privacy concerns persist. While AI has limitations, its restricted capabilities without cognitive and emotional intelligence should lead us to cautiously and partially embrace this advancing technology to enhance patient care.
Collapse
Affiliation(s)
- Rahul Bhushan
- Department of Cardiovascular and Thoracic Surgery, All India
Institute of Medical Sciences (AIIMS), Patna, India
| | - Vijay Grover
- Department of Cardiac surgery, Atal Bihari Vajpayee Institute of
Medical Sciences (ABVIMS) and Dr Ram Manohar Lohia (RML) Hospital, New Delhi, India
| |
Collapse
|
10
|
López-López E, Medina-Franco JL. Toward structure-multiple activity relationships (SMARts) using computational approaches: A polypharmacological perspective. Drug Discov Today 2024; 29:104046. [PMID: 38810721 DOI: 10.1016/j.drudis.2024.104046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2024] [Revised: 05/13/2024] [Accepted: 05/22/2024] [Indexed: 05/31/2024]
Abstract
In the current era of biological big data, which are rapidly populating the biological chemical space, in silico polypharmacology drug design approaches help to decode structure-multiple activity relationships (SMARts). Current computational methods can predict or categorize multiple properties simultaneously, which aids the generation, identification, curation, prioritization, optimization, and repurposing of molecules. Computational methods have generated opportunities and challenges in medicinal chemistry, pharmacology, food chemistry, toxicology, bioinformatics, and chemoinformatics. It is anticipated that computer-guided SMARts could contribute to the full automatization of drug design and drug repurposing campaigns, facilitating the prediction of new biological targets, side and off-target effects, and drug-drug interactions.
Collapse
Affiliation(s)
- Edgar López-López
- Department of Chemistry and Graduate Program in Pharmacology, Center for Research and Advanced Studies of the National Polytechnic Institute, Section 14-740, Mexico City 07000, Mexico; DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico.
| | - José L Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico.
| |
Collapse
|
11
|
Retchin M, Wang Y, Takaba K, Chodera JD. DrugGym: A testbed for the economics of autonomous drug discovery. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.28.596296. [PMID: 38854082 PMCID: PMC11160604 DOI: 10.1101/2024.05.28.596296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Drug discovery is stochastic. The effectiveness of candidate compounds in satisfying design objectives is unknown ahead of time, and the tools used for prioritization-predictive models and assays-are inaccurate and noisy. In a typical discovery campaign, thousands of compounds may be synthesized and tested before design objectives are achieved, with many others ideated but deprioritized. These challenges are well-documented, but assessing potential remedies has been difficult. We introduce DrugGym, a framework for modeling the stochastic process of drug discovery. Emulating biochemical assays with realistic surrogate models, we simulate the progression from weak hits to sub-micromolar leads with viable ADME. We use this testbed to examine how different ideation, scoring, and decision-making strategies impact statistical measures of utility, such as the probability of program success within predefined budgets and the expected costs to achieve target candidate profile (TCP) goals. We also assess the influence of affinity model inaccuracy, chemical creativity, batch size, and multi-step reasoning. Our findings suggest that reducing affinity model inaccuracy from 2 to 0.5 pIC50 units improves budget-constrained success rates tenfold. DrugGym represents a realistic testbed for machine learning methods applied to the hit-to-lead phase. Source code is available at www.drug-gym.org.
Collapse
Affiliation(s)
- Michael Retchin
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medical College, Cornell University, New York, NY 10065
| | - Yuanqing Wang
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065
- Simons Center for Computational Chemistry and Center for Data Science, New York University, New York, NY 10004
| | - Kenichiro Takaba
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065
- Pharmaceutical Research Center, Advanced Drug Discovery, Asahi Kasei Pharma Corporation, Shizuoka 410-2321, Japan
| | - John D. Chodera
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medical College, Cornell University, New York, NY 10065
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065
| |
Collapse
|
12
|
Cheng Z, Aitha M, Thomas CA, Sturgill A, Fairweather M, Hu A, Bethel CR, Rivera DD, Dranchak P, Thomas PW, Li H, Feng Q, Tao K, Song M, Sun N, Wang S, Silwal SB, Page RC, Fast W, Bonomo RA, Weese M, Martinez W, Inglese J, Crowder MW. Machine Learning Models Identify Inhibitors of New Delhi Metallo-β-lactamase. J Chem Inf Model 2024; 64:3977-3991. [PMID: 38727192 PMCID: PMC11129921 DOI: 10.1021/acs.jcim.3c02015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/13/2024]
Abstract
The worldwide spread of the metallo-β-lactamases (MBL), especially New Delhi metallo-β-lactamase-1 (NDM-1), is threatening the efficacy of β-lactams, which are the most potent and prescribed class of antibiotics in the clinic. Currently, FDA-approved MBL inhibitors are lacking in the clinic even though many strategies have been used in inhibitor development, including quantitative high-throughput screening (qHTS), fragment-based drug discovery (FBDD), and molecular docking. Herein, a machine learning-based prediction tool is described, which was generated using results from HTS of a large chemical library and previously published inhibition data. The prediction tool was then used for virtual screening of the NIH Genesis library, which was subsequently screened using qHTS. A novel MBL inhibitor was identified and shown to lower minimum inhibitory concentrations (MICs) of Meropenem for a panel of E. coli and K. pneumoniae clinical isolates expressing NDM-1. The mechanism of inhibition of this novel scaffold was probed utilizing equilibrium dialyses with metal analyses, native state electrospray ionization mass spectrometry, UV-vis spectrophotometry, and molecular docking. The uncovered inhibitor, compound 72922413, was shown to be 9-hydroxy-3-[(5-hydroxy-1-oxa-9-azaspiro[5.5]undec-9-yl)carbonyl]-4H-pyrido[1,2-a]pyrimidin-4-one.
Collapse
Affiliation(s)
- Zishuo Cheng
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - Mahesh Aitha
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD 20850, USA
| | - Caitlyn A. Thomas
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - Aidan Sturgill
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - Mitch Fairweather
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - Amy Hu
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - Christopher R. Bethel
- Research Service, Louis Stokes Cleveland Department of Veterans Affairs Medical Center, Cleveland, OH 44106, USA
| | - Dann D. Rivera
- Division of Chemical Biology and Medicinal Chemistry, College of Pharmacy, University of Texas, Austin, TX 78712, USA
| | - Patricia Dranchak
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD 20850, USA
| | - Pei W. Thomas
- Division of Chemical Biology and Medicinal Chemistry, College of Pharmacy, University of Texas, Austin, TX 78712, USA
| | - Han Li
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - Qi Feng
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - Kaicheng Tao
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - Minshuai Song
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - Na Sun
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - Shuo Wang
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | | | - Richard C. Page
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - Walt Fast
- Division of Chemical Biology and Medicinal Chemistry, College of Pharmacy, University of Texas, Austin, TX 78712, USA
| | - Robert A. Bonomo
- Research Service, Louis Stokes Cleveland Department of Veterans Affairs Medical Center, Cleveland, OH 44106, USA
- Departments of Medicine, Biochemistry, Molecular Biology and Microbiology, Pharmacology, and Proteomics and Bioinformatics, Case Western Reserve University School of Medicine, Cleveland, OH 44106, USA
- Clinician Scientist Investigator, Louis Stokes Cleveland Department of Veterans Affairs Medical Center, Cleveland, OH 44106, USA
- CWRU-Cleveland VAMC Center for Antimicrobial Resistance and Epidemiology (Case VA CARES) Cleveland, OH 44106, USA
| | - Maria Weese
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - Waldyn Martinez
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| | - James Inglese
- Division of Preclinical Innovation, National Center for Advancing Translational Sciences, National Institutes of Health, Rockville, MD 20850, USA
- Metabolic Medicine Branch, National Human Genome Research Institute, National Institutes of Health, Bethesda, MD 20817, USA
| | - Michael W. Crowder
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA
| |
Collapse
|
13
|
Xu Y, Jiang WJ, Bai YY, Yang YJ, Zhang ZL. Artificial Intelligence-Assisted Multiparameter Size Discrimination of Silver Nanoparticles through Electrochemical Collision. Anal Chem 2024; 96:6195-6201. [PMID: 38607805 DOI: 10.1021/acs.analchem.3c05115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/14/2024]
Abstract
Single particle collision is an important tool for size analysis at the individual particle level; however, due to complex dynamic behaviors of nanoparticles on the surface of an electrode, the accuracy of size discrimination is limited. A silver (Ag) nanoparticle (NP) was chosen as the research target, and the dynamic behavior of Ag NPs was simplified by enhancing adsorption between Ag NP and Au ultramicroelectrode (UME) in alkaline media. Immediately after, accurate dynamic and thermodynamic information on single Ag NP was accurately extracted from collision events, including current intensity, transferred charge, and duration time. On the basis that there were differences between parameters of different-sized Ag NPs, multiparameter size discrimination was proposed, which improved the accuracy compared to single-parameter discrimination. More intriguingly, multiparameter analysis was combined with artificial intelligence, a tool adept at processing multidimensional data, for the first time. Finally, artificial intelligence-assisted multiparameter size discrimination was successfully used to intelligently distinguish mixed Ag NPs, with an optimal accuracy of more than 95%. To sum up, the artificial intelligence-assisted multiparameter method showed an excellent ability to quickly achieve the most accurate size discrimination of nanoparticles at the level of individual particle and provide an effective guidance for the application of nanoparticles.
Collapse
Affiliation(s)
- Ying Xu
- College of Chemistry and Molecular Sciences, Wuhan University, Wuhan 430072, People's Republic of China
| | - Wei-Jian Jiang
- College of Life Science and Technology, Huazhong University of Science and Technology, Wuhan 430074, People's Republic of China
| | - Yi-Yan Bai
- College of Chemistry and Molecular Sciences, Wuhan University, Wuhan 430072, People's Republic of China
- Department of Chemistry, Yuncheng University, Yuncheng 04400, People's Republic of China
| | - Yan-Ju Yang
- College of Chemistry and Molecular Sciences, Wuhan University, Wuhan 430072, People's Republic of China
| | - Zhi-Ling Zhang
- College of Chemistry and Molecular Sciences, Wuhan University, Wuhan 430072, People's Republic of China
| |
Collapse
|
14
|
Loeffler HH, He J, Tibo A, Janet JP, Voronov A, Mervin LH, Engkvist O. Reinvent 4: Modern AI-driven generative molecule design. J Cheminform 2024; 16:20. [PMID: 38383444 PMCID: PMC10882833 DOI: 10.1186/s13321-024-00812-5] [Citation(s) in RCA: 20] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2023] [Accepted: 02/09/2024] [Indexed: 02/23/2024] Open
Abstract
REINVENT 4 is a modern open-source generative AI framework for the design of small molecules. The software utilizes recurrent neural networks and transformer architectures to drive molecule generation. These generators are seamlessly embedded within the general machine learning optimization algorithms, transfer learning, reinforcement learning and curriculum learning. REINVENT 4 enables and facilitates de novo design, R-group replacement, library design, linker design, scaffold hopping and molecule optimization. This contribution gives an overview of the software and describes its design. Algorithms and their applications are discussed in detail. REINVENT 4 is a command line tool which reads a user configuration in either TOML or JSON format. The aim of this release is to provide reference implementations for some of the most common algorithms in AI based molecule generation. An additional goal with the release is to create a framework for education and future innovation in AI based molecular design. The software is available from https://github.com/MolecularAI/REINVENT4 and released under the permissive Apache 2.0 license. Scientific contribution. The software provides an open-source reference implementation for generative molecular design where the software is also being used in production to support in-house drug discovery projects. The publication of the most common machine learning algorithms in one code and full documentation thereof will increase transparency of AI and foster innovation, collaboration and education.
Collapse
Affiliation(s)
- Hannes H Loeffler
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden.
| | - Jiazhen He
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Alessandro Tibo
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Jon Paul Janet
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Alexey Voronov
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| | - Lewis H Mervin
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Cambridge, UK
| | - Ola Engkvist
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, Gothenburg, Sweden
| |
Collapse
|
15
|
Garg V. Generative AI for graph-based drug design: Recent advances and the way forward. Curr Opin Struct Biol 2024; 84:102769. [PMID: 38199072 DOI: 10.1016/j.sbi.2023.102769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2023] [Revised: 12/17/2023] [Accepted: 12/19/2023] [Indexed: 01/12/2024]
Abstract
Discovering new promising molecule candidates that could translate into effective drugs is a key scientific pursuit. However, factors such as the vastness and discreteness of the molecular search space pose a formidable technical challenge in this quest. AI-driven generative models can effectively learn from data, and offer hope to streamline drug design. In this article, we review state of the art in generative models that operate on molecular graphs. We also shed light on some limitations of the existing methodology and sketch directions to harness the potential of AI for drug design tasks going forward.
Collapse
Affiliation(s)
- Vikas Garg
- Aalto University and YaiYai Ltd, Finland.
| |
Collapse
|
16
|
Haghir Ebrahim Abadi MH, Ghasemlou A, Bayani F, Sefidbakht Y, Vosough M, Mozaffari-Jovin S, Uversky VN. AI-driven covalent drug design strategies targeting main protease (m pro) against SARS-CoV-2: structural insights and molecular mechanisms. J Biomol Struct Dyn 2024:1-29. [PMID: 38287509 DOI: 10.1080/07391102.2024.2308769] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2023] [Accepted: 01/17/2024] [Indexed: 01/31/2024]
Abstract
The emergence of new SARS-CoV-2 variants has raised concerns about the effectiveness of COVID-19 vaccines. To address this challenge, small-molecule antivirals have been proposed as a crucial therapeutic option. Among potential targets for anti-COVID-19 therapy, the main protease (Mpro) of SARS-CoV-2 is important due to its essential role in the virus's life cycle and high conservation. The substrate-binding region of the core proteases of various coronaviruses, including SARS-CoV-2, SARS-CoV, and Middle East respiratory syndrome coronavirus (MERS-CoV), could be used for the generation of new protease inhibitors. Various drug discovery methods have employed a diverse range of strategies, targeting both monomeric and dimeric forms, including drug repurposing, integrating virtual screening with high-throughput screening (HTS), and structure-based drug design, each demonstrating varying levels of efficiency. Covalent inhibitors, such as Nirmatrelvir and MG-101, showcase robust and high-affinity binding to Mpro, exhibiting stable interactions confirmed by molecular docking studies. Development of effective antiviral drugs is imperative to address potential pandemic situations. This review explores recent advances in the search for Mpro inhibitors and the application of artificial intelligence (AI) in drug design. AI leverages vast datasets and advanced algorithms to streamline the design and identification of promising Mpro inhibitors. AI-driven drug discovery methods, including molecular docking, predictive modeling, and structure-based drug repurposing, are at the forefront of identifying potential candidates for effective antiviral therapy. In a time when COVID-19 potentially threat global health, the quest for potent antiviral solutions targeting Mpro could be critical for inhibiting the virus.
Collapse
Affiliation(s)
| | | | - Fatemeh Bayani
- Protein Research Center, Shahid Beheshti University, Tehran, Iran
| | - Yahya Sefidbakht
- Protein Research Center, Shahid Beheshti University, Tehran, Iran
| | - Massoud Vosough
- Department of Regenerative Medicine, Cell Science Research Center, Royan Institute for Stem Cell Biology and Technology, ACECR, Tehran, Iran
| | - Sina Mozaffari-Jovin
- Department of Medical Genetics, School of Medicine, Mashhad University of Medical Sciences, Mashhad, Iran
- Medical Genetics Research Center, Mashhad University of Medical Sciences, Mashhad, Iran
| | - Vladimir N Uversky
- Department of Molecular Medicine, University of South Florida, Tampa, FL, USA
| |
Collapse
|
17
|
Xu W, Yang X, Guan Y, Cheng X, Wang Y. Integrative approach for predicting drug-target interactions via matrix factorization and broad learning systems. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2024; 21:2608-2625. [PMID: 38454698 DOI: 10.3934/mbe.2024115] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/09/2024]
Abstract
In the drug discovery process, time and costs are the most typical problems resulting from the experimental screening of drug-target interactions (DTIs). To address these limitations, many computational methods have been developed to achieve more accurate predictions. However, identifying DTIs mostly rely on separate learning tasks with drug and target features that neglect interaction representation between drugs and target. In addition, the lack of these relationships may lead to a greatly impaired performance on the prediction of DTIs. Aiming at capturing comprehensive drug-target representations and simplifying the network structure, we propose an integrative approach with a convolution broad learning system for the DTI prediction (ConvBLS-DTI) to reduce the impact of the data sparsity and incompleteness. First, given the lack of known interactions for the drug and target, the weighted K-nearest known neighbors (WKNKN) method was used as a preprocessing strategy for unknown drug-target pairs. Second, a neighborhood regularized logistic matrix factorization (NRLMF) was applied to extract features of updated drug-target interaction information, which focused more on the known interaction pair parties. Then, a broad learning network incorporating a convolutional neural network was established to predict DTIs, which can make classification more effective using a different perspective. Finally, based on the four benchmark datasets in three scenarios, the ConvBLS-DTI's overall performance out-performed some mainstream methods. The test results demonstrate that our model achieves improved prediction effect on the area under the receiver operating characteristic curve and the precision-recall curve.
Collapse
Affiliation(s)
- Wanying Xu
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| | - Xixin Yang
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
- School of Automation, Qingdao University, Qingdao 266071, China
| | - Yuanlin Guan
- Key Lab of Industrial Fluid Energy Conservation and Pollution Control, Ministry of Education, Qingdao University of Technology, Qingdao 266520, China
- School of Mechanical & Automotive Engineering, Qingdao University of Technology, Qingdao 266520, China
| | - Xiaoqing Cheng
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| | - Yu Wang
- College of Computer Science & Technology, Qingdao University, Qingdao 266071, China
| |
Collapse
|
18
|
Panwar U, Murali A, Khan MA, Selvaraj C, Singh SK. Virtual Screening Process: A Guide in Modern Drug Designing. Methods Mol Biol 2024; 2714:21-31. [PMID: 37676591 DOI: 10.1007/978-1-0716-3441-7_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Due to its capacity to drastically cut the cost and time necessary for experimental screening of compounds, virtual screening (VS) has grown to be a crucial component of drug discovery and development. VS is a computational method used in drug design to identify potential drugs from enormous libraries of chemicals. This approach makes use of molecular modeling and docking simulations to assess the small molecule's ability to bind to the desired protein. Virtual screening has a bright future, as high computational power and modern techniques are likely to further enhance the accuracy and speed of the process.
Collapse
Affiliation(s)
- Umesh Panwar
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Science Block, Alagappa University, Karaikudi, Tamil Nadu, India
| | - Aarthy Murali
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Science Block, Alagappa University, Karaikudi, Tamil Nadu, India
| | - Mohammad Aqueel Khan
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Science Block, Alagappa University, Karaikudi, Tamil Nadu, India
| | - Chandrabose Selvaraj
- Center for Transdisciplinary Research, Department of Pharmacology, Saveetha Dental College and Hospitals, Saveetha Institute of Medical and Technical Sciences (SIMATS), Saveetha University, Chennai, Tamil Nadu, India
| | - Sanjeev Kumar Singh
- Computer Aided Drug Design and Molecular Modelling Lab, Department of Bioinformatics, Science Block, Alagappa University, Karaikudi, Tamil Nadu, India
- Department of Data Sciences, Centre of Biomedical Research, SGPGIMS Campus, Lucknow, Uttar Pradesh, India
| |
Collapse
|
19
|
Angelo JS, Guedes IA, Barbosa HJC, Dardenne LE. Multi-and many-objective optimization: present and future in de novo drug design. Front Chem 2023; 11:1288626. [PMID: 38192501 PMCID: PMC10773868 DOI: 10.3389/fchem.2023.1288626] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2023] [Accepted: 11/27/2023] [Indexed: 01/10/2024] Open
Abstract
de novo Drug Design (dnDD) aims to create new molecules that satisfy multiple conflicting objectives. Since several desired properties can be considered in the optimization process, dnDD is naturally categorized as a many-objective optimization problem (ManyOOP), where more than three objectives must be simultaneously optimized. However, a large number of objectives typically pose several challenges that affect the choice and the design of optimization methodologies. Herein, we cover the application of multi- and many-objective optimization methods, particularly those based on Evolutionary Computation and Machine Learning techniques, to enlighten their potential application in dnDD. Additionally, we comprehensively analyze how molecular properties used in the optimization process are applied as either objectives or constraints to the problem. Finally, we discuss future research in many-objective optimization for dnDD, highlighting two important possible impacts: i) its integration with the development of multi-target approaches to accelerate the discovery of innovative and more efficacious drug therapies and ii) its role as a catalyst for new developments in more fundamental and general methodological frameworks in the field.
Collapse
Affiliation(s)
| | | | | | - Laurent E. Dardenne
- Coordenação de Modelagem Computacional, Laboratório Nacional de Computação Científica, Petrópolis, Brazil
| |
Collapse
|
20
|
Blaudin de Thé FX, Baudier C, Andrade Pereira R, Lefebvre C, Moingeon P. Transforming drug discovery with a high-throughput AI-powered platform: A 5-year experience with Patrimony. Drug Discov Today 2023; 28:103772. [PMID: 37717933 DOI: 10.1016/j.drudis.2023.103772] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Revised: 09/01/2023] [Accepted: 09/12/2023] [Indexed: 09/19/2023]
Abstract
High-throughput computational platforms are being established to accelerate drug discovery. Servier launched the Patrimony platform to harness computational sciences and artificial intelligence (AI) to integrate massive multimodal data from internal and external sources. Patrimony has enabled researchers to prioritize therapeutic targets based on a deep understanding of the pathophysiology of immuno-inflammatory diseases. Herein, we share our experience regarding main challenges and critical success factors faced when industrializing the platform and broadening its applications to neurological diseases. We emphasize the importance of integrating such platforms in an end-to-end drug discovery process and engaging human experts early on to ensure a transforming impact.
Collapse
|
21
|
Tautermann CS, Borghardt JM, Pfau R, Zentgraf M, Weskamp N, Sauer A. Towards holistic Compound Quality Scores: Extending ligand efficiency indices with compound pharmacokinetic characteristics. Drug Discov Today 2023; 28:103758. [PMID: 37660984 DOI: 10.1016/j.drudis.2023.103758] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2023] [Revised: 08/17/2023] [Accepted: 08/28/2023] [Indexed: 09/05/2023]
Abstract
The suitability of small molecules as oral drugs is often assessed by simple physicochemical rules, the application of ligand efficiency scores or by composite scores based on physicochemical compound properties. These rules and scores are empirical and typically lack mechanistic background, such as information on pharmacokinetics (PK). We introduce new types of Compound Quality Scores (CQS, specifically called dose scores and cmax scores), which explicitly include predicted or, when available, experimental PK parameters and combine these with on-target potency. These CQS scores are surrogates for an estimated dose and corresponding cmax and allow prioritizing of compounds within test cascades as well as before synthesis. We demonstrate the complementarity and, in most cases, superior performance relative to existing efficiency metrics by project examples.
Collapse
Affiliation(s)
- Christofer S Tautermann
- Boehringer Ingelheim Pharma GmbH & Co. KG, Medicinal Chemistry, Birkendorfer Strasse 65, Biberach 88397, Germany; Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck 6020, Austria.
| | - Jens M Borghardt
- Boehringer Ingelheim Pharma GmbH & Co. KG, Drug Discovery Sciences, Birkendorfer Strasse 65, Biberach 88397, Germany.
| | - Roland Pfau
- Boehringer Ingelheim Pharma GmbH & Co. KG, Medicinal Chemistry, Birkendorfer Strasse 65, Biberach 88397, Germany; Boehringer Ingelheim Pharma GmbH & Co. KG, CNS Research, Birkendorfer Strasse 65, Biberach 88397, Germany.
| | - Matthias Zentgraf
- Boehringer Ingelheim Pharma GmbH & Co. KG, Discovery Research Coordination Germany, Birkendorfer Strasse 65, Biberach 88397, Germany.
| | - Nils Weskamp
- Boehringer Ingelheim Pharma GmbH & Co. KG, Medicinal Chemistry, Birkendorfer Strasse 65, Biberach 88397, Germany.
| | - Achim Sauer
- Boehringer Ingelheim Pharma GmbH & Co. KG, Drug Discovery Sciences, Birkendorfer Strasse 65, Biberach 88397, Germany.
| |
Collapse
|
22
|
Stanley M, Segler M. Fake it until you make it? Generative de novo design and virtual screening of synthesizable molecules. Curr Opin Struct Biol 2023; 82:102658. [PMID: 37473637 DOI: 10.1016/j.sbi.2023.102658] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2023] [Revised: 06/21/2023] [Accepted: 06/22/2023] [Indexed: 07/22/2023]
Abstract
Computational techniques, including virtual screening, de novo design, and generative models, play an increasing role in expediting DMTA cycles for modern molecular discovery. However, computationally proposed molecules must be synthetically feasible for laboratory testing. In this perspective, we offer a succinct introduction to the subject, and showcase typical workflows to integrate synthesis planning, synthesizability scoring, and molecule generation. Finally, we address limitations and opportunities for future research.
Collapse
Affiliation(s)
- Megan Stanley
- Microsoft Research AI4Science, UK. https://twitter.com/@megjanestanley
| | | |
Collapse
|
23
|
Monteiro NRC, Pereira TO, Machado ACD, Oliveira JL, Abbasi M, Arrais JP. FSM-DDTR: End-to-end feedback strategy for multi-objective De Novo drug design using transformers. Comput Biol Med 2023; 164:107285. [PMID: 37557054 DOI: 10.1016/j.compbiomed.2023.107285] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2023] [Revised: 07/05/2023] [Accepted: 07/28/2023] [Indexed: 08/11/2023]
Abstract
The design of compounds that target specific biological functions with relevant selectivity is critical in the context of drug discovery, especially due to the polypharmacological nature of most existing drug molecules. In recent years, in silico-based methods combined with deep learning have shown promising results in the de novo drug design challenge, leading to potential leads for biologically interesting targets. However, several of these methods overlook the importance of certain properties, such as validity rate and target selectivity, or simplify the generative process by neglecting the multi-objective nature of the pharmacological space. In this study, we propose a multi-objective Transformer-based architecture to generate drug candidates with desired molecular properties and increased selectivity toward a specific biological target. The framework consists of a Transformer-Decoder Generator that generates novel and valid compounds in the SMILES format notation, a Transformer-Encoder Predictor that estimates the binding affinity toward the biological target, and a feedback loop combined with a multi-objective optimization strategy to rank the generated molecules and condition the generating distribution around the targeted properties. The results demonstrate that the proposed architecture can generate novel and synthesizable small compounds with desired pharmacological properties toward a biologically relevant target. The unbiased Transformer-based Generator achieved superior performance in the novelty rate (97.38%) and comparable performance in terms of internal diversity, uniqueness, and validity against state-of-the-art baselines. The optimization of the unbiased Transformer-based Generator resulted in the generation of molecules exhibiting high binding affinity toward the Adenosine A2A Receptor (AA2AR) and possessing desirable physicochemical properties, where 99.36% of the generated molecules follow Lipinski's rule of five. Furthermore, the implementation of a feedback strategy, in conjunction with a multi-objective algorithm, effectively shifted the distribution of the generated molecules toward optimal values of molecular weight, molecular lipophilicity, topological polar surface area, synthetic accessibility score, and quantitative estimate of drug-likeness, without the necessity of prior training sets comprising molecules endowed with pharmacological properties of interest. Overall, this research study validates the applicability of a Transformer-based architecture in the context of drug design, capable of exploring the vast chemical representation space to generate novel molecules with improved pharmacological properties and target selectivity. The data and source code used in this study are available at: https://github.com/larngroup/FSM-DDTR.
Collapse
Affiliation(s)
- Nelson R C Monteiro
- University of Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal.
| | - Tiago O Pereira
- University of Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal.
| | - Ana Catarina D Machado
- University of Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal.
| | - José L Oliveira
- IEETA, Department of Electronics, Telecommunications and Informatics, University of Aveiro, Aveiro, Portugal.
| | - Maryam Abbasi
- University of Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal; Polytechnic Institute of Coimbra, Applied Research Institute, Coimbra, Portugal.
| | - Joel P Arrais
- University of Coimbra, Centre for Informatics and Systems of the University of Coimbra, Department of Informatics Engineering, Coimbra, Portugal.
| |
Collapse
|
24
|
Jakšić Z, Devi S, Jakšić O, Guha K. A Comprehensive Review of Bio-Inspired Optimization Algorithms Including Applications in Microelectronics and Nanophotonics. Biomimetics (Basel) 2023; 8:278. [PMID: 37504166 PMCID: PMC10807478 DOI: 10.3390/biomimetics8030278] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Revised: 06/25/2023] [Accepted: 06/26/2023] [Indexed: 07/29/2023] Open
Abstract
The application of artificial intelligence in everyday life is becoming all-pervasive and unavoidable. Within that vast field, a special place belongs to biomimetic/bio-inspired algorithms for multiparameter optimization, which find their use in a large number of areas. Novel methods and advances are being published at an accelerated pace. Because of that, in spite of the fact that there are a lot of surveys and reviews in the field, they quickly become dated. Thus, it is of importance to keep pace with the current developments. In this review, we first consider a possible classification of bio-inspired multiparameter optimization methods because papers dedicated to that area are relatively scarce and often contradictory. We proceed by describing in some detail some more prominent approaches, as well as those most recently published. Finally, we consider the use of biomimetic algorithms in two related wide fields, namely microelectronics (including circuit design optimization) and nanophotonics (including inverse design of structures such as photonic crystals, nanoplasmonic configurations and metamaterials). We attempted to keep this broad survey self-contained so it can be of use not only to scholars in the related fields, but also to all those interested in the latest developments in this attractive area.
Collapse
Affiliation(s)
- Zoran Jakšić
- Center of Microelectronic Technologies, Institute of Chemistry, Technology and Metallurgy, National Institute of the Republic of Serbia University of Belgrade, 11000 Belgrade, Serbia;
| | - Swagata Devi
- Department of Electronics and Communication Engineering, B V Raju Institute of Technology Narasapur, Narasapur 502313, India;
| | - Olga Jakšić
- Center of Microelectronic Technologies, Institute of Chemistry, Technology and Metallurgy, National Institute of the Republic of Serbia University of Belgrade, 11000 Belgrade, Serbia;
| | - Koushik Guha
- Department of Electronics and Communication Engineering, National Institute of Technology Silchar, Silchar 788010, India;
| |
Collapse
|
25
|
Šícho M, Luukkonen S, van den Maagdenberg HW, Schoenmaker L, Béquignon OJM, van Westen GJP. DrugEx: Deep Learning Models and Tools for Exploration of Drug-Like Chemical Space. J Chem Inf Model 2023; 63:3629-3636. [PMID: 37272707 PMCID: PMC10306259 DOI: 10.1021/acs.jcim.3c00434] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Indexed: 06/06/2023]
Abstract
The discovery of novel molecules with desirable properties is a classic challenge in medicinal chemistry. With the recent advancements of machine learning, there has been a surge of de novo drug design tools. However, few resources exist that are user-friendly as well as easily customizable. In this application note, we present the new versatile open-source software package DrugEx for multiobjective reinforcement learning. This package contains the consolidated and redesigned scripts from the prior DrugEx papers including multiple generator architectures, a variety of scoring tools, and multiobjective optimization methods. It has a flexible application programming interface and can readily be used via the command line interface or the graphical user interface GenUI. The DrugEx package is publicly available at https://github.com/CDDLeiden/DrugEx.
Collapse
Affiliation(s)
- Martin Šícho
- Leiden
Academic Centre for Drug Research, Leiden
University, 55 Einsteinweg, 2333 CC, Leiden, The Netherlands
- CZ-OPENSCREEN:
National Infrastructure for Chemical Biology, Department of Informatics
and Chemistry, Faculty of Chemical Technology, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague, Czech Republic
| | - Sohvi Luukkonen
- Leiden
Academic Centre for Drug Research, Leiden
University, 55 Einsteinweg, 2333 CC, Leiden, The Netherlands
| | | | - Linde Schoenmaker
- Leiden
Academic Centre for Drug Research, Leiden
University, 55 Einsteinweg, 2333 CC, Leiden, The Netherlands
| | - Olivier J. M. Béquignon
- Leiden
Academic Centre for Drug Research, Leiden
University, 55 Einsteinweg, 2333 CC, Leiden, The Netherlands
| | - Gerard J. P. van Westen
- Leiden
Academic Centre for Drug Research, Leiden
University, 55 Einsteinweg, 2333 CC, Leiden, The Netherlands
| |
Collapse
|
26
|
Schoenmaker L, Béquignon OJM, Jespers W, van Westen GJP. UnCorrupt SMILES: a novel approach to de novo design. J Cheminform 2023; 15:22. [PMID: 36788579 PMCID: PMC9926805 DOI: 10.1186/s13321-023-00696-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2022] [Accepted: 02/06/2023] [Indexed: 02/16/2023] Open
Abstract
Generative deep learning models have emerged as a powerful approach for de novo drug design as they aid researchers in finding new molecules with desired properties. Despite continuous improvements in the field, a subset of the outputs that sequence-based de novo generators produce cannot be progressed due to errors. Here, we propose to fix these invalid outputs post hoc. In similar tasks, transformer models from the field of natural language processing have been shown to be very effective. Therefore, here this type of model was trained to translate invalid Simplified Molecular-Input Line-Entry System (SMILES) into valid representations. The performance of this SMILES corrector was evaluated on four representative methods of de novo generation: a recurrent neural network (RNN), a target-directed RNN, a generative adversarial network (GAN), and a variational autoencoder (VAE). This study has found that the percentage of invalid outputs from these specific generative models ranges between 4 and 89%, with different models having different error-type distributions. Post hoc correction of SMILES was shown to increase model validity. The SMILES corrector trained with one error per input alters 60-90% of invalid generator outputs and fixes 35-80% of them. However, a higher error detection and performance was obtained for transformer models trained with multiple errors per input. In this case, the best model was able to correct 60-95% of invalid generator outputs. Further analysis showed that these fixed molecules are comparable to the correct molecules from the de novo generators based on novelty and similarity. Additionally, the SMILES corrector can be used to expand the amount of interesting new molecules within the targeted chemical space. Introducing different errors into existing molecules yields novel analogs with a uniqueness of 39% and a novelty of approximately 20%. The results of this research demonstrate that SMILES correction is a viable post hoc extension and can enhance the search for better drug candidates.
Collapse
Affiliation(s)
- Linde Schoenmaker
- grid.5132.50000 0001 2312 1970Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Olivier J. M. Béquignon
- grid.5132.50000 0001 2312 1970Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Willem Jespers
- grid.5132.50000 0001 2312 1970Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| | - Gerard J. P. van Westen
- grid.5132.50000 0001 2312 1970Computational Drug Discovery, Drug Discovery and Safety, Leiden Academic Centre for Drug Research, Einsteinweg 55, Leiden, The Netherlands
| |
Collapse
|