1
|
Retchin M, Wang Y, Takaba K, Chodera JD. DrugGym: A testbed for the economics of autonomous drug discovery. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.28.596296. [PMID: 38854082 PMCID: PMC11160604 DOI: 10.1101/2024.05.28.596296] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/11/2024]
Abstract
Drug discovery is stochastic. The effectiveness of candidate compounds in satisfying design objectives is unknown ahead of time, and the tools used for prioritization-predictive models and assays-are inaccurate and noisy. In a typical discovery campaign, thousands of compounds may be synthesized and tested before design objectives are achieved, with many others ideated but deprioritized. These challenges are well-documented, but assessing potential remedies has been difficult. We introduce DrugGym , a frame-work for modeling the stochastic process of drug discovery. Emulating biochemical assays with realistic surrogate models, we simulate the progression from weak hits to sub-micromolar leads with viable ADME. We use this testbed to examine how different ideation, scoring, and decision-making strategies impact statistical measures of utility, such as the probability of program success within predefined budgets and the expected costs to achieve target candidate profile (TCP) goals. We also assess the influence of affinity model inaccuracy, chemical creativity, batch size, and multi-step reasoning. Our findings suggest that reducing affinity model inaccuracy from 2 to 0.5 pIC50 units improves budget-constrained success rates tenfold. DrugGym represents a realistic testbed for machine learning methods applied to the hit-to-lead phase. Source code is available at www.drug-gym.org .
Collapse
Affiliation(s)
- Michael Retchin
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medical College, Cornell University, New York, NY 10065
| | - Yuanqing Wang
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065
- Simons Center for Computational Chemistry and Center for Data Science, New York University, New York, NY 10004
| | - Kenichiro Takaba
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065
- Pharmaceutical Research Center, Advanced Drug Discovery, Asahi Kasei Pharma Corporation, Shizuoka 410-2321, Japan
| | - John D. Chodera
- Tri-Institutional PhD Program in Computational Biology and Medicine, Weill Cornell Medical College, Cornell University, New York, NY 10065
- Computational and Systems Biology Program, Sloan Kettering Institute, Memorial Sloan Kettering Cancer Center, New York, NY 10065
| |
Collapse
|
2
|
Gosavi AA, Nandgude TD, Mishra RK, Puri DB. Exploring the Potential of Artificial Intelligence as a Facilitating Tool for Formulation Development in Fluidized Bed Processor: a Comprehensive Review. AAPS PharmSciTech 2024; 25:111. [PMID: 38740666 DOI: 10.1208/s12249-024-02816-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Accepted: 04/23/2024] [Indexed: 05/16/2024] Open
Abstract
This in-depth study looks into how artificial intelligence (AI) could be used to make formulation development easier in fluidized bed processes (FBP). FBP is complex and involves numerous variables, making optimization challenging. Various AI techniques have addressed this challenge, including machine learning, neural networks, genetic algorithms, and fuzzy logic. By integrating AI with experimental design, process modeling, and optimization strategies, intelligent systems for FBP can be developed. The advantages of AI in this context include improved process understanding, reduced time and cost, enhanced product quality, and robust formulation optimization. However, data availability, model interpretability, and regulatory compliance challenges must be addressed. Case studies demonstrate successful applications of AI in decision-making, process outcome prediction, and scale-up. AI can improve efficiency, quality, and cost-effectiveness in significant ways. Still, it is important to think carefully about data quality, how easy it is to understand, and how to follow the rules. Future research should focus on fully harnessing the potential of AI to advance formulation development in FBP.
Collapse
Affiliation(s)
- Aachal A Gosavi
- Department of Pharmaceutics, Dr. D. Y. Patil Institute of Pharmaceutical Sciences and Research, Pimpri, Pune, India
| | - Tanaji D Nandgude
- Department of Pharmaceutics, JSPM University's School of Pharmaceutical Sciences, Wagholi, Pune, India
| | - Rakesh K Mishra
- Department of Pharmaceutics, Dr. D. Y. Patil Institute of Pharmaceutical Sciences and Research, Pimpri, Pune, India.
| | - Dhiraj B Puri
- Department of Mechanical Engineering, Birla Institute of Technology and Science-Pilani, K K Birla Goa Campus, Zuarinagar, Sancoale, Goa, India
| |
Collapse
|
3
|
Umemori Y, Handa K, Yoshimura S, Kageyama M, Iijima T. Development of a Novel In Silico Classification Model to Assess Reactive Metabolite Formation in the Cysteine Trapping Assay and Investigation of Important Substructures. Biomolecules 2024; 14:535. [PMID: 38785942 PMCID: PMC11117661 DOI: 10.3390/biom14050535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Revised: 04/25/2024] [Accepted: 04/26/2024] [Indexed: 05/25/2024] Open
Abstract
Predicting whether a compound can cause drug-induced liver injury (DILI) is difficult due to the complexity of drug mechanism. The cysteine trapping assay is a method for detecting reactive metabolites that bind to microsomes covalently. However, it is cumbersome to use 35S isotope-labeled cysteine for this assay. Therefore, we constructed an in silico classification model for predicting a positive/negative outcome in the cysteine trapping assay. We collected 475 compounds (436 in-house compounds and 39 publicly available drugs) based on experimental data performed in this study, and the composition of the results showed 248 positives and 227 negatives. Using a Message Passing Neural Network (MPNN) and Random Forest (RF) with extended connectivity fingerprint (ECFP) 4, we built machine learning models to predict the covalent binding risk of compounds. In the time-split dataset, AUC-ROC of MPNN and RF were 0.625 and 0.559 in the hold-out test, restrictively. This result suggests that the MPNN model has a higher predictivity than RF in the time-split dataset. Hence, we conclude that the in silico MPNN classification model for the cysteine trapping assay has a better predictive power. Furthermore, most of the substructures that contributed positively to the cysteine trapping assay were consistent with previous results.
Collapse
Affiliation(s)
| | - Koichi Handa
- DMPK Research Department, Teijin Institute for Bio-Medical Research, TEIJIN PHARMA LIMITED, 4-3-2 Asahigaoka, Hino-shi, Tokyo 191-8512, Japan; (Y.U.); (S.Y.); (M.K.); (T.I.)
| | | | | | | |
Collapse
|
4
|
Leniak A, Pietruś W, Kurczab R. From NMR to AI: Designing a Novel Chemical Representation to Enhance Machine Learning Predictions of Physicochemical Properties. J Chem Inf Model 2024; 64:3302-3321. [PMID: 38529877 DOI: 10.1021/acs.jcim.3c02039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/27/2024]
Abstract
A novel approach to the utilization of nuclear magnetic resonance (NMR) spectroscopy data in the prediction of logD through machine learning algorithms is shown. In the analysis, a data set of 754 chemical compounds, organized into 30 clusters, was evaluated using advanced machine learning models, such as Support Vector Regression (SVR), Gradient Boosting, and AdaBoost, and comprehensive validation and testing methods were employed, including 10-fold cross-validation, bootstrapping, and leave-one-out. The study revealed the superior performance of the Bucket Integration method for dimensionality reduction, consistently yielding the lowest root mean square error (RMSE) across all data sets and normalization schemes. The SVR prediction models demonstrated remarkable computational efficiency and low cost, with the best RMSE value reaching 0.66. Our best model outperformed existing tools like JChem Suite's logD Predictor (0.91) and CplogD (1.27), and a comparison with traditional molecular representations yielded a comparable RMSE (0.50), emphasizing the robustness of our NMR data integration. The widespread availability of NMR data in pharmaceutical and industrial research presents an untapped resource for predictive modeling, highlighting the need for accessible methodologies like ours that complement the analytical toolbox beyond conventional 2D approaches. Our approach, designed to leverage the rich spatial data from NMR spectroscopy, provides additional insights and enriches drug discovery and computational chemistry with a freely accessible tool.
Collapse
Affiliation(s)
- Arkadiusz Leniak
- Department of Medicinal Chemistry, Celon Pharma S.A., ul. Marymoncka 15, 05-152 Kazuń Nowy, Poland
| | - Wojciech Pietruś
- Department of Medicinal Chemistry, Celon Pharma S.A., ul. Marymoncka 15, 05-152 Kazuń Nowy, Poland
- Department of Medicinal Chemistry, Maj Institute of Pharmacology, Polish Academy of Sciences, Smetna 12, 31-343 Kraków, Poland
| | - Rafał Kurczab
- Department of Medicinal Chemistry, Maj Institute of Pharmacology, Polish Academy of Sciences, Smetna 12, 31-343 Kraków, Poland
| |
Collapse
|
5
|
Heyndrickx W, Mervin L, Morawietz T, Sturm N, Friedrich L, Zalewski A, Pentina A, Humbeck L, Oldenhof M, Niwayama R, Schmidtke P, Fechner N, Simm J, Arany A, Drizard N, Jabal R, Afanasyeva A, Loeb R, Verma S, Harnqvist S, Holmes M, Pejo B, Telenczuk M, Holway N, Dieckmann A, Rieke N, Zumsande F, Clevert DA, Krug M, Luscombe C, Green D, Ertl P, Antal P, Marcus D, Do Huu N, Fuji H, Pickett S, Acs G, Boniface E, Beck B, Sun Y, Gohier A, Rippmann F, Engkvist O, Göller AH, Moreau Y, Galtier MN, Schuffenhauer A, Ceulemans H. MELLODDY: Cross-pharma Federated Learning at Unprecedented Scale Unlocks Benefits in QSAR without Compromising Proprietary Information. J Chem Inf Model 2024; 64:2331-2344. [PMID: 37642660 PMCID: PMC11005050 DOI: 10.1021/acs.jcim.3c00799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Indexed: 08/31/2023]
Abstract
Federated multipartner machine learning has been touted as an appealing and efficient method to increase the effective training data volume and thereby the predictivity of models, particularly when the generation of training data is resource-intensive. In the landmark MELLODDY project, indeed, each of ten pharmaceutical companies realized aggregated improvements on its own classification or regression models through federated learning. To this end, they leveraged a novel implementation extending multitask learning across partners, on a platform audited for privacy and security. The experiments involved an unprecedented cross-pharma data set of 2.6+ billion confidential experimental activity data points, documenting 21+ million physical small molecules and 40+ thousand assays in on-target and secondary pharmacodynamics and pharmacokinetics. Appropriate complementary metrics were developed to evaluate the predictive performance in the federated setting. In addition to predictive performance increases in labeled space, the results point toward an extended applicability domain in federated learning. Increases in collective training data volume, including by means of auxiliary data resulting from single concentration high-throughput and imaging assays, continued to boost predictive performance, albeit with a saturating return. Markedly higher improvements were observed for the pharmacokinetics and safety panel assay-based task subsets.
Collapse
Affiliation(s)
| | - Lewis Mervin
- AstraZeneca
R&D, Biomedical Campus, 1 Francis Crick Ave, Cambridge CB2 0SL, U.K.
| | - Tobias Morawietz
- Bayer
Pharma
AG, Global Drug Discovery, Chemical Research,
Computational Chemistry, Aprather Weg 18 a, Wuppertal 42096, Germany
| | - Noé Sturm
- Novartis
Institutes for BioMedical Research, Novartis Campus, Basel 4002, Switzerland
| | - Lukas Friedrich
- Merck KGaA, Global Research & Development, Frankfurter Strasse 250, Darmstadt 64293, Germany
| | - Adam Zalewski
- Amgen Research
(Munich) GmbH, Staffelseestraße
2, Munich 81477, Germany
| | - Anastasia Pentina
- Bayer AG, Machine Learning Research, Research & Development,
Pharmaceuticals, Berlin 10117, Germany
| | - Lina Humbeck
- BI Medicinal
Chemistry Department, Boehringer Ingelheim
Pharma GmbH & Co. KG, Birkendorfer Str. 65, Biberach an der Riss 88397, Germany
| | - Martijn Oldenhof
- KU
Leuven, ESAT-STADIUS, Kasteelpark Arenberg 10, Heverlee 3001, Belgium
| | - Ritsuya Niwayama
- Institut
de recherches Servier, 125 chemin de ronde Croissy-sur-Seine, Île-de-France 78290, France
| | | | - Nikolas Fechner
- Novartis
Institutes for BioMedical Research, Novartis Campus, Basel 4002, Switzerland
| | - Jaak Simm
- KU
Leuven, ESAT-STADIUS, Kasteelpark Arenberg 10, Heverlee 3001, Belgium
| | - Adam Arany
- KU
Leuven, ESAT-STADIUS, Kasteelpark Arenberg 10, Heverlee 3001, Belgium
| | | | - Rama Jabal
- Iktos, 65 rue de Prony, Paris 75017, France
| | - Arina Afanasyeva
- Modality
Informatics Group, Digital Research Solutions, Advanced Informatics
& Analytics, Astellas Pharma Inc., 21 Miyukigaoka, Tsukuba-shi, Ibaraki 305-8585, Japan
| | - Regis Loeb
- KU
Leuven, ESAT-STADIUS, Kasteelpark Arenberg 10, Heverlee 3001, Belgium
| | - Shlok Verma
- GlaxoSmithKline, Computational Sciences, Gunnels Wood Road Stevenage, Herts SG1 2NY, U.K.
| | - Simon Harnqvist
- GlaxoSmithKline, Computational Sciences, Gunnels Wood Road Stevenage, Herts SG1 2NY, U.K.
| | - Matthew Holmes
- GlaxoSmithKline, Computational Sciences, Gunnels Wood Road Stevenage, Herts SG1 2NY, U.K.
| | - Balazs Pejo
- Budapest
University of Technology and Economics, Department of Networked Systems and Services, Műegyetem rkp. 3, Budapest 1111, Hungary
| | | | - Nicholas Holway
- Novartis
Institutes for BioMedical Research, Novartis Campus, Basel 4002, Switzerland
| | - Arne Dieckmann
- Bayer
AG, API Production, Product Supply, Pharmaceuticals, Ernst-Schering-Straße 14, Bergkamen 59192, Germany
| | - Nicola Rieke
- NVIDIA
GmbH, Floessergasse 2, Munich 81369, Germany
| | | | - Djork-Arné Clevert
- Bayer AG, Machine Learning Research, Research & Development,
Pharmaceuticals, Berlin 10117, Germany
| | - Michael Krug
- Merck KGaA, Global Research & Development, Frankfurter Strasse 250, Darmstadt 64293, Germany
| | - Christopher Luscombe
- GlaxoSmithKline, Computational Sciences, Gunnels Wood Road Stevenage, Herts SG1 2NY, U.K.
| | - Darren Green
- GlaxoSmithKline, Computational Sciences, Gunnels Wood Road Stevenage, Herts SG1 2NY, U.K.
| | - Peter Ertl
- Novartis
Institutes for BioMedical Research, Novartis Campus, Basel 4002, Switzerland
| | - Peter Antal
- Budapest
University of Technology and Economics, Department of Measurement and Information Systems, Műegyetem rkp. 3, Budapest 1111, Hungary
| | - David Marcus
- GlaxoSmithKline, Computational Sciences, Gunnels Wood Road Stevenage, Herts SG1 2NY, U.K.
| | | | - Hideyoshi Fuji
- Modality
Informatics Group, Digital Research Solutions, Advanced Informatics
& Analytics, Astellas Pharma Inc., 21 Miyukigaoka, Tsukuba-shi, Ibaraki 305-8585, Japan
| | - Stephen Pickett
- GlaxoSmithKline, Computational Sciences, Gunnels Wood Road Stevenage, Herts SG1 2NY, U.K.
| | - Gergely Acs
- Budapest
University of Technology and Economics, Department of Networked Systems and Services, Műegyetem rkp. 3, Budapest 1111, Hungary
| | - Eric Boniface
- Substra
Foundation - Labelia Labs, 4 rue Voltaire, Nantes 44000, France
| | - Bernd Beck
- BI Medicinal
Chemistry Department, Boehringer Ingelheim
Pharma GmbH & Co. KG, Birkendorfer Str. 65, Biberach an der Riss 88397, Germany
| | - Yax Sun
- Amgen
Research, 1 Amgen Center
Drive, Thousand Oaks, California 92130, United States
| | - Arnaud Gohier
- Institut
de recherches Servier, 125 chemin de ronde Croissy-sur-Seine, Île-de-France 78290, France
| | - Friedrich Rippmann
- Merck KGaA, Global Research & Development, Frankfurter Strasse 250, Darmstadt 64293, Germany
| | - Ola Engkvist
- AstraZeneca, Molecular AI, Discovery Sciences,
R&D, Pepparedsleden
1, Mölndal 431 50, Sweden
| | - Andreas H. Göller
- Bayer
Pharma
AG, Global Drug Discovery, Chemical Research,
Computational Chemistry, Aprather Weg 18 a, Wuppertal 42096, Germany
| | - Yves Moreau
- KU
Leuven, ESAT-STADIUS, Kasteelpark Arenberg 10, Heverlee 3001, Belgium
| | | | - Ansgar Schuffenhauer
- Novartis
Institutes for BioMedical Research, Novartis Campus, Basel 4002, Switzerland
| | - Hugo Ceulemans
- Janssen
Pharmaceutica NV, Turnhoutseweg 30, Beerse 2340, Belgium
| |
Collapse
|
6
|
Fluetsch A, Di Lascio E, Gerebtzoff G, Rodríguez-Pérez R. Adapting Deep Learning QSPR Models to Specific Drug Discovery Projects. Mol Pharm 2024; 21:1817-1826. [PMID: 38373038 DOI: 10.1021/acs.molpharmaceut.3c01124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
Medicinal chemistry and drug design efforts can be assisted by machine learning (ML) models that relate the molecular structure to compound properties. Such quantitative structure-property relationship models are generally trained on large data sets that include diverse chemical series (global models). In the pharmaceutical industry, these ML global models are available across discovery projects as an "out-of-the-box" solution to assist in drug design, synthesis prioritization, and experiment selection. However, drug discovery projects typically focus on confined parts of the chemical space (e.g., chemical series), where global models might not be applicable. Local ML models are sometimes generated to focus on specific projects or series. Herein, ML-based global models, local models, and hybrid global-local strategies were benchmarked. Analyses were done for more than 300 drug discovery projects at Novartis and ten absorption, distribution, metabolism, and excretion (ADME) assays. In this work, hybrid global-local strategies based on transfer learning approaches were proposed to leverage both historical ADME data (global) and project-specific data (local) to adapt model predictions. Fine-tuning a pretrained global ML model (used for weights' initialization, WI) was the top-performing method. Average improvements of mean absolute errors across all assays were 16% and 27% compared with global and local models, respectively. Interestingly, when the effect of training set size was analyzed, WI fine-tuning was found to be successful even in low-data scenarios (e.g., ∼10 molecules per project). Taken together, this work highlights the potential of domain adaptation in the field of molecular property predictions to refine existing pretrained models on a new compound data distribution.
Collapse
Affiliation(s)
- Andrin Fluetsch
- Novartis Biomedical Research, Novartis Campus, Basel 4002, Switzerland
| | - Elena Di Lascio
- Novartis Biomedical Research, Novartis Campus, Basel 4002, Switzerland
| | | | | |
Collapse
|
7
|
Nandi S, Bhaduri S, Das D, Ghosh P, Mandal M, Mitra P. Deciphering the Lexicon of Protein Targets: A Review on Multifaceted Drug Discovery in the Era of Artificial Intelligence. Mol Pharm 2024; 21:1563-1590. [PMID: 38466810 DOI: 10.1021/acs.molpharmaceut.3c01161] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/13/2024]
Abstract
Understanding protein sequence and structure is essential for understanding protein-protein interactions (PPIs), which are essential for many biological processes and diseases. Targeting protein binding hot spots, which regulate signaling and growth, with rational drug design is promising. Rational drug design uses structural data and computational tools to study protein binding sites and protein interfaces to design inhibitors that can change these interactions, thereby potentially leading to therapeutic approaches. Artificial intelligence (AI), such as machine learning (ML) and deep learning (DL), has advanced drug discovery and design by providing computational resources and methods. Quantum chemistry is essential for drug reactivity, toxicology, drug screening, and quantitative structure-activity relationship (QSAR) properties. This review discusses the methodologies and challenges of identifying and characterizing hot spots and binding sites. It also explores the strategies and applications of artificial-intelligence-based rational drug design technologies that target proteins and protein-protein interaction (PPI) binding hot spots. It provides valuable insights for drug design with therapeutic implications. We have also demonstrated the pathological conditions of heat shock protein 27 (HSP27) and matrix metallopoproteinases (MMP2 and MMP9) and designed inhibitors of these proteins using the drug discovery paradigm in a case study on the discovery of drug molecules for cancer treatment. Additionally, the implications of benzothiazole derivatives for anticancer drug design and discovery are deliberated.
Collapse
Affiliation(s)
- Suvendu Nandi
- School of Medical Science and Technology, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Soumyadeep Bhaduri
- Centre for Computational and Data Sciences, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Debraj Das
- Centre for Computational and Data Sciences, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Priya Ghosh
- School of Medical Science and Technology, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Mahitosh Mandal
- School of Medical Science and Technology, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| | - Pralay Mitra
- Department of Computer Science and Engineering, Indian Institute of Technology Kharagpur, Kharagpur, West Bengal 721302, India
| |
Collapse
|
8
|
Biehn SE, Goncalves LM, Lehmann J, Marty JD, Mueller C, Ramirez SA, Tillier F, Sage CR. BioPrint meets the AI age: development of artificial intelligence-based ADMET models for the drug-discovery platform SAFIRE. Future Med Chem 2024; 16:587-599. [PMID: 38372202 DOI: 10.4155/fmc-2024-0007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2024] [Accepted: 02/08/2024] [Indexed: 02/20/2024] Open
Abstract
Background: To prioritize compounds with a higher likelihood of success, artificial intelligence models can be used to predict absorption, distribution, metabolism, excretion and toxicity (ADMET) properties of molecules quickly and efficiently. Methods: Models were trained with BioPrint database proprietary data along with public datasets to predict various ADMET end points for the SAFIRE platform. Results: SAFIRE models performed at or above 75% accuracy and 0.4 Matthew's correlation coefficient with validation sets. Training with both proprietary and public data improved model performance and expanded the chemical space on which the models were trained. The platform features scoring functionality to guide user decision-making. Conclusion: High-quality datasets along with chemical space considerations yielded ADMET models performing favorably with utility in the drug discovery process.
Collapse
Affiliation(s)
- Sarah E Biehn
- Eurofins DiscoveryAI, Eurofins Panlabs, Inc., Saint Charles, MO 63304, USA
| | | | - Juerg Lehmann
- Eurofins DiscoveryAI, Eurofins Panlabs, Inc., Saint Charles, MO 63304, USA
| | - Jessica D Marty
- Eurofins DiscoveryAI, Eurofins Panlabs, Inc., Saint Charles, MO 63304, USA
| | - Christoph Mueller
- Eurofins DiscoveryAI, Eurofins Panlabs, Inc., Saint Charles, MO 63304, USA
| | - Samuel A Ramirez
- Eurofins DiscoveryAI, Eurofins Panlabs, Inc., Saint Charles, MO 63304, USA
| | - Fabien Tillier
- Eurofins DiscoveryAI, Eurofins Panlabs, Inc., Saint Charles, MO 63304, USA
| | - Carleton R Sage
- Eurofins DiscoveryAI, Eurofins Panlabs, Inc., Saint Charles, MO 63304, USA
| |
Collapse
|
9
|
Margiotta-Casaluci L, Owen SF, Winter MJ. Cross-Species Extrapolation of Biological Data to Guide the Environmental Safety Assessment of Pharmaceuticals-The State of the Art and Future Priorities. ENVIRONMENTAL TOXICOLOGY AND CHEMISTRY 2024; 43:513-525. [PMID: 37067359 DOI: 10.1002/etc.5634] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/21/2022] [Revised: 03/23/2023] [Accepted: 04/13/2023] [Indexed: 05/27/2023]
Abstract
The extrapolation of biological data across species is a key aspect of biomedical research and drug development. In this context, comparative biology considerations are applied with the goal of understanding human disease and guiding the development of effective and safe medicines. However, the widespread occurrence of pharmaceuticals in the environment and the need to assess the risk posed to wildlife have prompted a renewed interest in the extrapolation of pharmacological and toxicological data across the entire tree of life. To address this challenge, a biological "read-across" approach, based on the use of mammalian data to inform toxicity predictions in wildlife species, has been proposed as an effective way to streamline the environmental safety assessment of pharmaceuticals. Yet, how effective has this approach been, and are we any closer to being able to accurately predict environmental risk based on known human risk? We discuss the main theoretical and experimental advancements achieved in the last 10 years of research in this field. We propose that a better understanding of the functional conservation of drug targets across species and of the quantitative relationship between target modulation and adverse effects should be considered as future research priorities. This pharmacodynamic focus should be complemented with the application of higher-throughput experimental and computational approaches to accelerate the prediction of internal exposure dynamics. The translation of comparative (eco)toxicology research into real-world applications, however, relies on the (limited) availability of experts with the skill set needed to navigate the complexity of the problem; hence, we also call for synergistic multistakeholder efforts to support and strengthen comparative toxicology research and education at a global level. Environ Toxicol Chem 2024;43:513-525. © 2023 The Authors. Environmental Toxicology and Chemistry published by Wiley Periodicals LLC on behalf of SETAC.
Collapse
Affiliation(s)
- Luigi Margiotta-Casaluci
- Institute of Pharmaceutical Science, Faculty of Life Sciences & Medicine, King's College London, London, United Kingdom
| | - Stewart F Owen
- Global Sustainability, AstraZeneca, Macclesfield, Cheshire, United Kingdom
| | - Matthew J Winter
- Biosciences, Faculty of Health and Life Sciences, University of Exeter, Exeter, Devon, United Kingdom
| |
Collapse
|
10
|
Führer F, Gruber A, Diedam H, Göller AH, Menz S, Schneckener S. A deep neural network: mechanistic hybrid model to predict pharmacokinetics in rat. J Comput Aided Mol Des 2024; 38:7. [PMID: 38294570 DOI: 10.1007/s10822-023-00547-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2023] [Accepted: 12/21/2023] [Indexed: 02/01/2024]
Abstract
An important aspect in the development of small molecules as drugs or agrochemicals is their systemic availability after intravenous and oral administration. The prediction of the systemic availability from the chemical structure of a potential candidate is highly desirable, as it allows to focus the drug or agrochemical development on compounds with a favorable kinetic profile. However, such predictions are challenging as the availability is the result of the complex interplay between molecular properties, biology and physiology and training data is rare. In this work we improve the hybrid model developed earlier (Schneckener in J Chem Inf Model 59:4893-4905, 2019). We reduce the median fold change error for the total oral exposure from 2.85 to 2.35 and for intravenous administration from 1.95 to 1.62. This is achieved by training on a larger data set, improving the neural network architecture as well as the parametrization of mechanistic model. Further, we extend our approach to predict additional endpoints and to handle different covariates, like sex and dosage form. In contrast to a pure machine learning model, our model is able to predict new end points on which it has not been trained. We demonstrate this feature by predicting the exposure over the first 24 h, while the model has only been trained on the total exposure.
Collapse
Affiliation(s)
- Florian Führer
- Engineering & Technology, Applied Mathematics, Bayer AG, 51368, Leverkusen, Germany.
| | - Andrea Gruber
- Pharmaceuticals, R&D, Preclinical Modeling & Simulation, Bayer AG, 13353, Berlin, Germany
| | - Holger Diedam
- Crop Science, Product Supply, SC Simulation & Analysis, Bayer AG, 40789, Monheim, Germany
| | - Andreas H Göller
- Pharmaceuticals, R&D, Molecular Design, Bayer AG, 42096, Wuppertal, Germany
| | - Stephan Menz
- Pharmaceuticals, R&D, Preclinical Modeling & Simulation, Bayer AG, 13353, Berlin, Germany
| | | |
Collapse
|
11
|
Bothe U, Günther J, Nubbemeyer R, Siebeneicher H, Ring S, Bömer U, Peters M, Rausch A, Denner K, Himmel H, Sutter A, Terebesi I, Lange M, Wengner AM, Guimond N, Thaler T, Platzek J, Eberspächer U, Schäfer M, Steuber H, Zollner TM, Steinmeyer A, Schmidt N. Discovery of IRAK4 Inhibitors BAY1834845 (Zabedosertib) and BAY1830839. J Med Chem 2024; 67:1225-1242. [PMID: 38228402 PMCID: PMC10823478 DOI: 10.1021/acs.jmedchem.3c01714] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 12/01/2023] [Accepted: 12/04/2023] [Indexed: 01/18/2024]
Abstract
Interleukin-1 receptor-associated kinase 4 (IRAK4) plays a critical role in innate inflammatory processes. Here, we describe the discovery of two clinical candidate IRAK4 inhibitors, BAY1834845 (zabedosertib) and BAY1830839, starting from a high-throughput screening hit derived from Bayer's compound library. By exploiting binding site features distinct to IRAK4 using an in-house docking model, liabilities of the original hit could surprisingly be overcome to confer both candidates with a unique combination of good potency and selectivity. Favorable DMPK profiles and activity in animal inflammation models led to the selection of these two compounds for clinical development in patients.
Collapse
Affiliation(s)
- Ulrich Bothe
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | - Judith Günther
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | | | | | - Sven Ring
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | | | - Michaele Peters
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | | | - Karsten Denner
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | - Herbert Himmel
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | - Andreas Sutter
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | - Ildiko Terebesi
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | | | - Antje M. Wengner
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | - Nicolas Guimond
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | - Tobias Thaler
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | - Johannes Platzek
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | - Uwe Eberspächer
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | | | | | - Thomas M. Zollner
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | - Andreas Steinmeyer
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| | - Nicole Schmidt
- Bayer AG, Research &
Development, Pharmaceuticals, 13353 Berlin, Germany
| |
Collapse
|
12
|
Hasselgren C, Oprea TI. Artificial Intelligence for Drug Discovery: Are We There Yet? Annu Rev Pharmacol Toxicol 2024; 64:527-550. [PMID: 37738505 DOI: 10.1146/annurev-pharmtox-040323-040828] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/24/2023]
Abstract
Drug discovery is adapting to novel technologies such as data science, informatics, and artificial intelligence (AI) to accelerate effective treatment development while reducing costs and animal experiments. AI is transforming drug discovery, as indicated by increasing interest from investors, industrial and academic scientists, and legislators. Successful drug discovery requires optimizing properties related to pharmacodynamics, pharmacokinetics, and clinical outcomes. This review discusses the use of AI in the three pillars of drug discovery: diseases, targets, and therapeutic modalities, with a focus on small-molecule drugs. AI technologies, such as generative chemistry, machine learning, and multiproperty optimization, have enabled several compounds to enter clinical trials. The scientific community must carefully vet known information to address the reproducibility crisis. The full potential of AI in drug discovery can only be realized with sufficient ground truth and appropriate human intervention at later pipeline stages.
Collapse
Affiliation(s)
- Catrin Hasselgren
- Safety Assessment, Genentech, Inc., South San Francisco, California, USA
| | - Tudor I Oprea
- Expert Systems Inc., San Diego, California, USA;
- Department of Internal Medicine, University of New Mexico Health Sciences Center, Albuquerque, New Mexico, USA
| |
Collapse
|
13
|
Siramshetty VB, Xu X, Shah P. Artificial Intelligence in ADME Property Prediction. Methods Mol Biol 2024; 2714:307-327. [PMID: 37676606 DOI: 10.1007/978-1-0716-3441-7_17] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 09/08/2023]
Abstract
Absorption, distribution, metabolism, excretion (ADME) are key properties of a small molecule that govern pharmacokinetic profiles and impact its efficacy and safety. Computational methods such as machine learning and artificial intelligence have gained significant interest in both academic and industrial settings to predict pharmacokinetic properties of small molecules. These methods are applied in drug discovery to optimize chemical libraries, prioritize hits from biological screens, and optimize ADME properties of lead molecules. In the recent years, the drug discovery community witnessed the use of a range of neural network architectures such as deep neural networks, recurrent neural networks, graph neural networks, and transformer neural networks, which marked a paradigm shift in computer-aided drug design and development. This chapter discusses recent developments with an emphasis on their application to predict ADME properties.
Collapse
Affiliation(s)
- Vishal B Siramshetty
- National Center for Advancing Translational Sciences, Rockville, MD, USA
- Department of Safety Assessment, Genentech, Inc., South San Francisco, CA, USA
| | - Xin Xu
- National Center for Advancing Translational Sciences, Rockville, MD, USA
| | - Pranav Shah
- National Center for Advancing Translational Sciences, Rockville, MD, USA.
| |
Collapse
|
14
|
Hong RS, Rojas AV, Bhardwaj RM, Wang L, Mattei A, Abraham NS, Cusack KP, Pierce MO, Mondal S, Mehio N, Bordawekar S, Kym PR, Abel R, Sheikh AY. Free Energy Perturbation Approach for Accurate Crystalline Aqueous Solubility Predictions. J Med Chem 2023; 66:15883-15893. [PMID: 38016916 DOI: 10.1021/acs.jmedchem.3c01339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2023]
Abstract
Early assessment of crystalline thermodynamic solubility continues to be elusive for drug discovery and development despite its critical importance, especially for the ever-increasing fraction of poorly soluble drug candidates. Here we present a detailed evaluation of a physics-based free energy perturbation (FEP+) approach for computing the thermodynamic aqueous solubility. The predictive power of this approach is assessed across diverse chemical spaces, spanning pharmaceutically relevant literature compounds and more complex AbbVie compounds. Our approach achieves predictive (RMSE = 0.86) and differentiating power (R2 = 0.69) and therefore provides notably improved correlations to experimental solubility compared to state-of-the-art machine learning approaches that utilize quantum mechanics-based descriptors. The importance of explicit considerations of crystalline packing in predicting solubility by the FEP+ approach is also highlighted in this study. Finally, we show how computed energetics, including hydration and sublimation free energies, can provide further insights into molecule design to feed the medicinal chemistry DMTA cycle.
Collapse
Affiliation(s)
- Richard S Hong
- AbbVie Inc., Research & Development, 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| | - Ana V Rojas
- Schrödinger Inc., 1540 Broadway 24th Floor, New York, New York 10036, United States
| | - Rajni Miglani Bhardwaj
- AbbVie Inc., Research & Development, 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| | - Lingle Wang
- Schrödinger Inc., 1540 Broadway 24th Floor, New York, New York 10036, United States
| | - Alessandra Mattei
- AbbVie Inc., Research & Development, 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| | - Nathan S Abraham
- Ventus Therapeutics 100 Beaver St, Waltham, Massachusetts 02453, United States
| | - Kevin P Cusack
- AbbVie Inc., Research & Development, 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| | - M Olivia Pierce
- Bristol Myer Squibb, 100 Binney Street, Cambridge, Massachusetts 02142, United States
| | - Sayan Mondal
- Schrödinger Inc., 1540 Broadway 24th Floor, New York, New York 10036, United States
| | - Nada Mehio
- AbbVie Inc., Research & Development, 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| | - Shailendra Bordawekar
- AbbVie Inc., Research & Development, 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| | - Philip R Kym
- AbbVie Inc., Research & Development, 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| | - Robert Abel
- Schrödinger Inc., 1540 Broadway 24th Floor, New York, New York 10036, United States
| | - Ahmad Y Sheikh
- AbbVie Inc., Research & Development, 1 N Waukegan Road, North Chicago, Illinois 60064, United States
| |
Collapse
|
15
|
Gheta SKO, Bonin A, Gerlach T, Göller AH. Predicting absolute aqueous solubility by applying a machine learning model for an artificially liquid-state as proxy for the solid-state. J Comput Aided Mol Des 2023; 37:765-789. [PMID: 37878216 DOI: 10.1007/s10822-023-00538-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 10/02/2023] [Indexed: 10/26/2023]
Abstract
In this study, we use machine learning algorithms with QM-derived COSMO-RS descriptors, along with Morgan fingerprints, to predict the absolute solubility of drug-like compounds. The QM-derived descriptors account for the molecular properties of the solute, i.e., the solute-solute interactions in an artificial-liquid-state (super-cooled liquid), and the solute-solvent interactions in solution. We employ two main approaches to predict solubility: (i) a hypothetical pathway that involves melting the solute at room temperature T = T¯ ([Formula: see text]) and mixing the artificially liquid solute into the solvent ([Formula: see text]). In this approach [Formula: see text] is predicted using machine learning models, and the [Formula: see text] is obtained from COSMO-RS calculations; (ii) direct solubility prediction using machine learning algorithms. The models were trained on a large number of Bayer in-house compounds for which water solubility data is available at physiological pH of 6.5 and ambient temperature. We also evaluated our models using external datasets from a solubility challenge. Our models present great improvements compared to the absolute solubility prediction with the QSAR model for the artificial liquid state as implemented in the COSMOtherm software, for both in-house and external datasets. We are furthermore able to demonstrate the superiority of QM-derived descriptors compared to cheminformatics descriptors. We finally present low-cost alternative models using fragment-based COSMOquick calculations with only marginal reduction in the quality of predicted solubility.
Collapse
Affiliation(s)
- Sadra Kashef Ol Gheta
- Bayer AG, Pharmaceuticals, R&D, Computational Molecular Design, 42096, Wuppertal, Germany
| | - Anne Bonin
- Bayer AG, Pharmaceuticals, R&D, Computational Molecular Design, 42096, Wuppertal, Germany
| | - Thomas Gerlach
- Bayer AG, Crop Science, R&D, Digital Transformation, 40789, Monheim, Germany
- Bayer AG, Engineering & Technology, Thermal Separation Technologies, 51368, Leverkusen, Germany
| | - Andreas H Göller
- Bayer AG, Pharmaceuticals, R&D, Computational Molecular Design, 42096, Wuppertal, Germany.
| |
Collapse
|
16
|
Tran-Nguyen VK, Junaid M, Simeon S, Ballester PJ. A practical guide to machine-learning scoring for structure-based virtual screening. Nat Protoc 2023; 18:3460-3511. [PMID: 37845361 DOI: 10.1038/s41596-023-00885-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Accepted: 07/03/2023] [Indexed: 10/18/2023]
Abstract
Structure-based virtual screening (SBVS) via docking has been used to discover active molecules for a range of therapeutic targets. Chemical and protein data sets that contain integrated bioactivity information have increased both in number and in size. Artificial intelligence and, more concretely, its machine-learning (ML) branch, including deep learning, have effectively exploited these data sets to build scoring functions (SFs) for SBVS against targets with an atomic-resolution 3D model (e.g., generated by X-ray crystallography or predicted by AlphaFold2). Often outperforming their generic and non-ML counterparts, target-specific ML-based SFs represent the state of the art for SBVS. Here, we present a comprehensive and user-friendly protocol to build and rigorously evaluate these new SFs for SBVS. This protocol is organized into four sections: (i) using a public benchmark of a given target to evaluate an existing generic SF; (ii) preparing experimental data for a target from public repositories; (iii) partitioning data into a training set and a test set for subsequent target-specific ML modeling; and (iv) generating and evaluating target-specific ML SFs by using the prepared training-test partitions. All necessary code and input/output data related to three example targets (acetylcholinesterase, HMG-CoA reductase, and peroxisome proliferator-activated receptor-α) are available at https://github.com/vktrannguyen/MLSF-protocol , can be run by using a single computer within 1 week and make use of easily accessible software/programs (e.g., Smina, CNN-Score, RF-Score-VS and DeepCoy) and web resources. Our aim is to provide practical guidance on how to augment training data to enhance SBVS performance, how to identify the most suitable supervised learning algorithm for a data set, and how to build an SF with the highest likelihood of discovering target-active molecules within a given compound library.
Collapse
Affiliation(s)
| | - Muhammad Junaid
- Centre de Recherche en Cancérologie de Marseille, Marseille, France
| | - Saw Simeon
- Centre de Recherche en Cancérologie de Marseille, Marseille, France
| | | |
Collapse
|
17
|
Fralish Z, Chen A, Skaluba P, Reker D. DeepDelta: predicting ADMET improvements of molecular derivatives with deep learning. J Cheminform 2023; 15:101. [PMID: 37885017 PMCID: PMC10605784 DOI: 10.1186/s13321-023-00769-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2023] [Accepted: 10/12/2023] [Indexed: 10/28/2023] Open
Abstract
Established molecular machine learning models process individual molecules as inputs to predict their biological, chemical, or physical properties. However, such algorithms require large datasets and have not been optimized to predict property differences between molecules, limiting their ability to learn from smaller datasets and to directly compare the anticipated properties of two molecules. Many drug and material development tasks would benefit from an algorithm that can directly compare two molecules to guide molecular optimization and prioritization, especially for tasks with limited available data. Here, we develop DeepDelta, a pairwise deep learning approach that processes two molecules simultaneously and learns to predict property differences between two molecules from small datasets. On 10 ADMET benchmark tasks, our DeepDelta approach significantly outperforms two established molecular machine learning algorithms, the directed message passing neural network (D-MPNN) ChemProp and Random Forest using radial fingerprints, for 70% of benchmarks in terms of Pearson's r, 60% of benchmarks in terms of mean absolute error (MAE), and all external test sets for both Pearson's r and MAE. We further analyze our performance and find that DeepDelta is particularly outperforming established approaches at predicting large differences in molecular properties and can perform scaffold hopping. Furthermore, we derive mathematically fundamental computational tests of our models based on mathematical invariants and show that compliance to these tests correlates with overall model performance - providing an innovative, unsupervised, and easily computable measure of expected model performance and applicability. Taken together, DeepDelta provides an accurate approach to predict molecular property differences by directly training on molecular pairs and their property differences to further support fidelity and transparency in molecular optimization for drug development and the chemical sciences.
Collapse
Affiliation(s)
- Zachary Fralish
- Department of Biomedical Engineering, Duke University, Durham, NC, 27708, USA
| | - Ashley Chen
- Department of Computer Science, Duke University, Durham, NC, 27708, USA
| | - Paul Skaluba
- Department of Biomedical Engineering, Duke University, Durham, NC, 27708, USA
| | - Daniel Reker
- Department of Biomedical Engineering, Duke University, Durham, NC, 27708, USA.
| |
Collapse
|
18
|
Tran TTV, Tayara H, Chong KT. Recent Studies of Artificial Intelligence on In Silico Drug Absorption. J Chem Inf Model 2023; 63:6198-6211. [PMID: 37819031 DOI: 10.1021/acs.jcim.3c00960] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/13/2023]
Abstract
Absorption is an important area of research in pharmacochemistry and drug development, because the drug has to be absorbed before any drug effects can occur. Furthermore, the ADMET (Absorption, Distribution, Metabolism, Excretion, and Toxicity) profile of drugs can be directly and considerably altered by modulating factors affecting absorption. Many drugs in development fail because of poor absorption. The research and continuous efforts of researchers in recent years have brought many successes and promises in drug absorption property prediction, especially in silico, which helps to reduce the time and cost significantly for screening undesirable drug candidates. In this report, we explicitly provide an overview of recent in silico studies on predicting absorption properties, especially from 2019 to the present, using artificial intelligence. Additionally, we have collected and investigated public databases that support absorption prediction research. On those grounds, we also proposed the challenges and development directions of absorption prediction in the future. We hope this review can provide researchers with valuable guidelines on absorption prediction to facilitate the development of newer approaches in drug discovery.
Collapse
Affiliation(s)
- Thi Tuyet Van Tran
- Department of Electronics and Information Engineering, Jeonbuk National University, Jeonju 54896, Republic of Korea
- Faculty of Information Technology, An Giang University, Long Xuyen 880000, Vietnam
- Vietnam National University, Ho Chi Minh City, Ho Chi Minh 700000, Vietnam
| | - Hilal Tayara
- School of International Engineering and Science, Jeonbuk National University, Jeonju 54896, Republic of Korea
| | - Kil To Chong
- Advances Electronics and Information Research Center, Jeonbuk National University, Jeonju 54896, Republic of Korea
| |
Collapse
|
19
|
Mamada H, Takahashi M, Ogino M, Nomura Y, Uesawa Y. Predictive Models Based on Molecular Images and Molecular Descriptors for Drug Screening. ACS OMEGA 2023; 8:37186-37195. [PMID: 37841172 PMCID: PMC10568689 DOI: 10.1021/acsomega.3c04073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/06/2023] [Accepted: 08/30/2023] [Indexed: 10/17/2023]
Abstract
Various toxicity and pharmacokinetic evaluations as screening experiments are needed at the drug discovery stage. Currently, to reduce the use of animal experiments and developmental expenses, the development of high-performance predictive models based on quantitative structure-activity relationship analysis is desired. From these evaluation targets, we selected 50% lethal dose (LD50), blood-brain barrier penetration (BBBP), and the clearance (CL) pathway for this investigation and constructed predictive models for each target using 636-11,886 compounds. First, we constructed predictive models using the DeepSnap-deep learning (DL) method and images of compounds as features. The calculated area under the curve (AUC) and balanced accuracy (BAC) were, respectively, 0.887 and 0.818 for LD50, 0.893 and 0.824 for BBBP, and 0.883 and 0.763 for the CL pathway. Next, molecular descriptors (MDs) of compounds were calculated using Molecular Operating Environment, alvaDesc, and ADMET Predictor to construct predictive models using the MD-based method. Using these MDs, we constructed predictive models using DataRobot. The calculated AUC and BAC were, respectively, 0.931 and 0.805 for LD50, 0.919 and 0.849 for BBBP, and 0.900 and 0.807 for the CL pathway. In this investigation, we constructed predictive models combining the DeepSnap-DL and MD-based methods. In ensemble models using the mean predictive probability of the DeepSnap-DL and MD-based methods, the calculated AUC and BAC were, respectively, 0.942 and 0.842 for LD50, 0.936 and 0.853 for BBBP, and 0.908 and 0.832 for the CL pathway, with improved predictive performance observed for all variables compared with either single method alone. Moreover, in consensus models that adopted only compounds for which the results of the two methods agreed, the calculated BAC for LD50, BBBP, and the CL pathway were 0.916, 0.918, and 0.847, respectively, indicating higher predictive performance than the ensemble models for all three variables. The predictive models combining the DeepSnap-DL and MD-based methods displayed high predictive performance for LD50, BBBP, and the CL pathway. Therefore, the application of this approach to prediction targets in various drug discovery screenings is expected to accelerate drug discovery.
Collapse
Affiliation(s)
- Hideaki Mamada
- Drug
Metabolism and Pharmacokinetics Research Laboratories, Central Pharmaceutical
Research Institute, Japan Tobacco Inc., 1-1 Murasaki-cho, Takatsuki, Osaka 569-1125, Japan
| | - Mari Takahashi
- Drug
Metabolism and Pharmacokinetics Research Laboratories, Central Pharmaceutical
Research Institute, Japan Tobacco Inc., 1-1 Murasaki-cho, Takatsuki, Osaka 569-1125, Japan
| | - Mizuki Ogino
- Drug
Metabolism and Pharmacokinetics Research Laboratories, Central Pharmaceutical
Research Institute, Japan Tobacco Inc., 1-1 Murasaki-cho, Takatsuki, Osaka 569-1125, Japan
| | - Yukihiro Nomura
- Drug
Metabolism and Pharmacokinetics Research Laboratories, Central Pharmaceutical
Research Institute, Japan Tobacco Inc., 1-1 Murasaki-cho, Takatsuki, Osaka 569-1125, Japan
| | - Yoshihiro Uesawa
- Department
of Medical Molecular Informatics, Meiji
Pharmaceutical University, 2-522-1 Noshio, Kiyose, Tokyo 204-858, Japan
| |
Collapse
|
20
|
Wang Y, Xiong J, Xiao F, Zhang W, Cheng K, Rao J, Niu B, Tong X, Qu N, Zhang R, Wang D, Chen K, Li X, Zheng M. LogD7.4 prediction enhanced by transferring knowledge from chromatographic retention time, microscopic pKa and logP. J Cheminform 2023; 15:76. [PMID: 37670374 PMCID: PMC10478446 DOI: 10.1186/s13321-023-00754-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2023] [Accepted: 08/25/2023] [Indexed: 09/07/2023] Open
Abstract
Lipophilicity is a fundamental physical property that significantly affects various aspects of drug behavior, including solubility, permeability, metabolism, distribution, protein binding, and toxicity. Accurate prediction of lipophilicity, measured by the logD7.4 value (the distribution coefficient between n-octanol and buffer at physiological pH 7.4), is crucial for successful drug discovery and design. However, the limited availability of data for logD modeling poses a significant challenge to achieving satisfactory generalization capability. To address this challenge, we have developed a novel logD7.4 prediction model called RTlogD, which leverages knowledge from multiple sources. RTlogD combines pre-training on a chromatographic retention time (RT) dataset since the RT is influenced by lipophilicity. Additionally, microscopic pKa values are incorporated as atomic features, providing valuable insights into ionizable sites and ionization capacity. Furthermore, logP is integrated as an auxiliary task within a multitask learning framework. We conducted ablation studies and presented a detailed analysis, showcasing the effectiveness and interpretability of RT, pKa, and logP in the RTlogD model. Notably, our RTlogD model demonstrated superior performance compared to commonly used algorithms and prediction tools. These results underscore the potential of the RTlogD model to improve the accuracy and generalization of logD prediction in drug discovery and design. In summary, the RTlogD model addresses the challenge of limited data availability in logD modeling by leveraging knowledge from RT, microscopic pKa, and logP. Incorporating these factors enhances the predictive capabilities of our model, and it holds promise for real-world applications in drug discovery and design scenarios.
Collapse
Affiliation(s)
- Yitian Wang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Jiacheng Xiong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Fu Xiao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing, 210023, China
| | - Wei Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Kaiyang Cheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing, 210023, China
| | - Jingxin Rao
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Buying Niu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Xiaochu Tong
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Ning Qu
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | - Runze Zhang
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
| | | | - Kaixian Chen
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing, 210023, China
| | - Xutong Li
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China.
| | - Mingyue Zheng
- Drug Discovery and Design Center, State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Shanghai, 201203, China.
- University of Chinese Academy of Sciences, No. 19A Yuquan Road, Beijing, 100049, China.
- Nanjing University of Chinese Medicine, 138 Xianlin Road, Nanjing, 210023, China.
| |
Collapse
|
21
|
Liu S, Kosugi Y. Human Brain Penetration Prediction Using Scaling Approach from Animal Machine Learning Models. AAPS J 2023; 25:86. [PMID: 37667061 DOI: 10.1208/s12248-023-00850-1] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Accepted: 08/14/2023] [Indexed: 09/06/2023] Open
Abstract
Machine learning (ML) approaches have been applied to predicting drug pharmacokinetic properties. Previously, we predicted rat unbound brain-to-plasma ratio (Kpuu,brain) by ML models. In this study, we aimed to predict human Kpuu,brain through animal ML models. First, we re-evaluated ML models for rat Kpuu,brain prediction by using trendy open-source packages. We then developed ML models for monkey Kpuu,brain prediction. Leave-one-out cross validation was utilized to rationally build models using a relatively small dataset. After establishing the monkey and rat ML models, human Kpuu,brain prediction was achieved by implementing the animal models considering appropriate scaling methods. Mechanistic NeuroPK models for the identical monkey and human dataset were treated as the criteria for comparison. Results showed that rat Kpuu,brain predictivity was successfully replicated. The optimal ML model for monkey Kpuu,brain prediction was superior to the NeuroPK model, where accuracy within 2-fold error was 78% (R2 = 0.76). For human Kpuu,brain prediction, rat model using relative expression factor (REF), scaled transporter efflux ratios (ERs), and monkey model using in vitro ERs can provide comparable predictivity to the NeuroPK model, where accuracy within 2-fold error was 71% and 64% (R2 = 0.30 and 0.52), respectively. We demonstrated that ML models can deliver promising Kpuu,brain prediction with several advantages: (1) predict reasonable animal Kpuu,brain; (2) prospectively predict human Kpuu,brain from animal models; and (3) can skip expensive monkey studies for human prediction by using the rat model. As a result, ML models can be a powerful tool for drug Kpuu,brain prediction in the discovery stage.
Collapse
Affiliation(s)
- Siyu Liu
- Drug Metabolism & Pharmacokinetics Research Laboratories, Preclinical & Translational Sciences, Research, Takeda Pharmaceutical Company Limited, Shonan Health Innovation Park, 26-1, Muraoka-Higashi 2-Chome, Fujisawa, Kanagawa, 251-8555, Japan.
| | - Yohei Kosugi
- Drug Metabolism & Pharmacokinetics Research Laboratories, Preclinical & Translational Sciences, Research, Takeda Pharmaceutical Company Limited, Shonan Health Innovation Park, 26-1, Muraoka-Higashi 2-Chome, Fujisawa, Kanagawa, 251-8555, Japan
| |
Collapse
|
22
|
Boldini D, Grisoni F, Kuhn D, Friedrich L, Sieber SA. Practical guidelines for the use of gradient boosting for molecular property prediction. J Cheminform 2023; 15:73. [PMID: 37641120 PMCID: PMC10464382 DOI: 10.1186/s13321-023-00743-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Accepted: 08/09/2023] [Indexed: 08/31/2023] Open
Abstract
Decision tree ensembles are among the most robust, high-performing and computationally efficient machine learning approaches for quantitative structure-activity relationship (QSAR) modeling. Among them, gradient boosting has recently garnered particular attention, for its performance in data science competitions, virtual screening campaigns, and bioactivity prediction. However, different variants of gradient boosting exist, the most popular being XGBoost, LightGBM and CatBoost. Our study provides the first comprehensive comparison of these approaches for QSAR. To this end, we trained 157,590 gradient boosting models, which were evaluated on 16 datasets and 94 endpoints, comprising 1.4 million compounds in total. Our results show that XGBoost generally achieves the best predictive performance, while LightGBM requires the least training time, especially for larger datasets. In terms of feature importance, the models surprisingly rank molecular features differently, reflecting differences in regularization techniques and decision tree structures. Thus, expert knowledge must always be employed when evaluating data-driven explanations of bioactivity. Furthermore, our results show that the relevance of each hyperparameter varies greatly across datasets and that it is crucial to optimize as many hyperparameters as possible to maximize the predictive performance. In conclusion, our study provides the first set of guidelines for cheminformatics practitioners to effectively train, optimize and evaluate gradient boosting models for virtual screening and QSAR applications.
Collapse
Affiliation(s)
- Davide Boldini
- Department of Bioscience, Center for Functional Protein Assemblies (CPA), Technical University of Munich, Garching bei Munich, Germany
| | - Francesca Grisoni
- Department of Biomedical Engineering, Institute for Complex Molecular Sciences, Eindhoven University of Technology, Eindhoven, The Netherlands
- Centre for Living Technologies, Alliance TU/E, WUR, UU, UMC Utrecht, Utrecht, The Netherlands
| | | | | | - Stephan A Sieber
- Department of Bioscience, Center for Functional Protein Assemblies (CPA), Technical University of Munich, Garching bei Munich, Germany.
| |
Collapse
|
23
|
Ekins S, Lane TR, Urbina F, Puhl AC. In silico ADME/tox comes of age: twenty years later. Xenobiotica 2023:1-7. [PMID: 37539466 PMCID: PMC10850432 DOI: 10.1080/00498254.2023.2245049] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Revised: 08/01/2023] [Accepted: 08/02/2023] [Indexed: 08/05/2023]
Abstract
In the early 2000s pharmaceutical drug discovery was beginning to use computational approaches for absorption, distribution, metabolism, excretion and toxicity (ADME/Tox, also known as ADMET) prediction. This emphasis on prediction was an effort to reduce the risk of later stage failures from ADME/Tox.Much has been written in the intervening twenty plus years and significant expenditure has occurred in companies developing these in silico capabilities which can be gleaned from publications. It is therefore an appropriate time to briefly reflect on what was proposed then and what the reality is today.20 years ago, we tended to optimise bioactivity and perhaps one ADME/Tox property at a time. Previously pharmaceutical companies needed a whole infrastructure for models - in silico and in vitro experts, IT, champions on a project team, educators and management support. Now we are in the age of generative de novo design where bioactivity and many ADME/Tox properties can be optimised and large language model technologies are available.There are also some challenges such as the focus on very large molecules which may be outside of current ADME/Tox models.We provide an opportunity to look forward with the increasing public data for ADME/Tox as well as expanded types of algorithms available.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Thomas R. Lane
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Fabio Urbina
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Ana C. Puhl
- Collaborations Pharmaceuticals, Inc., 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| |
Collapse
|
24
|
Yong F, Yan M, Zhang L, Ji W, Zhao S, Gao Y. Analysis of Functional Promoter of Camel FGF21 Gene and Identification of Small Compounds Targeting FGF21 Protein. Vet Sci 2023; 10:452. [PMID: 37505857 PMCID: PMC10383868 DOI: 10.3390/vetsci10070452] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2023] [Revised: 06/27/2023] [Accepted: 07/06/2023] [Indexed: 07/29/2023] Open
Abstract
The fibroblast growth factor 21 (FGF21) gene plays an important role in the mechanism of glucose and lipid metabolism and is a promising therapeutic target for metabolic disease. Camels display a unique regulation characteristic of glucose and lipid metabolism, endowing them with the ability to adapt to survive drought and chronic hunger. However, the knowledge about the camel FGF21 gene regulation and its differences between humans and mice is still limited. In this study, camel FGF21 gene promoter was obtained for ~2000 bp upstream of the transcriptional start site (TSS). Bioinformatics analysis showed that the proximal promoter region sequences near the TSS between humans and camels have high similarity. Two potential core active regions are located in the -445-612 bp region. In addition, camel FGF21 promoter contains three CpG islands (CGIs), located in the -435~-1168 bp regions, significantly more and longer than in humans and mice. The transcription factor binding prediction showed that most transcription factors, including major functional transcription factors, are the same in different species although the binding site positions in the promoter are different. These results indicated that the signaling pathways involved in FGF21 gene transcription regulation are conservative in mammals. Truncated fragments recombinant vectors and luciferase reporter assay determined that camel FGF21 core promoter is located within the 800 bp region upstream of the TSS and an enhancer may exist between the -1000 and -2000 bp region. Combining molecular docking and in silico ADMET druggability prediction, two compounds were screened as the most promising candidate drugs specifically targeting FGF21. This study expanded the functions of these small molecules and provided a foundation for drug development targeting FGF21.
Collapse
Affiliation(s)
- Fang Yong
- College of Life Science and Technology, Gansu Agricultural University, Lanzhou 730070, China
| | - Meilin Yan
- College of Life Science and Technology, Gansu Agricultural University, Lanzhou 730070, China
| | - Lili Zhang
- College of Life Science and Technology, Gansu Agricultural University, Lanzhou 730070, China
| | - Wangye Ji
- College of Life Science and Technology, Gansu Agricultural University, Lanzhou 730070, China
| | - Shuqin Zhao
- College of Life Science and Technology, Gansu Agricultural University, Lanzhou 730070, China
- Gansu Key Laboratory of Animal Generational Physiology and Reproductive Regulation, Lanzhou 730070, China
| | - Yuan Gao
- College of Life Science and Technology, Gansu Agricultural University, Lanzhou 730070, China
- Gansu Key Laboratory of Animal Generational Physiology and Reproductive Regulation, Lanzhou 730070, China
| |
Collapse
|
25
|
Zhang J, Gao LX, Chen W, Zhong JJ, Qian C, Zhou WW. Rational Design of Daunorubicin C-14 Hydroxylase Based on the Understanding of Its Substrate-Binding Mechanism. Int J Mol Sci 2023; 24:ijms24098337. [PMID: 37176043 PMCID: PMC10179135 DOI: 10.3390/ijms24098337] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2023] [Revised: 04/26/2023] [Accepted: 05/03/2023] [Indexed: 05/15/2023] Open
Abstract
Doxorubicin is one of the most widely used antitumor drugs and is currently produced via the chemical conversion method, which suffers from high production costs, complex product separation processes, and serious environmental pollution. Biocatalysis is considered a more efficient and environment-friendly method for drug production. The cytochrome daunorubicin C-14 hydroxylase (DoxA) is the essential enzyme catalyzing the conversion of daunorubicin to doxorubicin. Herein, the DoxA from Streptomyces peucetius subsp. caesius ATCC 27952 was expressed in Escherichia coli, and the rational design strategy was further applied to improve the enzyme activity. Eight amino acid residues were identified as the key sites via molecular docking. Using a constructed screening library, we obtained the mutant DoxA(P88Y) with a more rational protein conformation, and a 56% increase in bioconversion efficiency was achieved by the mutant compared to the wild-type DoxA. Molecular dynamics simulation was applied to understand the relationship between the enzyme's structural property and its substrate-binding efficiency. It was demonstrated that the mutant DoxA(P88Y) formed a new hydrophobic interaction with the substrate daunorubicin, which might have enhanced the binding stability and thus improved the catalytic activity. Our work lays a foundation for further exploration of DoxA and facilitates the industrial process of bio-production of doxorubicin.
Collapse
Affiliation(s)
- Jing Zhang
- College of Biosystems Engineering and Food Science, Ningbo Research Institute, Zhejiang University, Hangzhou 310058, China
- School of Chemical and Biomolecular Engineering, The University of Sydney, Sydney, NSW 2006, Australia
| | - Ling-Xiao Gao
- College of Biosystems Engineering and Food Science, Ningbo Research Institute, Zhejiang University, Hangzhou 310058, China
| | - Wei Chen
- College of Biosystems Engineering and Food Science, Ningbo Research Institute, Zhejiang University, Hangzhou 310058, China
| | - Jian-Jiang Zhong
- State Key Laboratory of Microbial Metabolism, School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, Shanghai 200240, China
| | - Chao Qian
- College of Chemical and Biological Engineering, Zhejiang Provincial Key Laboratory of Advanced Chemical Engineering Manufacture Technology, Zhejiang University, Hangzhou 310027, China
| | - Wen-Wen Zhou
- College of Biosystems Engineering and Food Science, Ningbo Research Institute, Zhejiang University, Hangzhou 310058, China
| |
Collapse
|
26
|
Di Lascio E, Gerebtzoff G, Rodríguez-Pérez R. Systematic Evaluation of Local and Global Machine Learning Models for the Prediction of ADME Properties. Mol Pharm 2023; 20:1758-1767. [PMID: 36745394 DOI: 10.1021/acs.molpharmaceut.2c00962] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Machine learning (ML) has become an indispensable tool to predict absorption, distribution, metabolism, and excretion (ADME) properties in pharmaceutical research. ML algorithms are trained on molecular structures and corresponding ADME assay data to develop quantitative structure-property relationship (QSPR) models. Traditional QSPR models were trained on compound sets of limited size. With the advent of more complex ML algorithms and data availability, training sets have become larger and more diverse. Most common training approaches consist in either training a model with a small set of similar compounds, namely, compounds designed for the same drug discovery project or chemical series (local model approach) or with a larger set of diverse compounds (global model approach). Global models are built with all experimental data available for an assay, combining compound data from different projects and disease areas. Despite the ML progress made so far, the choice of the appropriate data composition for building ML models is still unclear. Herein, a systematic evaluation of local and global ML models was performed for 10 different experimental assays and 112 drug discovery projects. Results show a consistent superior performance of global models for ADME property predictions. Diagnostic analyses were also carried out to investigate the influence of training set size, structural diversity, and data shift in the relative performance of local and global ML models. Training set and structural diversity did not have an impact in the relative performance on the methods. Instead, data shift helped to identify the projects with larger performance differences between local and global models. Results presented in this work can be leveraged to improve ML-based ADME properties predictions and thus decision-making in drug discovery projects.
Collapse
Affiliation(s)
- Elena Di Lascio
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | - Grégori Gerebtzoff
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | | |
Collapse
|
27
|
O' Donovan DH, De Fusco C, Kuhnke L, Reichel A. Trends in Molecular Properties, Bioavailability, and Permeability across the Bayer Compound Collection. J Med Chem 2023; 66:2347-2360. [PMID: 36752336 DOI: 10.1021/acs.jmedchem.2c01577] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/09/2023]
Abstract
For oral drugs, medicinal chemists aim to design compounds with high oral bioavailability, of which permeability is a key determinant. Taking advantage of >2000 compounds tested in rat bioavailability studies and >20,000 compounds tested in Caco2 assays at Bayer, we have examined the molecular properties governing bioavailability and permeability. In addition to classical parameters such as logD and molecular weight, we also investigated the relationship between calculated pKa and permeability. We find that neutral compounds retain permeability up to a molecular weight limit of 700, while stronger acids and bases are restricted to weights of 400-500. We also investigate trends for common properties such as hydrogen bond donors and acceptors, polar surface area, aromatic ring count, and rotatable bonds, including compounds which exceed Lipinski's rule of five (Ro5). These property-structure relationships are combined to provide design guidelines for bioavailable drugs in both traditional and "beyond rule of 5" (bRo5) chemical space.
Collapse
Affiliation(s)
| | | | - Lara Kuhnke
- Drug Discovery Sciences, Bayer AG, 13342 Berlin, Germany
| | | |
Collapse
|
28
|
Chen J, Yuan Z, Tu Y, Hu W, Xie C, Ye L. Experimental and computational models to investigate intestinal drug permeability and metabolism. Xenobiotica 2023; 53:25-45. [PMID: 36779684 DOI: 10.1080/00498254.2023.2180454] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/14/2023]
Abstract
Oral administration is the preferred route for drug administration that leads to better therapy compliance. The intestine plays a key role in the absorption and metabolism of oral drugs, therefore, new intestinal models are being continuously proposed, which contribute to the study of intestinal physiology, drug screening, drug side effects, and drug-drug interactions.Advances in pharmaceutical processes have produced more drug formulations, causing challenges for intestinal models. To adapt to the rapid evolution of pharmaceuticals, more intestinal models have been created. However, because of the complexity of the intestine, few models can take all aspects of the intestine into account, and some functions must be sacrificed to investigate other areas. Therefore, investigators need to choose appropriate models according to the experimental stage and other requirements to obtain the desired results.To help researchers achieve this goal, this review summarised the advantages and disadvantages of current commonly used intestinal models and discusses possible future directions, providing a better understanding of intestinal models.
Collapse
Affiliation(s)
- Jinyuan Chen
- Institute of Scientific Research, Southern Medical University, Guangzhou, P.R. China.,TCM-Integrated Hospital, Southern Medical University, Guangzhou, P.R. China
| | - Ziyun Yuan
- NMPA Key Laboratory for Research and Evaluation of Drug Metabolism, Guangdong Provincial Key Laboratory of New Drug Screening, School of Pharmaceutical Sciences, Southern Medical University, Guangzhou, P.R. China
| | - Yifan Tu
- Boehringer-Ingelheim, Connecticut, P.R. USA
| | - Wanyu Hu
- NMPA Key Laboratory for Research and Evaluation of Drug Metabolism, Guangdong Provincial Key Laboratory of New Drug Screening, School of Pharmaceutical Sciences, Southern Medical University, Guangzhou, P.R. China
| | - Cong Xie
- Clinical Pharmacy Center, Nanfang Hospital, Southern Medical University, Guangzhou, P.R. China
| | - Ling Ye
- TCM-Integrated Hospital, Southern Medical University, Guangzhou, P.R. China
| |
Collapse
|
29
|
Stegemann S, Moreton C, Svanbäck S, Box K, Motte G, Paudel A. Trends in oral small-molecule drug discovery and product development based on product launches before and after the Rule of Five. Drug Discov Today 2023; 28:103344. [PMID: 36442594 DOI: 10.1016/j.drudis.2022.103344] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2022] [Revised: 07/28/2022] [Accepted: 09/01/2022] [Indexed: 11/26/2022]
Abstract
In 1997, the 'Rule of Five' (Ro5) suggested physicochemical limitations for orally administered drugs, based on the analysis of chemical libraries from the early 1990s. In this review, we report on the trends in oral drug product development by analyzing products launched between 1994 and 1997 and between 2013 and 2019. Our analysis confirmed that most new oral drugs are within the Ro5 descriptors; however, the number of new drug products of drugs with molecular weight (MW) and calculated partition coefficient (clogP) beyond the Ro5 has slightly increased. Analysis revealed that there is no single scientific or technological reason for this trend, but that it likely results from incremental advances are being made in molecular biology, target diversity, drug design, medicinal chemistry, predictive modeling, drug metabolism and pharmacokinetics, and drug delivery.
Collapse
Affiliation(s)
- Sven Stegemann
- Institute for Process and Particle Engineering, Graz University of Technology, Inffeldgasse 13, 8010 Graz, Austria.
| | | | - Sami Svanbäck
- The Solubility Company Ltd, Viikinkaari 4, 00790 Helsinki, Finland
| | - Karl Box
- Pion Inc. (UK) Ltd, Forest Row, UK
| | - Geneviève Motte
- JEN Pharma Consulting, 182 Rue Henri Latour, 1450 Chastre, Belgium
| | - Amrit Paudel
- Institute for Process and Particle Engineering, Graz University of Technology, Inffeldgasse 13, 8010 Graz, Austria; Research Center Pharmaceutical Engineering GmbH, Inffeldgasse 13, 8010 Graz, Austria
| |
Collapse
|
30
|
Fine-tuning BERT for automatic ADME semantic labeling in FDA drug labeling to enhance product-specific guidance assessment. J Biomed Inform 2023; 138:104285. [PMID: 36632860 DOI: 10.1016/j.jbi.2023.104285] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2022] [Revised: 10/25/2022] [Accepted: 01/07/2023] [Indexed: 01/11/2023]
Abstract
Product-specific guidances (PSGs) recommended by the United States Food and Drug Administration (FDA) are instrumental to promote and guide generic drug product development. To assess a PSG, the FDA assessor needs to take extensive time and effort to manually retrieve supportive drug information of absorption, distribution, metabolism, and excretion (ADME) from the reference listed drug labeling. In this work, we leveraged the state-of-the-art pre-trained language models to automatically label the ADME paragraphs in the pharmacokinetics section from the FDA-approved drug labeling to facilitate PSG assessment. We applied a transfer learning approach by fine-tuning the pre-trained Bidirectional Encoder Representations from Transformers (BERT) model to develop a novel application of ADME semantic labeling, which can automatically retrieve ADME paragraphs from drug labeling instead of manual work. We demonstrate that fine-tuning the pre-trained BERT model can outperform conventional machine learning techniques, achieving up to 12.5% absolute F1 improvement. To our knowledge, we were the first to successfully apply BERT to solve the ADME semantic labeling task. We further assessed the relative contribution of pre-training and fine-tuning to the overall performance of the BERT model in the ADME semantic labeling task using a series of analysis methods, such as attention similarity and layer-based ablations. Our analysis revealed that the information learned via fine-tuning is focused on task-specific knowledge in the top layers of the BERT, whereas the benefit from the pre-trained BERT model is from the bottom layers.
Collapse
|
31
|
Gao Y, Guo L, Han Y, Zhang J, Dai Z, Ma S. A Combination of In Silico ADMET Prediction, In Vivo Toxicity Evaluation, and Potential Mechanism Exploration of Brucine and Brucine N-oxide-A Comparative Study. Molecules 2023; 28:molecules28031341. [PMID: 36771007 PMCID: PMC9919335 DOI: 10.3390/molecules28031341] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2022] [Revised: 01/15/2023] [Accepted: 01/18/2023] [Indexed: 02/04/2023] Open
Abstract
Brucine (BRU) and brucine N-oxide (BNO) are prominent, bioactive, and toxic alkaloids in crude and processed Semen Strychni. Studies have demonstrated that BRU and BNO possess comprehensive pharmacological activities, such as anti-inflammatory and analgesic. In this context, a comparative study of BRU and BNO was performed by combination analysis of in silico ADMET prediction, in vivo toxicity evaluation, and potential action mechanism exploration. ADMET prediction showed that BRU and BNO might induce liver injury, and BRU may have a stronger hepatoxic effect. The prediction was experimentally verified using the zebrafish model. The BRU-induced hepatotoxicity of zebrafish larvae had a dose-response relationship. The mechanism of BRU-induced hepatotoxicity might relate to phosphorylation, kinase activity, and signal transduction. By comparison, signal transduction and gap junctions might involve BNO-induced hepatotoxicity. Our results provided a better understanding of BRU- and BNO-induced hepatotoxicity. We also built a foundation to elucidate the material base of the hepatotoxicity of traditional Chinese medicine Semen Strychni.
Collapse
Affiliation(s)
- Yan Gao
- National Institutes for Food and Drug Control, Beijing 100050, China
| | - Lin Guo
- National Institutes for Food and Drug Control, Beijing 100050, China
| | - Ying Han
- Department of Pharmacology, Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Jingpu Zhang
- Department of Pharmacology, Institute of Medicinal Biotechnology, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing 100050, China
| | - Zhong Dai
- National Institutes for Food and Drug Control, Beijing 100050, China
- Correspondence: (Z.D.); (S.M.)
| | - Shuangcheng Ma
- National Institutes for Food and Drug Control, Beijing 100050, China
- Correspondence: (Z.D.); (S.M.)
| |
Collapse
|
32
|
Dorahy G, Chen JZ, Balle T. Computer-Aided Drug Design towards New Psychotropic and Neurological Drugs. Molecules 2023; 28:molecules28031324. [PMID: 36770990 PMCID: PMC9921936 DOI: 10.3390/molecules28031324] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2022] [Revised: 01/23/2023] [Accepted: 01/26/2023] [Indexed: 01/31/2023] Open
Abstract
Central nervous system (CNS) disorders are a therapeutic area in drug discovery where demand for new treatments greatly exceeds approved treatment options. This is complicated by the high failure rate in late-stage clinical trials, resulting in exorbitant costs associated with bringing new CNS drugs to market. Computer-aided drug design (CADD) techniques minimise the time and cost burdens associated with drug research and development by ensuring an advantageous starting point for pre-clinical and clinical assessments. The key elements of CADD are divided into ligand-based and structure-based methods. Ligand-based methods encompass techniques including pharmacophore modelling and quantitative structure activity relationships (QSARs), which use the relationship between biological activity and chemical structure to ascertain suitable lead molecules. In contrast, structure-based methods use information about the binding site architecture from an established protein structure to select suitable molecules for further investigation. In recent years, deep learning techniques have been applied in drug design and present an exciting addition to CADD workflows. Despite the difficulties associated with CNS drug discovery, advances towards new pharmaceutical treatments continue to be made, and CADD has supported these findings. This review explores various CADD techniques and discusses applications in CNS drug discovery from 2018 to November 2022.
Collapse
Affiliation(s)
- Georgia Dorahy
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2006, Australia
- Brain and Mind Centre, The University of Sydney, Camperdown, NSW 2050, Australia
| | - Jake Zheng Chen
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2006, Australia
- Brain and Mind Centre, The University of Sydney, Camperdown, NSW 2050, Australia
| | - Thomas Balle
- Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, Sydney, NSW 2006, Australia
- Brain and Mind Centre, The University of Sydney, Camperdown, NSW 2050, Australia
- Correspondence:
| |
Collapse
|
33
|
Stoyanova R, Katzberger PM, Komissarov L, Khadhraoui A, Sach-Peltason L, Groebke Zbinden K, Schindler T, Manevski N. Computational Predictions of Nonclinical Pharmacokinetics at the Drug Design Stage. J Chem Inf Model 2023; 63:442-458. [PMID: 36595708 DOI: 10.1021/acs.jcim.2c01134] [Citation(s) in RCA: 8] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Although computational predictions of pharmacokinetics (PK) are desirable at the drug design stage, existing approaches are often limited by prediction accuracy and human interpretability. Using a discovery data set of mouse and rat PK studies at Roche (9,685 unique compounds), we performed a proof-of-concept study to predict key PK properties from chemical structure alone, including plasma clearance (CLp), volume of distribution at steady-state (Vss), and oral bioavailability (F). Ten machine learning (ML) models were evaluated, including Single-Task, Multitask, and transfer learning approaches (i.e., pretraining with in vitro data). In addition to prediction accuracy, we emphasized human interpretability of outcomes, especially the quantification of uncertainty, applicability domains, and explanations of predictions in terms of molecular features. Results show that intravenous (IV) PK properties (CLp and Vss) can be predicted with good precision (average absolute fold error, AAFE of 1.96-2.84 depending on data split) and low bias (average fold error, AFE of 0.98-1.36), with AutoGluon, Gaussian Process Regressor (GP), and ChemProp displaying the best performance. Driven by higher complexity of oral PK studies, predictions of F were more challenging, with the best AAFE values of 2.35-2.60 and higher overprediction bias (AFE of 1.45-1.62). Multi-Task approaches and pretraining of ChemProp neural networks with in vitro data showed similar precision to Single-Task models but helped reduce the bias and increase correlations between observations and predictions. A combination of GP-computed prediction variance, molecular clustering, and dimensionality-reduction provided valuable quantitative insights into prediction uncertainty and applicability domains. SHAPley Additive exPlanations (SHAPs) highlighted molecular features contributing to prediction outcomes of Vss, providing explanations that could aid drug design. Combined results show that computational predictions of PK are feasible at the drug design stage, with several ML technologies converging to successfully leverage historical PK data sets. Further studies are needed to unlock the full potential of this approach, especially with respect to data set sizes and quality, transfer learning between in vitro and in vivo data sets, model-independent quantification of uncertainty, and explainability of predictions.
Collapse
Affiliation(s)
- Raya Stoyanova
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Paul Maximilian Katzberger
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Leonid Komissarov
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Aous Khadhraoui
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Lisa Sach-Peltason
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Katrin Groebke Zbinden
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Torsten Schindler
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| | - Nenad Manevski
- Roche Pharmaceutical Research and Early Development, Roche Innovation Center Basel, 4070Basel, Switzerland
| |
Collapse
|
34
|
Rodríguez-Pérez R, Trunzer M, Schneider N, Faller B, Gerebtzoff G. Multispecies Machine Learning Predictions of In Vitro Intrinsic Clearance with Uncertainty Quantification Analyses. Mol Pharm 2023; 20:383-394. [PMID: 36437712 DOI: 10.1021/acs.molpharmaceut.2c00680] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
In pharmaceutical research, compounds are optimized for metabolic stability to avoid a too fast elimination of the drug. Intrinsic clearance (CLint) measured in liver microsomes or hepatocytes is an important parameter during lead optimization. In this work, machine learning models were developed to relate the compound structure to microsomal metabolic stability and predict CLint for new compounds. A multitask (MT) learning architecture was introduced to model the CLint of six species simultaneously, giving as a result a multispecies machine learning model. MT graph neural network (MT-GNN) regression was identified as the top-performing method, and an ensemble of 10 MT-GNN models was evaluated prospectively. Geometric mean fold errors were consistently smaller than 2-fold. Moreover, high precision values were obtained in the prediction of "high" (>300 μL/min/mg) and "low" (<100 μL/min/mg) CLint compounds. Precision values ranged from 80 to 94% for low CLint predictions and from 75 to 97% for high CLint predictions, depending on the species. Uncertainty on experimental values and model predictions was systematically quantified. Experimental variability (aleatoric uncertainty) of all historical Novartis in vitro clearance experiments was analyzed. Interestingly, MT-GNN models' performance approached assays' experimental variability. Moreover, uncertainty estimation in predictions (epistemic uncertainty) enabled identifying predictions associated with lower and higher error. Taken together, our manuscript combines a multispecies deep learning model and large-scale uncertainty analyses to improve CLint predictions and facilitate early informed decisions for compound prioritization.
Collapse
Affiliation(s)
| | - Markus Trunzer
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | - Nadine Schneider
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | - Bernard Faller
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| | - Grégori Gerebtzoff
- Novartis Institutes for Biomedical Research, Novartis Campus, BaselCH-4002, Switzerland
| |
Collapse
|
35
|
Lim MA, Yang S, Mai H, Cheng AC. Exploring Deep Learning of Quantum Chemical Properties for Absorption, Distribution, Metabolism, and Excretion Predictions. J Chem Inf Model 2022; 62:6336-6341. [PMID: 35758421 DOI: 10.1021/acs.jcim.2c00245] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
Quantum mechanical (QM) descriptors of small molecules have wide applicability in understanding organic reactivity and molecular properties, but the substantial compute cost required for ab initio QM calculations limits their broad usage. Here, we investigate the use of deep learning for predicting QM descriptors, with the goal of enabling usage of near-QM accuracy electronic properties on large molecular data sets such as those seen in drug discovery. Several deep learning approaches have previously been benchmarked on a published data set called QM9, where 12 ground-state properties have been calculated for molecules with up to nine heavy atoms, limited to C, H, N, O, and F elements. To advance the work beyond the QM9 chemical space and enable application to molecules encountered in drug discovery, we extend the QM9 data set by creating a QM9-extended data set covering an additional ∼20,000 molecules containing S and Cl atoms. Using this extended set, we generate new deep learning models as well as leverage ANI-2x models to provide predictions on larger, more diverse molecules common in drug discovery, and we find the models estimate 11 of 12 ground-state properties reasonably. We use the predicted QM descriptors to augment graph convolutional neural network (GCNN) models for selected ADME end points (rat microsomal clearance, hepatic clearance, total clearance, and P-glycoprotein efflux) and found varying degrees of performance improvement compared to nonaugmented GCNN models, including pronounced improvement in P-glycoprotein efflux prediction.
Collapse
Affiliation(s)
- Megan A Lim
- Computational and Structural Chemistry, Merck & Co., Inc., South San Francisco, California 94080, United States
| | - Song Yang
- Computational and Structural Chemistry, Merck & Co., Inc., South San Francisco, California 94080, United States
| | - Huanghao Mai
- Computational and Structural Chemistry, Merck & Co., Inc., South San Francisco, California 94080, United States
| | - Alan C Cheng
- Computational and Structural Chemistry, Merck & Co., Inc., South San Francisco, California 94080, United States
| |
Collapse
|
36
|
Kadela-Tomanek M, Jastrzębska M, Chrobak E, Bębenek E. Lipophilicity and ADMET Analysis of Quinoline-1,4-quinone Hybrids. Pharmaceutics 2022; 15:pharmaceutics15010034. [PMID: 36678664 PMCID: PMC9867208 DOI: 10.3390/pharmaceutics15010034] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2022] [Revised: 12/14/2022] [Accepted: 12/20/2022] [Indexed: 12/25/2022] Open
Abstract
Lipophilicity is one of the basic properties of a potential drug determining its solubility in non-polar solvents and, consequently, its ability to passively penetrate the cell membrane, as well as the occurrence of various pharmacokinetic processes, including adsorption, distribution, metabolism, excretion, and toxicity (ADMET). Heterocyclic compounds containing a nitrogen atom play a significant role in the search for new drugs. In this study, lipophilicity as well as other physicochemical, pharmacokinetic and toxicity properties affecting the bioavailability of the quinolone-1,4-quinone hybrids are presented. Lipophilicity was determined experimentally as well as theoretically using various computer programs. The tested compounds showed low values of experimental lipophilicity and its relationship with the type of 1,4-quinone moiety. Introduction of the nitrogen atom reduced the lipophilicity depending on the position at the 5,8-quinolinedione moiety. The bioavailability of the tested compounds was determined in silico using the ADMET parameters. The obtained parameters showed that most of the hybrids can be used orally and do not exhibit neurotoxic effects. Similarity analysis was used to examine the relationship between the ADMET parameters and experimental lipophilicity. The ability of hybrids to interact with biological targets was characterized by global reactivity descriptors. The molecular docking study showed that the hybrids can inhibit the BCL-2 protein.
Collapse
Affiliation(s)
- Monika Kadela-Tomanek
- Department of Organic Chemistry, Faculty of Pharmaceutical Sciences in Sosnowiec, Medical University of Silesia, 4 Jagiellońska Str., 41-200 Sosnowiec, Poland
- Correspondence: ; Tel.: +48-32-3641666
| | - Maria Jastrzębska
- Silesian Center for Education and Interdisciplinary Research, Institute of Physics, University of Silesia, 75 Pułku Piechoty 1a, 41-500 Chorzów, Poland
| | - Elwira Chrobak
- Department of Organic Chemistry, Faculty of Pharmaceutical Sciences in Sosnowiec, Medical University of Silesia, 4 Jagiellońska Str., 41-200 Sosnowiec, Poland
| | - Ewa Bębenek
- Department of Organic Chemistry, Faculty of Pharmaceutical Sciences in Sosnowiec, Medical University of Silesia, 4 Jagiellońska Str., 41-200 Sosnowiec, Poland
| |
Collapse
|
37
|
Physicochemical QSAR analysis of hERG inhibition revisited: towards a quantitative potency prediction. J Comput Aided Mol Des 2022; 36:837-849. [PMID: 36305984 DOI: 10.1007/s10822-022-00483-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2022] [Accepted: 10/04/2022] [Indexed: 01/07/2023]
Abstract
In an earlier study (Didziapetris R & Lanevskij K (2016). J Comput Aided Mol Des. 30:1175-1188) we collected a database of publicly available hERG inhibition data for almost 6700 drug-like molecules and built a probabilistic Gradient Boosting classifier with a minimal set of physicochemical descriptors (log P, pKa, molecular size and topology parameters). This approach favored interpretability over statistical performance but still achieved an overall classification accuracy of 75%. In the current follow-up work we expanded the database (provided in Supplementary Information) to almost 9400 molecules and performed temporal validation of the model on a set of novel chemicals from recently published lead optimization projects. Validation results showed almost no performance degradation compared to the original study. Additionally, we rebuilt the model using AFT (Accelerated Failure Time) learning objective in XGBoost, which accepts both quantitative and censored data often reported in protein inhibition studies. The new model achieved a similar level of accuracy of discerning hERG blockers from non-blockers at 10 µM threshold, which can be conceived as close to the performance ceiling for methods aiming to describe only non-specific ligand interactions with hERG. Yet, this model outputs quantitative potency values (IC50) and is not tied to a particular classification cut-off. pIC50 from patch-clamp measurements can be predicted with R2 ≈ 0.4 and MAE < 0.5, which enables ligand ranking according to their expected potency levels. The employed approach can be valuable for quantitative modeling of various ADME and drug safety endpoints with a high prevalence of censored data.
Collapse
|
38
|
Tian H, Ketkar R, Tao P. ADMETboost: a web server for accurate ADMET prediction. J Mol Model 2022; 28:408. [PMID: 36454321 PMCID: PMC9903341 DOI: 10.1007/s00894-022-05373-8] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2022] [Accepted: 10/31/2022] [Indexed: 12/03/2022]
Abstract
The absorption, distribution, metabolism, excretion, and toxicity (ADMET) properties are important in drug discovery as they define efficacy and safety. In this work, we applied an ensemble of features, including fingerprints and descriptors, and a tree-based machine learning model, extreme gradient boosting, for accurate ADMET prediction. Our model performs well in the Therapeutics Data Commons ADMET benchmark group. For 22 tasks, our model is ranked first in 18 tasks and top 3 in 21 tasks. The trained machine learning models are integrated in ADMETboost, a web server that is publicly available at https://ai-druglab.smu.edu/admet .
Collapse
Affiliation(s)
- Hao Tian
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, 75205, TX, USA
| | | | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, 75205, TX, USA.
| |
Collapse
|
39
|
Kuroda M, Watanabe R, Esaki T, Kawashima H, Ohashi R, Sato T, Honma T, Komura H, Mizuguchi K. Utilizing public and private sector data to build better machine learning models for the prediction of pharmacokinetic parameters. Drug Discov Today 2022; 27:103339. [PMID: 35973660 DOI: 10.1016/j.drudis.2022.103339] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2022] [Revised: 07/11/2022] [Accepted: 08/11/2022] [Indexed: 11/20/2022]
Abstract
One solution to compensate for the shortage of publicly available data is to collect more quality-controlled data from the private sector through public-private partnerships. However, several issues must be resolved before implementing such a system. Here, we review the technical aspects of public-private partnerships using our initiative in Japan as an example. In particular, we focus on the procedure for collecting data from multiple private sector companies and building prediction models and discuss how merging public and private sector datasets will help to improve the chemical space coverage and prediction performance. Teaser: Japan's first public-private consortium in pharmacokinetics has incorporated data from multiple pharmaceutical companies to create useful predictive models.
Collapse
Affiliation(s)
- Masataka Kuroda
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan; Discovery Technology Laboratories, Mitsubishi Tanabe Pharma Corporation, 1000, Kamoshida-cho, Aoba-ku, Yokohama, Kanagawa 227-0033, Japan.
| | - Reiko Watanabe
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan; Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan
| | - Tsuyoshi Esaki
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan; The Centre for Data Science Education and Research, Shiga University, 1-1-1, Banba, Hikone, Shiga 522-8522, Japan
| | - Hitoshi Kawashima
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan
| | - Rikiya Ohashi
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan; Discovery Technology Laboratories, Mitsubishi Tanabe Pharma Corporation, 1000, Kamoshida-cho, Aoba-ku, Yokohama, Kanagawa 227-0033, Japan
| | - Tomohiro Sato
- RIKEN Center for Biosystems Dynamics Research, 1-7-22, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Teruki Honma
- RIKEN Center for Biosystems Dynamics Research, 1-7-22, Suehiro-cho, Tsurumi-ku, Yokohama, Kanagawa 230-0045, Japan
| | - Hiroshi Komura
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan; University Research Administration Centre, Osaka Metropolitan University, 1-2-7, Asahi, Abeno-ku, Osaka 545-0051, Japan
| | - Kenji Mizuguchi
- Artificial Intelligence Centre for Health and Biomedical Research, National Institutes of Biomedical Innovation, Health and Nutrition (NIBIOHN), 7-6-8, Saito-Asagi, Ibaraki, Osaka 567-0085, Japan; Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita-shi, Osaka 565-0871, Japan.
| |
Collapse
|
40
|
Blay V, Li X, Gerlach J, Urbina F, Ekins S. Combining DELs and machine learning for toxicology prediction. Drug Discov Today 2022; 27:103351. [PMID: 36096360 PMCID: PMC9995617 DOI: 10.1016/j.drudis.2022.103351] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 07/31/2022] [Accepted: 09/06/2022] [Indexed: 01/12/2023]
Abstract
DNA-encoded libraries (DELs) allow starting chemical matter to be identified in drug discovery. The volume of experimental data generated also makes DELs an attractive resource for machine learning (ML). ML allows modeling complex relationships between compounds and numerical endpoints, such as the binding to a target measured by DELs. DELs could also empower other areas of drug discovery. Here, we propose that DELs and ML could be combined to model binding to off-targets, enabling better predictive toxicology. With enough data, ML models can make accurate predictions across a vast chemical space, and they can be reused and expanded across projects. Although there are limitations, more general toxicology models could be applied earlier during drug discovery, illuminating safety liabilities at a lower cost.
Collapse
Affiliation(s)
- Vincent Blay
- Department of Microbiology and Environmental Toxicology, University of California at Santa Cruz, Santa Cruz, CA 95064, USA.
| | - Xiaoyu Li
- Department of Chemistry and State Key Laboratory of Synthetic Chemistry, The University of Hong Kong, Hong Kong Special Administrative Region
| | - Jacob Gerlach
- Collaborations Pharmaceuticals, Inc, 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Fabio Urbina
- Collaborations Pharmaceuticals, Inc, 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA
| | - Sean Ekins
- Collaborations Pharmaceuticals, Inc, 840 Main Campus Drive, Lab 3510, Raleigh, NC 27606, USA.
| |
Collapse
|
41
|
Sauer S, Matter H, Hessler G, Grebner C. Optimizing interactions to protein binding sites by integrating docking-scoring strategies into generative AI methods. Front Chem 2022; 10:1012507. [PMID: 36339033 PMCID: PMC9629386 DOI: 10.3389/fchem.2022.1012507] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Accepted: 09/20/2022] [Indexed: 11/14/2022] Open
Abstract
The identification and optimization of promising lead molecules is essential for drug discovery. Recently, artificial intelligence (AI) based generative methods provided complementary approaches for generating molecules under specific design constraints of relevance in drug design. The goal of our study is to incorporate protein 3D information directly into generative design by flexible docking plus an adapted protein-ligand scoring function, thereby moving towards automated structure-based design. First, the protein-ligand scoring function RFXscore integrating individual scoring terms, ligand descriptors, and combined terms was derived using the PDBbind database and internal data. Next, design results for different workflows are compared to solely ligand-based reward schemes. Our newly proposed, optimal workflow for structure-based generative design is shown to produce promising results, especially for those exploration scenarios, where diverse structures fitting to a protein binding site are requested. Best results are obtained using docking followed by RFXscore, while, depending on the exact application scenario, it was also found useful to combine this approach with other metrics that bias structure generation into “drug-like” chemical space, such as target-activity machine learning models, respectively.
Collapse
|
42
|
Nie YY, Zhou LJ, Li YM, Yang WC, Liu YY, Yang ZY, Ma XX, Zhang YP, Hong PZ, Zhang Y. Hizikia fusiforme functional oil (HFFO) prevents neuroinflammation and memory deficits evoked by lipopolysaccharide/aluminum trichloride in zebrafish. Front Aging Neurosci 2022; 14:941994. [PMID: 36158548 PMCID: PMC9500236 DOI: 10.3389/fnagi.2022.941994] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 08/05/2022] [Indexed: 11/13/2022] Open
Abstract
BackgroundOxidative stress, cholinergic deficiency, and neuroinflammation are hallmarks of most neurodegenerative disorders (NDs). Lipids play an important role in brain development and proper functioning. Marine-derived lipids have shown good memory-improving potentials, especially those from fish and microalgae. The cultivated macroalga Hizikia fusiforme is healthy food and shows benefits to memory, but the study is rare on the brain healthy value of its oil. Previously, we had reported that the Hizikia fusiforme functional oil (HFFO) contains arachidonic acid, 11,14,17-eicosatrienoic acid, phytol, and other molecules displaying in vitro acetylcholinesterase inhibitory and nitroxide scavenging activity; however, the in vivo effect remains unclear. In this study, we further investigated its potential effects against lipopolysaccharides (LPS)- or aluminum trichloride (AlCl3)-induced memory deficiency in zebrafish and its drug-related properties in silica.MethodsWe established memory deficit models in zebrafish by intraperitoneal (i.p.) injection of lipopolysaccharides (LPS) (75 ng) or aluminum trichloride (AlCl3) (21 μg), and assessed their behaviors in the T-maze test. The interleukin-1β (IL-1β), tumor necrosis factor-α (TNF-α), acetylcholine (ACh), and malondialdehyde (MDA) levels were measured 24 h after the LPS/AlCl3 injection as markers of inflammation, cholinergic activity, and oxidative stress. Furthermore, the interaction of two main components, 11,14,17-eicosatrienoic acid and phytol, was investigated by molecular docking, with the important anti-inflammatory targets nuclear factor kappa B (NF-κB) and cyclooxygenase 2 (COX-2). Specifically, the absorption, distribution, metabolism, excretion, and toxicity (ADMET) and drug-likeness properties of HFFO were studied by ADMETlab.ResultsThe results showed that HFFO reduced cognitive deficits in zebrafish T-maze induced by LPS/AlCl3. While the LPS/AlCl3 treatment increased MDA content, lowered ACh levels in the zebrafish brain, and elevated levels of central and peripheral proinflammatory cytokines, these effects were reversed by 100 mg/kg HFFO except for MDA. Moreover, 11,14,17-eicosatrienoic acid and phytol showed a good affinity with NF-κB, COX-2, and HFFO exhibited acceptable drug-likeness and ADMET profiles in general.ConclusionCollectively, this study's findings suggest HFFO as a potent neuroprotectant, potentially valuable for the prevention of memory impairment caused by cholinergic deficiency and neuroinflammation.
Collapse
Affiliation(s)
- Ying-Ying Nie
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Laboratory for Marine Biological Products, Guangdong Provincial Engineering Technology Research Center of Seafood, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Shenzhen Institute of Guangdong Ocean University, Zhanjiang Municipal Key Laboratory of Marine Drugs and Nutrition for Brain Health, Research Institute for Marine Drugs and Nutrition, College of Food Science and Technology, Guangdong Ocean University, Zhanjian, China
- Collaborative Innovation Center of Seafood Deep Processing, Dalian Polytechnic University, Dalian, China
| | - Long-Jian Zhou
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Laboratory for Marine Biological Products, Guangdong Provincial Engineering Technology Research Center of Seafood, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Shenzhen Institute of Guangdong Ocean University, Zhanjiang Municipal Key Laboratory of Marine Drugs and Nutrition for Brain Health, Research Institute for Marine Drugs and Nutrition, College of Food Science and Technology, Guangdong Ocean University, Zhanjian, China
- Collaborative Innovation Center of Seafood Deep Processing, Dalian Polytechnic University, Dalian, China
| | - Yan-Mei Li
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Laboratory for Marine Biological Products, Guangdong Provincial Engineering Technology Research Center of Seafood, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Shenzhen Institute of Guangdong Ocean University, Zhanjiang Municipal Key Laboratory of Marine Drugs and Nutrition for Brain Health, Research Institute for Marine Drugs and Nutrition, College of Food Science and Technology, Guangdong Ocean University, Zhanjian, China
| | - Wen-Cong Yang
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Laboratory for Marine Biological Products, Guangdong Provincial Engineering Technology Research Center of Seafood, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Shenzhen Institute of Guangdong Ocean University, Zhanjiang Municipal Key Laboratory of Marine Drugs and Nutrition for Brain Health, Research Institute for Marine Drugs and Nutrition, College of Food Science and Technology, Guangdong Ocean University, Zhanjian, China
| | - Ya-Yue Liu
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Laboratory for Marine Biological Products, Guangdong Provincial Engineering Technology Research Center of Seafood, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Shenzhen Institute of Guangdong Ocean University, Zhanjiang Municipal Key Laboratory of Marine Drugs and Nutrition for Brain Health, Research Institute for Marine Drugs and Nutrition, College of Food Science and Technology, Guangdong Ocean University, Zhanjian, China
- Collaborative Innovation Center of Seafood Deep Processing, Dalian Polytechnic University, Dalian, China
| | - Zhi-You Yang
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Laboratory for Marine Biological Products, Guangdong Provincial Engineering Technology Research Center of Seafood, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Shenzhen Institute of Guangdong Ocean University, Zhanjiang Municipal Key Laboratory of Marine Drugs and Nutrition for Brain Health, Research Institute for Marine Drugs and Nutrition, College of Food Science and Technology, Guangdong Ocean University, Zhanjian, China
| | - Xiao-Xiang Ma
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Laboratory for Marine Biological Products, Guangdong Provincial Engineering Technology Research Center of Seafood, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Shenzhen Institute of Guangdong Ocean University, Zhanjiang Municipal Key Laboratory of Marine Drugs and Nutrition for Brain Health, Research Institute for Marine Drugs and Nutrition, College of Food Science and Technology, Guangdong Ocean University, Zhanjian, China
| | - Yong-Ping Zhang
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Laboratory for Marine Biological Products, Guangdong Provincial Engineering Technology Research Center of Seafood, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Shenzhen Institute of Guangdong Ocean University, Zhanjiang Municipal Key Laboratory of Marine Drugs and Nutrition for Brain Health, Research Institute for Marine Drugs and Nutrition, College of Food Science and Technology, Guangdong Ocean University, Zhanjian, China
| | - Peng-Zhi Hong
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Laboratory for Marine Biological Products, Guangdong Provincial Engineering Technology Research Center of Seafood, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Shenzhen Institute of Guangdong Ocean University, Zhanjiang Municipal Key Laboratory of Marine Drugs and Nutrition for Brain Health, Research Institute for Marine Drugs and Nutrition, College of Food Science and Technology, Guangdong Ocean University, Zhanjian, China
- Collaborative Innovation Center of Seafood Deep Processing, Dalian Polytechnic University, Dalian, China
| | - Yi Zhang
- Guangdong Provincial Key Laboratory of Aquatic Product Processing and Safety, Guangdong Provincial Engineering Laboratory for Marine Biological Products, Guangdong Provincial Engineering Technology Research Center of Seafood, Key Laboratory of Advanced Processing of Aquatic Product of Guangdong Higher Education Institution, Shenzhen Institute of Guangdong Ocean University, Zhanjiang Municipal Key Laboratory of Marine Drugs and Nutrition for Brain Health, Research Institute for Marine Drugs and Nutrition, College of Food Science and Technology, Guangdong Ocean University, Zhanjian, China
- Collaborative Innovation Center of Seafood Deep Processing, Dalian Polytechnic University, Dalian, China
- *Correspondence: Yi Zhang ;
| |
Collapse
|
43
|
Veríssimo GC, Serafim MSM, Kronenberger T, Ferreira RS, Honorio KM, Maltarollo VG. Designing drugs when there is low data availability: one-shot learning and other approaches to face the issues of a long-term concern. Expert Opin Drug Discov 2022; 17:929-947. [PMID: 35983695 DOI: 10.1080/17460441.2022.2114451] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/04/2022]
Abstract
INTRODUCTION Modern drug discovery generally is accessed by useful information from previous large databases or uncovering novel data. The lack of biological and/or chemical data tends to slow the development of scientific research and innovation. Here, approaches that may help provide solutions to generate or obtain enough relevant data or improve/accelerate existing methods within the last five years were reviewed. AREAS COVERED One-shot learning (OSL) approaches, structural modeling, molecular docking, scoring function space (SFS), molecular dynamics (MD), and quantum mechanics (QM) may be used to amplify the amount of available data to drug design and discovery campaigns, presenting methods, their perspectives, and discussions to be employed in the near future. EXPERT OPINION Recent works have successfully used these techniques to solve a range of issues in the face of data scarcity, including complex problems such as the challenging scenario of drug design aimed at intrinsically disordered proteins and the evaluation of potential adverse effects in a clinical scenario. These examples show that it is possible to improve and kickstart research from scarce available data to design and discover new potential drugs.
Collapse
Affiliation(s)
- Gabriel C Veríssimo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Mateus Sá M Serafim
- Departamento de Microbiologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Thales Kronenberger
- Department of Medical Oncology and Pneumology, Internal Medicine VIII, University Hospital of Tübingen, Tübingen, Germany.,School of Pharmacy, Faculty of Health Sciences, University of Eastern Finland, Kuopio, Finland
| | - Rafaela S Ferreira
- Departamento de Bioquímica e Imunologia, Instituto de Ciências Biológicas, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| | - Kathia M Honorio
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo (USP), São Paulo, Brazil.,Centro de Ciências Naturais e Humanas, Universidade Federal do ABC (UFABC), Santo André, Brazil
| | - Vinícius G Maltarollo
- Departamento de Produtos Farmacêuticos, Faculdade de Farmácia, Universidade Federal de Minas Gerais (UFMG), Belo Horizonte, Brazil
| |
Collapse
|
44
|
Gorgulla C, Jayaraj A, Fackeldey K, Arthanari H. Emerging frontiers in virtual drug discovery: From quantum mechanical methods to deep learning approaches. Curr Opin Chem Biol 2022; 69:102156. [PMID: 35576813 PMCID: PMC9990419 DOI: 10.1016/j.cbpa.2022.102156] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 03/16/2022] [Accepted: 04/07/2022] [Indexed: 11/19/2022]
Abstract
Virtual screening-based approaches to discover initial hit and lead compounds have the potential to reduce both the cost and time of early drug discovery stages, as well as to find inhibitors for even challenging target sites such as protein-protein interfaces. Here in this review, we provide an overview of the progress that has been made in virtual screening methodology and technology on multiple fronts in recent years. The advent of ultra-large virtual screens, in which hundreds of millions to billions of compounds are screened, has proven to be a powerful approach to discover highly potent hit compounds. However, these developments are just the tip of the iceberg, with new technologies and methods emerging to propel the field forward. Examples include novel machine-learning approaches, which can reduce the computational costs of virtual screening dramatically, while progress in quantum-mechanical approaches can increase the accuracy of predictions of various small molecule properties.
Collapse
Affiliation(s)
- Christoph Gorgulla
- Department of Biological Chemistry and Molecular Pharmacology, Blavatnik Institute, Harvard Medical School (HMS), Boston, MA, USA; Department of Physics, Faculty of Arts and Sciences, Harvard University, Cambridge, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute (DFCI), Boston, MA, USA
| | | | - Konstantin Fackeldey
- Institute of Mathematics, Technical University Berlin, Berlin, Germany; Zuse Institute Berlin, Berlin, Germany
| | - Haribabu Arthanari
- Department of Biological Chemistry and Molecular Pharmacology, Blavatnik Institute, Harvard Medical School (HMS), Boston, MA, USA; Department of Cancer Biology, Dana-Farber Cancer Institute (DFCI), Boston, MA, USA.
| |
Collapse
|
45
|
Sheridan RP. Stability of Prediction in Production ADMET Models as a Function of Version: Why and When Predictions Change. J Chem Inf Model 2022; 62:3477-3485. [PMID: 35849796 DOI: 10.1021/acs.jcim.2c00803] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
As with other pharma companies, we maintain production QSAR models of ADMET end points and update them regularly. Here, for six ADMET end points, we examine the predictions of test set molecules on multiple versions of random forest models spanning a period of 10 years. For any given end point, the predictions for the majority of molecules are similar for all model versions. However, for a small minority of molecules, the prediction shifts substantially over the span of a few versions. For most molecules that shift, the prediction becomes more accurate at later times. This Perspective investigates metrics that can help indicate which molecules will shift substantially in prediction and when the shift will occur.
Collapse
Affiliation(s)
- Robert P Sheridan
- Computational and Structural Chemistry, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| |
Collapse
|
46
|
Sheridan RP, Culberson JC, Joshi E, Tudor M, Karnachi P. Prediction Accuracy of Production ADMET Models as a Function of Version: Activity Cliffs Rule. J Chem Inf Model 2022; 62:3275-3280. [PMID: 35796226 DOI: 10.1021/acs.jcim.2c00699] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
As with many other institutions, our company maintains many quantitative structure-activity relationship (QSAR) models of absorption, distribution, metabolism, excretion, and toxicity (ADMET) end points and updates the models regularly. We recently examined version-to-version predictivity for these models over a period of 10 years. In this approach we monitor the goodness of prediction of new molecules relative to the training set of model version V before they are incorporated in the updated model V+1. Using a cell-based permeability assay (Papp) as an example, we illustrate how the QSAR models made from this data are generally predictive and can be utilized to enrich chemical designs and synthesis. Despite the obvious utility of these models, we turned up unexpected behavior in Papp and other ADMET activities for which the explanation is not obvious. One such behavior is that the apparent predictivity of the models as measured by root-mean-square-error can vary greatly from version to version and is sometimes very poor. One intuitively appealing explanation is that the observed activities of the new molecules fall outside the bulk of activities in the training set. Alternatively, one may think that the new molecules are exploring different regions of chemical space than the training set. However, the real explanation has to do with activity cliffs. If the observed activities of the new molecules are different than expected based on similar molecules in the training set, the predictions will be less accurate. This is true for all our ADMET end points.
Collapse
Affiliation(s)
- Robert P Sheridan
- Computational and Structural Chemistry, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| | - J Chris Culberson
- Computational and Structural Chemistry, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| | - Elizabeth Joshi
- Computational and Structural Chemistry, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| | - Matthew Tudor
- Computational and Structural Chemistry, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| | - Prabha Karnachi
- Computational and Structural Chemistry, Merck & Co., Inc., Kenilworth, New Jersey 07033, United States
| |
Collapse
|
47
|
Smith AME, Lanevskij K, Sazonovas A, Harris J. Impact of Established and Emerging Software Tools on the Metabolite Identification Landscape. FRONTIERS IN TOXICOLOGY 2022; 4:932445. [PMID: 35800176 PMCID: PMC9253584 DOI: 10.3389/ftox.2022.932445] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Accepted: 05/30/2022] [Indexed: 11/25/2022] Open
Abstract
Scientists’ ability to detect drug-related metabolites at trace concentrations has improved over recent decades. High-resolution instruments enable collection of large amounts of raw experimental data. In fact, the quantity of data produced has become a challenge due to effort required to convert raw data into useful insights. Various cheminformatics tools have been developed to address these metabolite identification challenges. This article describes the current state of these tools. They can be split into two categories: Pre-experimental metabolite generation and post-experimental data analysis. The former can be subdivided into rule-based, machine learning-based, and docking-based approaches. Post-experimental tools help scientists automatically perform chromatographic deconvolution of LC/MS data and identify metabolites. They can use pre-experimental predictions to improve metabolite identification, but they are not limited to these predictions: unexpected metabolites can also be discovered through fractional mass filtering. In addition to a review of available software tools, we present a description of pre-experimental and post-experimental metabolite structure generation using MetaSense. These software tools improve upon manual techniques, increasing scientist productivity and enabling efficient handling of large datasets. However, the trend of increasingly large datasets and highly data-driven workflows requires a more sophisticated informatics transition in metabolite identification labs. Experimental work has traditionally been separated from the information technology tools that handle our data. We argue that these IT tools can help scientists draw connections via data visualizations and preserve and share results via searchable centralized databases. In addition, data marshalling and homogenization techniques enable future data mining and machine learning.
Collapse
|
48
|
López-López E, Fernández-de Gortari E, Medina-Franco JL. Yes SIR! On the structure-inactivity relationships in drug discovery. Drug Discov Today 2022; 27:2353-2362. [PMID: 35561964 DOI: 10.1016/j.drudis.2022.05.005] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2022] [Revised: 04/09/2022] [Accepted: 05/05/2022] [Indexed: 12/12/2022]
Abstract
In analogy with structure-activity relationships (SARs), which are at the core of medicinal chemistry, studying structure-inactivity relationships (SIRs) is essential to understanding and predicting biological activity. Current computational methods should predict or distinguish 'activity' and 'inactivity' with the same confidence because both concepts are complementary. However, the lack of inactivity data, in particular in the public domain, limits the development of predictive models and its broad application. In this review, we encourage the scientific community to disclose and analyze high-confidence activity data considering both the labeled 'active' and 'inactive' compounds.
Collapse
Affiliation(s)
- Edgar López-López
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico; Department of Chemistry and Graduate Program in Pharmacology, Center for Research and Advanced Studies of the National Polytechnic Institute, Mexico City 07000, Mexico.
| | - Eli Fernández-de Gortari
- Department of Nanosafety, International Iberian Nanotechnology Laboratory, Braga 4715-330, Portugal
| | - José L Medina-Franco
- DIFACQUIM Research Group, Department of Pharmacy, School of Chemistry, Universidad Nacional Autónoma de México, Mexico City 04510, Mexico.
| |
Collapse
|
49
|
A Brief Review of Machine Learning-Based Bioactive Compound Research. APPLIED SCIENCES-BASEL 2022. [DOI: 10.3390/app12062906] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
Bioactive compounds are often used as initial substances for many therapeutic agents. In recent years, both theoretical and practical innovations in hardware-assisted and fast-evolving machine learning (ML) have made it possible to identify desired bioactive compounds in chemical spaces, such as those in natural products (NPs). This review introduces how machine learning approaches can be used for the identification and evaluation of bioactive compounds. It also provides an overview of recent research trends in machine learning-based prediction and the evaluation of bioactive compounds by listing real-world examples along with various input data. In addition, several ML-based approaches to identify specific bioactive compounds for cardiovascular and metabolic diseases are described. Overall, these approaches are important for the discovery of novel bioactive compounds and provide new insights into the machine learning basis for various traditional applications of bioactive compound-related research.
Collapse
|
50
|
Baltrukevich H, Podlewska S. From Data to Knowledge: Systematic Review of Tools for Automatic Analysis of Molecular Dynamics Output. Front Pharmacol 2022; 13:844293. [PMID: 35359865 PMCID: PMC8960308 DOI: 10.3389/fphar.2022.844293] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/27/2021] [Accepted: 01/26/2022] [Indexed: 12/02/2022] Open
Abstract
An increasing number of crystal structures available on one side, and the boost of computational power available for computer-aided drug design tasks on the other, have caused that the structure-based drug design tools are intensively used in the drug development pipelines. Docking and molecular dynamics simulations, key representatives of the structure-based approaches, provide detailed information about the potential interaction of a ligand with a target receptor. However, at the same time, they require a three-dimensional structure of a protein and a relatively high amount of computational resources. Nowadays, as both docking and molecular dynamics are much more extensively used, the amount of data output from these procedures is also growing. Therefore, there are also more and more approaches that facilitate the analysis and interpretation of the results of structure-based tools. In this review, we will comprehensively summarize approaches for handling molecular dynamics simulations output. It will cover both statistical and machine-learning-based tools, as well as various forms of depiction of molecular dynamics output.
Collapse
Affiliation(s)
- Hanna Baltrukevich
- Maj Institute of Pharmacology, Polish Academy of Sciences, Kraków, Poland
- Faculty of Pharmacy, Chair of Technology and Biotechnology of Medical Remedies, Jagiellonian University Medical College in Krakow, Kraków, Poland
| | - Sabina Podlewska
- Maj Institute of Pharmacology, Polish Academy of Sciences, Kraków, Poland
| |
Collapse
|