1
|
Batool M, Azam NA, Zhu J, Haraguchi K, Zhao L, Akutsu T. A unified approach to inferring chemical compounds with the desired aqueous solubility. J Cheminform 2025; 17:37. [PMID: 40140978 PMCID: PMC11938699 DOI: 10.1186/s13321-025-00966-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2024] [Accepted: 02/02/2025] [Indexed: 03/28/2025] Open
Abstract
Aqueous solubility (AS) is a key physiochemical property that plays a crucial role in drug discovery and material design. We report a novel unified approach to predict and infer chemical compounds with the desired AS based on simple deterministic graph-theoretic descriptors, multiple linear regression (MLR), and mixed integer linear programming (MILP). Selected descriptors based on a forward stepwise procedure enabled the simplest regression model, MLR, to achieve significantly good prediction accuracy compared to the existing approaches, achieving accuracy in the range [0.7191, 0.9377] for 29 diverse datasets. By simulating these descriptors and learning models as MILPs, we inferred mathematically exact and optimal compounds with the desired AS, prescribed structures, and up to 50 non-hydrogen atoms in a reasonable time range [6, 1166] seconds. These findings indicate a strong correlation between the simple graph-theoretic descriptors and the AS of compounds, potentially leading to a deeper understanding of their AS without relying on widely used complicated chemical descriptors and complex machine learning models that are computationally expensive, and therefore difficult to use for inference. An implementation of the proposed approach is available at https://github.com/ku-dml/mol-infer/tree/master/AqSol .
Collapse
Affiliation(s)
- Muniba Batool
- Discrete Mathematics and Computational Intelligence Laboratory, Department of Mathematics, Quaid-i-Azam University, Islamabad, Pakistan
| | - Naveed Ahmed Azam
- Discrete Mathematics and Computational Intelligence Laboratory, Department of Mathematics, Quaid-i-Azam University, Islamabad, Pakistan.
| | - Jianshen Zhu
- Discrete Mathematics Laboratory, Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University, 606-8501, Kyoto, Japan
| | - Kazuya Haraguchi
- Discrete Mathematics Laboratory, Department of Applied Mathematics and Physics, Graduate School of Informatics, Kyoto University, 606-8501, Kyoto, Japan
| | - Liang Zhao
- Graduate School of Advanced Integrated Studies in Human Survivability (Shishu-Kan), Kyoto University, 606-8306, Kyoto, Japan
| | - Tatsuya Akutsu
- Bioinformatics Center, Institute for Chemical Research, Kyoto University, 611-0011, Uji, Japan
| |
Collapse
|
2
|
Islam MT, Aktaruzzaman M, Saif A, Hasan AR, Sourov MMH, Sikdar B, Rehman S, Tabassum A, Abeed-Ul-Haque S, Sakib MH, Muhib MMA, Setu MAA, Tasnim F, Rayhan R, Abdel-Daim MM, Raihan MO. Identification of acetylcholinesterase inhibitors from traditional medicinal plants for Alzheimer's disease using in silico and machine learning approaches. RSC Adv 2024; 14:34620-34636. [PMID: 39483377 PMCID: PMC11526779 DOI: 10.1039/d4ra05073h] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Accepted: 10/18/2024] [Indexed: 11/03/2024] Open
Abstract
Acetylcholinesterase (AChE) holds significance in Alzheimer's disease (AD), where cognitive impairment correlates with insufficient acetylcholine levels. AChE's role involves the breakdown of acetylcholine, moderating cholinergic neuron activity to prevent overstimulation and signal termination. Hence, inhibiting AChE emerges as a potential treatment avenue for AD. A library of 2500 compounds, derived from 25 traditionally used medicinal plants, was constructed using the IMPAAT database of traditional medicinal plants. The canonical SMILES of these compounds were collected and underwent virtual screening based on physicochemical properties, with subsequent determination of IC50 values for the screened compounds followed by analysis using machine learning (ML). Subsequently, a molecular docking study elucidated both binding affinity and interactions between these compounds and AChE. The top three compounds, exhibiting robust binding affinities, underwent MM-GBSA analysis for molecular docking validation, succeeded by pharmacokinetics and toxicity evaluations to gauge safety and efficacy. These three compounds underwent MD simulation studies to assess protein-ligand complex conformational stability. Additionally, Density Functional Theory (DFT) was employed to ascertain HOMO, LUMO, energy gap, and molecular electrostatic potential. Among 2500 compounds, physicochemical properties-based virtual screening identified 80 with good properties, of which 32 showed promising IC50 values. Molecular docking studies of these 32 compounds revealed various binding energies with AChE, with the best three compounds (CID 102267534, CID 15161648, CID 12441) selected for further analysis. MM-GBSA studies confirmed the promising binding energies of these three compounds, validating the molecular docking study. Further, the MD simulation studies have confirmed the structural and conformational stability of these three protein-ligand complexes. Finally, DFT calculations revealed favorable chemical features of these compounds. Thus, we can conclude that these three compounds (CID 102267534, CID 15161648, CID 12441) may inhibit the activity of AChE and can be useful as a treatment for Alzheimer's disease.
Collapse
Affiliation(s)
- Md Tarikul Islam
- Department of Genetic Engineering and Biotechnology, Faculty of Biological Science and Technology, Jashore University of Science and Technology Jashore 7408 Bangladesh
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
| | - Md Aktaruzzaman
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Pharmacy, Faculty of Biological Science and Technology, Jashore University of Science and Technology Jashore 7408 Bangladesh +88019295912
| | - Ahmed Saif
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Pharmacy, Faculty of Science, University of Rajshahi Rajshahi 6205 Bangladesh
| | - Al Riyad Hasan
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Pharmacy, Faculty of Biological Science and Technology, Jashore University of Science and Technology Jashore 7408 Bangladesh +88019295912
| | - Md Mehedi Hasan Sourov
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Microbiology, Faculty of Biological Science, University of Rajshahi Rajshahi 6205 Bangladesh
| | - Bratati Sikdar
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Biological Sciences, Bose Institute Unified Academic Campus, EN-80, Salt Lake, Sector V, Bidhannagar Kolkata 700091 West Bengal India
| | - Saira Rehman
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Faculty of Pharmaceutical Sciences, Lahore University of Biological and Applied Sciences Lahore Punjab Pakistan
| | - Afrida Tabassum
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Genetic Engineering and Biotechnology, Faculty of Life and Earth Sciences, Jagannath University Dhaka 1100 Bangladesh
| | - Syed Abeed-Ul-Haque
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Microbiology, Faculty of Biological Science, University of Rajshahi Rajshahi 6205 Bangladesh
| | - Mehedi Hasan Sakib
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Microbiology, Faculty of Biological Science, University of Rajshahi Rajshahi 6205 Bangladesh
| | - Md Muntasir Alam Muhib
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Microbiology, Faculty of Biological Science, University of Rajshahi Rajshahi 6205 Bangladesh
| | - Md Ali Ahasan Setu
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Microbiology, Faculty of Biological Science and Technology, Jashore University of Science and Technology Jashore 7408 Bangladesh
| | - Faria Tasnim
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Genetic Engineering and Biotechnology, University of Rajshahi Rajshahi 6205 Bangladesh
| | - Rifat Rayhan
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Biomedical Engineering, Jashore University of Science and Technology Jashore 7408 Bangladesh
| | - Mohamed M Abdel-Daim
- Department of Pharmaceutical Sciences, Pharmacy Program, Batterjee Medical College P. O. Box 6231 Jeddah 21442 Saudi Arabia
- Pharmacology Department, Faculty of Veterinary Medicine, Suez Canal University Ismailia 41522 Egypt
| | - Md Obayed Raihan
- Laboratory of Advanced Computational Neuroscience, Biological Research on the Brain (BRB) Jashore 7408 Bangladesh
- Department of Pharmaceutical Sciences, College of Health Sciences and Pharmacy, Chicago State University Chicago IL USA
| |
Collapse
|
3
|
Bao Z, Tom G, Cheng A, Watchorn J, Aspuru-Guzik A, Allen C. Towards the prediction of drug solubility in binary solvent mixtures at various temperatures using machine learning. J Cheminform 2024; 16:117. [PMID: 39468626 PMCID: PMC11520512 DOI: 10.1186/s13321-024-00911-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 09/28/2024] [Indexed: 10/30/2024] Open
Abstract
Drug solubility is an important parameter in the drug development process, yet it is often tedious and challenging to measure, especially for expensive drugs or those available in small quantities. To alleviate these challenges, machine learning (ML) has been applied to predict drug solubility as an alternative approach. However, the majority of existing ML research has focused on the predictions of aqueous solubility and/or solubility at specific temperatures, which restricts the model applicability in pharmaceutical development. To bridge this gap, we compiled a dataset of 27,000 solubility datapoints, including solubility of small molecules measured in a range of binary solvent mixtures under various temperatures. Next, a panel of ML models were trained on this dataset with their hyperparameters tuned using Bayesian optimization. The resulting top-performing models, both gradient boosted decision trees (light gradient boosting machine and extreme gradient boosting), achieved mean absolute errors (MAE) of 0.33 for LogS (S in g/100 g) on the holdout set. These models were further validated through a prospective study, wherein the solubility of four drug molecules were predicted by the models and then validated with in-house solubility experiments. This prospective study demonstrated that the models accurately predicted the solubility of solutes in specific binary solvent mixtures under different temperatures, especially for drugs whose features closely align within the solutes in the dataset (MAE < 0.5 for LogS). To support future research and facilitate advancements in the field, we have made the dataset and code openly available. Scientific contribution Our research advances the state-of-the-art in predicting solubility for small molecules by leveraging ML and a uniquely comprehensive dataset. Unlike existing ML studies that predominantly focus on solubility in aqueous solvents at fixed temperatures, our work enables prediction of drug solubility in a variety of binary solvent mixtures over a broad temperature range, providing practical insights on the modeling of solubility for realistic pharmaceutical applications. These advancements along with the open access dataset and code support significant steps in the drug development process including new molecule discovery, drug analysis and formulation.
Collapse
Affiliation(s)
- Zeqing Bao
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, M5S 3M2, Canada
| | - Gary Tom
- Department of Chemistry, University of Toronto, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, M5S 1M1, Canada
| | - Austin Cheng
- Department of Chemistry, University of Toronto, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, M5S 1M1, Canada
| | | | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, M5S 1M1, Canada
- Acceleration Consortium, Toronto, ON, M5S 3H6, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), Toronto, ON, M5S 1M1, Canada
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
- Department of Materials Science and Engineering, University of Toronto, Toronto, ON, M5S 3E4, Canada
- CIFAR Artificial Intelligence Research Chair, Vector Institute, Toronto, ON, M5S 1M1, Canada
| | - Christine Allen
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, M5S 3M2, Canada.
- Acceleration Consortium, Toronto, ON, M5S 3H6, Canada.
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada.
| |
Collapse
|
4
|
Jing Y, Luo L, Zeng Z, Zhao X, Huang R, Song C, Chen G, Wei S, Yang H, Tang Y, Jin S. Targeted Screening of Curcumin Derivatives as Pancreatic Lipase Inhibitors Using Computer-Aided Drug Design. ACS OMEGA 2024; 9:27669-27679. [PMID: 38947805 PMCID: PMC11209693 DOI: 10.1021/acsomega.4c03596] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/14/2024] [Revised: 05/30/2024] [Accepted: 06/03/2024] [Indexed: 07/02/2024]
Abstract
Curcumin has demonstrated promising preclinical antiobesity effects, but its low bioavailability makes it difficult to exert its full effect at a suitable dose. The objective of this study was to screen curcumin derivatives with enhanced bioavailability and lipid-lowering activity under the guidance of computer-aided drug design (CADD). CAAD was used to perform virtual assays on curcumin derivatives to assess their pharmacokinetic properties and effects on pancreatic lipase activity. Subsequently, 19 curcumin derivatives containing 5 skeletons were synthesized to confirm the above virtual assay. The in vitro pancreatic lipase inhibition assay was employed to determine the half-maximal inhibitory concentration (IC50) of these 19 curcumin derivatives. Based on CADD analysis and in vitro pancreatic lipase inhibition, 2 curcumin derivatives outperformed curcumin in both aspects. Microscale thermophoresis (MST) experiments were employed to assess the binding equilibrium constants (K d) of the aforementioned 2 curcumin derivatives, curcumin, and the positive control drug with pancreatic lipase. Through virtual screening utilizing a chemoinformatics database and molecular docking, 6 derivatives of curcumin demonstrated superior solubility, absorption, and pancreatic lipase inhibitory activity compared to curcumin. The IC50 value for 1,7-bis(4-hydroxyphenyl)heptane-3,5-dione (C4), which displayed the most effective inhibitory effect, was 42.83 μM, while the IC50 value for 1,7-bis(4-hydroxy-3-methoxyphenyl)heptane-3,5-dione (C6) was 98.62 μM. On the other hand, the IC50 value for curcumin was 142.24 μM. The MST experiment results indicated that the K d values of C4, C6, and curcumin were 2.91, 18.20, and 23.53 μM, respectively. The results of the activity assays exhibited a relatively high degree of concordance with the outcomes yielded by CADD screening. Under the guidance of CADD, the targeted screening of curcumin derivatives with excellent properties in this study exhibited high-efficiency and low-cost benefits.
Collapse
Affiliation(s)
- Yuxuan Jing
- School
of Pharmacy, Hubei University of Chinese
Medicine, 430065 Wuhan, Hubei, China
| | - Laichun Luo
- School
of Pharmacy, Hubei University of Chinese
Medicine, 430065 Wuhan, Hubei, China
| | - Zhaoxiang Zeng
- School
of Pharmacy, Hubei University of Chinese
Medicine, 430065 Wuhan, Hubei, China
| | - Xueyan Zhao
- School
of Pharmacy, Hubei University of Chinese
Medicine, 430065 Wuhan, Hubei, China
| | - Rongzeng Huang
- School
of Pharmacy, Hubei University of Chinese
Medicine, 430065 Wuhan, Hubei, China
| | - Chengwu Song
- School
of Pharmacy, Hubei University of Chinese
Medicine, 430065 Wuhan, Hubei, China
- Center
of Traditional Chinese Medicine Modernization for Liver Diseases, 430065 Wuhan, Hubei, China
- Hubei
Shizhen Laboratory, 430065 Wuhan, Hubei, China
| | - Guiying Chen
- Wuhan
Hongren Biopharmaceutical Inc, 430065 Wuhan, Hubei, China
| | - Sha Wei
- School
of Basic Medical Sciences, Hubei University
of Chinese Medicine, 430065 Wuhan, Hubei, China
| | - Haijun Yang
- School
of Basic Medical Sciences, Hubei University
of Chinese Medicine, 430065 Wuhan, Hubei, China
| | - Yinping Tang
- School
of Pharmacy, Hubei University of Chinese
Medicine, 430065 Wuhan, Hubei, China
| | - Shuna Jin
- Hubei
Shizhen Laboratory, 430065 Wuhan, Hubei, China
- School
of Basic Medical Sciences, Hubei University
of Chinese Medicine, 430065 Wuhan, Hubei, China
| |
Collapse
|
5
|
Wei W, Mengshan L, Yan W, Lixin G. Cluster energy prediction based on multiple strategy fusion whale optimization algorithm and light gradient boosting machine. BMC Chem 2024; 18:24. [PMID: 38291518 PMCID: PMC11367823 DOI: 10.1186/s13065-024-01127-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Accepted: 01/15/2024] [Indexed: 02/01/2024] Open
Abstract
BACKGROUND Clusters, a novel hierarchical material structure that emerges from atoms or molecules, possess unique reactivity and catalytic properties, crucial in catalysis, biomedicine, and optoelectronics. Predicting cluster energy provides insights into electronic structure, magnetism, and stability. However, the structure of clusters and their potential energy surface is exceptionally intricate. Searching for the global optimal structure (the lowest energy) among these isomers poses a significant challenge. Currently, modelling cluster energy predictions with traditional machine learning methods has several issues, including reliance on manual expertise, slow computation, heavy computational resource demands, and less efficient parameter tuning. RESULTS This paper introduces a predictive model for the energy of a gold cluster comprising twenty atoms (referred to as Au20 cluster). The model integrates the Multiple Strategy Fusion Whale Optimization Algorithm (MSFWOA) with the Light Gradient Boosting Machine (LightGBM), resulting in the MSFWOA-LightGBM model. This model employs the Coulomb matrix representation and eigenvalue solution methods for feature extraction. Additionally, it incorporates the Tent chaotic mapping, cosine convergence factor, and inertia weight updating strategy to optimize the Whale Optimization Algorithm (WOA), leading to the development of MSFWOA. Subsequently, MSFWOA is employed to optimize the parameters of LightGBM for supporting the energy prediction of Au20 cluster. CONCLUSIONS The experimental results show that the most stable Au20 cluster structure is a regular tetrahedron with the lowest energy, displaying tight and uniform atom distribution, high geometric symmetry. Compared to other models, the MSFWOA-LightGBM model excels in accuracy and correlation, with MSE, RMSE, and R2 values of 0.897, 0.947, and 0.879, respectively. Additionally, the MSFWOA-LightGBM model possesses outstanding scalability, offering valuable insights for material design, energy storage, sensing technology, and biomedical imaging, with the potential to drive research and development in these areas.
Collapse
Affiliation(s)
- Wu Wei
- School of Physics and Electronic Information, Gannan Normal University, Ganzhou, 341000, Jiangxi, China
| | - Li Mengshan
- School of Physics and Electronic Information, Gannan Normal University, Ganzhou, 341000, Jiangxi, China.
| | - Wu Yan
- School of Mathematics and Computer Science, Gannan Normal University, Ganzhou, 341000, Jiangxi, China
| | - Guan Lixin
- School of Physics and Electronic Information, Gannan Normal University, Ganzhou, 341000, Jiangxi, China
| |
Collapse
|