1
|
Song Y, Ding Y, Su J, Li J, Ji Y. Unlocking the Potential of Machine Learning in Co-crystal Prediction by a Novel Approach Integrating Molecular Thermodynamics. Angew Chem Int Ed Engl 2025; 64:e202502410. [PMID: 40072272 DOI: 10.1002/anie.202502410] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2025] [Revised: 03/11/2025] [Accepted: 03/12/2025] [Indexed: 03/25/2025]
Abstract
Co-crystal engineering is of interest for many applications in pharmaceutical, chemical, and materials fields, but rational design of co-crystals is still challenging. Although artificial intelligence has revolutionized decision-making processes in material design, limitations in generalization and mechanistic understanding remain. Herein, we sought to improve prediction of co-crystals by combining mechanistic thermodynamic modeling with machine learning. We constructed a brand-new co-crystal database, integrating drug, coformer, and reaction solvent information. By incorporating various thermodynamic models, the predictive performance was significantly enhanced. Benefiting from the complementarity of thermodynamic mechanisms and structural descriptors, the model coupling three thermodynamic models achieved optimal predictive performance in coformer and solvent screening. The model was rigorously validated against benchmark models using challenging independent test sets, showcasing superior performance in both coformer and solvent predicting with accuracy over 90%. Further, we employed SHAP analysis for model interpretation, suggesting that thermodynamic mechanisms are prominent in the model's decision-making. Proof-of-concept studies on ketoconazole validated the model's efficacy in identifying coformers/solvents, demonstrating its potential in practical application. Overall, our work enhanced the understanding of co-crystallization and highlighted the strategy that integrates mechanistic insights with data-driven models to accelerate the rational design and synthesis of co-crystals, as well as various other functional materials.
Collapse
Affiliation(s)
- Yutong Song
- Jiangsu Province Hi-Tech Key Laboratory for Biomedical Research, School of Chemistry and Chemical Engineering, Southeast University, Nanjing, 211198, P.R. China
| | - Yewei Ding
- Jiangsu Province Hi-Tech Key Laboratory for Biomedical Research, School of Chemistry and Chemical Engineering, Southeast University, Nanjing, 211198, P.R. China
| | - Junyi Su
- Jiangsu Province Hi-Tech Key Laboratory for Biomedical Research, School of Chemistry and Chemical Engineering, Southeast University, Nanjing, 211198, P.R. China
| | - Jian Li
- Jinling Pharmaceutical Co., Ltd., Nanjing, 210009, P.R. China
| | - Yuanhui Ji
- Jiangsu Province Hi-Tech Key Laboratory for Biomedical Research, School of Chemistry and Chemical Engineering, Southeast University, Nanjing, 211198, P.R. China
| |
Collapse
|
2
|
Fan Z, Chen L, Wu X, Huang Z, Deng L. Enhancing Predictions of Drug Solubility Through Multidimensional Structural Characterization Exploitation. IEEE J Biomed Health Inform 2025; 29:1828-1837. [PMID: 38713567 DOI: 10.1109/jbhi.2024.3397493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/09/2024]
Abstract
Solubility is not only a significant physical property of molecules but also a vital factor in small-molecule drug development. Determining drug solubility demands stringent equipment, controlled environments, and substantial human and material resources. The accurate prediction of drug solubility using computational methods has long been a goal for researchers. In this study, we introduce MSCSol, a solubility prediction model that integrates multidimensional molecular structure information. We incorporate a graph neural network with geometric vector perceptrons (GVP-GNN) to encode 3D molecular structures, representing spatial arrangement and orientation of atoms, as well as atomic sequences and interactions. We also employ Selective Kernel Convolution combined with Global and Local attention mechanisms to capture molecular features context at different scales. Additionally, various descriptors are calculated to enrich the molecular representation. For the 2D and 3D structural data of molecules, we design different data augmentation strategies to enhance generalization ability and prevent the model from learning irrelevant information. Extensive experiments on benchmark and independent datasets demonstrate MSCSol's superior performance. Ablation studies further confirm the effectiveness of different modules. Interpretability analysis highlights the importance of various atomic groups and substructures for solubility and verifies that our model effectively captures functional molecular structures and higher-order knowledge.
Collapse
|
3
|
Alanazi M, Alanazi J, Alharby TN, Huwaimel B. Correlation of rivaroxaban solubility in mixed solvents for optimization of solubility using machine learning analysis and validation. Sci Rep 2025; 15:4725. [PMID: 39922955 PMCID: PMC11807219 DOI: 10.1038/s41598-025-89093-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2024] [Accepted: 02/03/2025] [Indexed: 02/10/2025] Open
Abstract
In this study, the solubility of rivaroxaban, a poorly water-soluble drug, was investigated in mixed solvent systems to address challenges in pharmaceutical formulation and bioavailability enhancement. Solubility optimization is essential for the effective delivery and therapeutic performance of rivaroxaban, as its low aqueous solubility limits oral bioavailability and necessitates innovative approaches for drug formulation. The study explored the role of primary alcohols combined with dichloromethane in improving solubility, emphasizing their industrial relevance in crystallization, purification, and drug manufacturing processes. To complement experimental insights, machine learning models were employed to predict rivaroxaban solubility based on temperature, solvent type, and mass fraction of dichloromethane. Three models-AdaBoost Gaussian process regression (ADAGPR), AdaBoost multilayer perceptron (ADAMLP), and AdaBoost LASSO regression (ADALASSO)-were evaluated using [Formula: see text], RMSE, and MAPE metrics. Among these, ADAGPR demonstrated superior performance with an R² score of [Formula: see text], outperforming ADAMLP [Formula: see text] and [Formula: see text]. It also achieved the lowest total RMSE [Formula: see text] and MAPE [Formula: see text], confirming its predictive precision and reliability. Optimal solubility conditions were identified at [Formula: see text] with a mass fraction of 0.8190 in a dichloromethane-methanol mixture, yielding a predicted solubility of [Formula: see text]. These findings highlight the potential of combining chemical engineering principles with advanced predictive modeling to optimize solubility in complex solvent systems, offering significant value to pharmaceutical development and process optimization.
Collapse
Affiliation(s)
- Muteb Alanazi
- Department of Clinical Pharmacy, College of Pharmacy, University of Ha'il, Ha'il, 81442, Saudi Arabia.
| | - Jowaher Alanazi
- Department of Pharmacology and Toxicology, College of Pharmacy, University of Ha'il, Ha'il, 81442, Saudi Arabia
| | - Tareq Nafea Alharby
- Department of Clinical Pharmacy, College of Pharmacy, University of Ha'il, Ha'il, 81442, Saudi Arabia
| | - Bader Huwaimel
- Department of Pharmaceutical Chemistry, College of Pharmacy, University of Ha'il, Hail, 81442, Saudi Arabia
- Medical and Diagnostic Research Center, University of Ha'il, Hail, 55473, Saudi Arabia
| |
Collapse
|
4
|
Bao Z, Tom G, Cheng A, Watchorn J, Aspuru-Guzik A, Allen C. Towards the prediction of drug solubility in binary solvent mixtures at various temperatures using machine learning. J Cheminform 2024; 16:117. [PMID: 39468626 PMCID: PMC11520512 DOI: 10.1186/s13321-024-00911-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2024] [Accepted: 09/28/2024] [Indexed: 10/30/2024] Open
Abstract
Drug solubility is an important parameter in the drug development process, yet it is often tedious and challenging to measure, especially for expensive drugs or those available in small quantities. To alleviate these challenges, machine learning (ML) has been applied to predict drug solubility as an alternative approach. However, the majority of existing ML research has focused on the predictions of aqueous solubility and/or solubility at specific temperatures, which restricts the model applicability in pharmaceutical development. To bridge this gap, we compiled a dataset of 27,000 solubility datapoints, including solubility of small molecules measured in a range of binary solvent mixtures under various temperatures. Next, a panel of ML models were trained on this dataset with their hyperparameters tuned using Bayesian optimization. The resulting top-performing models, both gradient boosted decision trees (light gradient boosting machine and extreme gradient boosting), achieved mean absolute errors (MAE) of 0.33 for LogS (S in g/100 g) on the holdout set. These models were further validated through a prospective study, wherein the solubility of four drug molecules were predicted by the models and then validated with in-house solubility experiments. This prospective study demonstrated that the models accurately predicted the solubility of solutes in specific binary solvent mixtures under different temperatures, especially for drugs whose features closely align within the solutes in the dataset (MAE < 0.5 for LogS). To support future research and facilitate advancements in the field, we have made the dataset and code openly available. Scientific contribution Our research advances the state-of-the-art in predicting solubility for small molecules by leveraging ML and a uniquely comprehensive dataset. Unlike existing ML studies that predominantly focus on solubility in aqueous solvents at fixed temperatures, our work enables prediction of drug solubility in a variety of binary solvent mixtures over a broad temperature range, providing practical insights on the modeling of solubility for realistic pharmaceutical applications. These advancements along with the open access dataset and code support significant steps in the drug development process including new molecule discovery, drug analysis and formulation.
Collapse
Affiliation(s)
- Zeqing Bao
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, M5S 3M2, Canada
| | - Gary Tom
- Department of Chemistry, University of Toronto, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, M5S 1M1, Canada
| | - Austin Cheng
- Department of Chemistry, University of Toronto, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, M5S 1M1, Canada
| | | | - Alán Aspuru-Guzik
- Department of Chemistry, University of Toronto, Toronto, ON, M5S 3H6, Canada
- Department of Computer Science, University of Toronto, Toronto, ON, M5S 2E4, Canada
- Vector Institute for Artificial Intelligence, Toronto, ON, M5S 1M1, Canada
- Acceleration Consortium, Toronto, ON, M5S 3H6, Canada
- Lebovic Fellow, Canadian Institute for Advanced Research (CIFAR), Toronto, ON, M5S 1M1, Canada
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada
- Department of Materials Science and Engineering, University of Toronto, Toronto, ON, M5S 3E4, Canada
- CIFAR Artificial Intelligence Research Chair, Vector Institute, Toronto, ON, M5S 1M1, Canada
| | - Christine Allen
- Leslie Dan Faculty of Pharmacy, University of Toronto, Toronto, ON, M5S 3M2, Canada.
- Acceleration Consortium, Toronto, ON, M5S 3H6, Canada.
- Department of Chemical Engineering and Applied Chemistry, University of Toronto, Toronto, ON, M5S 3E5, Canada.
| |
Collapse
|
5
|
Alamoudi JA. Recent advancements toward the incremsent of drug solubility using environmentally-friendly supercritical CO 2: a machine learning perspective. Front Med (Lausanne) 2024; 11:1467289. [PMID: 39286644 PMCID: PMC11402729 DOI: 10.3389/fmed.2024.1467289] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2024] [Accepted: 08/20/2024] [Indexed: 09/19/2024] Open
Abstract
Inadequate bioavailability of therapeutic drugs, which is often the consequence of their unacceptable solubility and dissolution rates, is an indisputable operational challenge of pharmaceutical companies due to its detrimental effect on the therapeutic efficacy. Over the recent decades, application of supercritical fluids (SCFs) (mainly SCCO2) has attracted the attentions of many scientists as promising alternative of toxic and environmentally-hazardous organic solvents due to possessing positive advantages like low flammability, availability, high performance, eco-friendliness and safety/simplicity of operation. Nowadays, application of different machine learning (ML) as a versatile, robust and accurate approach for the prediction of different momentous parameters like solubility and bioavailability has been of great attentions due to the non-affordability and time-wasting nature of experimental investigations. The prominent goal of this article is to review the role of different ML-based tools for the prediction of solubility/bioavailability of drugs using SCCO2. Moreover, the importance of solubility factor in the pharmaceutical industry and different possible techniques for increasing the amount of this parameter in poorly-soluble drugs are comprehensively discussed. At the end, the efficiency of SCCO2 for improving the manufacturing process of drug nanocrystals is aimed to be discussed.
Collapse
Affiliation(s)
- Jawaher Abdullah Alamoudi
- Department of Pharmaceutical Sciences, College of Pharmacy, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia
| |
Collapse
|
6
|
Liang Z, Lin C, Tan G, Li J, He Y, Cai S. A low-cost machine learning framework for predicting drug-drug interactions based on fusion of multiple features and a parameter self-tuning strategy. Phys Chem Chem Phys 2024; 26:6300-6315. [PMID: 38305788 DOI: 10.1039/d4cp00039k] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/03/2024]
Abstract
Poly-drug therapy is now recognized as a crucial treatment, and the analysis of drug-drug interactions (DDIs) offers substantial theoretical support and guidance for its implementation. Predicting potential DDIs using intelligent algorithms is an emerging approach in pharmacological research. However, the existing supervised models and deep learning-based techniques still have several limitations. This paper proposes a novel DDI analysis and prediction framework called the Multi-View Semi-supervised Graph-based (MVSG) framework, which provides a comprehensive judgment by integrating multiple DDI features and functions without any time-consuming training process. Unlike conventional approaches, MVSG can search for the most suitable similarity (or distance) measurement among DDI data and construct graph structures for each feature. By employing a parameter self-tuning strategy, MVSG fuses multiple graphs according to the contributions of features' information. The actual anticancer drug data are extracted from the authoritative public database for evaluating the effectiveness of our framework, including 904 drugs, 7730 DDI records and 19 types of drug interactions. Validation results indicate that the prediction is more accurate when multiple features are adopted by our framework. In comparison to conventional machine learning techniques, MVSG can achieve higher performance even with less labeled data and without a training process. Finally, MVSG is employed to narrow down the search for potential valuable combinations.
Collapse
Affiliation(s)
- Zexiao Liang
- School of Integrated Circuits, Guangdong University of Technology, 100 Waihuan Xi Road, Panyu District, Guangzhou, 510006, Guangdong, China.
| | - Canxin Lin
- School of Computer Science and Technology, Guangdong University of Technology, 100 Waihuan Xi Road, Panyu District, Guangzhou, 510006, Guangdong, China
| | - Guoliang Tan
- School of Automation, Guangdong University of Technology, 100 Waihuan Xi Road, Panyu District, Guangzhou, 510006, Guangdong, China
| | - Jianzhong Li
- School of Integrated Circuits, Guangdong University of Technology, 100 Waihuan Xi Road, Panyu District, Guangzhou, 510006, Guangdong, China.
| | - Yan He
- School of Biomedical and Pharmaceutical Sciences, Guangdong University of Technology, 100 Waihuan Xi Road, Panyu District, Guangzhou, 510006, Guangdong, China
| | - Shuting Cai
- School of Integrated Circuits, Guangdong University of Technology, 100 Waihuan Xi Road, Panyu District, Guangzhou, 510006, Guangdong, China.
| |
Collapse
|
7
|
Jovic O, Mouras R. Extreme Gradient Boosting Combined with Conformal Predictors for Informative Solubility Estimation. Molecules 2023; 29:19. [PMID: 38202602 PMCID: PMC10779886 DOI: 10.3390/molecules29010019] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2023] [Revised: 12/15/2023] [Accepted: 12/17/2023] [Indexed: 01/12/2024] Open
Abstract
We used the extreme gradient boosting (XGB) algorithm to predict the experimental solubility of chemical compounds in water and organic solvents and to select significant molecular descriptors. The accuracy of prediction of our forward stepwise top-importance XGB (FSTI-XGB) on curated solubility data sets in terms of RMSE was found to be 0.59-0.76 Log(S) for two water data sets, while for organic solvent data sets it was 0.69-0.79 Log(S) for the Methanol data set, 0.65-0.79 for the Ethanol data set, and 0.62-0.70 Log(S) for the Acetone data set. That was the first step. In the second step, we used uncurated and curated AquaSolDB data sets for applicability domain (AD) tests of Drugbank, PubChem, and COCONUT databases and determined that more than 95% of studied ca. 500,000 compounds were within the AD. In the third step, we applied conformal prediction to obtain narrow prediction intervals and we successfully validated them using test sets' true solubility values. With prediction intervals obtained in the last fourth step, we were able to estimate individual error margins and the accuracy class of the solubility prediction for molecules within the AD of three public databases. All that was possible without the knowledge of experimental database solubilities. We find these four steps novel because usually, solubility-related works only study the first step or the first two steps.
Collapse
Affiliation(s)
| | - Rabah Mouras
- Pharmaceutical Manufacturing Technology Centre, Bernal Institute, Department of Chemical Sciences, University of Limerick, V94 T9PX Limerick, Ireland;
| |
Collapse
|
8
|
Wang C, Cheng Y, Ma Y, Ji Y, Huang D, Qian H. Prediction of enhanced drug solubility related to clathrate compositions and operating conditions: Machine learning study. Int J Pharm 2023; 646:123458. [PMID: 37776964 DOI: 10.1016/j.ijpharm.2023.123458] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2023] [Revised: 09/14/2023] [Accepted: 09/27/2023] [Indexed: 10/02/2023]
Abstract
Although complexation technique has been documented as a promising strategy to enhance the dissolution rate and bioavailability of water-insoluble drugs, prediction of the enhanced drug solubility related to clathrate compositions and operating conditions is still a challenge. Herein, clathrate compositions (drug content (DC), drug molecular weight (M) and molar ratio (Ratio)), operating conditions (drug concentration (C), pH, pressure (P), temperature (T) and dissolution time (t)) under the different excipients (PEG, PVP, HPMC and cyclodextrin) as main solubilizers of the clathrates condition as input parameters were used to predict two indexes (drug dissolved percentage and dissolution efficiency) simultaneously through machine learning methodfor the first time. The results show that PVP as the main solubilizer of clathrates had higher prediction accuracy to the drug dissolved percentage, and HPMC as the main solubilizer of clathrates had higher prediction accuracy to the drug dissolution efficiency. In addition, the influence of various factors and interactions on the target variables were analyzed. This study affords achievable hints to the quantitative prediction of the drug solubility affected by various compositions and different operating conditions.
Collapse
Affiliation(s)
- Cong Wang
- Department of Pharmaceutical Engineering, China Pharmaceutical University, Nanjing 211198, PR China
| | - Yuan Cheng
- Department of Pharmaceutical Engineering, China Pharmaceutical University, Nanjing 211198, PR China
| | - Yuhong Ma
- Department of Pharmaceutical Engineering, China Pharmaceutical University, Nanjing 211198, PR China
| | - Yuanhui Ji
- Jiangsu Province Hi-Tech Key Laboratory for Biomedical Research, School of Chemistry and Chemical Engineering, Southeast University, Nanjing 211189, PR China.
| | - Dechun Huang
- Department of Pharmaceutical Engineering, China Pharmaceutical University, Nanjing 211198, PR China
| | - Hongliang Qian
- Department of Pharmaceutical Engineering, China Pharmaceutical University, Nanjing 211198, PR China.
| |
Collapse
|
9
|
Zhang R, Chen Y, Fan D, Liu T, Ma Z, Dai Y, Wang Y, Zhu Z. Modelling enzyme inhibition toxicity of ionic liquid from molecular structure via convolutional neural network model. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2023; 34:789-803. [PMID: 37722394 DOI: 10.1080/1062936x.2023.2255517] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/02/2023] [Accepted: 08/30/2023] [Indexed: 09/20/2023]
Abstract
Deep learning (DL) methods further promote the development of quantitative structure-activity/property relationship (QSAR/QSPR) models by dealing with complex relationships between data. An acetylcholinesterase inhibitory toxicity model of ionic liquids (ILs) was established using a convolution neural network (CNN) combined with support vector machine (SVM), random forest (RF) and multilayer perceptron (MLP). A CNN model was proposed for feature self-learning and extraction of ILs. By comparing with the model results through feature engineering (FE), the model regression results based on the CNN model for feature extraction have been substantially improved. The results showed that all six models (FE-SVM, FE-RF, FE-MLP, CNN-SVM, CNN-RF, and CNN-MLP) had good prediction accuracy, but the results based on the CNN model were better. The hyperparameters of six models were optimized by grid search and the 10-fold cross validation. Compared with the existing models in the literature, the model performance has been further improved. The model could be used as an intelligent tool to guide the design or screening of low-toxicity ILs.
Collapse
Affiliation(s)
- R Zhang
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People's Republic of China
| | - Y Chen
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People's Republic of China
| | - D Fan
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People's Republic of China
| | - T Liu
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People's Republic of China
| | - Z Ma
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People's Republic of China
| | - Y Dai
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People's Republic of China
| | - Y Wang
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People's Republic of China
| | - Z Zhu
- College of Chemical Engineering, Qingdao University of Science and Technology, Qingdao, People's Republic of China
| |
Collapse
|
10
|
Abstract
Condensable gases are the sum of condensable and volatile steam or organic compounds, including water vapor, which are discharged into the atmosphere in gaseous form at atmospheric pressure and room temperature. Condensable toxic and harmful gases emitted from petrochemical, chemical, packaging and printing, industrial coatings, and mineral mining activities seriously pollute the atmospheric environment and endanger human health. Meanwhile, these gases are necessary chemical raw materials; therefore, developing green and efficient capture technology is significant for efficiently utilizing condensed gas resources. To overcome the problems of pollution and corrosion existing in traditional organic solvent and alkali absorption methods, ionic liquids (ILs), known as "liquid molecular sieves", have received unprecedented attention thanks to their excellent separation and regeneration performance and have gradually become green solvents used by scholars to replace traditional absorbents. This work reviews the research progress of ILs in separating condensate gas. As the basis of chemical engineering, this review first provides a detailed discussion of the origin of predictive molecular thermodynamics and its broad application in theory and industry. Afterward, this review focuses on the latest research results of ILs in the capture of several important typical condensable gases, including water vapor, aromatic VOCs (i.e., BTEX), chlorinated VOC, fluorinated refrigerant gas, low-carbon alcohols, ketones, ethers, ester vapors, etc. Using pure IL, mixed ILs, and IL + organic solvent mixtures as absorbents also briefly expanded the related reports of porous materials loaded with an IL as adsorbents. Finally, future development and research directions in this exciting field are remarked.
Collapse
Affiliation(s)
- Guoxuan Li
- State Key Laboratory of Chemical Resource Engineering, Beijing Key Laboratory of Energy Environmental Catalysis, Beijing University of Chemical Technology, Box 266, Beijing 100029, China
| | - Kai Chen
- School of Chemistry and Chemical Engineering/State Key Laboratory Incubation Base for Green Processing of Chemical Engineering, Shihezi University, Shihezi 832003, China
| | - Zhigang Lei
- State Key Laboratory of Chemical Resource Engineering, Beijing Key Laboratory of Energy Environmental Catalysis, Beijing University of Chemical Technology, Box 266, Beijing 100029, China
- School of Chemistry and Chemical Engineering/State Key Laboratory Incubation Base for Green Processing of Chemical Engineering, Shihezi University, Shihezi 832003, China
| | - Zhong Wei
- School of Chemistry and Chemical Engineering/State Key Laboratory Incubation Base for Green Processing of Chemical Engineering, Shihezi University, Shihezi 832003, China
| |
Collapse
|
11
|
Pavliš J, Mathers A, Fulem M, Klajmon M. Can Pure Predictions of Activity Coefficients from PC-SAFT Assist Drug-Polymer Compatibility Screening? Mol Pharm 2023; 20:3960-3974. [PMID: 37386723 PMCID: PMC10410664 DOI: 10.1021/acs.molpharmaceut.3c00124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2023] [Revised: 06/09/2023] [Accepted: 06/13/2023] [Indexed: 07/01/2023]
Abstract
The bioavailability of poorly water-soluble active pharmaceutical ingredients (APIs) can be improved via the formulation of an amorphous solid dispersion (ASD), where the API is incorporated into a suitable polymeric carrier. Optimal carriers that exhibit good compatibility (i.e., solubility and miscibility) with given APIs are typically identified through experimental means, which are routinely labor- and cost-inefficient. Therefore, the perturbed-chain statistical associating fluid theory (PC-SAFT) equation of state, a popular thermodynamic model in pharmaceutical applications, is examined in terms of its performance regarding the computational pure prediction of API-polymer compatibility based on activity coefficients (API fusion properties were taken from experiments) without any binary interaction parameters fitted to API-polymer experimental data (that is, kij = 0 in all cases). This kind of prediction does not need any experimental binary information and has been underreported in the literature so far, as the routine modeling strategy used in the majority of the existing PC-SAFT applications to ASDs comprised the use of nonzero kij values. The predictive performance of PC-SAFT was systematically and thoroughly evaluated against reliable experimental data for almost 40 API-polymer combinations. We also examined the effect of different sets of PC-SAFT parameters for APIs on compatibility predictions. Quantitatively, the total average error calculated over all systems was approximately 50% in the weight fraction solubility of APIs in polymers, regardless of the specific API parametrization. The magnitude of the error for individual systems was found to vary significantly from one system to another. Interestingly, the poorest results were obtained for systems with self-associating polymers such as poly(vinyl alcohol). Such polymers can form intramolecular hydrogen bonds, which are not accounted for in the PC-SAFT variant routinely applied to ASDs (i.e., that used in this work). However, the qualitative ranking of polymers with respect to their compatibility with a given API was reasonably predicted in many cases. It was also predicted correctly that some polymers always have better compatibility with the APIs than others. Finally, possible future routes to improve the cost-performance ratio of PC-SAFT in terms of parametrization are discussed.
Collapse
Affiliation(s)
- Jáchym Pavliš
- Department of Physical Chemistry,
Faculty of Chemical Engineering, University
of Chemistry and Technology, Prague, Technická 5, 166 28 Prague 6, Czech Republic
| | - Alex Mathers
- Department of Physical Chemistry,
Faculty of Chemical Engineering, University
of Chemistry and Technology, Prague, Technická 5, 166 28 Prague 6, Czech Republic
| | - Michal Fulem
- Department of Physical Chemistry,
Faculty of Chemical Engineering, University
of Chemistry and Technology, Prague, Technická 5, 166 28 Prague 6, Czech Republic
| | - Martin Klajmon
- Department of Physical Chemistry,
Faculty of Chemical Engineering, University
of Chemistry and Technology, Prague, Technická 5, 166 28 Prague 6, Czech Republic
| |
Collapse
|
12
|
Syed TA, Ansari KB, Banerjee A, Wood DA, Khan MS, Al Mesfer MK. Machine‐learning predictions of caffeine co‐crystal formation accompanying experimental and molecular validations. J FOOD PROCESS ENG 2022. [DOI: 10.1111/jfpe.14230] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Affiliation(s)
- Tanweer A. Syed
- Department of Chemical Engineering Institute of Chemical Technology Mumbai Maharashtra India
| | - Khursheed B. Ansari
- Department of Chemical Engineering Zakir Husain College of Engineering and Technology, Aligarh Muslim University Aligarh Uttar Pradesh India
| | - Arghya Banerjee
- Department of Chemical Engineering Indian Institute of Technology Ropar Punjab India
| | | | - Mohd Shariq Khan
- Department of Chemical Engineering, College of Engineering Dhofar University Salalah Oman
| | | |
Collapse
|
13
|
Li M, Chen H, Zhang H, Zeng M, Chen B, Guan L. Prediction of the Aqueous Solubility of Compounds Based on Light Gradient Boosting Machines with Molecular Fingerprints and the Cuckoo Search Algorithm. ACS OMEGA 2022; 7:42027-42035. [PMID: 36440111 PMCID: PMC9685740 DOI: 10.1021/acsomega.2c03885] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 10/18/2022] [Indexed: 06/16/2023]
Abstract
Aqueous solubility is one of the most important physicochemical properties in drug discovery. At present, the prediction of aqueous solubility of compounds is still a challenging problem. Machine learning has shown great potential in solubility prediction. Most machine learning models largely rely on the setting of hyperparameters, and their performance can be improved by setting the hyperparameters in a better way. In this paper, we used MACCS fingerprints to represent the structural features and optimized the hyperparameters of the light gradient boosting machine (LightGBM) with the cuckoo search algorithm (CS). Based on the above representation and optimization, the CS-LightGBM model was established to predict the aqueous solubility of 2446 organic compounds and the obtained prediction results were compared with those obtained with the other six different machine learning models (RF, GBDT, XGBoost, LightGBM, SVR, and BO-LightGBM). The comparison results showed that the CS-LightGBM model had a better prediction performance than the other six different models. RMSE, MAE, and R 2 of the CS-LightGBM model were, respectively, 0.7785, 0.5117, and 0.8575. In addition, this model has good scalability and can be used to solve solubility prediction problems in other fields such as solvent selection and drug screening.
Collapse
|
14
|
Klajmon M. Purely Predicting the Pharmaceutical Solubility: What to Expect from PC-SAFT and COSMO-RS? Mol Pharm 2022; 19:4212-4232. [PMID: 36136040 DOI: 10.1021/acs.molpharmaceut.2c00573] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A pair of popular thermodynamic models for pharmaceutical applications, namely, the perturbed-chain statistical associating fluid theory (PC-SAFT) equation of state and the conductor-like screening model for real solvents (COSMO-RS) are thoroughly benchmarked for their performance in predicting the solubility of active pharmaceutical ingredients (APIs) in pure solvents. The ultimate goal is to provide an illustration of what to expect from these progressive frameworks when applied to the thermodynamic solubility of APIs based on activity coefficients in a purely predictive regime without specific experimental solubility data (the fusion properties of pure APIs were taken from experiments). While this kind of prediction represents the typical modus operandi of the first-principles-aided COSMO-RS, PC-SAFT is a relatively highly parametrized model that relies on experimental data, against which its pure-substance and binary interaction parameters (kij) are fitted. Therefore, to make this benchmark as fair as possible, we omitted any binary parameters of PC-SAFT (i.e., kij = 0 in all cases) and preferred pure-substance parameter sets for APIs not trained to experimental solubility data. This computational approach, together with a detailed assessment of the obtained solubility predictions against a large experimental data set, revealed that COSMO-RS convincingly outperformed PC-SAFT both qualitatively (i.e., COSMO-RS was better in solvent ranking) and quantitatively, even though the former is independent of both substance- and mixture-specific experimental data. Regarding quantitative comparison, COSMO-RS outperformed PC-SAFT for 9 of the 10 APIs and for 63% of the API-solvent systems, with root-mean-square deviations of the predicted data from the entire experimental data set being 0.82 and 1.44 log units, respectively. The results were further analyzed to expand the picture of the performance of both models with respect to the individual APIs and solvents. Interestingly, in many cases, both models were found to qualitatively incorrectly predict the direction of deviations from ideality. Furthermore, we examined how the solubility predictions from both models are sensitive to different API parametrizations.
Collapse
Affiliation(s)
- Martin Klajmon
- Department of Physical Chemistry, Faculty of Chemical Engineering, University of Chemistry and Technology, Prague, Technická 5, 166 28 Prague 6, Czech Republic
| |
Collapse
|
15
|
de Hemptinne JC, Kontogeorgis GM, Dohrn R, Economou IG, ten Kate A, Kuitunen S, Fele Žilnik L, De Angelis MG, Vesovic V. A View on the Future of Applied Thermodynamics. Ind Eng Chem Res 2022. [DOI: 10.1021/acs.iecr.2c01906] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
| | - Georgios M. Kontogeorgis
- Center for Energy Resources Engineering (CERE), Department of Chemical and Biochemical Engineering, Technical University of Denmark, Lyngby DK-2800, Denmark
| | - Ralf Dohrn
- Bayer AG, Process Technologies, Building E41, Leverkusen 51368, Germany
| | - Ioannis G. Economou
- Chemical Engineering Program, Texas A&M University at Qatar, Doha P.O. Box 23874, Qatar
| | | | - Susanna Kuitunen
- Neste Engineering Solutions Oy, P.O. Box 310, Porvoo FI-06101, Finland
| | - Ljudmila Fele Žilnik
- Department of Catalysis and Chemical Reaction Engineering, National Institute of Chemistry, Hajdrihova 19, Ljubljana 1001, Slovenia
| | - Maria Grazia De Angelis
- Institute for Materials and Processes, School of Engineering, University of Edinburgh, Sanderson Building, Edinburgh EH9 3FB, UK
- Department of Civil, Chemical, Environmental and Materials Engineering University of Bologna, Bologna 40131 Italy
| | - Velisa Vesovic
- Department of Earth Science and Engineering, Imperial College London, South Kensington Campus, London SW7 2AZ, United Kingdom
| |
Collapse
|
16
|
Ge K, Ji Y. A thermodynamic approach for predicting thermodynamic phase behaviors of pharmaceuticals in biorelevant media. Chem Eng Sci 2022. [DOI: 10.1016/j.ces.2022.117973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
|
17
|
Priya S, Tripathi G, Singh DB, Jain P, Kumar A. Machine learning approaches and their applications in drug discovery and design. Chem Biol Drug Des 2022; 100:136-153. [PMID: 35426249 DOI: 10.1111/cbdd.14057] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2022] [Revised: 03/30/2022] [Accepted: 04/10/2022] [Indexed: 01/04/2023]
Abstract
This review is focused on several machine learning approaches used in chemoinformatics. Machine learning approaches provide tools and algorithms to improve drug discovery. Many physicochemical properties of drugs like toxicity, absorption, drug-drug interaction, carcinogenesis, and distribution have been effectively modeled by QSAR techniques. Machine learning is a subset of artificial intelligence, and this technique has shown tremendous potential in the field of drug discovery. Techniques discussed in this review are capable of modeling non-linear datasets, as well as big data of increasing depth and complexity. Various machine learning-based approaches are being used for drug target prediction, modeling the structure of drug target, binding site prediction, ligand-based similarity searching, de novo designing of ligands with desired properties, developing scoring functions for molecular docking, building QSAR model for biological activity prediction, and prediction of pharmacokinetic and pharmacodynamic properties of ligands. In recent years, these predictive tools and models have achieved good accuracy. By the use of more related input data, relevant parameters, and appropriate algorithms, the accuracy of these predictions can be further improved.
Collapse
Affiliation(s)
- Sonal Priya
- Department of Chemistry, T. N. B. College, TMBU, Bhagalpur, India
| | - Garima Tripathi
- Department of Chemistry, T. N. B. College, TMBU, Bhagalpur, India
| | - Dev Bukhsh Singh
- Department of Biotechnology, Siddharth University, Siddharth Nagar, India
| | - Priyanka Jain
- National Institute of Plant Genome Research, New Delhi, India
| | - Abhijeet Kumar
- Department of Chemistry, Mahatma Gandhi Central University, Motihari, India
| |
Collapse
|
18
|
Shi Y, Wang J, Wang Q, Jia Q, Yan F, Luo ZH, Zhou YN. Supervised Machine Learning Algorithms for Predicting Rate Constants of Ozone Reaction with Micropollutants. Ind Eng Chem Res 2022. [DOI: 10.1021/acs.iecr.1c04697] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Affiliation(s)
- Yajuan Shi
- Department of Chemical Engineering, School of Chemistry and Chemical Engineering, State Key Laboratory of Metal Matrix Composites, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Jiang Wang
- Department of Chemical Engineering, School of Chemistry and Chemical Engineering, State Key Laboratory of Metal Matrix Composites, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Qiang Wang
- School of Chemical Engineering and Material Science, Tianjin University of Science and Technology, Tianjin, 300457, P. R. China
| | - Qingzhu Jia
- School of Marine and Environmental Science, Tianjin University of Science and Technology, Tianjin, 300457, P. R. China
| | - Fangyou Yan
- School of Chemical Engineering and Material Science, Tianjin University of Science and Technology, Tianjin, 300457, P. R. China
| | - Zheng-Hong Luo
- Department of Chemical Engineering, School of Chemistry and Chemical Engineering, State Key Laboratory of Metal Matrix Composites, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| | - Yin-Ning Zhou
- Department of Chemical Engineering, School of Chemistry and Chemical Engineering, State Key Laboratory of Metal Matrix Composites, Shanghai Jiao Tong University, Shanghai, 200240, P. R. China
| |
Collapse
|