1
|
Gheta SKO, Bonin A, Gerlach T, Göller AH. Predicting absolute aqueous solubility by applying a machine learning model for an artificially liquid-state as proxy for the solid-state. J Comput Aided Mol Des 2023; 37:765-789. [PMID: 37878216 DOI: 10.1007/s10822-023-00538-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 10/02/2023] [Indexed: 10/26/2023]
Abstract
In this study, we use machine learning algorithms with QM-derived COSMO-RS descriptors, along with Morgan fingerprints, to predict the absolute solubility of drug-like compounds. The QM-derived descriptors account for the molecular properties of the solute, i.e., the solute-solute interactions in an artificial-liquid-state (super-cooled liquid), and the solute-solvent interactions in solution. We employ two main approaches to predict solubility: (i) a hypothetical pathway that involves melting the solute at room temperature T = T¯ ([Formula: see text]) and mixing the artificially liquid solute into the solvent ([Formula: see text]). In this approach [Formula: see text] is predicted using machine learning models, and the [Formula: see text] is obtained from COSMO-RS calculations; (ii) direct solubility prediction using machine learning algorithms. The models were trained on a large number of Bayer in-house compounds for which water solubility data is available at physiological pH of 6.5 and ambient temperature. We also evaluated our models using external datasets from a solubility challenge. Our models present great improvements compared to the absolute solubility prediction with the QSAR model for the artificial liquid state as implemented in the COSMOtherm software, for both in-house and external datasets. We are furthermore able to demonstrate the superiority of QM-derived descriptors compared to cheminformatics descriptors. We finally present low-cost alternative models using fragment-based COSMOquick calculations with only marginal reduction in the quality of predicted solubility.
Collapse
Affiliation(s)
- Sadra Kashef Ol Gheta
- Bayer AG, Pharmaceuticals, R&D, Computational Molecular Design, 42096, Wuppertal, Germany
| | - Anne Bonin
- Bayer AG, Pharmaceuticals, R&D, Computational Molecular Design, 42096, Wuppertal, Germany
| | - Thomas Gerlach
- Bayer AG, Crop Science, R&D, Digital Transformation, 40789, Monheim, Germany
- Bayer AG, Engineering & Technology, Thermal Separation Technologies, 51368, Leverkusen, Germany
| | - Andreas H Göller
- Bayer AG, Pharmaceuticals, R&D, Computational Molecular Design, 42096, Wuppertal, Germany.
| |
Collapse
|
2
|
Díaz Mirón JEZ, Stein M. A benchmark for non-covalent interactions in organometallic crystals. Phys Chem Chem Phys 2022; 24:29338-29349. [PMID: 36448535 DOI: 10.1039/d2cp04160j] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Organometallic complexes are the basis for homogeneous catalysis, have applications in materials science and are also active pharmaceutical ingredients. The interaction between transition metal complexes in the solid state is determining their thermodynamics and bio-availability. Non-covalent interactions such as hydrogen bonding and van der Waals are stabilizing crystals of transition metal complexes. The variation of ligand field, central metal atoms and their oxidation and spin states are determinants of the magnitude of their inter-molecular interactions. A comparison of a set of 43 manually curated experimental heats of sublimation (the new XTMC43 set) and results from periodic DFT calculations shows that an agreement to within 9% can be achieved using GGA or mGGA functionals with atom-centred Gaussian-type basis functions. The need for careful assessments of consistency, calibration and reproducibility of experimental and computational data is discussed. Results regarding the new XTMC43 benchmark set are suggested to serve as a starting point for further method development, systematic screening and crystal engineering.
Collapse
Affiliation(s)
- José Eduardo Zamudio Díaz Mirón
- Molecular Simulations and Design Group, Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstrasse 1, 39106 Magdeburg, Germany.
| | - Matthias Stein
- Molecular Simulations and Design Group, Max Planck Institute for Dynamics of Complex Technical Systems, Sandtorstrasse 1, 39106 Magdeburg, Germany.
| |
Collapse
|
3
|
Abstract
Sublimation is an effective and ‘green’ method to prepare and identify new polymorphs, cocrystals, ionic cocrystals and molecular salts.
Collapse
Affiliation(s)
- Patrick McArdle
- School of Chemistry, National University of Ireland, Galway, Ireland
| | - Andrea Erxleben
- School of Chemistry, National University of Ireland, Galway, Ireland
- Synthesis and Solid State Pharmaceutical Centre (SSPC), Limerick, Ireland
| |
Collapse
|
4
|
Červinka C, Štejfa V. Sublimation Properties of α,ω-Diamines Revisited from First-Principles Calculations. Chemphyschem 2020; 21:1184-1194. [PMID: 32243713 DOI: 10.1002/cphc.202000108] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2020] [Revised: 04/02/2020] [Indexed: 11/06/2022]
Abstract
Sublimation enthalpies of alkane-α,ω-diamines exhibit an odd-even pattern within their homologous series. First-principles calculations coupled with the quasi-harmonic approximation for crystals and with the conformation mixing model for the ideal gas are used to explain this phenomenon from the theoretical point of view. Crystals of the odd and even alkane-α,ω-diamines distinctly differ in their packing motifs. However, first-principles calculations indicate that it is a delicate interplay of the cohesive forces, phonons, molecular vibrations and conformational equilibrium which governs the odd-even pattern of the sublimation enthalpies within the homologous series. High molecular flexibility of the alkane-α,ω-diamines predetermines higher sensitivity of the computational model to the quality of the optimized geometries and relative conformational energies. Performance of high-throughput computational methods, such as the density functional tight binding (DFTB, GFN2-xTB) and the explicitly correlated dispersion-corrected Møller-Plesset perturbative method (MP2C-F12), are benchmarked against the consistent state-of-the-art calculations of conformational energies and interaction energies, respectively.
Collapse
Affiliation(s)
- Ctirad Červinka
- Department of Physical Chemistry, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague 6, Czech Republic
| | - Vojtěch Štejfa
- Department of Physical Chemistry, University of Chemistry and Technology Prague, Technická 5, 166 28, Prague 6, Czech Republic
| |
Collapse
|
5
|
Voronin AP, Surov AO, Churakov AV, Parashchuk OD, Rykounov AA, Vener MV. Combined X-ray Crystallographic, IR/Raman Spectroscopic, and Periodic DFT Investigations of New Multicomponent Crystalline Forms of Anthelmintic Drugs: A Case Study of Carbendazim Maleate. Molecules 2020; 25:E2386. [PMID: 32455564 PMCID: PMC7287603 DOI: 10.3390/molecules25102386] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 05/17/2020] [Accepted: 05/18/2020] [Indexed: 12/14/2022] Open
Abstract
Synthesis of multicomponent solid forms is an important method of modifying and fine-tuning the most critical physicochemical properties of drug compounds. The design of new multicomponent pharmaceutical materials requires reliable information about the supramolecular arrangement of molecules and detailed description of the intermolecular interactions in the crystal structure. It implies the use of a combination of different experimental and theoretical investigation methods. Organic salts present new challenges for those who develop theoretical approaches describing the structure, spectral properties, and lattice energy Elatt. These crystals consist of closed-shell organic ions interacting through relatively strong hydrogen bonds, which leads to Elatt > 200 kJ/mol. Some technical problems that a user of periodic (solid-state) density functional theory (DFT) programs encounters when calculating the properties of these crystals still remain unsolved, for example, the influence of cell parameter optimization on the Elatt value, wave numbers, relative intensity of Raman-active vibrations in the low-frequency region, etc. In this work, various properties of a new two-component carbendazim maleate crystal were experimentally investigated, and the applicability of different DFT functionals and empirical Grimme corrections to the description of the obtained structural and spectroscopic properties was tested. Based on this, practical recommendations were developed for further theoretical studies of multicomponent organic pharmaceutical crystals.
Collapse
Affiliation(s)
- Alexander P. Voronin
- Department of Physical Chemistry of Drugs, G.A. Krestov Institute of Solution Chemistry of RAS, 153045 Ivanovo, Russia; (A.P.V.); (A.O.S.)
| | - Artem O. Surov
- Department of Physical Chemistry of Drugs, G.A. Krestov Institute of Solution Chemistry of RAS, 153045 Ivanovo, Russia; (A.P.V.); (A.O.S.)
| | - Andrei V. Churakov
- Department of Crystal Chemistry and X-ray Diffraction, N.S. Kurnakov Institute of General and Inorganic Chemistry of RAS, 119991 Moscow, Russia;
| | - Olga D. Parashchuk
- Faculty of Physics, Lomonosov Moscow State University, 119991 Moscow, Russia;
| | - Alexey A. Rykounov
- Theoretical Department, FSUE “RFNC-VNIITF Named after Academ. E.I. Zababakhin”, 456770 Snezhinsk, Russia;
| | - Mikhail V. Vener
- Department of Quantum Chemistry, D. Mendeleev University of Chemical Technology, 125047 Moscow, Russia
| |
Collapse
|
6
|
Marchese Robinson RL, Geatches D, Morris C, Mackenzie R, Maloney AGP, Roberts KJ, Moldovan A, Chow E, Pencheva K, Vatvani DRM. Evaluation of Force-Field Calculations of Lattice Energies on a Large Public Dataset, Assessment of Pharmaceutical Relevance, and Comparison to Density Functional Theory. J Chem Inf Model 2019; 59:4778-4792. [DOI: 10.1021/acs.jcim.9b00601] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Affiliation(s)
- Richard L. Marchese Robinson
- Centre for Digital Design of Drug Products, School of Chemical and Process Engineering, University of Leeds, Leeds LS2 9JT, United Kingdom
| | - Dawn Geatches
- Science and Technology Facilities Council, Daresbury Laboratory, Sci-Tech Daresbury, Warrington WA4 4AD, United Kingdom
| | - Chris Morris
- Science and Technology Facilities Council, Daresbury Laboratory, Sci-Tech Daresbury, Warrington WA4 4AD, United Kingdom
| | - Rebecca Mackenzie
- Science and Technology Facilities Council, Daresbury Laboratory, Sci-Tech Daresbury, Warrington WA4 4AD, United Kingdom
| | - Andrew G. P. Maloney
- Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 1EZ, United Kingdom
| | - Kevin J. Roberts
- Centre for Digital Design of Drug Products, School of Chemical and Process Engineering, University of Leeds, Leeds LS2 9JT, United Kingdom
| | - Alexandru Moldovan
- Centre for Digital Design of Drug Products, School of Chemical and Process Engineering, University of Leeds, Leeds LS2 9JT, United Kingdom
| | - Ernest Chow
- Pfizer Worldwide R&D, Ramsgate Road, Sandwich CT13 9NJ, United Kingdom
| | | | | |
Collapse
|
7
|
Lai T, Pencheva K, Chow E, Docherty R. De-Risking Early-Stage Drug Development With a Bespoke Lattice Energy Predictive Model: A Materials Science Informatics Approach to Address Challenges Associated With a Diverse Chemical Space. J Pharm Sci 2019; 108:3176-3186. [DOI: 10.1016/j.xphs.2019.06.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Revised: 05/24/2019] [Accepted: 06/12/2019] [Indexed: 01/11/2023]
|
8
|
Geatches D, Rosbottom I, Marchese Robinson RL, Byrne P, Hasnip P, Probert MIJ, Jochym D, Maloney A, Roberts KJ. Off-the-shelf DFT-DISPersion methods: Are they now “on-trend” for organic molecular crystals? J Chem Phys 2019; 151:044106. [PMID: 31370509 DOI: 10.1063/1.5108829] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Affiliation(s)
- Dawn Geatches
- Science and Technologies Facilities Council, Daresbury Laboratory, Sci-Tech Daresbury, Warrington WA4 4AD, United Kingdom
| | - Ian Rosbottom
- Centre for the Digital Design of Drug Products, School of Chemical and Process Engineering, University of Leeds, Leeds LS2 9JT, United Kingdom
| | - Richard L. Marchese Robinson
- Centre for the Digital Design of Drug Products, School of Chemical and Process Engineering, University of Leeds, Leeds LS2 9JT, United Kingdom
| | - Peter Byrne
- Department of Physics, University of York, Heslington YO10 5DD, United Kingdom
| | - Phil Hasnip
- Department of Physics, University of York, Heslington YO10 5DD, United Kingdom
| | - Matt I. J. Probert
- Department of Physics, University of York, Heslington YO10 5DD, United Kingdom
| | - Dominik Jochym
- Science and Technology Facilities Council, Rutherford Appleton Laboratory, Harwell Campus, Didcot OX11 OQX, United Kingdom
| | - Andrew Maloney
- The Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 1EZ, United Kingdom
| | - Kevin J. Roberts
- Centre for the Digital Design of Drug Products, School of Chemical and Process Engineering, University of Leeds, Leeds LS2 9JT, United Kingdom
| |
Collapse
|
9
|
Mathieu D. Accurate or Fast Prediction of Solid-State Formation Enthalpies Using Standard Sublimation Enthalpies Derived From Geometrical Fragments. Ind Eng Chem Res 2018. [DOI: 10.1021/acs.iecr.8b03001] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
10
|
Marchese Robinson RL, Roberts KJ, Martin EB. The influence of solid state information and descriptor selection on statistical models of temperature dependent aqueous solubility. J Cheminform 2018; 10:44. [PMID: 30159699 PMCID: PMC6115327 DOI: 10.1186/s13321-018-0298-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2018] [Accepted: 08/17/2018] [Indexed: 11/23/2022] Open
Abstract
Predicting the equilibrium solubility of organic, crystalline materials at all relevant temperatures is crucial to the digital design of manufacturing unit operations in the chemical industries. The work reported in our current publication builds upon the limited number of recently published quantitative structure-property relationship studies which modelled the temperature dependence of aqueous solubility. One set of models was built to directly predict temperature dependent solubility, including for materials with no solubility data at any temperature. We propose that a modified cross-validation protocol is required to evaluate these models. Another set of models was built to predict the related enthalpy of solution term, which can be used to estimate solubility at one temperature based upon solubility data for the same material at another temperature. We investigated whether various kinds of solid state descriptors improved the models obtained with a variety of molecular descriptor combinations: lattice energies or 3D descriptors calculated from crystal structures or melting point data. We found that none of these greatly improved the best direct predictions of temperature dependent solubility or the related enthalpy of solution endpoint. This finding is surprising because the importance of the solid state contribution to both endpoints is clear. We suggest our findings may, in part, reflect limitations in the descriptors calculated from crystal structures and, more generally, the limited availability of polymorph specific data. We present curated temperature dependent solubility and enthalpy of solution datasets, integrated with molecular and crystal structures, for future investigations.
Collapse
Affiliation(s)
| | - Kevin J Roberts
- School of Chemical and Process Engineering, University of Leeds, Leeds, LS2 9JT, UK
| | - Elaine B Martin
- School of Chemical and Process Engineering, University of Leeds, Leeds, LS2 9JT, UK.
| |
Collapse
|
11
|
Meftahi N, Walker ML, Enciso M, Smith BJ. Predicting the Enthalpy and Gibbs Energy of Sublimation by QSPR Modeling. Sci Rep 2018; 8:9779. [PMID: 29950681 PMCID: PMC6021403 DOI: 10.1038/s41598-018-28105-6] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2018] [Accepted: 06/15/2018] [Indexed: 11/28/2022] Open
Abstract
The enthalpy and Gibbs energy of sublimation are predicted using quantitative structure property relationship (QSPR) models. In this study, we compare several approaches previously reported in the literature for predicting the enthalpy of sublimation. These models, which were reproduced successfully, exhibit high correlation coefficients, in the range 0.82 to 0.97. There are significantly fewer examples of QSPR models currently described in the literature that predict the Gibbs energy of sublimation; here we describe several models that build upon the previous models for predicting the enthalpy of sublimation. The most robust and predictive model constructed using multiple linear regression, with the fewest number of descriptors for estimating this property, was obtained with an R2 of the training set of 0.71, an R2 of the test set of 0.62, and a standard deviation of 9.1 kJ mol−1. This model could be improved by training using a neural network, yielding an R2 of the training and test sets of 0.80 and 0.63, respectively, and a standard deviation of 8.9 kJ mol−1.
Collapse
Affiliation(s)
- Nastaran Meftahi
- La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria, 3086, Australia
| | - Michael L Walker
- La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria, 3086, Australia
| | - Marta Enciso
- La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria, 3086, Australia
| | - Brian J Smith
- La Trobe Institute for Molecular Science, La Trobe University, Melbourne, Victoria, 3086, Australia.
| |
Collapse
|
12
|
McDonagh JL, Silva AF, Vincent MA, Popelier PLA. Machine Learning of Dynamic Electron Correlation Energies from Topological Atoms. J Chem Theory Comput 2017; 14:216-224. [PMID: 29211469 DOI: 10.1021/acs.jctc.7b01157] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
We present an innovative method for predicting the dynamic electron correlation energy of an atom or a bond in a molecule utilizing topological atoms. Our approach uses the machine learning method Kriging (Gaussian Process Regression with a non-zero mean function) to predict these dynamic electron correlation energy contributions. The true energy values are calculated by partitioning the MP2 two-particle density-matrix via the Interacting Quantum Atoms (IQA) procedure. To our knowledge, this is the first time such energies have been predicted by a machine learning technique. We present here three important proof-of-concept cases: the water monomer, the water dimer, and the van der Waals complex H2···He. These cases represent the final step toward the design of a full IQA potential for molecular simulation. This final piece will enable us to consider situations in which dispersion is the dominant intermolecular interaction. The results from these examples suggest a new method by which dispersion potentials for molecular simulation can be generated.
Collapse
Affiliation(s)
- James L McDonagh
- Manchester Institute of Biotechnology, The University of Manchester , 131 Princess Street, Manchester M1 7DN, Great Britain
| | - Arnaldo F Silva
- Manchester Institute of Biotechnology, The University of Manchester , 131 Princess Street, Manchester M1 7DN, Great Britain
| | - Mark A Vincent
- School of Chemistry, The University of Manchester , Oxford Road, Manchester M13 9PL, Great Britain
| | - Paul L A Popelier
- Manchester Institute of Biotechnology, The University of Manchester , 131 Princess Street, Manchester M1 7DN, Great Britain.,School of Chemistry, The University of Manchester , Oxford Road, Manchester M13 9PL, Great Britain
| |
Collapse
|
13
|
Červinka C, Fulem M. State-of-the-Art Calculations of Sublimation Enthalpies for Selected Molecular Crystals and Their Computational Uncertainty. J Chem Theory Comput 2017; 13:2840-2850. [PMID: 28437618 DOI: 10.1021/acs.jctc.7b00164] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
A computational methodology for calculation of sublimation enthalpies of molecular crystals from first principles is developed and validated by comparison to critically evaluated literature experimental data. Temperature-dependent sublimation enthalpies for a set of selected 22 molecular crystals in their low-temperature phases are calculated. The computational methodology consists of several building blocks based on high-level electronic structure methods of quantum chemistry and statistical thermodynamics. Ab initio methods up to the coupled clusters with iterative treatment of single and double excitations and perturbative triples correction with an estimated complete basis set description [CCSD(T)/CBS] are used to calculate the cohesive energies of crystalline phases within a fragment-based additive scheme. Density functional theory (DFT) calculations with periodic boundary conditions (PBC) coupled with the quasi-harmonic approximation are used to evaluate the thermal contributions to the enthalpy of the solid phase. The properties of the vapor phase are calculated within the ideal-gas model using the rigid-rotor harmonic-oscillator model with correction for internal rotation using a one-dimensional hindered rotor approximation and a proper treatment of the molecular rotational degrees of freedom in the vicinity of 0 K. All individual terms contributing to the sublimation enthalpy as a function of temperature are discussed and their uncertainties estimated by comparison to critically evaluated experimental data.
Collapse
Affiliation(s)
- Ctirad Červinka
- Department of Physical Chemistry, University of Chemistry and Technology , Prague, Technická 5, CZ-166 28 Prague 6, Czech Republic
| | - Michal Fulem
- Department of Physical Chemistry, University of Chemistry and Technology , Prague, Technická 5, CZ-166 28 Prague 6, Czech Republic
| |
Collapse
|