1
|
Roy D, Patel C. Revisiting the Use of Quantum Chemical Calculations in LogP octanol-water Prediction. Molecules 2023; 28:801. [PMID: 36677858 PMCID: PMC9866719 DOI: 10.3390/molecules28020801] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 01/06/2023] [Accepted: 01/10/2023] [Indexed: 01/15/2023] Open
Abstract
The partition coefficients of drug and drug-like molecules between an aqueous and organic phase are an important property for developing new therapeutics. The predictive power of computational methods is used extensively to predict partition coefficients of molecules. The application of quantum chemical calculations is used to develop methods to develop structure-activity relationship models for such prediction, either based on molecular fragment methods, or via direct calculation of solvation free energy in solvent continuum. The applicability, merits, and shortcomings of these developments are revisited here.
Collapse
Affiliation(s)
- Dipankar Roy
- Department of Biological Sciences, University of Alberta, Edmonton, AB T6G 2E9, Canada
| | - Chandan Patel
- Department of Applied Sciences, COEP Technological University, Wellesely Road, Shivajinagar, Pune 411005, Maharashtra, India
| |
Collapse
|
2
|
Delforce L, Duprat F, Ploix JL, Ontiveros JF, Goussard V, Nardello-Rataj V, Aubry JM. Fast Prediction of the Equivalent Alkane Carbon Number Using Graph Machines and Neural Networks. ACS OMEGA 2022; 7:38869-38881. [PMID: 36340160 PMCID: PMC9631404 DOI: 10.1021/acsomega.2c04592] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/20/2022] [Accepted: 08/09/2022] [Indexed: 06/16/2023]
Abstract
The hydrophobicity of oils is a key parameter to design surfactant/oil/water (SOW) macro-, micro-, or nano-dispersed systems with the desired features. This essential physicochemical characteristic is quantitatively expressed by the equivalent alkane carbon number (EACN) whose experimental determination is tedious since it requires knowledge of the phase behavior of the SOW systems at different temperatures and for different surfactant concentrations. In this work, two mathematical models are proposed for the rapid prediction of the EACN of oils. They have been designed using artificial intelligence (machine-learning) methods, namely, neural networks (NN) and graph machines (GM). While the GM model is implemented from the SMILES codes of a 111-molecule training set of known EACN values, the NN model is fed with some σ-moment descriptors computed with the COSMOtherm software for the 111-molecule set. In a preliminary step, the leave-one-out algorithm is used to select, given the available data, the appropriate complexity of the two models. A comparison of the EACNs of liquids of a fresh set of 10 complex cosmetic and perfumery molecules shows that the two approaches provide comparable results in terms of accuracy and reliability. Finally, the NN and GM models are applied to nine series of homologous compounds, for which the GM model results are in better agreement with the experimental EACN trends than the NN model predictions. The results obtained by the GMs and by the NN based on σ-moments can be duplicated with the demonstration tool available for download as detailed in the Supporting Information.
Collapse
Affiliation(s)
- Lucie Delforce
- University
of Lille, CNRS, Centrale Lille, Université d′Artois,
UMR 8181—UCCS—Unité de Catalyse et Chimie du
Solide, F-59000Lille, France
| | - François Duprat
- Laboratoire
de Chimie Organique, CNRS, ESPCI Paris,
PSL Research University, 10 rue Vauquelin, 75005Paris, France
| | - Jean-Luc Ploix
- Laboratoire
de Chimie Organique, CNRS, ESPCI Paris,
PSL Research University, 10 rue Vauquelin, 75005Paris, France
| | - Jesus Fermín Ontiveros
- University
of Lille, CNRS, Centrale Lille, Université d′Artois,
UMR 8181—UCCS—Unité de Catalyse et Chimie du
Solide, F-59000Lille, France
| | - Valentin Goussard
- University
of Lille, CNRS, Centrale Lille, Université d′Artois,
UMR 8181—UCCS—Unité de Catalyse et Chimie du
Solide, F-59000Lille, France
| | - Véronique Nardello-Rataj
- University
of Lille, CNRS, Centrale Lille, Université d′Artois,
UMR 8181—UCCS—Unité de Catalyse et Chimie du
Solide, F-59000Lille, France
| | - Jean-Marie Aubry
- University
of Lille, CNRS, Centrale Lille, Université d′Artois,
UMR 8181—UCCS—Unité de Catalyse et Chimie du
Solide, F-59000Lille, France
| |
Collapse
|
3
|
Bio-based alternatives to volatile silicones: Relationships between chemical structure, physicochemical properties and functional performances. Adv Colloid Interface Sci 2022; 304:102679. [PMID: 35512559 DOI: 10.1016/j.cis.2022.102679] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2022] [Revised: 04/08/2022] [Accepted: 04/13/2022] [Indexed: 11/23/2022]
Abstract
Emollient oils are ubiquitous ingredients of personal care products, especially skin care and hair care formulations. They offer excellent spreading properties and give end-use products a soft, pleasant and non-sticky after-feel. Emollients belong to various petro- or bio-based chemical families among which silicone oils, hydrocarbons and esters are the most prominent. Silicones have exceptional physicochemical and sensory properties but their high chemical stability results in very low biodegradability and a high bioaccumulation potential. Nowadays, consumers are increasingly responsive to environmental issues and demand more environmentally friendly products. This awareness strongly encourages cosmetics industries to develop bio-based alternatives to silicone oils. Finding effective silicon-free emollients requires understanding the molecular origin of emollience. This review details the relationships between the molecular structures of emollients and their physicochemical properties as well as the resulting functional performances in order to facilitate the design of alternative oils with suitable physicochemical and sensory properties. The molecular profile of an ideal emollient in terms of chemical function (alkane, ether, ester, carbonate, alcohol), optimal number of carbons and branching is established to obtain an odourless oil with good spreading on the skin. Since none of the carbon-based emollients alone can imitate the non-sticky and dry feel of silicone oils, it is judicious to blend alkanes and esters to significantly improve both the sensory properties and the solubilizing properties of the synergistic mixture towards polar ingredients (sun filters, antioxidants, fragrances). Finally, it is shown how modelling tools (QSPR, COSMO-RS and neural networks) can predict in silico the key properties of hundreds of virtual candidate molecules in order to synthesize only the most promising whose predicted properties are close to the specifications.
Collapse
|
4
|
A Simple, Robust and Efficient Computational Method for n-Octanol/Water Partition Coefficients of Substituted Aromatic Drugs. Sci Rep 2017; 7:5760. [PMID: 28720783 PMCID: PMC5515958 DOI: 10.1038/s41598-017-05964-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2016] [Accepted: 05/02/2017] [Indexed: 11/09/2022] Open
Abstract
In this paper, multiple linear regression (MLR) was used to build quantitative structure property relationship (QSPR) of n-octanol-water partition coefficient (logPo/w) of 195 substituted aromatic drugs. The molecular descriptors were calculated for each compound by the VLifeMDS. By applying genetic algorithm/multiple linear regressions (GA/MLR) the most relevant descriptors were selected to build a QSPR model. The robustness of the model was characterized by the statistical validation and applicability domain (AD). The prediction results from MLR are in good agreement with the experimental values. The R2 and Q2LOO for MLR are 0.9433, 0.9341. The AD of the model was analyzed based on the Williams plot. The effects of different selected descriptors are described.
Collapse
|
5
|
Deeb O, Goodarzi M. QSAR of Antioxidants. Oncology 2017. [DOI: 10.4018/978-1-5225-0549-5.ch015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Antioxidants are substances that protect cells from the damaging effects of oxygen radicals, which are chemicals that play a part in some diseases such as cancer and others. Antioxidants are expected to be promising drugs in the management of these diseases by removing oxidative stress. Most of the modeling approaches involved in designing new antioxidants is based on Quantitative Structure-Activity Relationship (QSAR). A number of QSAR studies have been conducted to elucidate the structural requirements of antioxidants for their activities in order to predict the potency of these compounds with regard to the targeted activity and to direct the synthesis of more potent analogues. The main focus of this chapter is on the QSAR modeling of antioxidant compounds. The authors provide different QSAR studies of antioxidant compounds and try to compare between them in terms of the best models obtained and their use in designing potential new drugs.
Collapse
|
6
|
Deeb O, Martínez-Pachecho H, Ramírez-Galicia G, Garduño-Juárez R. Application of Docking Methodologies in QSAR-Based Studies. PHARMACEUTICAL SCIENCES 2017. [DOI: 10.4018/978-1-5225-1762-7.ch033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
Abstract
The computational strategies permeate all aspects of drug discovery such as virtual screening techniques. Virtual screening can be classified into ligand based and structure based methods. The ligand based method such as Quantitative Structure Activity Relationship (QSAR) is used when a set of active ligand compounds is recognized and slight or no structural information is available for the receptors. In structure based drug design, the most widespread method is molecular docking. It is widely accepted that drug activity is obtained through the molecular binding of one ligand to receptor. In their binding conformations, the molecules exhibit geometric and chemical complementarity, both of which are essential for successful drug activity. The molecular docking approach can be used to model the interaction between a small drug molecule and a protein, which allow us to characterize the performance of small molecules in the binding site of target proteins as well as to clarify fundamental biochemical processes.
Collapse
|
7
|
|
8
|
Lamrini B, Della Valle G, Trelea IC, Perrot N, Trystram G. A new method for dynamic modelling of bread dough kneading based on artificial neural network. Food Control 2012. [DOI: 10.1016/j.foodcont.2012.01.011] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
9
|
Garrido NM, Economou IG, Queimada AJ, Jorge M, Macedo EA. Prediction of then-hexane/water and 1-octanol/water partition coefficients for environmentally relevant compounds using molecular simulation. AIChE J 2011. [DOI: 10.1002/aic.12718] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
|
10
|
Garrido NM, Jorge M, Queimada AJ, Macedo EA, Economou IG. Using molecular simulation to predict solute solvation and partition coefficients in solvents of different polarity. Phys Chem Chem Phys 2011; 13:9155-64. [DOI: 10.1039/c1cp20110g] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
|
11
|
Quantitative structure–property relationship study of n-octanol–water partition coefficients of some of diverse drugs using multiple linear regression. Anal Chim Acta 2007; 604:99-106. [DOI: 10.1016/j.aca.2007.10.004] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2007] [Revised: 10/04/2007] [Accepted: 10/04/2007] [Indexed: 11/23/2022]
|
12
|
Comparing evolutionary hybrid systems for design and optimization of multilayer perceptron structure along training parameters. Inf Sci (N Y) 2007. [DOI: 10.1016/j.ins.2007.02.021] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
13
|
Di Fenza A, Alagona G, Ghio C, Leonardi R, Giolitti A, Madami A. Caco-2 cell permeability modelling: a neural network coupled genetic algorithm approach. J Comput Aided Mol Des 2007; 21:207-21. [PMID: 17265097 DOI: 10.1007/s10822-006-9098-3] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2006] [Accepted: 12/14/2006] [Indexed: 10/23/2022]
Abstract
The ability to cross the intestinal cell membrane is a fundamental prerequisite of a drug compound. However, the experimental measurement of such an important property is a costly and highly time consuming step of the drug development process because it is necessary to synthesize the compound first. Therefore, in silico modelling of intestinal absorption, which can be carried out at very early stages of drug design, is an appealing alternative procedure which is based mainly on multivariate statistical analysis such as partial least squares (PLS) and neural networks (NN). Our implementation of neural network models for the prediction of intestinal absorption is based on the correlation of Caco-2 cell apparent permeability (P (app)) values, as a measure of intestinal absorption, to the structures of two different data sets of drug candidates. Several molecular descriptors of the compounds were calculated and the optimal subsets were selected using a genetic algorithm; therefore, the method was indicated as Genetic Algorithm-Neural Network (GA-NN). A methodology combining a genetic algorithm search with neural network analysis applied to the modelling of Caco-2 P (app) has never been presented before, although the two procedures have been already employed separately. Moreover, we provide new Caco-2 cell permeability measurements for more than two hundred compounds. Interestingly, the selected descriptors show to possess physico-chemical connotations which are in excellent accordance with the well known relevant molecular properties involved in the cellular membrane permeation phenomenon: hydrophilicity, hydrogen bonding propensity, hydrophobicity and molecular size. The predictive ability of the models, although rather good for a preliminary study, is somewhat affected by the poor precision of the experimental Caco-2 measurements. Finally, the generalization ability of one model was checked on an external test set not derived from the data sets used to build the models. The result obtained is of interesting practical application and underlines that the successful model construction is strictly dependent on the structural space representation of the data set used for model development.
Collapse
Affiliation(s)
- Armida Di Fenza
- Molecular Modelling Lab, Institute for Physico-Chemical Processes (IPCF) CNR, Via G Moruzzi 1, Pisa, Italy.
| | | | | | | | | | | |
Collapse
|
14
|
Peterson KL. Artificial Neural Networks and Their use in Chemistry. REVIEWS IN COMPUTATIONAL CHEMISTRY 2007. [DOI: 10.1002/9780470125939.ch2] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/15/2023]
|
15
|
Molnár L, Keseru GM, Papp A, Lorincz Z, Ambrus G, Darvas F. A neural network based classification scheme for cytotoxicity predictions:Validation on 30,000 compounds. Bioorg Med Chem Lett 2006; 16:1037-9. [PMID: 16288868 DOI: 10.1016/j.bmcl.2005.10.079] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2005] [Revised: 10/20/2005] [Accepted: 10/24/2005] [Indexed: 11/24/2022]
Abstract
Elimination of cytotoxic compounds in the early phases of drug discovery can save substantial amounts of research and development costs. An artificial neural network based approach using atomic fragmental descriptors has been developed to categorize compounds according to their in vitro human cytotoxicity. Fragmental descriptors were obtained from the Atomic7 linear logP calculation method implemented in Pallas PrologP program. We used cytotoxicity values obtained from an in-house screening campaign of a diverse set of 30,000 drug-like molecules. The training set included only the most and least toxic 12,998 compounds, however, cytotoxicity data for all compounds were used for validation. The proposed approach can be safely used for filtering out potentially cytotoxic candidates from the development pipeline before synthesis or assays during lead development or lead optimisation. The trained neural network misclassified less than 5% percent of the non-toxic and 9% of the toxic compounds.
Collapse
Affiliation(s)
- László Molnár
- Department of Chemical Information Technology, Budapest University of Technology and Economics, Szent Gellért tér 4., H-1111 Budapest, Hungary
| | | | | | | | | | | |
Collapse
|
16
|
Graph Machines and Their Applications to Computer-Aided Drug Design: A New Approach to Learning from Structured Data. ACTA ACUST UNITED AC 2006. [DOI: 10.1007/11839132_1] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
|
17
|
Molnár L, Keseru GM, Papp A, Gulyás Z, Darvas F. A neural network based prediction of octanol-water partition coefficients using atomic5 fragmental descriptors. Bioorg Med Chem Lett 2004; 14:851-3. [PMID: 15012980 DOI: 10.1016/j.bmcl.2003.12.024] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2003] [Revised: 11/19/2003] [Accepted: 12/04/2003] [Indexed: 11/30/2022]
Abstract
An artificial neural network based approach using Atomic5 fragmental descriptors has been developed to predict the octanol-water partition coefficient (logP). We used a pre-selected set of organic molecules from PHYSPROP database as training and test sets for a feedforward neural network. Results demonstrate the superiority of our non-linear model over the traditional linear method.
Collapse
Affiliation(s)
- László Molnár
- Department of Chemical Information Technology, Budapest University of Technology and Economics, Szent Gellért tér 4., H-1111 Budapest, Hungary
| | | | | | | | | |
Collapse
|
18
|
Taskinen J, Yliruusi J. Prediction of physicochemical properties based on neural network modelling. Adv Drug Deliv Rev 2003; 55:1163-83. [PMID: 12954197 DOI: 10.1016/s0169-409x(03)00117-0] [Citation(s) in RCA: 115] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
The literature describing neural network modelling to predict physicochemical properties of organic compounds from the molecular structure is reviewed from the perspective of pharmaceutical research. The standard three-layer, feed-forward neural network is the technique most frequently used, although the use of other techniques is increasing. Various approaches to describe the molecular structure have been successfully used, including molecular fragments, topological indices, and descriptors calculated by semi-empirical quantum chemical methods. Some physicochemical properties, such as octanol-water partition coefficient, water solubility, boiling point and vapour pressure, have been modelled by several research groups over the years using different approaches and structurally diverse large training sets. The prediction accuracy of most models seems to be rather close to the performance of the experimental measurements, when the accuracy is assessed with a test set from the working database. Results with independent test sets have been less satisfactory. Implications of this problem are discussed.
Collapse
Affiliation(s)
- Jyrki Taskinen
- Viikki Drug Discovery Technology Center, Department of Pharmacy, University of Helsinki, Helsinki, Finland.
| | | |
Collapse
|
19
|
Bianucci AM, Micheli A, Sperduti A, Starita A. A Novel Approach to QSPR/QSAR Based on Neural Networks for Structures. SOFT COMPUTING APPROACHES IN CHEMISTRY 2003. [DOI: 10.1007/978-3-540-36213-5_10] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
|
20
|
Giordanetto F, Fossa P, Menozzi G, Mosti L. In silico rationalization of the structural and physicochemical requirements for photobiological activity in angelicine derivatives and their heteroanalogues. J Comput Aided Mol Des 2003; 17:53-64. [PMID: 12926855 DOI: 10.1023/a:1024557113083] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
In PUVA (Psoralen plus UVA) chemotherapy 8-methoxypsoralen is the most widely used compound, although its efficacy is endowed with undesired side effects. In order to have an evident anti-proliferative activity with a reduced phototoxicity, many linear and angular derivatives have been synthesised. In this paper we describe a QSAR study in which, by means of the neural networks methodology, a useful model for predicting biological activity, expressed as ID50 (the UVA dose that reduces to 50% the DNA synthesis in Ehrlich cells), has been derived. A decision tree that is able to discriminate between active and inactive compounds has been built based on recursive partitioning. The study shows the key structural features responsible for the activity and could be a helpful tool in the rational design of new, less toxic, photochemotherapeuthic agents.
Collapse
Affiliation(s)
- Fabrizio Giordanetto
- Centre for Computational Science, Department of Chemistry, Queen Mary, University of London, Mile End Road, London E1 4NS, United Kingdom
| | | | | | | |
Collapse
|
21
|
Xing L, Glen RC. Novel methods for the prediction of logP, pK(a), and logD. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2002; 42:796-805. [PMID: 12132880 DOI: 10.1021/ci010315d] [Citation(s) in RCA: 112] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Novel methods for predicting logP, pK(a), and logD values have been developed using data sets (592 molecules for logP and 1029 for pK(a)) containing a wide range of molecular structures. An equation with three molecular properties (polarizability and partial atomic charges on nitrogen and oxygen) correlates highly with logP (r2 = 0.89). The pK(a)s are estimated for both acids and bases using a novel tree structured fingerprint describing the ionizing centers. The new models have been compared with existing models and also experimental measurements on test sets of common organic compounds and pharmaceutical molecules.
Collapse
Affiliation(s)
- Li Xing
- Tripos, Inc., 1699 South Hanley Road, St. Louis, Missouri 63144, USA.
| | | |
Collapse
|
22
|
Yaffe D, Cohen Y, Espinosa G, Arenas A, Giralt F. Fuzzy ARTMAP and back-propagation neural networks based quantitative structure-property relationships (QSPRs) for octanol-water partition coefficient of organic compounds. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2002; 42:162-83. [PMID: 11911684 DOI: 10.1021/ci0103267] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Quantitative structure-property relationships (QSPRs) for estimating the logarithm octanol/water partition coefficients, logK(ow), at 25 degrees C were developed based on fuzzy ARTMAP and back-propagation neural networks using a heterogeneous set of 442 organic compounds. The set of molecular descriptors were derived from molecular connectivity indices and quantum chemical descriptors calculated from PM3 semiempirical MO-theory. Quantum chemical input descriptors include average polarizability, dipole moments, exchange energy, total electrostatic interaction energy, total two-center energy, and ionization potential. The fuzzy ARTMAP/QSPR performed, for a logK(ow) range of -1.6 to 7.9, with average absolute errors of 0.03 and 0.14 logK(ow) for the overall data and test sets, respectively. The optimal 12-11-1 back-propagation/QSPR model, for the same range of logK(ow), exhibited larger average absolute errors of 0.23 and 0.27 logK(ow) for the test and validation data sets, respectively, over the same range of logK(ow) values. The present results with the fuzzy ARTMAP-based QSPR are encouraging and suggest that high performance logK(ow) QSPR that encompasses a wider range of chemical groups could be developed, following the present approach, by training with a larger heterogeneous data set.
Collapse
Affiliation(s)
- Denise Yaffe
- Department of Chemical Engineering, University of California, Los Angeles, Los Angeles, California 90095-1592, USA
| | | | | | | | | |
Collapse
|
23
|
Bruneau P. Search for predictive generic model of aqueous solubility using Bayesian neural nets. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2001; 41:1605-16. [PMID: 11749587 DOI: 10.1021/ci010363y] [Citation(s) in RCA: 99] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Several predictive models of aqueous solubility have been published. They have good performances on the data sets which have been used for training the models, but usually these data sets do not contain many structures similar to the structures of interest to the drug research and their applicability in drug hunting is questionable. A very diverse data set has been gathered with compounds issued from literature reports and proprietary compounds. These compounds have been grouped in a so-called literature data set I, a proprietary data set II, and a mixed data set III formed by I and II. About 100 descriptors emphasizing surface properties were calculated for every compound. Bayesian learning of neural nets which cumulates the advantages of neural nets without having their weaknesses was used to select the most parsimonious models and train them, from I, II, and III. The models were established by either selecting the most efficient descriptors one by one using a modified Gram-Schmidt procedure (GS) or by simplifying a most complete model using automatic relevance procedure (ARD). The predictive ability of the models was accessed using validation data sets as much unrelated to the training sets as possible, using two new parameters: NDD(x,ref) the normalized smallest descriptor distance of a compound x to a reference data set and CD(x,mod) the combination of NDD(x,ref) with the dispersion of the Bayesian neural nets calculations. The results show that it is possible to obtain a generic predictive model from database I but that the diversity of database II is too restricted to give a model with good generalization ability and that the ARD method applied to the mixed database III gives the best predictive model.
Collapse
Affiliation(s)
- P Bruneau
- AstraZeneca Centre de Recherche, Parc Industriel Pompelle, BP 1050, 51689 Reims, France.
| |
Collapse
|
24
|
Mannhold R, van de Waterbeemd H. Substructure and whole molecule approaches for calculating log P. J Comput Aided Mol Des 2001; 15:337-54. [PMID: 11349816 DOI: 10.1023/a:1011107422318] [Citation(s) in RCA: 131] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Lipophilicity is a major determinant of pharmacokinetic and pharmacodynamic properties of drug molecules. Correspondingly, there is great interest in medicinal chemistry in developing methods of deriving the quantitative descriptor of lipophilicity, the partition coefficient P, from molecular structure. Roughly, methods for calculating log P can be divided into two major classes: Substructure approaches have in common that molecules are cut into atoms (atom contribution methods) or groups (fragmental methods); summing the single-atom or fragmental contributions (supplemented by applying correction rules in the latter case) results in the final log P. Whole molecule approaches inspect the entire molecule; they use for instance molecular lipophilicity potentials (MLP), topological indices or molecular properties to quantify log P. In this review, representative members of substructure and whole molecule approaches for calculating log P are described; their advantages and shortcomings are discussed. Finally, the predictive power of some calculation methods is compared and a scheme for classifying calculation methods is proposed.
Collapse
Affiliation(s)
- R Mannhold
- Molecular Drug Research Group, Heinrich-Heine-Universiteit, Düsseldorf, Germany.
| | | |
Collapse
|
25
|
Micheli A, Sperduti A, Starita A, Bianucci AM. Analysis of the internal representations developed by neural networks for structures applied to quantitative structure--activity relationship studies of benzodiazepines. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2001; 41:202-18. [PMID: 11206375 DOI: 10.1021/ci9903399] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
An application of recursive cascade correlation (CC) neural networks to quantitative structure-activity relationship (QSAR) studies is presented, with emphasis on the study of the internal representations developed by the neural networks. Recursive CC is a neural network model recently proposed for the processing of structured data. It allows the direct handling of chemical compounds as labeled ordered directed graphs, and constitutes a novel approach to QSAR. The adopted representation of molecular structure captures, in a quite general and flexible way, significant topological aspects and chemical functionalities for each specific class of molecules showing a particular chemical reactivity or biological activity. A class of 1,4-benzodiazepin-2-ones is analyzed by the proposed approach. It compares favorably versus the traditional QSAR treatment based on equations. To show the ability of the model in capturing most of the structural features that account for the biological activity, the internal representations developed by the networks are analyzed by principal component analysis. This analysis shows that the networks are able to discover relevant structural features just on the basis of the association between the molecular morphology and the target property (affinity).
Collapse
Affiliation(s)
- A Micheli
- Dipartimento di Informatica, Università di Pisa, Italy.
| | | | | | | |
Collapse
|
26
|
Hitzel L, Watt AP, Locker KL. An increased throughput method for the determination of partition coefficients. Pharm Res 2000; 17:1389-95. [PMID: 11205732 DOI: 10.1023/a:1007546905874] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
PURPOSE To present an increased throughput automated shake-flask method for the direct determination of the partition coefficients of solutes between octan-1-ol and buffer. METHOD The traditional shake-flask method has been transferred onto 96-well plate technology and a robotic liquid handler has been used for sample preparation. A custom programmed Gilson autosampler samples the organic and aqueous phases directly from the plate, circumventing the need for any manual separation. Analyses are performed by reverse phase high performance liquid chromatography (RP-HPLC). Generic fast gradient RP-HPLC conditions are used to eliminate chromatographic method development time and reduce analysis time. RESULTS A full validation of the automated method is presented for a range of compounds with log D values between -2 and 4. CONCLUSIONS The advantages and limitations of this direct measurement method are discussed. The use of this methodology provides a means to rapidly assess log D values for large compound arrays.
Collapse
Affiliation(s)
- L Hitzel
- Merck Sharp and Dohme, Neuroscience Research Centre, Essex, United Kingdom
| | | | | |
Collapse
|
27
|
Nonparametric regression applied to quantitative structure-activity relationships. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2000; 40:452-9. [PMID: 10761152 DOI: 10.1021/ci990082e] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Several nonparametric regressors have been applied to modeling quantitative structure-activity relationship (QSAR) data. The simplest regressor, the Nadaraya-Watson, was assessed in a genuine multivariate setting. Other regressors, the local linear and the shifted Nadaraya-Watson, were implemented within additive models--a computationally more expedient approach, better suited for low-density designs. Performances were benchmarked against the nonlinear method of smoothing splines. A linear reference point was provided by multilinear regression (MLR). Variable selection was explored using systematic combinations of different variables and combinations of principal components. For the data set examined, 47 inhibitors of dopamine beta-hydroxylase, the additive nonparametric regressors have greater predictive accuracy (as measured by the mean absolute error of the predictions or the Pearson correlation in cross-validation trails) than MLR. The use of principal components did not improve the performance of the nonparametric regressors over use of the original descriptors, since the original descriptors are not strongly correlated. It remains to be seen if the nonparametric regressors can be successfully coupled with better variable selection and dimensionality reduction in the context of high-dimensional QSARs.
Collapse
|
28
|
Le Bret C. A general 13C NMR spectrum predictor using data mining techniques. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2000; 11:211-234. [PMID: 10969872 DOI: 10.1080/10629360008033232] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/23/2023]
Abstract
A general-case neural network model for 13C NMR spectrum prediction (estimation) was built from more than 8,300 carbon atoms having various environments. Building the model from the data set required a few weeks' work using commercial software. Average deviation on test data is ca. 4 ppm. There is no limit on molecule complexity. Estimation error does not depend on molecule size or complexity. The emphasis is on the data, the method and the results, not on the processes that take place inside the modelling software. Advantages, disadvantages and peculiarities of neural network-based data modelling ("data mining") are described at length. The differences in data handling between the data mining approach and traditional statistical modelling techniques are discussed and illustrated in detail. The spectrum predictor is available from PMSI at no charge.
Collapse
Affiliation(s)
- C Le Bret
- Department of Research, PMSI, Paris, France
| |
Collapse
|
29
|
Kövesdi I, Dominguez-Rodriguez MF, Orfi L, Náray-Szabó G, Varró A, Papp JG, Mátyus P. Application of neural networks in structure-activity relationships. Med Res Rev 1999; 19:249-69. [PMID: 10232652 DOI: 10.1002/(sici)1098-1128(199905)19:3<249::aid-med4>3.0.co;2-0] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Methodology and application of artificial neural networks in structure-activity relationships are reviewed focusing on the most frequently used three-layer feedforward back-propagation procedure. Two applications of neural networks are presented and a comparison of the performance with those of CoMFA and a classical QSAR analysis is also discussed.
Collapse
Affiliation(s)
- I Kövesdi
- EGIS Pharmaceuticals Ltd., Budapest, Hungary
| | | | | | | | | | | | | |
Collapse
|