1
|
Guan D, Lui R, Mattthews ST. Low-cost quantum mechanical descriptors for data efficient skin sensitization QSAR models. Curr Res Toxicol 2024; 7:100183. [PMID: 39021404 PMCID: PMC11253267 DOI: 10.1016/j.crtox.2024.100183] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2023] [Revised: 06/15/2024] [Accepted: 06/17/2024] [Indexed: 07/20/2024] Open
Abstract
Quantitative Structure Activity Relationship modelling methodologies need to incorporate relevant mechanistic information to have high predictive performance and validity. Electrophilic reactivity is a common mechanistic feature of skin sensitization endpoints which could be concisely characterized with electronic descriptors which is key to enabling the modelling of small datasets in this domain. However, quantum mechanical methodologies have previously featured high computational costs which would exclude the use of large datasets. Consequently, we investigate the use of electronic descriptors calculated using the Hartree Fock with 3 corrections (Hf-3c) method, a low-cost ab initio methodology that has higher chemical accuracy than previous semiempirical methodologies for modelling in vitro skin sensitization assay outcomes. We also model the Ames assay as a surrogate for determining skin sensitization outcomes. The quantum chemical descriptors calculated using the Hf-3c method with conductor-like polarizable continuum model (CPCM) implicit solvation found improved QSAR model performance for the in vitro Ames (n = 6049, 0.770 AUC), KeratinoSens (n = 164, 0.763 AUC), and Direct Peptide Reactivity Assay (n = 122, 0.750 AUC) datasets, with their combination producing high predictive performance for unseen in vivo Local Lymph Node Assay (n = 86, 0.789 AUC) and Human Repeated Insult Patch Test (n = 86, 0.791 AUC) assay toxicant outcomes.
Collapse
Affiliation(s)
- Davy Guan
- Computational Pharmacology & Toxicology Laboratory, Discipline of Pharmacology, Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, NSW 2006, Australia
| | - Raymond Lui
- Computational Pharmacology & Toxicology Laboratory, Discipline of Pharmacology, Sydney Pharmacy School, Faculty of Medicine and Health, The University of Sydney, NSW 2006, Australia
| | | |
Collapse
|
2
|
Zukić S, Maran U. Modelling of antiproliferative activity measured in HeLa cervical cancer cells in a series of xanthene derivatives. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2020; 31:905-921. [PMID: 33236957 DOI: 10.1080/1062936x.2020.1839131] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/13/2020] [Accepted: 10/15/2020] [Indexed: 06/11/2023]
Abstract
Cancer remains one of the leading causes of death in humans, and new drug substances are therefore being developed. Thus, the anti-cancer activity of xanthene derivatives has become an important topic in the development of new and potent anti-cancer drug substances. Previously published novel series of xanthen-3-one and xanthen-1,8-dione derivatives have been synthesized in one of our laboratories and showed anti-proliferative activity in HeLa cancer cell lines. This series serves as a good basis to develop quantitative structure-activity relationship (QSAR), to study the relations between anti-proliferative activity and chemical structures. A QSAR model has been derived that relies only on two-dimensional molecular descriptors, providing mechanistic insight into the anti-proliferative activity of xanthene derivatives. The model is validated internally and externally and additionally with the set of inactive compounds of the original data, confirming model applicability for the design and discovery of novel xanthene derivatives. The QSAR model is available at the QsarDB repository (http://dx.doi.10.15152/QDB.237).
Collapse
Affiliation(s)
- S Zukić
- Department of Pharmaceutical Chemistry, University of Sarajevo , Sarajevo, Bosnia and Herzegovina
| | - U Maran
- Department of Chemistry, University of Tartu , Tartu, Estonia
| |
Collapse
|
3
|
Achary PGR. Applications of Quantitative Structure-Activity Relationships (QSAR) based Virtual Screening in Drug Design: A Review. Mini Rev Med Chem 2020; 20:1375-1388. [DOI: 10.2174/1389557520666200429102334] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 11/07/2019] [Accepted: 11/08/2019] [Indexed: 12/18/2022]
Abstract
The scientists, and the researchers around the globe generate tremendous amount of information
everyday; for instance, so far more than 74 million molecules are registered in Chemical
Abstract Services. According to a recent study, at present we have around 1060 molecules, which are
classified as new drug-like molecules. The library of such molecules is now considered as ‘dark chemical
space’ or ‘dark chemistry.’ Now, in order to explore such hidden molecules scientifically, a good
number of live and updated databases (protein, cell, tissues, structure, drugs, etc.) are available today.
The synchronization of the three different sciences: ‘genomics’, proteomics and ‘in-silico simulation’
will revolutionize the process of drug discovery. The screening of a sizable number of drugs like molecules
is a challenge and it must be treated in an efficient manner. Virtual screening (VS) is an important
computational tool in the drug discovery process; however, experimental verification of the
drugs also equally important for the drug development process. The quantitative structure-activity relationship
(QSAR) analysis is one of the machine learning technique, which is extensively used in VS
techniques. QSAR is well-known for its high and fast throughput screening with a satisfactory hit rate.
The QSAR model building involves (i) chemo-genomics data collection from a database or literature
(ii) Calculation of right descriptors from molecular representation (iii) establishing a relationship
(model) between biological activity and the selected descriptors (iv) application of QSAR model to
predict the biological property for the molecules. All the hits obtained by the VS technique needs to be
experimentally verified. The present mini-review highlights: the web-based machine learning tools, the
role of QSAR in VS techniques, successful applications of QSAR based VS leading to the drug discovery
and advantages and challenges of QSAR.
Collapse
Affiliation(s)
- Patnala Ganga Raju Achary
- Department of Chemistry, Faculty of Engineering & Technology (ITER), Siksha ‘O’ Anusandhan, Deemed to be University, Khandagiri Square, Bhubaneswar- 751030, India
| |
Collapse
|
4
|
Funar-Timofei S, Ilia G. QSAR Modeling of Dye Ecotoxicity. METHODS IN PHARMACOLOGY AND TOXICOLOGY 2020. [DOI: 10.1007/978-1-0716-0150-1_18] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
5
|
Viira B, García-Sosa AT, Maran U. Chemical structure and correlation analysis of HIV-1 NNRT and NRT inhibitors and database-curated, published inhibition constants with chemical structure in diverse datasets. J Mol Graph Model 2017; 76:205-223. [PMID: 28738270 DOI: 10.1016/j.jmgm.2017.06.019] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Revised: 06/18/2017] [Accepted: 06/19/2017] [Indexed: 01/26/2023]
Abstract
Human immunodeficiency virus (HIV-1) reverse transcriptase is a major target for designing anti-HIV drugs. Developed inhibitors are divided into non-nucleoside analog reverse-transcriptase inhibitors (NNRTIs) and nucleoside analog reverse-transcriptase inhibitors (NRTIs) depending on their mechanism. Given that many inhibitors have been studied and for many of them binding affinity constants have been calculated, it is beneficial to analyze the chemical landscape of these families of inhibitors and correlate these inhibition constants with molecular structure descriptors. For this, the HIV-1 RT data was retrieved from the ChEMBL database, carefully curated, and original literature verified, grouped into NRTIs and NNRTIs, analyzed using a hierarchical scaffold classification method and modelled with best multi-linear regression approach. Analysis of the HIV-1 NNRTIs subset results in ten different common structural parent types of oxazepanone, piperazinone, pyrazine, oxazinanone, diazinanone, pyridine, pyrrole, diazepanone, thiazole, and triazine. The same analysis for HIV-1 NRTIs groups structures into four different parent types of uracil, pyrimide, pyrimidione, and imidazole. Each scaffold tree corresponding to the parent types has been carefully analyzed and examined, and changes in chemical structure favorable to potency and stability are highlighted. For both subsets, descriptive and predictive QSAR models are derived, discussed and externally validated, revealing general trends in relationships between molecular structure and binding affinity constants in structurally diverse datasets. Data and QSAR models are available at the QsarDB repository (http://dx.doi.org/10.15152/QDB.202).
Collapse
Affiliation(s)
- Birgit Viira
- Institute of Chemistry, University of Tartu, Tartu 50411, Estonia
| | | | - Uko Maran
- Institute of Chemistry, University of Tartu, Tartu 50411, Estonia.
| |
Collapse
|
6
|
Leelananda SP, Lindert S. Computational methods in drug discovery. Beilstein J Org Chem 2016; 12:2694-2718. [PMID: 28144341 PMCID: PMC5238551 DOI: 10.3762/bjoc.12.267] [Citation(s) in RCA: 285] [Impact Index Per Article: 35.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Accepted: 11/22/2016] [Indexed: 12/11/2022] Open
Abstract
The process for drug discovery and development is challenging, time consuming and expensive. Computer-aided drug discovery (CADD) tools can act as a virtual shortcut, assisting in the expedition of this long process and potentially reducing the cost of research and development. Today CADD has become an effective and indispensable tool in therapeutic development. The human genome project has made available a substantial amount of sequence data that can be used in various drug discovery projects. Additionally, increasing knowledge of biological structures, as well as increasing computer power have made it possible to use computational methods effectively in various phases of the drug discovery and development pipeline. The importance of in silico tools is greater than ever before and has advanced pharmaceutical research. Here we present an overview of computational methods used in different facets of drug discovery and highlight some of the recent successes. In this review, both structure-based and ligand-based drug discovery methods are discussed. Advances in virtual high-throughput screening, protein structure prediction methods, protein-ligand docking, pharmacophore modeling and QSAR techniques are reviewed.
Collapse
Affiliation(s)
- Sumudu P Leelananda
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH 43210, USA
| | - Steffen Lindert
- Department of Chemistry and Biochemistry, Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
7
|
Aruoja V, Moosus M, Kahru A, Sihtmäe M, Maran U. Measurement of baseline toxicity and QSAR analysis of 50 non-polar and 58 polar narcotic chemicals for the alga Pseudokirchneriella subcapitata. CHEMOSPHERE 2014; 96:23-32. [PMID: 23895738 DOI: 10.1016/j.chemosphere.2013.06.088] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/20/2013] [Revised: 06/28/2013] [Accepted: 06/30/2013] [Indexed: 06/02/2023]
Abstract
In this paper a set of homogenous experimental algal toxicity data was measured for 50 non-polar narcotic chemicals using the alga Pseudokirchneriella subcapitata in a closed test with a growth rate endpoint. Most of the tested compounds are high volume industrial chemicals that so far lacked published REACH-compliant algal growth inhibition values. The test protocol fulfilled the criteria set forth in the OECD guideline 201 and had the same sensitivity as the open test which allowed direct comparison of toxicity values. Baseline QSAR model for non-polar narcotic compounds was established and compared with previous analogous models. Multi-linear QSAR model was derived for the non-polar and 58 previously tested polar (anilines and phenols) narcotic compounds modulating hydrophobicity, molecular size, electronic and molecular stability effects coded in the molecular descriptors. Descriptors in the model were analyzed and applicability domain was assessed providing further guidelines for the in silico prediction purposes in decision support while performing risk assessment. QSAR models in the manuscript are available on-line through QsarDB repository for exploring and prediction services (http://hdl.handle.net/10967/106).
Collapse
Affiliation(s)
- Villem Aruoja
- Laboratory of Environmental Toxicology, National Institute of Chemical Physics and Biophysics, Akadeemia tee 23, Tallinn 12618, Estonia.
| | | | | | | | | |
Collapse
|
8
|
Moosus M, Hiob R, Maran U. Quantitative relationship between rate constants and molecular structure descriptors for the gas phase hydrogen abstraction reactions. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:501-518. [PMID: 23724929 DOI: 10.1080/1062936x.2013.792869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The abstraction of hydrogen by general radicals has a wide role in environmental and also in technological processes because it results in reactive free radicals that play a vital role in atmospheric chemistry and also in biochemical processes. In addition to experimental studies, the theoretical modelling of this elementary reaction has been important for understanding and predicting respective rate constants. In this paper, molecular descriptors in the context of a QSAR approach are used to codify the relationship between molecular structure and rate constants. Unique experimental data is collected from the literature for the reaction R(i)• + R(j)H → R(i)H + R(j)•, where R(i)• = H• and R(j)• are diverse radicals. The four-parameter QSAR model (n = 34, r(2) = 0.81, r(2)(CV) = 0.74, r(2)(scr) = 0.12, s(2) = 0.19) is presented for the bimolecular rate constants, accompanied with model diagnostics and analysis of descriptors in the model.
Collapse
Affiliation(s)
- Maikki Moosus
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | | | | |
Collapse
|
9
|
Piir G, Sild S, Maran U. Comparative analysis of local and consensus quantitative structure-activity relationship approaches for the prediction of bioconcentration factor. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:175-199. [PMID: 23410132 DOI: 10.1080/1062936x.2012.762426] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Quantitative structure-activity relationships (QSARs) are broadly classified as global or local, depending on their molecular constitution. Global models use large and diverse training sets covering a wide range of chemical space. Local models focus on smaller structurally or chemically similar subsets that are conventionally selected by human experts or alternatively using clustering analysis. The current study focuses on the comparative analysis of different clustering algorithms (expectation-maximization, K-means and hierarchical) for seven different descriptor sets as structural characteristics and two rule-based approaches to select subsets for designing local QSAR models. A total of 111 local QSAR models are developed for predicting bioconcentration factor. Predictions from local models were compared with corresponding predictions from the global model. The comparison of coefficients of determination (r(2)) and standard deviations for local models with similar subsets from the global model show improved prediction quality in 97% of cases. The descriptor content of derived QSARs is discussed and analyzed. Local QSAR models were further consolidated within the framework of consensus approach. All different consensus approaches increased performance over the global and local models. The consensus approach reduced the number of strongly deviating predictions by evening out prediction errors, which were produced by some local QSARs.
Collapse
Affiliation(s)
- G Piir
- Institute of Chemistry, University of Tartu, Tartu, Estonia
| | | | | |
Collapse
|
10
|
Katritzky AR, Kuanar M, Slavov S, Hall CD, Karelson M, Kahn I, Dobchev DA. Quantitative Correlation of Physical and Chemical Properties with Chemical Structure: Utility for Prediction. Chem Rev 2010; 110:5714-89. [DOI: 10.1021/cr900238d] [Citation(s) in RCA: 386] [Impact Index Per Article: 27.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Alan R. Katritzky
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611
| | - Minati Kuanar
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611
| | - Svetoslav Slavov
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611
| | - C. Dennis Hall
- Center for Heterocyclic Compounds, Department of Chemistry, University of Florida, Gainesville, Florida 32611
| | - Mati Karelson
- Institute of Chemistry, Tallinn University of Technology, Akadeemia tee 15, Tallinn 19086, Estonia, and MolCode, Ltd., Soola 8, Tartu 51013, Estonia
| | - Iiris Kahn
- Institute of Chemistry, Tallinn University of Technology, Akadeemia tee 15, Tallinn 19086, Estonia, and MolCode, Ltd., Soola 8, Tartu 51013, Estonia
| | - Dimitar A. Dobchev
- Institute of Chemistry, Tallinn University of Technology, Akadeemia tee 15, Tallinn 19086, Estonia, and MolCode, Ltd., Soola 8, Tartu 51013, Estonia
| |
Collapse
|
11
|
Tulp I, Sild S, Maran U. Relationship Between Structure and Permeability in Artificial Membranes: Theoretical Whole Molecule Descriptors in Development of QSAR Models. ACTA ACUST UNITED AC 2009. [DOI: 10.1002/qsar.200860160] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
12
|
Kahn I, Sild S, Maran U. Modeling the Toxicity of Chemicals to Tetrahymena pyriformis Using Heuristic Multilinear Regression and Heuristic Back-Propagation Neural Networks. J Chem Inf Model 2007; 47:2271-9. [DOI: 10.1021/ci700231c] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Affiliation(s)
- Iiris Kahn
- Institute of Chemistry, University of Tartu, 2 Jakobi Str., Tartu 51014, Estonia
| | - Sulev Sild
- Institute of Chemistry, University of Tartu, 2 Jakobi Str., Tartu 51014, Estonia
| | - Uko Maran
- Institute of Chemistry, University of Tartu, 2 Jakobi Str., Tartu 51014, Estonia
| |
Collapse
|
13
|
Torres-Cartas S, Martín-Biosca Y, Villanueva-Camañas RM, Sagrado S, Medina-Hernández MJ. Biopartitioning micellar chromatography to predict mutagenicity of aromatic amines. Eur J Med Chem 2007; 42:1396-402. [PMID: 17482318 DOI: 10.1016/j.ejmech.2007.02.022] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2007] [Revised: 02/26/2007] [Accepted: 02/27/2007] [Indexed: 12/01/2022]
Abstract
Mutagenicity is a toxicity endpoint associated with the chronic exposure to chemicals. Aromatic amines have considerable industrial and environmental importance due to their widespread use in industry and their mutagenic capacity. Biopartitioning micellar chromatography (BMC), a mode of micellar liquid chromatography that uses micellar mobile phases of Brij35 in adequate experimental conditions, has demonstrated to be useful in mimicking the drug partitioning process into biological systems. In this paper, the usefulness of BMC for predicting mutagenicity of aromatic amines is demonstrated. A multiple linear regression (MLR) model based on BMC retention data is proposed and compared with other ones reported in bibliography. The proposed model present better or similar descriptive and predictive capability.
Collapse
Affiliation(s)
- S Torres-Cartas
- Departamento de Química Analítica, Universidad de Valencia, C/Vicente Andrés Estellés s/n, 46100 Burjassot, Valencia, Spain
| | | | | | | | | |
Collapse
|
14
|
Karelson M, Dobchev DA, Kulshyn OV, Katritzky AR. Neural Networks Convergence Using Physicochemical Data. J Chem Inf Model 2006; 46:1891-7. [PMID: 16995718 DOI: 10.1021/ci0600206] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
An investigation of the neural network convergence and prediction based on three optimization algorithms, namely, Levenberg-Marquardt, conjugate gradient, and delta rule, is described. Several simulated neural networks built using the above three algorithms indicated that the Levenberg-Marquardt optimizer implemented as a back-propagation neural network converged faster than the other two algorithms and provides in most of the cases better prediction. These conclusions are based on eight physicochemical data sets, each with a significant number of compounds comparable to that usually used in the QSAR/QSPR modeling. The superiority of the Levenberg-Marquardt algorithm is revealed in terms of functional dependence of the change of the neural network weights with respect to the gradient of the error propagation as well as distribution of the weight values. The prediction of the models is assessed by the error of the validation sets not used in the training process.
Collapse
Affiliation(s)
- Mati Karelson
- Department of Chemistry, University of Tartu, 2 Jakobi Street, Tartu 51014, Estonia.
| | | | | | | |
Collapse
|
15
|
Dobchev DA, Karelson M. Reparameterized Austin Model 1 for quantitative structure–property relationships in liquid media. J Mol Model 2006; 12:503-12. [PMID: 16404615 DOI: 10.1007/s00894-005-0080-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2005] [Accepted: 11/18/2005] [Indexed: 10/25/2022]
Abstract
A reparameterization of the quantum-chemical AM1 (Austin Model 1) model has been carried out using a nonlinear optimization based on a modification of the Levenberg-Marquardt technique. The optimum numerical values for the one-electron resonance integral parameters (beta (s) and beta (p)) and core-core repulsion atomic parameters alpha were obtained for the elements H, C, N, O, Cl and Br using the statistical fit of a two-parameter QSPR equation for the boiling points of organic compounds. A substantially improved two-parameter correlation (R2=0.9685, s=13.48 K) was obtained by using the new optimized parameters. The QSPR equation employs two molecular descriptors, a bulk cohesiveness descriptor, [Formula: see text] and the area-weighted surface charge of hydrogen-bonding donor atom(s) in the molecule. The model developed shows remarkably accurate predictions of the normal boiling points for nine additional simple inorganic compounds. The new parameters were tested on the critical temperatures of 165 organic compounds. The new QSPR model obtained for this property was found to be statistically significantly better than the original model. [Figure: see text].
Collapse
Affiliation(s)
- Dimitar A Dobchev
- Department of Chemistry, Tallinn University of Technology, Ehitajate tee 5, Tallinn, 19086, Estonia
| | | |
Collapse
|
16
|
Modeling of structure–mutagenicity relationships: counter propagation neural network approach using calculated structural descriptors. Anal Chim Acta 2004. [DOI: 10.1016/j.aca.2003.12.035] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
|
17
|
Vraćko M, Szymoszek A, Barbieri P. Structure-Mutagenicity Study of 12 Trimethylimidazopyridine Isomers Using Orbital Energies and “Spectrum-like Representation” As Descriptors. ACTA ACUST UNITED AC 2004; 44:352-8. [PMID: 15032511 DOI: 10.1021/ci030420i] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The set of 12 trimethylimidazopyridine isomers with mutagenic potency toward two strains of Salmonella was treated in this study. Ten isomers with known mutagenic properties were taken to build the models. Fifteen molecular orbital energies, or a "spectrum-like" representation of 3D structures, were taken as descriptors. As modeling techniques the multiple linear regression and the counter propagation neural network were applied. Models were tested with the recall ability test and the leave-one-out cross-validation tests. For two isomers, which have not been synthesized yet, we report predicted values for both mutagenic potencies obtained with different models. The best models were found when unoccupied molecular orbital energies are among the descriptors.
Collapse
Affiliation(s)
- M Vraćko
- National Institute of Chemistry, Hajdrihova 19, Ljubljana, Slovenia.
| | | | | |
Collapse
|
18
|
Mattioni BE, Kauffman GW, Jurs PC, Custer LL, Durham SK, Pearl GM. Predicting the genotoxicity of secondary and aromatic amines using data subsetting to generate a model ensemble. JOURNAL OF CHEMICAL INFORMATION AND COMPUTER SCIENCES 2003; 43:949-63. [PMID: 12767154 DOI: 10.1021/ci034013i] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Binary quantitative structure-activity relationship (QSAR) models are developed to classify a data set of 334 aromatic and secondary amine compounds as genotoxic or nongenotoxic based on information calculated solely from chemical structure. Genotoxic endpoints for each compound were determined using the SOS Chromotest in both the presence and absence of an S9 rat liver homogenate. Compounds were considered genotoxic if assay results indicated a positive genotoxicity hit for either the S9 inactivated or S9 activated assay. Each compound in the data set was encoded through the calculation of numerical descriptors that describe various aspects of chemical structure (e.g. topological, geometric, electronic, polar surface area). Furthermore, five additional descriptors that focused on the secondary and aromatic nitrogen atoms in each molecule were calculated specifically for this study. Descriptor subsets were examined using a genetic algorithm search engine interfaced with a k-Nearest Neighbor fitness evaluator to find the most information-rich subsets, which ultimately served as the final predictive models. Models were chosen for their ability to minimize the total number of misclassifications, with special attention given to those models that possessed fewer occurrences of positive toxicity hits being misclassified as nontoxic (false negatives). In addition, a subsetting procedure was used to form an ensemble of models using different combinations of compounds in the training and prediction sets. This was done to ensure that consistent results could be obtained regardless of training set composition. The procedure also allowed for each compound to be externally validated three times by different training set data with the resultant predictions being used in a "majority rules" voting scheme to produce a consensus prediction for each member of the data set. The individual models produced an average training set classification rate of 71.6% and an average prediction set classification rate of 67.7%. However, the model ensemble was able to correctly classify the genotoxicity of 72.2% of all prediction set compounds.
Collapse
Affiliation(s)
- Brian E Mattioni
- Department of Chemistry, The Pennsylvania State University, 152 Davey Laboratory, University Park, Pennsylvania 16802, USA
| | | | | | | | | | | |
Collapse
|
19
|
Abstract
Recent developments in the prediction of toxicity from chemical structure have been reviewed. Attention has been drawn to some of the problems that can be encountered in the area of predictive toxicology, including the need for a multi-disciplinary approach and the need to address mechanisms of action. Progress has been hampered by the sparseness of good quality toxicological data. Perhaps too much effort has been devoted to exploring new statistical methods rather than to the creation of data sets for hitherto uninvestigated toxicological endpoints and/or classes of chemicals.
Collapse
Affiliation(s)
- M D Barratt
- Marlin Consultancy, 10 Beeby Way, Carlton, Bedford MK43 7LW, UK.
| | | |
Collapse
|