1
|
González-Castañeda Y, Marrero-Ponce Y, Guerra JO, Echevarría-Díaz Y, Pérez N, Pérez-Giménez F, Simonet AM, Macías FA, Nogueiras CM, Olazabal E, Serrano H. Computational discovery of novel anthelmintic natural compounds from Agave Brittoniana trel. Spp. Brachypus. BIONATURA 2022. [DOI: 10.21931/rb/2022.07.04.53] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Helminth infections are a medical problem in the world nowadays. This report used bond-based 2D quadratic indices, a bond-level QuBiLs-MAS molecular descriptor family, and Linear Discriminant Analysis (LDA) to obtain a quantitative linear model that discriminates between anthelmintic and non-anthelmintic drug-like organic-compounds. The model obtained correctly classified 87.46% and 81.82% of the training and external data sets, respectively. The developed model was used in a virtual screening to predict the biological activity of all chemicals (19) previously obtained and chemically characterized by some authors of this report from Agave brittoniana Trel. spp. Brachypus. The model identified several metabolites (12) as possible anthelmintics, and a group of 5 novel natural products was tested in an in vitro assay against Fasciola hepatica (100% effectivity at 500 µg/mL). Finally, the two best hits were evaluated in vivo in bald/c mice and the same helminth parasite using a 25 mg/kg dose. Compound 8 (Karatavinoside A) showed an efficacy of 92.2% in vivo. It is important to remark that this natural compound exhibits similar-to-superior activity as triclabendazole, the best human fasciolicide available in the market against Fasciola hepatica, resulting in a novel lead scaffold with anti-helminthic activity.
Keywords: TOMOCOMD-CARDD Software; QuBiLs-MAS, nonstochastic and stochastic bond-based quadratic indices; LDA-based QSAR model; Computational Screening, Anthelmintic Agent; Agave brittoniana Trel. spp. Brachypus, Fasciola hepatica.
Collapse
Affiliation(s)
- Yeniel González-Castañeda
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA)
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA), Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
| | - Jose O. Guerra
- Chemistry Department, Faculty of Chemistry-Pharmacy. Universidad Central “Marta Abreu” de Las Villas, Santa Clara, 54830, Villa Clara, Cuba
| | - Yunaimy Echevarría-Díaz
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional (MeM&T), Escuela de Medicina, Colegio de Ciencias de la Salud (COCSA), Departamento de Ciencias de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE)
| | - Noel Pérez
- Colegio de Ciencias e Ingenierías “El Politécnico”, Universidad San Francisco de Quito (USFQ), Quito, Ecuador
| | - Facundo Pérez-Giménez
- Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
| | - Ana M. Simonet
- Grupo de Alelopatía, Departamento de Química Orgánica, Facultad de Ciencias, Universidad de Cádiz
| | - Francisco A. Macías
- Grupo de Alelopatía, Departamento de Química Orgánica, Facultad de Ciencias, Universidad de Cádiz
| | - Clara M. Nogueiras
- Departamento de Química Orgánica, Facultad de Química, Universidad de La Habana
| | - Ervelio Olazabal
- Chemical Bioactive Center. Universidad Central “Marta Abreu” de Las Villas, Santa Clara
| | - Hector Serrano
- Chemical Bioactive Center. Universidad Central “Marta Abreu” de Las Villas, Santa Clara
| |
Collapse
|
2
|
Agüero-Chapin G, Galpert D, Molina-Ruiz R, Ancede-Gallardo E, Pérez-Machado G, De la Riva GA, Antunes A. Graph Theory-Based Sequence Descriptors as Remote Homology Predictors. Biomolecules 2019; 10:E26. [PMID: 31878100 PMCID: PMC7022958 DOI: 10.3390/biom10010026] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 12/16/2019] [Accepted: 12/18/2019] [Indexed: 12/23/2022] Open
Abstract
Alignment-free (AF) methodologies have increased in popularity in the last decades as alternative tools to alignment-based (AB) algorithms for performing comparative sequence analyses. They have been especially useful to detect remote homologs within the twilight zone of highly diverse gene/protein families and superfamilies. The most popular alignment-free methodologies, as well as their applications to classification problems, have been described in previous reviews. Despite a new set of graph theory-derived sequence/structural descriptors that have been gaining relevance in the detection of remote homology, they have been omitted as AF predictors when the topic is addressed. Here, we first go over the most popular AF approaches used for detecting homology signals within the twilight zone and then bring out the state-of-the-art tools encoding graph theory-derived sequence/structure descriptors and their success for identifying remote homologs. We also highlight the tendency of integrating AF features/measures with the AB ones, either into the same prediction model or by assembling the predictions from different algorithms using voting/weighting strategies, for improving the detection of remote signals. Lastly, we briefly discuss the efforts made to scale up AB and AF features/measures for the comparison of multiple genomes and proteomes. Alongside the achieved experiences in remote homology detection by both the most popular AF tools and other less known ones, we provide our own using the graphical-numerical methodologies, MARCH-INSIDE, TI2BioP, and ProtDCal. We also present a new Python-based tool (SeqDivA) with a friendly graphical user interface (GUI) for delimiting the twilight zone by using several similar criteria.
Collapse
Affiliation(s)
- Guillermin Agüero-Chapin
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n 4450-208 Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| | - Deborah Galpert
- Departamento de Ciencia de la Computación. Universidad Central ¨Marta Abreu¨ de Las Villas (UCLV), Santa Clara 54830, Cuba;
| | - Reinaldo Molina-Ruiz
- Centro de Bioactivos Químicos (CBQ), Universidad Central ¨Marta Abreu¨ de Las Villas (UCLV), Santa Clara 54830, Cuba;
| | - Evys Ancede-Gallardo
- Programa de Doctorado en Fisicoquímica Molecular, Facultad de Ciencias Exactas, Universidad Andrés Bello, Av. República 239, Santiago 8370146, Chile;
| | - Gisselle Pérez-Machado
- EpiDisease S.L. Spin-Off of Centro de Investigación Biomédica en Red de Enfermedades Raras (CIBERER), 46980 Valencia, Spain;
| | - Gustavo A. De la Riva
- Laboratorio de Biotecnología Aplicada S. de R.L. de C.V., GRECA Inc., Carretera La Piedad-Carapán, km 3.5, La Piedad, Michoacán 59300, Mexico;
- Tecnológico Nacional de México, Instituto Tecnológico de la Piedad, Av. Ricardo Guzmán Romero, Santa Fe, La Piedad de Cavadas, Michoacán 59370, Mexico
| | - Agostinho Antunes
- CIIMAR/CIMAR, Interdisciplinary Centre of Marine and Environmental Research, University of Porto, Terminal de Cruzeiros do Porto de Leixões, Av. General Norton de Matos s/n 4450-208 Porto, Portugal
- Department of Biology, Faculty of Sciences, University of Porto, Rua do Campo Alegre, 4169-007 Porto, Portugal
| |
Collapse
|
3
|
González-Díaz H, Muíño L, Anadón AM, Romaris F, Prado-Prado FJ, Munteanu CR, Dorado J, Sierra AP, Mezo M, González-Warleta M, Gárate T, Ubeira FM. MISS-Prot: web server for self/non-self discrimination of protein residue networks in parasites; theory and experiments in Fasciola peptides and Anisakis allergens. MOLECULAR BIOSYSTEMS 2011; 7:1938-55. [PMID: 21468430 DOI: 10.1039/c1mb05069a] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Infections caused by human parasites (HPs) affect the poorest 500 million people worldwide but chemotherapy has become expensive, toxic, and/or less effective due to drug resistance. On the other hand, many 3D structures in Protein Data Bank (PDB) remain without function annotation. We need theoretical models to quickly predict biologically relevant Parasite Self Proteins (PSP), which are expressed differentially in a given parasite and are dissimilar to proteins expressed in other parasites and have a high probability to become new vaccines (unique sequence) or drug targets (unique 3D structure). We present herein a model for PSPs in eight different HPs (Ascaris, Entamoeba, Fasciola, Giardia, Leishmania, Plasmodium, Trypanosoma, and Toxoplasma) with 90% accuracy for 15 341 training and validation cases. The model combines protein residue networks, Markov Chain Models (MCM) and Artificial Neural Networks (ANN). The input parameters are the spectral moments of the Markov transition matrix for electrostatic interactions associated with the protein residue complex network calculated with the MARCH-INSIDE software. We implemented this model in a new web-server called MISS-Prot (MARCH-INSIDE Scores for Self-Proteins). MISS-Prot was programmed using PHP/HTML/Python and MARCH-INSIDE routines and is freely available at: . This server is easy to use by non-experts in Bioinformatics who can carry out automatic online upload and prediction with 3D structures deposited at PDB (mode 1). We can also study outcomes of Peptide Mass Fingerprinting (PMFs) and MS/MS for query proteins with unknown 3D structures (mode 2). We illustrated the use of MISS-Prot in experimental and/or theoretical studies of peptides from Fasciola hepatica cathepsin proteases or present on 10 Anisakis simplex allergens (Ani s 1 to Ani s 10). In doing so, we combined electrophoresis (1DE), MALDI-TOF Mass Spectroscopy, and MASCOT to seek sequences, Molecular Mechanics + Molecular Dynamics (MM/MD) to generate 3D structures and MISS-Prot to predict PSP scores. MISS-Prot also allows the prediction of PSP proteins in 16 additional species including parasite hosts, fungi pathogens, disease transmission vectors, and biotechnologically relevant organisms.
Collapse
Affiliation(s)
- Humberto González-Díaz
- Department of Microbiology & Parasitology, Faculty of Pharmacy, University of Santiago de Compostela, 15782, Santiago de Compostela, Spain.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
4
|
Casañola-Martin GM, Marrero-Ponce Y, Khan MTH, Khan SB, Torrens F, Pérez-Jiménez F, Rescigno A, Abad C. Bond-based 2D quadratic fingerprints in QSAR studies: virtual and in vitro tyrosinase inhibitory activity elucidation. Chem Biol Drug Des 2010; 76:538-45. [PMID: 20964806 DOI: 10.1111/j.1747-0285.2010.01032.x] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In this report, we show the results of quantitative structure-activity relationship (QSAR) studies of tyrosinase inhibitory activity, by using the bond-based quadratic indices as molecular descriptors (MDs) and linear discriminant analysis (LDA), to generate discriminant functions to predict the anti-tyrosinase activity. The best two models [Eqs (6) and (12)] out of the total 12 QSAR models developed here show accuracies of 93.51% and 91.21%, as well as high Matthews correlation coefficients (C) of 0.86 and 0.82, respectively, in the training set. The validation external series depicts values of 90.00% and 89.44% for these best two equations (6) and (12), respectively. Afterwards, a second external prediction data are used to perform a virtual screening of compounds reported in the literature as active (tyrosinase inhibitors). In a final step, a series of lignans is analysed using the in silico-developed models, and in vitro corroboration of the activity is carried out. An issue of great importance to remark here is that all compounds present greater inhibition values than Kojic acid (standard tyrosinase inhibitor: IC₅₀ = 16.67 μm). The current obtained results could be used as a framework to increase the speed, in the biosilico discovery of leads for the treatment of skin disorders.
Collapse
|
5
|
Ortega-Broche SE, Marrero-Ponce Y, Díaz YE, Torrens F, Pérez-Giménez F. tomocomd-camps and protein bilinear indices - novel bio-macromolecular descriptors for protein research: I. Predicting protein stability effects of a complete set of alanine substitutions in the Arc repressor. FEBS J 2010; 277:3118-46. [DOI: 10.1111/j.1742-4658.2010.07711.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
6
|
3D entropy and moments prediction of enzyme classes and experimental-theoretic study of peptide fingerprints in Leishmania parasites. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2009; 1794:1784-94. [DOI: 10.1016/j.bbapap.2009.08.020] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/27/2009] [Revised: 08/07/2009] [Accepted: 08/17/2009] [Indexed: 11/21/2022]
|
7
|
Nucleotide's bilinear indices: novel bio-macromolecular descriptors for bioinformatics studies of nucleic acids. I. Prediction of paromomycin's affinity constant with HIV-1 Psi-RNA packaging region. J Theor Biol 2009; 259:229-41. [PMID: 19272394 DOI: 10.1016/j.jtbi.2009.02.021] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2008] [Revised: 02/24/2009] [Accepted: 02/25/2009] [Indexed: 02/03/2023]
Abstract
A new set of nucleotide-based bio-macromolecular descriptors are presented. This novel approach to bio-macromolecular design from a linear algebra point of view is relevant to nucleic acids quantitative structure-activity relationship (QSAR) studies. These bio-macromolecular indices are based on the calculus of bilinear maps on Re(n)[b(mk)(x (m),y (m)):Re(n) x Re(n)-->Re] in canonical basis. Nucleic acid's bilinear indices are calculated from kth power of non-stochastic and stochastic nucleotide's graph-theoretic electronic-contact matrices, M(m)(k) and (s)M(m)(k), respectively. That is to say, the kth non-stochastic and stochastic nucleic acid's bilinear indices are calculated using M(m)(k) and (s)M(m)(k) as matrix operators of bilinear transformations. Moreover, biochemical information is codified by using different pair combinations of nucleotide-base properties as weightings (experimental molar absorption coefficient epsilon(260) at 260 nm and pH=7.0, first (Delta E(1)) and second (Delta E(2)) single excitation energies in eV, and first (f(1)) and second (f(2)) oscillator strength values (of the first singlet excitation energies) of the nucleotide DNA-RNA bases. As example of this approach, an interaction study of the antibiotic paromomycin with the packaging region of the HIV-1 Psi-RNA have been performed and it have been obtained several linear models in order to predict the interaction strength. The best linear model obtained by using non-stochastic bilinear indices explains about 91% of the variance of the experimental Log K (R=0.95 and s=0.08 x 10(-4)M(-1)) as long as the best stochastic bilinear indices-based equation account for 93% of the Log K variance (R=0.97 and s=0.07 x 10(-4)M(-1)). The leave-one-out (LOO) press statistics, evidenced high predictive ability of both models (q(2)=0.86 and s(cv)=0.09 x 10(-4)M(-1) for non-stochastic and q(2)=0.91 and s(cv)=0.08 x 10(-4)M(-1) for stochastic bilinear indices). The nucleic acid's bilinear indices-based models compared favorably with other nucleic acid's indices-based approaches reported nowadays. These models also permit the interpretation of the driving forces of the interaction process. In this sense, developed equations involve short-reaching (k<or=3), middle-reaching (4<k<9), and far-reaching (k=10 or greater) nucleotide's bilinear indices. This situation points to electronic and topologic nucleotide's backbone interactions control of the stability profile of paromomycin-RNA complexes. Consequently, the present approach represents a novel and rather promising way to theoretical-biology studies.
Collapse
|
8
|
García I, Munteanu CR, Fall Y, Gómez G, Uriarte E, González-Díaz H. QSAR and complex network study of the chiral HMGR inhibitor structural diversity. Bioorg Med Chem 2009; 17:165-75. [DOI: 10.1016/j.bmc.2008.11.007] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2008] [Revised: 10/31/2008] [Accepted: 11/06/2008] [Indexed: 10/21/2022]
|
9
|
Vilar S, González-Díaz H, Santana L, Uriarte E. QSAR model for alignment-free prediction of human breast cancer biomarkers based on electrostatic potentials of protein pseudofolding HP-lattice networks. J Comput Chem 2008; 29:2613-22. [PMID: 18478581 DOI: 10.1002/jcc.21016] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Network theory allows relationships to be established between numerical parameters that describe the molecular structure of genes and proteins and their biological properties. These models can be considered as quantitative structure-activity relationships (QSAR) for biopolymers. The work described here concerns the first QSAR model for 122 proteins that are associated with human breast cancer (HBC), as identified experimentally by Sjöblom et al. (Science 2006, 314, 268) from over 10,000 human proteins. In this study, the 122 proteins related to HBC (HBCp) and a control group of 200 proteins that are not related to HBC (non-HBCp) were forced to fold in an HP lattice network. From these networks a series of electrostatic potential parameters (xi(k)) was calculated to describe each protein numerically. The use of xi(k) as an entry point to linear discriminant analysis led to a QSAR model to discriminate between HBCp and non-HBCp, and this model could help to predict the involvement of a certain gene and/or protein in HBC. In addition, validation procedures were carried out on the model and these included an external prediction series and evaluation of an additional series of 1000 non-HBCp. In all cases good levels of classification were obtained with values above 80%. This study represents the first example of a QSAR model for the computational chemistry inspired search of potential HBC protein biomarkers.
Collapse
Affiliation(s)
- Santiago Vilar
- Unit of Bioinformatics and Connectivity Analysis, Institute of Industrial Pharmacy, and Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, Santiago de Compostela 15782, Spain
| | | | | | | |
Collapse
|
10
|
Rivera-Borroto O, Marrero-Ponce Y, Meneses-Marcel A, Escario J, Gómez Barrio A, Arán V, Martins Alho M, Montero Pereira D, Nogal J, Torrens F, Ibarra-Velarde F, Montenegro Y, Huesca-Guillén A, Rivera N, Vogel C. Discovery of Novel Trichomonacidals Using LDA-Driven QSAR Models and Bond-Based Bilinear Indices as Molecular Descriptors. ACTA ACUST UNITED AC 2008. [DOI: 10.1002/qsar.200610165] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
11
|
Cruz-Monteagudo M, Munteanu CR, Borges F, Cordeiro MND, Uriarte E, Chou KC, González-Díaz H. Stochastic molecular descriptors for polymers. 4. Study of complex mixtures with topological indices of mass spectra spiral and star networks: The blood proteome case. POLYMER 2008. [DOI: 10.1016/j.polymer.2008.09.070] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
12
|
Castillo-Garit JA, Martinez-Santiago O, Marrero-Ponce Y, Casañola-Martín GM, Torrens F. Atom-based non-stochastic and stochastic bilinear indices: Application to QSPR/QSAR studies of organic compounds. Chem Phys Lett 2008. [DOI: 10.1016/j.cplett.2008.08.094] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
13
|
Castillo-Garit JA, Marrero-Ponce Y, Escobar J, Torrens F, Rotondo R. A novel approach to predict aquatic toxicity from molecular structure. CHEMOSPHERE 2008; 73:415-427. [PMID: 18597811 DOI: 10.1016/j.chemosphere.2008.05.024] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2008] [Revised: 04/29/2008] [Accepted: 05/07/2008] [Indexed: 05/26/2023]
Abstract
The main aim of the study was to develop quantitative structure-activity relationship (QSAR) models for the prediction of aquatic toxicity using atom-based non-stochastic and stochastic linear indices. The used dataset consist of 392 benzene derivatives, separated into training and test sets, for which toxicity data to the ciliate Tetrahymena pyriformis were available. Using multiple linear regression, two statistically significant QSAR models were obtained with non-stochastic (R2=0.791 and s=0.344) and stochastic (R2=0.799 and s=0.343) linear indices. A leave-one-out (LOO) cross-validation procedure was carried out achieving values of q2=0.781 (scv=0.348) and q2=0.786 (scv=0.350), respectively. In addition, a validation through an external test set was performed, which yields significant values of Rpred2 of 0.762 and 0.797. A brief study of the influence of the statistical outliers in QSAR's model development was also carried out. Finally, our method was compared with other approaches implemented in the Dragon software achieving better results. The non-stochastic and stochastic linear indices appear to provide an interesting alternative to costly and time-consuming experiments for determining toxicity.
Collapse
Affiliation(s)
- Juan A Castillo-Garit
- Applied Chemistry Research Center, Central University of Las Villas, Santa Clara, 54830, Villa Clara, Cuba.
| | | | | | | | | |
Collapse
|
14
|
Marrero-Ponce Y, Khan MTH, Casañola Martín GM, Ather A, Sultankhodzhaev MN, Torrens F, Rotondo R. Prediction of tyrosinase inhibition activity using atom-based bilinear indices. ChemMedChem 2008; 2:449-78. [PMID: 17366651 DOI: 10.1002/cmdc.200600186] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
A set of novel atom-based molecular fingerprints is proposed based on a bilinear map similar to that defined in linear algebra. These molecular descriptors (MDs) are proposed as a new means of molecular parametrization easily calculated from 2D molecular information. The nonstochastic and stochastic molecular indices match molecular structure provided by molecular topology by using the kth nonstochastic and stochastic graph-theoretical electronic-density matrices, M(k) and S(k), respectively. Thus, the kth nonstochastic and stochastic bilinear indices are calculated using M(k) and S(k) as matrix operators of bilinear transformations. Chemical information is coded by using different pair combinations of atomic weightings (mass, polarizability, vdW volume, and electronegativity). The results of QSAR studies of tyrosinase inhibitors using the new MDs and linear discriminant analysis (LDA) demonstrate the ability of the bilinear indices in testing biological properties. A database of 246 structurally diverse tyrosinase inhibitors was assembled. An inactive set of 412 drugs with other clinical uses was used; both active and inactive sets were processed by hierarchical and partitional cluster analyses to design training and predicting sets. Twelve LDA-based QSAR models were obtained, the first six using the nonstochastic total and local bilinear indices and the last six with the stochastic MDs. The discriminant models were applied; globally good classifications of 99.58 and 89.96 % were observed for the best nonstochastic and stochastic bilinear indices models in the training set along with high Matthews correlation coefficients (C) of 0.99 and 0.79, respectively, in the learning set. External prediction sets used to validate the models obtained were correctly classified, with accuracies of 100 and 87.78 %, respectively, yielding C values of 1.00 and 0.73. This subset contains 180 active and inactive compounds not considered to fit the models. A simulated virtual screen demonstrated this approach in searching tyrosinase inhibitors from compounds never considered in either training or predicting series. These fitted models permitted the selection of new cycloartane compounds isolated from herbal plants as new tyrosinase inhibitors. A good correspondence between theoretical and experimental inhibitory effects on tyrosinase was observed; compound CA6 (IC(50)=1.32 microM) showed higher activity than the reference compounds kojic acid (IC(50)=16.67 microM) and L-mimosine (IC(50)=3.68 microM).
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Institut Universitari de Ciència Molecular, Universitat de València, Edifici d'Instituts de Paterna, Poligon la Coma s/n (detras de Canal Nou) P.O. Box 22085, 46071 Valencia, Spain.
| | | | | | | | | | | | | |
Collapse
|
15
|
Marrero-Ponce Y, Meneses-Marcel A, Rivera-Borroto OM, García-Domenech R, De Julián-Ortiz JV, Montero A, Escario JA, Barrio AG, Pereira DM, Nogal JJ, Grau R, Torrens F, Vogel C, Arán VJ. Bond-based linear indices in QSAR: computational discovery of novel anti-trichomonal compounds. J Comput Aided Mol Des 2008; 22:523-40. [DOI: 10.1007/s10822-008-9171-1] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2006] [Accepted: 01/05/2008] [Indexed: 10/22/2022]
|
16
|
Prado-Prado FJ, González-Díaz H, de la Vega OM, Ubeira FM, Chou KC. Unified QSAR approach to antimicrobials. Part 3: first multi-tasking QSAR model for input-coded prediction, structural back-projection, and complex networks clustering of antiprotozoal compounds. Bioorg Med Chem 2008; 16:5871-80. [PMID: 18485714 DOI: 10.1016/j.bmc.2008.04.068] [Citation(s) in RCA: 107] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2008] [Revised: 04/22/2008] [Accepted: 04/25/2008] [Indexed: 10/22/2022]
Abstract
Several pathogen parasite species show different susceptibilities to different antiparasite drugs. Unfortunately, almost all structure-based methods are one-task or one-target Quantitative Structure-Activity Relationships (ot-QSAR) that predict the biological activity of drugs against only one parasite species. Consequently, multi-tasking learning to predict drugs activity against different species by a single model (mt-QSAR) is vitally important. In the two previous works of the present series we reported two single mt-QSAR models in order to predict the antimicrobial activity against different fungal (Bioorg. Med. Chem.2006, 14, 5973-5980) or bacterial species (Bioorg. Med. Chem.2007, 15, 897-902). These mt-QSARs offer a good opportunity (unpractical with ot-QSAR) to construct drug-drug similarity Complex Networks and to map the contribution of sub-structures to function for multiple species. These possibilities were unattended in our previous works. In the present work, we continue this series toward other important direction of chemotherapy (antiparasite drugs) with the development of an mt-QSAR for more than 500 drugs tested in the literature against different parasites. The data were processed by Linear Discriminant Analysis (LDA) classifying drugs as active or non-active against the different tested parasite species. The model correctly classifies 212 out of 244 (87.0%) cases in training series and 207 out of 243 compounds (85.4%) in external validation series. In order to illustrate the performance of the QSAR for the selection of active drugs we carried out an additional virtual screening of antiparasite compounds not used in training or predicting series; the model recognized 97 out of 114 (85.1%) of them. We also give the procedures to construct back-projection maps and to calculate sub-structures contribution to the biological activity. Finally, we used the outputs of the QSAR to construct, by the first time, a multi-species Complex Networks of antiparasite drugs. The network predicted has 380 nodes (compounds), 634 edges (pairs of compounds with similar activity). This network allows us to cluster different compounds and identify on average three known compounds similar to a new query compound according to their profile of biological activity. This is the first attempt to calculate probabilities of antiparasitic action of drugs against different parasites.
Collapse
|
17
|
González-Díaz H, González-Díaz Y, Santana L, Ubeira FM, Uriarte E. Proteomics, networks and connectivity indices. Proteomics 2008; 8:750-78. [DOI: 10.1002/pmic.200700638] [Citation(s) in RCA: 145] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
|
18
|
Castillo-Garit JA, Marrero-Ponce Y, Torrens F, Rotondo R. Atom-based stochastic and non-stochastic 3D-chiral bilinear indices and their applications to central chirality codification. J Mol Graph Model 2007; 26:32-47. [PMID: 17110145 DOI: 10.1016/j.jmgm.2006.09.007] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2006] [Revised: 09/08/2006] [Accepted: 09/20/2006] [Indexed: 11/16/2022]
Abstract
Non-stochastic and stochastic 2D bilinear indices have been generalized to codify chemical structure information for chiral drugs, making use of a trigonometric 3D-chirality correction factor. In order to evaluate the effectiveness of this novel approach in drug design we have modeled the angiotensin-converting enzyme inhibitory activity of perindoprilate's sigma-stereoisomers combinatorial library. Two linear discriminant analysis models, using non-stochastic and stochastic linear indices, were obtained. The models had shown an accuracy of 95.65% for the training set and 100% for the external prediction set. Next the prediction of the sigma-receptor antagonists of chiral 3-(3-hydroxyphenyl)piperidines by multiple linear regression analysis was carried out. Two statistically significant QSAR models were obtained when non-stochastic (R(2)=0.953 and s=0.238) and stochastic (R(2)=0.961 and s=0.219) 3D-chiral bilinear indices were used. These models showed adequate predictive power (assessed by the leave-one-out cross-validation experiment) yielding values of q(2)=0.935 (s(cv)=0.259) and q(2)=0.946 (s(cv)=0.235), respectively. Finally, the prediction of the corticosteroid-binding globulin binding affinity of steroids set was performed. The obtained results are rather similar to most of the 3D-QSAR approaches reported so far. The validation of this method was achieved by comparison with previous reports applied to the same data set. The non-stochastic and stochastic 3D-chiral linear indices appear to provide a very interesting alternative to other more common 3D-QSAR descriptors.
Collapse
Affiliation(s)
- Juan A Castillo-Garit
- Applied Chemistry Research Center, Central University of Las Villas, Santa Clara, 54830 Villa Clara, Cuba.
| | | | | | | |
Collapse
|
19
|
Ponce Y, Khan M, Martín G, Ather A, Sultankhodzhaev M, Torrens F, Rotondo R, Alvarado Y. Atom-Based 2D Quadratic Indices in Drug Discovery of Novel Tyrosinase Inhibitors: Results ofIn Silico Studies Supported by Experimental Results. ACTA ACUST UNITED AC 2007. [DOI: 10.1002/qsar.200610156] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
|
20
|
González-Díaz H, Vilar S, Santana L, Podda G, Uriarte E. On the applicability of QSAR for recognition of miRNA bioorganic structures at early stages of organism and cell development: Embryo and stem cells. Bioorg Med Chem 2007; 15:2544-50. [PMID: 17300944 DOI: 10.1016/j.bmc.2007.01.050] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2006] [Revised: 01/24/2007] [Accepted: 01/31/2007] [Indexed: 11/18/2022]
Abstract
Quantitative structure-activity-relationship (QSAR) models have application in bioorganic chemistry mainly to the study of small sized molecules while applications to biopolymers remain not very developed. MicroRNAs (miRNAs), which are non-coding small RNAs, regulate a variety of biological processes and constitute good candidates to scale up the application of QSAR to biopolymers. The propensity of a small RNA sequence to act as miRNA depends on its secondary structure, which one can explain in terms of folding thermodynamic parameters. Then, thermodynamic QSAR can be used, for instance, for fast identification of miRNAs at early stages of development such as embryos and stem cells (called here esmiRNAs), and gain clarity inside cellular differentiation processes and diseases such as cancer. First, we calculated folding free energies (DeltaG), enthalpies (DeltaH), and entropies (DeltaS) as well as melting temperatures (T(m)) for 2623 small RNA sequences (including 623 esmiRNAs and 2000 negative control sequences). Next, we seek a QSAR classification model: esmiRNA=0.035 x T(m)-0.078 x DeltaS-8.748. The model correctly recognized 543 (87.2%) of esmiRNAs and 935 (93.5%) of non-esmiRNAs divided into both training and validation series. The model also recognized 908 out of 1000 additional negative control sequences. ROC curve analysis (area=0.93) demonstrated that the present model significantly differentiates from a random classifier. In addition, we map the influence of thermodynamic parameters over esmiRNA activity. Last, a double ordinate Cartesian plot of cross-validated residuals (first ordinate), standard residuals (second ordinate), and leverages (abscissa) defined the domain of applicability of the model as a squared area within +/-2 band for residuals and a leverage threshold of h=0.0074. The present is the first QSAR model for quickly accurate selection of new esmiRNAs with potential use in bioorganic and medicinal chemistry.
Collapse
Affiliation(s)
- Humberto González-Díaz
- Faculty of Pharmacy, University of Santiago de Compostela, Santiago de Compostela 15782, Spain.
| | | | | | | | | |
Collapse
|
21
|
Marrero-Ponce Y, Khan MTH, Casañola-Martín GM, Ather A, Sultankhodzhaev MN, García-Domenech R, Torrens F, Rotondo R. Bond-based 2D TOMOCOMD-CARDD approach for drug discovery: aiding decision-making in 'in silico' selection of new lead tyrosinase inhibitors. J Comput Aided Mol Des 2007; 21:167-88. [PMID: 17333484 DOI: 10.1007/s10822-006-9094-7] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2006] [Accepted: 12/02/2006] [Indexed: 11/25/2022]
Abstract
In this paper, we present a new set of bond-level TOMOCOMD-CARDD molecular descriptors (MDs), the bond-based bilinear indices, based on a bilinear map similar to those defined in linear algebra. These novel MDs are used here in Quantitative Structure-Activity Relationship (QSAR) studies of tyrosinase inhibitors, for finding functions that discriminate between the tyrosinase inhibitor compounds and inactive ones. In total 14 models were obtained and the best two discriminant functions (Eqs. 32 and 33) shown globally good classification of 91.00% and 90.17%, respectively, in the training set. The test set had accuracies of 93.33% and 88.89% for the models 32 and 33, correspondingly. A simulated virtual screening was also carried out to prove the quality of the determined models. In a final step, the fitted models were used in the biosilico identification of new synthesized tetraketones, where a good agreement could be observed between the theoretical and experimental results. Four compounds of the novel bioactive chemicals discovered as tyrosinase inhibitors: TK10 (IC(50) = 2.09 microM), TK11 (IC(50) = 2.61 microM), TK21 (IC(50) = 2.06 microM), TK23 (IC(50) = 3.19 microM), showed more potent activity than L-mimose (IC(50) = 3.68 microM). Besides, for this study a heterogeneous database of tyrosinase inhibitors was collected, and could be a useful tool for the scientist in the domain of tyrosinase enzyme researches. The current report could help to shed some clues in the identification of new chemicals that inhibits enzyme tyrosinase, for entering in the pipeline of drug discovery development.
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Institut Universitari de Ciència Molecular, Universitat de València, Edifici d'Instituts de Paterna, Poligon la Coma s/n (detras de Canal Nou), Valencia, Spain.
| | | | | | | | | | | | | | | |
Collapse
|
22
|
Marrero-Ponce Y, Torrens F, Alvarado YJ, Rotondo R. Bond-based global and local (bond, group and bond-type) quadratic indices and their applications to computer-aided molecular design. 1. QSPR studies of diverse sets of organic chemicals. J Comput Aided Mol Des 2006; 20:685-701. [PMID: 17186417 DOI: 10.1007/s10822-006-9089-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2006] [Accepted: 10/18/2006] [Indexed: 11/26/2022]
Abstract
The concept of atom-based quadratic indices is extended to a series of molecular descriptors (MDs) (both total and local) based on adjacency between edges. The kth edge-adjacency matrix (E ( k )) denotes the matrix of bond-based quadratic indices (non-stochastic) with respect to the canonical basis set. The kth "stochastic" edge-adjacency matrix, ES ( k ), is here proposed as a new molecular representation easily calculated from E ( k ). Then, the kth stochastic bond-based quadratic indices are calculated using ES ( k ) as operators of quadratic transformations. The study of six representative physicochemical properties of octane isomers was used to compare the ability of both series of MDs to produce significant quantitative structure-property relationship (QSPR) models. Moreover, the general performance of the new MDs in this QSPR study has been evaluated with respect to other 2D/3D well-known sets of indices and the obtained results shown a quite satisfactory behavior of the present method. The novel bond-level MDs were also used for the description and prediction of the boiling point of 28 alkyl-alcohols and to the modeling of the specific rate constant (log k) of 34 derivatives of 2-furylethylenes. These models were statistically significant and showed very good stability to data variation in leave-one-out (LOO) cross-validation experiment. The comparison with other approaches (edge- and vertices-based connectivity indices, total and local spectral moments, and quantum chemical descriptors as well as E-state/biomolecular encounter parameters) expose a good behavior of our method in this QSPR studies. The approach described in this report appears to be a very promising structural invariant, useful for QSPR/QSAR studies, similarity/diversity analysis, and computer-aided "rational" molecular (drug) design.
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Unit of Computer-Aided Molecular 'Biosilico' Discovery and Bioinformatic Research (CAMD-BIR Unit), Faculty of Chemistry-Pharmacy, Central University of Las Villas, Santa Clara, Villa Clara, 54830, Cuba.
| | | | | | | |
Collapse
|
23
|
Castillo-Garit JA, Marrero-Ponce Y, Torrens F. Atom-based 3D-chiral quadratic indices. Part 2: Prediction of the corticosteroid-binding globulinbinding affinity of the 31 benchmark steroids data set. Bioorg Med Chem 2006; 14:2398-408. [PMID: 16325409 DOI: 10.1016/j.bmc.2005.11.024] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2005] [Revised: 11/09/2005] [Accepted: 11/09/2005] [Indexed: 10/25/2022]
Abstract
A quantitative structure-activity relationship (QSAR) study to predict the relative affinities of the steroid 'benchmark' data set to the corticosteroid-binding globulin (CBG) is described. It is shown that the 3D-chiral quadratic indices closely correlate with the measured CBG affinity values for the 31 steroids. The calculated descriptors were correlated with biological data through multiple linear regressions. Two statistically significant models were obtained when non-stochastic (R = 0.924 and s = 0.46) as well as stochastic (R = 0.929 and s = 0.46) 3D-chiral quadratic indices were used. A leave-one-out (LOO) approach to model validation is used here; the best results obtained in the cross-validation procedure with non-stochastic (q2 = 0.781) and stochastic (q2 = 0.735) 3D-chiral quadratic indices are better or similar to most of the 3D-QSAR approaches reported so far. These results support the idea that the 3D-chiral quadratic indices may be helpful in prediction of the corticosteroid-binding affinity for new compounds.
Collapse
Affiliation(s)
- Juan A Castillo-Garit
- Applied Chemistry Research Center, Central University of Las Villas, Santa Clara, 54830, Villa Clara, Cuba.
| | | | | |
Collapse
|
24
|
González-Díaz H, Pérez-Bello A, Uriarte E, González-Díaz Y. QSAR study for mycobacterial promoters with low sequence homology. Bioorg Med Chem Lett 2006; 16:547-53. [PMID: 16275068 DOI: 10.1016/j.bmcl.2005.10.057] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2005] [Revised: 10/13/2005] [Accepted: 10/18/2005] [Indexed: 11/27/2022]
Abstract
The general belief is that quantitative structure-activity relationship (QSAR) techniques work only for small molecules and, protein sequences or, more recently, DNA sequences. However, with non-branched graph for proteins and DNA sequences the QSAR often have to be based on powerful non-linear techniques such as support vector machines. In our opinion, linear QSAR models based on RNA could be useful to assign biological activity when alignment techniques fail due to low sequence homology. The idea bases the high level of branching for the RNA graph. This work introduces the so-called Markov electrostatic potentials (k)xi(M) as a new class of RNA 2D-structure descriptors. Subsequently, we validate these molecular descriptors solving a QSAR classification problem for mycobacterial promoter sequences (mps), which constitute a very low sequence homology problem. The model developed (mps=-4.664.(0)xi(M)+0. 991.(1)xi(M)-2.432) was intended to predict whether a naturally occurring sequence is an mps or not on the basis of the calculated (k)xi(M) value for the corresponding RNA secondary structure. The RNA-QSAR approach recognises 115/135mps (85.2%) and 100% of control sequences. Average predictability and robustness were greater than 95%. A previous non-linear model predicts mps with a slightly higher accuracy (97%) but uses a very large parameter space for DNA sequences. Conversely, the (k)xi(M)-based RNA-QSAR encodes more structural information and needs only two variables.
Collapse
|
25
|
Marrero-Ponce Y, Marrero RM, Torrens F, Martinez Y, Bernal MG, Zaldivar VR, Castro EA, Abalo RG. Non-stochastic and stochastic linear indices of the molecular pseudograph’s atom-adjacency matrix: a novel approach for computational in silico screening and “rational” selection of new lead antibacterial agents. J Mol Model 2005; 12:255-71. [PMID: 16270182 DOI: 10.1007/s00894-005-0024-8] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2004] [Accepted: 06/20/2005] [Indexed: 11/25/2022]
Abstract
A novel approach (TOMOCOMD-CARDD) to computer-aided "rational" drug design is illustrated. This approach is based on the calculation of the non-stochastic and stochastic linear indices of the molecular pseudograph's atom-adjacency matrix representing molecular structures. These TOMOCOMD-CARDD descriptors are introduced for the computational (virtual) screening and "rational" selection of new lead antibacterial agents using linear discrimination analysis. The two structure-based antibacterial-activity classification models, including non-stochastic and stochastic indices, classify correctly 91.61% and 90.75%, respectively, of 1525 chemicals in training sets. These models show high Matthews correlation coefficients (MCC=0.84 and 0.82). An external validation process was carried out to assess the robustness and predictive power of the model obtained. These QSAR models permit the correct classification of 91.49% and 89.31% of 505 compounds in an external test set, yielding MCCs of 0.84 and 0.79, respectively. The TOMOCOMD-CARDD approach compares satisfactorily with respect to nine of the most useful models for antimicrobial selection reported to date. Finally, an in silico screening of 87 new chemicals reported in the anti-infective field with antibacterial activities is developed showing the ability of the TOMOCOMD-CARDD models to identify new lead antibacterial compounds.
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Department of Pharmacy, Faculty of Chemical-Pharmacy, Central University of Las Villas, Santa Clara, 54830, Villa Clara, Cuba.
| | | | | | | | | | | | | | | |
Collapse
|
26
|
González-Díaz H, Agüero-Chapin G, Varona-Santos J, Molina R, de la Riva G, Uriarte E. 2D RNA-QSAR: assigning ACC oxidase family membership with stochastic molecular descriptors; isolation and prediction of a sequence from Psidium guajava L. Bioorg Med Chem Lett 2005; 15:2932-7. [PMID: 15878661 DOI: 10.1016/j.bmcl.2005.03.017] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2004] [Revised: 03/03/2005] [Accepted: 03/04/2005] [Indexed: 11/17/2022]
Abstract
Quantitative structure-activity relationship (QSAR) techniques for small molecules could be applied to nucleic acids. Unfortunately, almost all molecular descriptors are more successful at encoding branching information than sequences and/or cannot be back-projected. A solution for scaling the QSAR problem up to RNA may be to transform sequences into secondary structures first. Our group has used Markovian negentropies as molecular descriptors for drug design with preliminary results in bioinformatics [Bioinformatics 2003, 19, 2079]. However, RNA-QSAR studies on RNA molecules have not been described to date. Novel Markovian negentropies have been introduced here as molecular descriptors for 2D-RNA structures. An RNA-QSAR study of the ACC proteins from different plants has been carried out. The QSAR recognizes 19/20 sequences (95.0%) within the ACC family and 12/17 (70.6%) of the control group sequences. The model has a high Matthews' regression coefficient (C = 0.68). Overall cross-validation average accuracies were 14 out of 15 for ACC sequences (93.3%) and 10 out of 13 for control sequences (76.9%). Finally, ACC oxidase family membership was assigned to a new sequence isolated for the first time in this work from Psidium guajava L. A backprojection map for this sequence identifies the left stem (40%) and the main stem (45%) as highly important substructures. Results of an nBLAST experiment are consistent with this finding and indicate a high conservation score (>70) for left stem and main stem; whereas major loop, right stem, cap and major loop right half were hardly conserved.
Collapse
Affiliation(s)
- Humberto González-Díaz
- Department of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, Spain.
| | | | | | | | | | | |
Collapse
|
27
|
Marrero Ponce Y, Castillo Garit JA, Nodarse D. Linear indices of the 'macromolecular graph's nucleotides adjacency matrix' as a promising approach for bioinformatics studies. Part 1: prediction of paromomycin's affinity constant with HIV-1 psi-RNA packaging region. Bioorg Med Chem 2005; 13:3397-404. [PMID: 15848751 DOI: 10.1016/j.bmc.2005.03.010] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2005] [Revised: 03/01/2005] [Accepted: 03/02/2005] [Indexed: 10/25/2022]
Abstract
The design of novel anti-HIV compounds has now become a crucial area for scientists around the world. In this paper a new set of macromolecular descriptors (that are calculated from the macromolecular graph's nucleotide adjacency matrix) of relevance to nucleic acid QSAR/QSPR studies, nucleic acids' linear indices. A study of the interaction of the antibiotic Paromomycin with the packaging region of the HIV-1 psi-RNA has been performed as example of this approach. A multiple linear regression model predicted the local binding affinity constants [Log K (10(-4) M(-1))] between a specific nucleotide and the aforementioned antibiotic. The linear model explains more than 87% of the variance of the experimental Log K (R = 0.93 and s = 0.102 x 10(-4) M(-1)) and leave-one-out press statistics evidenced its predictive ability (q2 = 0.82 and s(cv) = 0.108 x 10(-4) M(-1)). The comparison with other approaches (macromolecular quadratic indices, Markovian Negentropies and 'stochastic' spectral moments) reveals a good behavior of our method.
Collapse
Affiliation(s)
- Yovani Marrero Ponce
- Department of Pharmacy, Faculty of Chemical-Pharmacy, Chemical Bioactive Center, Central University of Las Villas, Santa Clara 54830, Villa Clara, Cuba.
| | | | | |
Collapse
|
28
|
Marrero-Ponce Y, Castillo-Garit JA. 3D-chiral Atom, Atom-type, and Total Non-stochastic and Stochastic Molecular Linear Indices and their Applications to Central Chirality Codification. J Comput Aided Mol Des 2005; 19:369-83. [PMID: 16231198 DOI: 10.1007/s10822-005-7575-8] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2005] [Accepted: 05/18/2005] [Indexed: 10/25/2022]
Abstract
Non-stochastic and stochastic 2D linear indices have been generalized to codify chemical structure information for chiral drugs, making use of a trigonometric 3D-chirality correction factor. These descriptors circumvent the inability of conventional 2D non-stochastic [Y. Marrero-Ponce. J. Chem. Inf. Comp., Sci. l 44 (2004) 2010] and stochastic [Y. Marrero-Ponce, et al. Bioorg. Med. Chem., 13 (2005) 1293] linear indices to distinguish sigma-stereoisomers. In order to test the potential of this novel approach in drug design we have modelled the angiotensin-converting enzyme inhibitory activity of perindoprilate's sigma-stereoisomers combinatorial library. Two linear discriminant analysis models, using non-stochastic and stochastic linear indices, were obtained. The models showed an accuracy of 100% and 96.65% for the training set; and 88.88% and 100% in the external test set, respectively. Canonical regression analysis corroborated the statistical quality of these models (R(can) of 0.78 and of 0.77) and was also used to compute biology activity canonical scores for each compound. After that, the prediction of the sigma-receptor antagonists of chiral 3-(3-hydroxyphenyl)piperidines by linear multiple regression analysis was carried out. Two statistically significant QSAR models were obtained when non-stochastic (R2 = 0.982 and s = 0.157) and stochastic (R2 = 0.941 and s = 0.267) 3D-chiral linear indices were used. The predictive power was assessed by the leave-one-out cross-validation experiment, yielding values of q2 = 0.982 (s(cv) = 0.186) and q2 = 0.90 (s(cv) = 0.319), respectively. Finally, the prediction of the corticosteroid-binding globulin binding affinity of steroids set was performed. The best results obtained in the cross-validation procedure with non-stochastic (q2 = 0.904) and stochastic (q2 = 0.88) 3D-chiral linear indices are rather similar to most of the 3D-QSAR approaches reported so far. The validation of this method was achieved by comparison with previous reports applied to the same data set. The non-stochastic and stochastic 3D-chiral linear indices appear to provide an interesting alternative to other more common 3D-QSAR descriptors.
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Department of Pharmacy, Faculty of Chemical-Pharmacy, Chemical Bioactive Center, Central University of Las Villas, Santa Clara, 54830, Villa Clara, Cuba.
| | | |
Collapse
|
29
|
Marrero-Ponce Y, Medina-Marrero R, Castillo-Garit JA, Romero-Zaldivar V, Torrens F, Castro EA. Protein linear indices of the ‘macromolecular pseudograph α-carbon atom adjacency matrix’ in bioinformatics. Part 1: Prediction of protein stability effects of a complete set of alanine substitutions in Arc repressor. Bioorg Med Chem 2005; 13:3003-15. [PMID: 15781410 DOI: 10.1016/j.bmc.2005.01.062] [Citation(s) in RCA: 48] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2004] [Revised: 01/28/2005] [Accepted: 01/31/2005] [Indexed: 10/25/2022]
Abstract
A novel approach to bio-macromolecular design from a linear algebra point of view is introduced. A protein's total (whole protein) and local (one or more amino acid) linear indices are a new set of bio-macromolecular descriptors of relevance to protein QSAR/QSPR studies. These amino-acid level biochemical descriptors are based on the calculation of linear maps on Rn[f k(xmi):Rn-->Rn] in canonical basis. These bio-macromolecular indices are calculated from the kth power of the macromolecular pseudograph alpha-carbon atom adjacency matrix. Total linear indices are linear functional on Rn. That is, the kth total linear indices are linear maps from Rn to the scalar R[f k(xm):Rn-->R]. Thus, the kth total linear indices are calculated by summing the amino-acid linear indices of all amino acids in the protein molecule. A study of the protein stability effects for a complete set of alanine substitutions in the Arc repressor illustrates this approach. A quantitative model that discriminates near wild-type stability alanine mutants from the reduced-stability ones in a training series was obtained. This model permitted the correct classification of 97.56% (40/41) and 91.67% (11/12) of proteins in the training and test set, respectively. It shows a high Matthews correlation coefficient (MCC=0.952) for the training set and an MCC=0.837 for the external prediction set. Additionally, canonical regression analysis corroborated the statistical quality of the classification model (Rcanc=0.824). This analysis was also used to compute biological stability canonical scores for each Arc alanine mutant. On the other hand, the linear piecewise regression model compared favorably with respect to the linear regression one on predicting the melting temperature (tm) of the Arc alanine mutants. The linear model explains almost 81% of the variance of the experimental tm (R=0.90 and s=4.29) and the LOO press statistics evidenced its predictive ability (q2=0.72 and scv=4.79). Moreover, the TOMOCOMD-CAMPS method produced a linear piecewise regression (R=0.97) between protein backbone descriptors and tm values for alanine mutants of the Arc repressor. A break-point value of 51.87 degrees C characterized two mutant clusters and coincided perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutant Arc homodimers. These models also permitted the interpretation of the driving forces of such folding process, indicating that topologic/topographic protein backbone interactions control the stability profile of wild-type Arc and its alanine mutants.
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Department of Pharmacy, Faculty of Chemical-Pharmacy, Central University of Las Villas, Santa Clara, 54830 Villa Clara, Cuba.
| | | | | | | | | | | |
Collapse
|
30
|
Marrero-Ponce Y, Medina-Marrero R, Torrens F, Martinez Y, Romero-Zaldivar V, Castro EA. Atom, atom-type, and total nonstochastic and stochastic quadratic fingerprints: a promising approach for modeling of antibacterial activity. Bioorg Med Chem 2005; 13:2881-99. [PMID: 15781398 DOI: 10.1016/j.bmc.2005.02.015] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2004] [Accepted: 02/09/2005] [Indexed: 11/16/2022]
Abstract
The TOpological MOlecular COMputer Design (TOMOCOMD-CARDD) approach has been introduced for the classification and design of antimicrobial agents using computer-aided molecular design. For this propose, atom, atom-type, and total quadratic indices have been generalized to codify chemical structure information. In this sense, stochastic quadratic indices have been introduced for the description of the molecular structure. These stochastic fingerprints are based on a simple model for the intramolecular movement of all valence-bond electrons. In this work, a complete data set containing 1006 antimicrobial agents is collected and presented. Two structure-based antibacterial activity classification models have been generated. The models (including nonstochastic and stochastic indices) classify correctly more than 90% of 1525 compounds in training sets. These models permit the correct classification of 92.28% and 89.31% of 505 compounds in an external test sets. The TOMOCOMD-CARDD approach, also, satisfactorily compares with respect to nine of the most useful models for antimicrobial selection reported to date. Finally, a virtual screening of 87 new compounds reported in the antiinfective field with antibacterial activities is developed showing the ability of the TOMOCOMD-CARDD models to identify new leads as antibacterial.
Collapse
Affiliation(s)
- Yovani Marrero-Ponce
- Department of Pharmacy, Faculty of Chemical-Pharmacy, Central University of Las Villas, Santa Clara 54830, Villa Clara, Cuba.
| | | | | | | | | | | |
Collapse
|
31
|
Ponce YM, Marrero RM, Castro EA, Ramos de Armas R, Díaz HG, Zaldivar VR, Torrens F. Protein quadratic indices of the "macromolecular pseudograph's alpha-carbon atom adjacency matrix". 1. Prediction of Arc repressor alanine-mutant's stability. Molecules 2004; 9:1124-47. [PMID: 18007508 DOI: 10.3390/91201124] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2004] [Revised: 12/12/2004] [Accepted: 12/13/2004] [Indexed: 11/16/2022] Open
Abstract
This report describes a new set of macromolecular descriptors of relevance to protein QSAR/QSPR studies, protein's quadratic indices. These descriptors are calculated from the macromolecular pseudograph's alpha-carbon atom adjacency matrix. A study of the protein stability effects for a complete set of alanine substitutions in Arc repressor illustrates this approach. Quantitative Structure-Stability Relationship (QSSR) models allow discriminating between near wild-type stability and reduced-stability A-mutants. A linear discriminant function gives rise to excellent discrimination between 85.4% (35/41)and 91.67% (11/12) of near wild-type stability/reduced stability mutants in training and test series, respectively. The model's overall predictability oscillates from 80.49 until 82.93, when n varies from 2 to 10 in leave-n-out cross validation procedures. This value stabilizes around 80.49% when n was > 6. Additionally, canonical regression analysis corroborates the statistical quality of the classification model (Rcanc = 0.72, p-level <0.0001). This analysis was also used to compute biological stability canonical scores for each Arc A-mutant. On the other hand, nonlinear piecewise regression model compares favorably with respect to linear regression one on predicting the melting temperature (tm)of the Arc A-mutants. The linear model explains almost 72% of the variance of the experimental tm (R = 0.85 and s = 5.64) and LOO press statistics evidenced its predictive ability (q2 = 0.55 and scv = 6.24). However, this linear regression model falls to resolve t(m) predictions of Arc A-mutants in external prediction series. Therefore, the use of nonlinear piecewise models was required. The tm values of A-mutants in training (R = 0.94) and test(R = 0.91) sets are calculated by piecewise model with a high degree of precision. A break-point value of 51.32 degrees C characterizes two mutants' clusters and coincides perfectly with the experimental scale. For this reason, we can use the linear discriminant analysis and piecewise models in combination to classify and predict the stability of the mutants' Arc homodimers. These models also permit the interpretation of the driving forces of such a folding process. The models include protein's quadratic indices accounting for hydrophobic (z1), bulk-steric (z2), and electronic (z3) features of the studied molecules. Preponderance of z1 and z3 over z2 indicates the higher importance of the hydrophobic and electronic side chain terms in the folding of the Arc dimer. In this sense, developed equations involve short-reaching (k < or = 3), middle- reaching (3 < k < or = 7) and far-reaching (k= 8 or greater) z1, 2, 3-protein's quadratic indices. This situation points to topologic/topographic protein's backbone interactions control of the stability profile of wild-type Arc and its A-mutants. Consequently, the present approach represents a novel and very promising way to mathematical research in biology sciences.
Collapse
Affiliation(s)
- Yovani Marrero Ponce
- Department of Pharmacy, Faculty of Chemical-Pharmacy, Central University of Las Villas, Santa Clara 54830, Villa Clara, Cuba.
| | | | | | | | | | | | | |
Collapse
|