1
|
Marrero-Ponce Y, Teran JE, Contreras-Torres E, García-Jacas CR, Perez-Castillo Y, Cubillan N, Peréz-Giménez F, Valdés-Martini JR. LEGO-based generalized set of two linear algebraic 3D bio-macro-molecular descriptors: Theory and validation by QSARs. J Theor Biol 2020; 485:110039. [DOI: 10.1016/j.jtbi.2019.110039] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2019] [Revised: 09/11/2019] [Accepted: 10/02/2019] [Indexed: 11/28/2022]
|
2
|
Terán JE, Marrero-Ponce Y, Contreras-Torres E, García-Jacas CR, Vivas-Reyes R, Terán E, Torres FJ. Tensor Algebra-based Geometrical (3D) Biomacro-Molecular Descriptors for Protein Research: Theory, Applications and Comparison with other Methods. Sci Rep 2019; 9:11391. [PMID: 31388082 PMCID: PMC6684663 DOI: 10.1038/s41598-019-47858-2] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2019] [Accepted: 07/22/2019] [Indexed: 11/16/2022] Open
Abstract
In this report, a new type of tridimensional (3D) biomacro-molecular descriptors for proteins are proposed. These descriptors make use of multi-linear algebra concepts based on the application of 3-linear forms (i.e., Canonical Trilinear (Tr), Trilinear Cubic (TrC), Trilinear-Quadratic-Bilinear (TrQB) and so on) as a specific case of the N-linear algebraic forms. The definition of the kth 3-tuple similarity-dissimilarity spatial matrices (Tensor's Form) are used for the transformation and for the representation of the existing chemical information available in the relationships between three amino acids of a protein. Several metrics (Minkowski-type, wave-edge, etc) and multi-metrics (Triangle area, Bond-angle, etc) are proposed for the interaction information extraction, as well as probabilistic transformations (e.g., simple stochastic and mutual probability) to achieve matrix normalization. A generalized procedure considering amino acid level-based indices that can be fused together by using aggregator operators for descriptors calculations is proposed. The obtained results demonstrated that the new proposed 3D biomacro-molecular indices perform better than other approaches in the SCOP-based discrimination and the prediction of folding rate of proteins by using simple linear parametrical models. It can be concluded that the proposed method allows the definition of 3D biomacro-molecular descriptors that contain orthogonal information capable of providing better models for applications in protein science.
Collapse
Affiliation(s)
- Julio E Terán
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador
- Universidad San Francisco de Quito (USFQ), Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, and Instituto de Simulación Computacional (ISC-USFQ), Quito, Pichincha, Ecuador
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador.
- Universidad de San Buenaventura - Cartagena - Facultad de Ciencias de la Salud - Grupo de Investigación Microbiología & Ambiente (GIMA) - Calle Real de Ternera, Diagonal 32, No. 30-966, Cartagena, Código postal: 1300 10, Colombia.
| | - Ernesto Contreras-Torres
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador
| | - César R García-Jacas
- Cátedras CONACYT - Departamento de Ciencia de la Computación, Centro de Investigación Científica y de Educación Superior de Ensenada (CICESE), Ensenada, Baja California, Mexico
| | - Ricardo Vivas-Reyes
- Grupo de Química Cuántica y Teórica de la Universidad de Cartagena-Facultad de Ciencias Exactas y Naturales. Programa de Química. Campus de San Pablo and Grupo GINUMED Corporacion Universitaria Rafal Nuñez. Facultad de Salud. Programa de Medicina., Cartagena, Colombia
- Grupo CipTec, Facultad de Ingenierias. Fundacion Universitaria Tecnologico Comfenalco - Cartagena, Cartagena, Bolívar, Colombia
| | - Enrique Terán
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Translacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Pichincha, Ecuador
| | - F Javier Torres
- Universidad San Francisco de Quito (USFQ), Grupo de Química Computacional y Teórica (QCT-USFQ), Departamento de Ingeniería Química, and Instituto de Simulación Computacional (ISC-USFQ), Quito, Pichincha, Ecuador
| |
Collapse
|
3
|
García-Jacas CR, Marrero-Ponce Y, Cortés-Guzmán F, Suárez-Lezcano J, Martinez-Rios FO, García-González LA, Pupo-Meriño M, Martinez-Mayorga K. Enhancing Acute Oral Toxicity Predictions by using Consensus Modeling and Algebraic Form-Based 0D-to-2D Molecular Encodes. Chem Res Toxicol 2019; 32:1178-1192. [PMID: 31066547 DOI: 10.1021/acs.chemrestox.9b00011] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Quantitative structure-activity relationships (QSAR) are introduced to predict acute oral toxicity (AOT), by using the QuBiLS-MAS (acronym for quadratic, bilinear and N-Linear maps based on graph-theoretic electronic-density matrices and atomic weightings) framework for the molecular encoding. Three training sets were employed to build the models: EPA training set (5931 compounds), EPA-full training set (7413 compounds), and Zhu training set (10 152 compounds). Additionally, the EPA test set (1482 compounds) was used for the validation of the QSAR models built on the EPA training set, while the ProTox (425 compounds) and T3DB (284 compounds) external sets were employed for the assessment of all the models. The k-nearest neighbor, multilayer perceptron, random forest, and support vector machine procedures were employed to build several base (individual) models. The base models with REPA-training ≥ 0.75 ( R = correlation coefficient) and MAEEPA-training ≤ 0.5 (MAE = mean absolute error) were retained to build consensus models. As a result, two consensus models based on the minimum operator and denoted as M19 and M22, as well as a consensus model based on the weighted average operator and denoted as M24, were selected as the best ones for each training set considered. According to the applicability domain (AD) analysis performed, model M19 (built on the EPA training set) has MAEtest-AD = 0.4044, MAEProTox-AD = 0.4067 and MAET3DB-AD = 0.2586 on the EPA test set, ProTox external set, and T3DB external set, respectively; whereas model M22 (built on the EPA-full set) and model M24 (built on the Zhu set) present MAEProTox-AD = 0.3992 and MAET3DB-AD = 0.2286, and MAEProTox-AD = 0.3773 and MAET3DB-AD = 0.2471 on the two external sets accounted for, respectively. These outcomes were compared and statistically validated with respect to 14 QSAR methods (e.g., admetSAR, ProTox-II) from the literature. As a result, model M22 presents the best overall performance. In addition, a retrospective study on 261 withdrawn drugs due to their toxic/side effects was performed, to assess the usefulness of prospectively using the QSAR models proposed in the labeling of chemicals. A comparison with regard to the methods from the literature was also made. As a result, model M22 has the best ability of labeling a compound as toxic according to the globally harmonized system of classification and labeling of chemicals. Therefore, it can be concluded that the models proposed, especially model M22, constitute prominent tools for studying AOT, at providing the best results among all the methods examined. A freely available software was also developed to be used in virtual screening tasks ( http://tomocomd.com/apps/ptoxra ).
Collapse
Affiliation(s)
- César R García-Jacas
- Departamento de Ciencias de la Computación , Centro de Investigación Científica y de Educación Superior de Ensenada , Ensenada , Baja California , México
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito, Grupo de Medicina Molecular y Traslacional, Colegio de Ciencias de la Salud , Escuela de Medicina, Edificio de Especialidades Médicas , Quito , Pichincha , Ecuador.,Grupo de Investigación Ambiental, Programas Ambientales, Facultad de Ingenierías , Fundacion Universitaria Tecnologico Comfenalco-Cartagena , Cr44 DN 30 A, 91 , Cartagena , Bolívar , Colombia
| | - Fernando Cortés-Guzmán
- Instituto de Química , Universidad Nacional Autónoma de México , Ciudad de México , México
| | - José Suárez-Lezcano
- Pontificia Universidad Católica del Ecuador Sede Esmeraldas , Esmeraldas , Ecuador
| | | | - Luis A García-González
- Grupo de Investigación de Bioinformática , Universidad de las Ciencias Informáticas , La Habana , Cuba
| | - Mario Pupo-Meriño
- Grupo de Investigación de Bioinformática , Universidad de las Ciencias Informáticas , La Habana , Cuba
| | | |
Collapse
|
4
|
Casañola-Martin GM, Pham-The H, Castillo-Garit JA, Le-Thi-Thu H. Atom based linear index descriptors in QSAR-machine learning classifiers for the prediction of ubiquitin-proteasome pathway activity. Med Chem Res 2018. [DOI: 10.1007/s00044-017-2091-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
5
|
Valdés-Martiní JR, Marrero-Ponce Y, García-Jacas CR, Martinez-Mayorga K, Barigye SJ, Vaz d'Almeida YS, Pham-The H, Pérez-Giménez F, Morell CA. QuBiLS-MAS, open source multi-platform software for atom- and bond-based topological (2D) and chiral (2.5D) algebraic molecular descriptors computations. J Cheminform 2017; 9:35. [PMID: 29086120 PMCID: PMC5462671 DOI: 10.1186/s13321-017-0211-5] [Citation(s) in RCA: 50] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2016] [Accepted: 04/07/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND In previous reports, Marrero-Ponce et al. proposed algebraic formalisms for characterizing topological (2D) and chiral (2.5D) molecular features through atom- and bond-based ToMoCoMD-CARDD (acronym for Topological Molecular Computational Design-Computer Aided Rational Drug Design) molecular descriptors. These MDs codify molecular information based on the bilinear, quadratic and linear algebraic forms and the graph-theoretical electronic-density and edge-adjacency matrices in order to consider atom- and bond-based relations, respectively. These MDs have been successfully applied in the screening of chemical compounds of different therapeutic applications ranging from antimalarials, antibacterials, tyrosinase inhibitors and so on. To compute these MDs, a computational program with the same name was initially developed. However, this in house software barely offered the functionalities required in contemporary molecular modeling tasks, in addition to the inherent limitations that made its usability impractical. Therefore, the present manuscript introduces the QuBiLS-MAS (acronym for Quadratic, Bilinear and N-Linear mapS based on graph-theoretic electronic-density Matrices and Atomic weightingS) software designed to compute topological (0-2.5D) molecular descriptors based on bilinear, quadratic and linear algebraic forms for atom- and bond-based relations. RESULTS The QuBiLS-MAS module was designed as standalone software, in which extensions and generalizations of the former ToMoCoMD-CARDD 2D-algebraic indices are implemented, considering the following aspects: (a) two new matrix normalization approaches based on double-stochastic and mutual probability formalisms; (b) topological constraints (cut-offs) to take into account particular inter-atomic relations; (c) six additional atomic properties to be used as weighting schemes in the calculation of the molecular vectors; (d) four new local-fragments to consider molecular regions of interest; (e) number of lone-pair electrons in chemical structure defined by diagonal coefficients in matrix representations; and (f) several aggregation operators (invariants) applied over atom/bond-level descriptors in order to compute global indices. This software permits the parallel computation of the indices, contains a batch processing module and data curation functionalities. This program was developed in Java v1.7 using the Chemistry Development Kit library (version 1.4.19). The QuBiLS-MAS software consists of two components: a desktop interface (GUI) and an API library allowing for the easy integration of the latter in chemoinformatics applications. The relevance of the novel extensions and generalizations implemented in this software is demonstrated through three studies. Firstly, a comparative Shannon's entropy based variability study for the proposed QuBiLS-MAS and the DRAGON indices demonstrates superior performance for the former. A principal component analysis reveals that the QuBiLS-MAS approach captures chemical information orthogonal to that codified by the DRAGON descriptors. Lastly, a QSAR study for the binding affinity to the corticosteroid-binding globulin using Cramer's steroid dataset is carried out. CONCLUSIONS From these analyses, it is revealed that the QuBiLS-MAS approach for atom-pair relations yields similar-to-superior performance with regard to other QSAR methodologies reported in the literature. Therefore, the QuBiLS-MAS approach constitutes a useful tool for the diversity analysis of chemical compound datasets and high-throughput screening of structure-activity data.
Collapse
Affiliation(s)
- José R Valdés-Martiní
- StreelBridge Laboratories, SteelBridge Consulting Technology Solutions, Miami, FL, USA
| | - Yovani Marrero-Ponce
- Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA), Escuela de Medicina, Edificio de Especialidades Médicas, Quito, Ecuador. .,Universidad San Francisco de Quito (USFQ), Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica, 170157, Quito, Pichincha, Ecuador. .,Computer-Aided Molecular "Biosilico" Discovery and Bioinformatics Research International Network (CAMD-BIR IN), Cumbayá, Quito, Ecuador. .,Grupo de Investigación Ambiental (GIA), Fundación Universitaria Tecnológico de Comfenalco, Facultad de Ingenierías, Programa de Ingeniería de Procesos, Cartagena de Indias, Bolívar, Colombia. .,Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain.
| | - César R García-Jacas
- Instituto de Química, Universidad Nacional Autónoma de México (UNAM), Ciudad de México, México.,Escuela de Sistemas y Computación, Pontificia Universidad Católica del Ecuador Sede Esmeraldas (PUCESE), Esmeraldas, Ecuador.,Grupo de Investigación de Bioinformática, Universidad de las Ciencias Informáticas (UCI), Havana, Cuba
| | - Karina Martinez-Mayorga
- Instituto de Química, Universidad Nacional Autónoma de México (UNAM), Ciudad de México, México
| | - Stephen J Barigye
- Facultad de Medicina, Universidad de Las Américas, Quito, Pichincha, Ecuador
| | | | - Hai Pham-The
- Department of Pharmaceutical Chemistry, Hanoi University of Pharmacy, 13-15 Le Thanh Tong, Hoan Kiem, Hanoi, Vietnam
| | - Facundo Pérez-Giménez
- Unidad de Investigación de Diseño de Fármacos y Conectividad Molecular, Departamento de Química Física, Facultad de Farmacia, Universitat de València, Valencia, Spain
| | - Carlos A Morell
- Laboratorio de Inteligencia Artificial, Centro de Estudios de Informática (CEI), Facultad de Matemática, Física y Computación, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, Villa Clara, Cuba
| |
Collapse
|
6
|
Martínez-Santiago O, Marrero-Ponce Y, Vivas-Reyes R, Rivera-Borroto OM, Hurtado E, Treto-Suarez MA, Ramos Y, Vergara-Murillo F, Orozco-Ugarriza ME, Martínez-López Y. Exploring the QSAR's predictive truthfulness of the novel N-tuple discrete derivative indices on benchmark datasets. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2017; 28:367-389. [PMID: 28590848 DOI: 10.1080/1062936x.2017.1326403] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2017] [Accepted: 04/27/2017] [Indexed: 06/07/2023]
Abstract
Graph derivative indices (GDIs) have recently been defined over N-atoms (N = 2, 3 and 4) simultaneously, which are based on the concept of derivatives in discrete mathematics (finite difference), metaphorical to the derivative concept in classical mathematical analysis. These molecular descriptors (MDs) codify topo-chemical and topo-structural information based on the concept of the derivative of a molecular graph with respect to a given event (S) over duplex, triplex and quadruplex relations of atoms (vertices). These GDIs have been successfully applied in the description of physicochemical properties like reactivity, solubility and chemical shift, among others, and in several comparative quantitative structure activity/property relationship (QSAR/QSPR) studies. Although satisfactory results have been obtained in previous modelling studies with the aforementioned indices, it is necessary to develop new, more rigorous analysis to assess the true predictive performance of the novel structure codification. So, in the present paper, an assessment and statistical validation of the performance of these novel approaches in QSAR studies are executed, as well as a comparison with those of other QSAR procedures reported in the literature. To achieve the main aim of this research, QSARs were developed on eight chemical datasets widely used as benchmarks in the evaluation/validation of several QSAR methods and/or many different MDs (fundamentally 3D MDs). Three to seven variable QSAR models were built for each chemical dataset, according to the original dissection into training/test sets. The models were developed by using multiple linear regression (MLR) coupled with a genetic algorithm as the feature wrapper selection technique in the MobyDigs software. Each family of GDIs (for duplex, triplex and quadruplex) behaves similarly in all modelling, although there were some exceptions. However, when all families were used in combination, the results achieved were quantitatively higher than those reported by other authors in similar experiments. Comparisons with respect to external correlation coefficients (q2ext) revealed that the models based on GDIs possess superior predictive ability in seven of the eight datasets analysed, outperforming methodologies based on similar or more complex techniques and confirming the good predictive power of the obtained models. For the q2ext values, the non-parametric comparison revealed significantly different results to those reported so far, which demonstrated that the models based on DIVATI's indices presented the best global performance and yielded significantly better predictions than the 12 0-3D QSAR procedures used in the comparison. Therefore, GDIs are suitable for structure codification of the molecules and constitute a good alternative to build QSARs for the prediction of physicochemical, biological and environmental endpoints.
Collapse
Affiliation(s)
- O Martínez-Santiago
- a Department of Chemical Sciences , Central University 'Martha Abreu' of Las Villas , Santa Clara , Cuba
- b Unit of Computer-Aided Molecular 'Biosilico' Discovery and Bioinformatics Research International Network (CAMD-BIR IN) , Quito , Ecuador
- c Group of Quantum and Theoretical Chemistry , University of Cartagena , Cartagena de Indias , Colombia
- d Facultad de Ingeniería , Grupo CipTec, Fundación Universitaria Tecnológico Comfenalco , Cartagena de Indias , Colombia
| | - Y Marrero-Ponce
- b Unit of Computer-Aided Molecular 'Biosilico' Discovery and Bioinformatics Research International Network (CAMD-BIR IN) , Quito , Ecuador
- e Escuela de Medicina, Edificio de Especialidades Médicas , Universidad San Francisco de Quito (USFQ), Grupo de Medicina Molecular y Traslacional (MeM&T), Colegio de Ciencias de la Salud (COCSA) , Av. Interoceánica Km 12 ½, Cumbayá , Ecuador
- f Instituto de Simulación Computacional (ISC-USFQ), Diego de Robles y vía Interoceánica , Quito , Ecuador
- g Grupo de Investigación Ambiental (GIA) , Fundación Universitaria Tecnológico de Comfenalco , Cartagena de Indias , Colombia
| | - R Vivas-Reyes
- c Group of Quantum and Theoretical Chemistry , University of Cartagena , Cartagena de Indias , Colombia
- d Facultad de Ingeniería , Grupo CipTec, Fundación Universitaria Tecnológico Comfenalco , Cartagena de Indias , Colombia
| | - O M Rivera-Borroto
- b Unit of Computer-Aided Molecular 'Biosilico' Discovery and Bioinformatics Research International Network (CAMD-BIR IN) , Quito , Ecuador
- h Departamento de Química Física Aplicada , Universidad Autónoma de Madrid (UAM) , Madrid , España
| | - E Hurtado
- b Unit of Computer-Aided Molecular 'Biosilico' Discovery and Bioinformatics Research International Network (CAMD-BIR IN) , Quito , Ecuador
| | - M A Treto-Suarez
- i Center of Applied Nanosciences (CENAP), Andres Bello University , Chile
| | - Y Ramos
- j Department of Economic Sciences , University of Camagüey , Camagüey , Cuba
| | - F Vergara-Murillo
- c Group of Quantum and Theoretical Chemistry , University of Cartagena , Cartagena de Indias , Colombia
- d Facultad de Ingeniería , Grupo CipTec, Fundación Universitaria Tecnológico Comfenalco , Cartagena de Indias , Colombia
| | - M E Orozco-Ugarriza
- k Seccional Cartagena y Grupo de Investigación Traslacional en Biomedicina & Biotecnología - GITB&B , Universidad del Sinú - Elías Bechara Zainúm , Cartagena de Indias , Colombia
| | - Y Martínez-López
- b Unit of Computer-Aided Molecular 'Biosilico' Discovery and Bioinformatics Research International Network (CAMD-BIR IN) , Quito , Ecuador
- l Grupo de Investigación de Inteligencia Artificial (AIRES) , Universidad de Camagüey , Camagüey , Cuba
| |
Collapse
|
7
|
Castillo-Garit JA, del Toro-Cortés O, Vega MC, Rolón M, Rojas de Arias A, Casañola-Martin GM, Escario JA, Gómez-Barrio A, Marrero-Ponce Y, Torrens F, Abad C. Bond-based bilinear indices for computational discovery of novel trypanosomicidal drug-like compounds through virtual screening. Eur J Med Chem 2015; 96:238-44. [PMID: 25884114 DOI: 10.1016/j.ejmech.2015.03.063] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2014] [Revised: 02/27/2015] [Accepted: 03/27/2015] [Indexed: 11/25/2022]
Abstract
Two-dimensional bond-based bilinear indices and linear discriminant analysis are used in this report to perform a quantitative structure-activity relationship study to identify new trypanosomicidal compounds. A data set of 440 organic chemicals, 143 with antitrypanosomal activity and 297 having other clinical uses, is used to develop the theoretical models. Two discriminant models, computed using bond-based bilinear indices, are developed and both show accuracies higher than 86% for training and test sets. The stochastic model correctly indentifies nine out of eleven compounds of a set of organic chemicals obtained from our synthetic collaborators. The in vitro antitrypanosomal activity of this set against epimastigote forms of Trypanosoma cruzi is assayed. Both models show a good agreement between theoretical predictions and experimental results. Three compounds showed IC50 values for epimastigote elimination (AE) lower than 50 μM, while for the benznidazole the IC50 = 54.7 μM which was used as reference compound. The value of IC50 for cytotoxicity of these compounds is at least 5 times greater than their value of IC50 for AE. Finally, we can say that, the present algorithm constitutes a step forward in the search for efficient ways of discovering new antitrypanosomal compounds.
Collapse
Affiliation(s)
- Juan Alberto Castillo-Garit
- Centro de Estudio de Química Aplicada, Facultad de Química-Farmacia, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, 54830, Villa Clara, Cuba; Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Faculty of Chemistry-Pharmacy, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, 54830, Villa Clara, Cuba; Departament de Bioquímica i Biologia Molecular, Universitat de València, E-46100, Burjassot, Spain; Institut Universitari de Ciència Molecular, Universitat de València, Edifici d'Instituts de Paterna, P.O. Box 22085, E-46071, València, Spain.
| | - Oremia del Toro-Cortés
- Centro de Estudio de Química Aplicada, Facultad de Química-Farmacia, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, 54830, Villa Clara, Cuba
| | - Maria C Vega
- Centro para el Desarrollo de la Investigacion Cientifica (CEDIC) and Fundación Moisés Bertoni/Laboratorios Díaz Gill, Pai Perez 265 casi Mariscal Estigarribia, Asuncion, Paraguay
| | - Miriam Rolón
- Centro para el Desarrollo de la Investigacion Cientifica (CEDIC) and Fundación Moisés Bertoni/Laboratorios Díaz Gill, Pai Perez 265 casi Mariscal Estigarribia, Asuncion, Paraguay
| | - Antonieta Rojas de Arias
- Centro para el Desarrollo de la Investigacion Cientifica (CEDIC) and Fundación Moisés Bertoni/Laboratorios Díaz Gill, Pai Perez 265 casi Mariscal Estigarribia, Asuncion, Paraguay
| | - Gerardo M Casañola-Martin
- Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Faculty of Chemistry-Pharmacy, Universidad Central "Marta Abreu" de Las Villas, Santa Clara, 54830, Villa Clara, Cuba; Departament de Bioquímica i Biologia Molecular, Universitat de València, E-46100, Burjassot, Spain; Centro de Información y Gestión Tecnológica, Ministerio de Ciencia Tecnología y Medio Ambiente (CITMA), 65100, Ciego de Ávila, Cuba
| | - José A Escario
- Departamento de Parasitología, Facultad de Farmacia, UCM, Pza. Ramón y Cajal s/n, 28040, Madrid, Spain
| | - Alicia Gómez-Barrio
- Departamento de Parasitología, Facultad de Farmacia, UCM, Pza. Ramón y Cajal s/n, 28040, Madrid, Spain
| | - Yovani Marrero-Ponce
- Enviromental and Computational Chemistry Group, Facultad de Química Farmacéutica, Universidad de Cartagena,Cartagena de Indias, Bolivar, Colombia
| | - Francisco Torrens
- Institut Universitari de Ciència Molecular, Universitat de València, Edifici d'Instituts de Paterna, P.O. Box 22085, E-46071, València, Spain
| | - Concepción Abad
- Departament de Bioquímica i Biologia Molecular, Universitat de València, E-46100, Burjassot, Spain
| |
Collapse
|
8
|
García-Jacas CR, Marrero-Ponce Y, Acevedo-Martínez L, Barigye SJ, Valdés-Martiní JR, Contreras-Torres E. QuBiLS-MIDAS: a parallel free-software for molecular descriptors computation based on multilinear algebraic maps. J Comput Chem 2014; 35:1395-409. [PMID: 24889018 DOI: 10.1002/jcc.23640] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2014] [Revised: 04/22/2014] [Accepted: 04/23/2014] [Indexed: 11/12/2022]
Abstract
The present report introduces the QuBiLS-MIDAS software belonging to the ToMoCoMD-CARDD suite for the calculation of three-dimensional molecular descriptors (MDs) based on the two-linear (bilinear), three-linear, and four-linear (multilinear or N-linear) algebraic forms. Thus, it is unique software that computes these tensor-based indices. These descriptors, establish relations for two, three, and four atoms by using several (dis-)similarity metrics or multimetrics, matrix transformations, cutoffs, local calculations and aggregation operators. The theoretical background of these N-linear indices is also presented. The QuBiLS-MIDAS software was developed in the Java programming language and employs the Chemical Development Kit library for the manipulation of the chemical structures and the calculation of the atomic properties. This software is composed by a desktop user-friendly interface and an Abstract Programming Interface library. The former was created to simplify the configuration of the different options of the MDs, whereas the library was designed to allow its easy integration to other software for chemoinformatics applications. This program provides functionalities for data cleaning tasks and for batch processing of the molecular indices. In addition, it offers parallel calculation of the MDs through the use of all available processors in current computers. The studies of complexity of the main algorithms demonstrate that these were efficiently implemented with respect to their trivial implementation. Lastly, the performance tests reveal that this software has a suitable behavior when the amount of processors is increased. Therefore, the QuBiLS-MIDAS software constitutes a useful application for the computation of the molecular indices based on N-linear algebraic maps and it can be used freely to perform chemoinformatics studies.
Collapse
Affiliation(s)
- César R García-Jacas
- Grupo de Investigación de Bioinformática, Centro de Estudio de Matemática Computacional, Universidad de las Ciencias Informáticas, La Habana, Cuba; Unit of Computer-Aided Molecular "Biosilico" Discovery and Bioinformatic Research (CAMD-BIR Unit), Faculty of Chemistry-Pharmacy, Universidad Central "Martha Abreu" de Las Villas, Santa Clara, 54830, Villa Clara, Cuba
| | | | | | | | | | | |
Collapse
|
9
|
Nesmerak K, Toropov AA, Toropova AP. SMILES-based quantitative structure–retention relationships for RP HPLC of 1-phenyl-5-benzylsulfanyltetrazoles. Struct Chem 2013. [DOI: 10.1007/s11224-013-0293-2] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
10
|
Brito-Sánchez Y, Castillo-Garit JA, Le-Thi-Thu H, González-Madariaga Y, Torrens F, Marrero-Ponce Y, Rodríguez-Borges JE. Comparative study to predict toxic modes of action of phenols from molecular structures. SAR AND QSAR IN ENVIRONMENTAL RESEARCH 2013; 24:235-251. [PMID: 23437773 DOI: 10.1080/1062936x.2013.766260] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/01/2023]
Abstract
Quantitative structure-activity relationship models for the prediction of mode of toxic action (MOA) of 221 phenols to the ciliated protozoan Tetrahymena pyriformis using atom-based quadratic indices are reported. The phenols represent a variety of MOAs including polar narcotics, weak acid respiratory uncouplers, pro-electrophiles and soft electrophiles. Linear discriminant analysis (LDA), and four machine learning techniques (ML), namely k-nearest neighbours (k-NN), support vector machine (SVM), classification trees (CTs) and artificial neural networks (ANNs), have been used to develop several models with higher accuracies and predictive capabilities for distinguishing between four MOAs. Most of them showed global accuracy of over 90%, and false alarm rate values were below 2.9% for the training set. Cross-validation, complementary subsets and external test set were performed, with good behaviour in all cases. Our models compare favourably with other previously published models, and in general the models obtained with ML techniques show better results than those developed with linear techniques. We developed unsupervised and supervised consensus, and these results were better than our ML models, the results of rule-based approach and other ensemble models previously published. This investigation highlights the merits of ML-based techniques as an alternative to other more traditional methods for modelling MOA.
Collapse
Affiliation(s)
- Y Brito-Sánchez
- Unit of Computer-Aided Molecular Biosilico Discovery and Bioinformatic Research, Faculty of Chemistry-Pharmacy, Universidad Central Marta Abreu de Las Villas, Santa Clara, Cuba
| | | | | | | | | | | | | |
Collapse
|
11
|
Quintero FA, Patel SJ, Muñoz F, Sam Mannan M. Review of Existing QSAR/QSPR Models Developed for Properties Used in Hazardous Chemicals Classification System. Ind Eng Chem Res 2012. [DOI: 10.1021/ie301079r] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Affiliation(s)
- Flor A. Quintero
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University System, College Station, Texas 77843-3122, United States
- Departamento de
Ingeniería Química, Universidad de los Andes, Cr.1 Este #19 A-40, Bogotá D.C.,
Colombia
| | - Suhani J. Patel
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University System, College Station, Texas 77843-3122, United States
| | - Felipe Muñoz
- Departamento de
Ingeniería Química, Universidad de los Andes, Cr.1 Este #19 A-40, Bogotá D.C.,
Colombia
| | - M. Sam Mannan
- Mary Kay O’Connor Process Safety Center, Artie McFerrin Department of Chemical Engineering, Texas A&M University System, College Station, Texas 77843-3122, United States
| |
Collapse
|
12
|
Castillo-Garit JA, del Toro-Cortés O, Kouznetsov VV, Puentes CO, Romero Bohórquez AR, Vega MC, Rolón M, Escario JA, Gómez-Barrio A, Marrero-Ponce Y, Torrens F, Abad C. Identification In Silico and In Vitro of Novel Trypanosomicidal Drug-Like Compounds. Chem Biol Drug Des 2012; 80:38-45. [DOI: 10.1111/j.1747-0285.2012.01378.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
|
13
|
Ligand-based discovery of novel trypanosomicidal drug-like compounds: In silico identification and experimental support. Eur J Med Chem 2011; 46:3324-30. [DOI: 10.1016/j.ejmech.2011.04.057] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2010] [Revised: 04/26/2011] [Accepted: 04/26/2011] [Indexed: 01/08/2023]
|
14
|
Ortega-Broche SE, Marrero-Ponce Y, Díaz YE, Torrens F, Pérez-Giménez F. tomocomd-camps and protein bilinear indices - novel bio-macromolecular descriptors for protein research: I. Predicting protein stability effects of a complete set of alanine substitutions in the Arc repressor. FEBS J 2010; 277:3118-46. [DOI: 10.1111/j.1742-4658.2010.07711.x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
15
|
Toropov AA, Toropova AP, Benfenati E, Leszczynska D, Leszczynski J. SMILES-based optimal descriptors: QSAR analysis of fullerene-based HIV-1 PR inhibitors by means of balance of correlations. J Comput Chem 2010; 31:381-92. [PMID: 19479738 DOI: 10.1002/jcc.21333] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Quantitative structure-activity relationships (QSAR) for prediction of binding affinities (pEC50, i.e., minus decimal logarithm of the 50% effective concentration) of 20 fullerene derivatives inhibitors of the HIV-1 PR (human immunodeficiency virus type 1 protease) have been developed by application of the optimal descriptors approach calculated with SMILES (simplified molecular input line entry system). The applied models were constructed by the balance of correlations. Three various splits of the experimental data into subtraining set, calibration set, and test set were examined. Comparison of classic scheme (training-test system) and the balance of correlations (subtraining-calibration-test system) show that the balance of correlations gives more robust predictions than the classic scheme for the pEC50 of the fullerene derivatives.
Collapse
Affiliation(s)
- Andrey A Toropov
- Institute of Geology and Geophysics, Laboratory of Physicochemical Methods of Analysis, Khodzhibaev St. 49, 100041 Tashkent, Uzbekistan.
| | | | | | | | | |
Collapse
|
16
|
Toropov AA, Toropova AP, Benfenati E, Manganaro A. QSPR modeling of enthalpies of formation for organometallic compounds by SMART-based optimal descriptors. J Comput Chem 2009; 30:2576-82. [PMID: 19373829 DOI: 10.1002/jcc.21263] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
A quantitative structure-property relationship (QSPR) model for the prediction of gas-phase enthalpy of formation has been developed, using as chemical information descriptors based on the SMART notation, which is an alternative to SMILES. The model is one-variable equation. The SMART-based descriptors are calculated with correlation weights of SMART attributes which are obtained by the Monte Carlo method. The model addressed organometallic compounds. Statistical characteristics of the model are the following: n = 104, r2 = 0.9944, s = 19.6 (kJ/mol), F = 18,269 (training set) and n = 28, r2 = 0.9909, s = 28.8 (kJ/mol), F = 2832 (test set).
Collapse
Affiliation(s)
- Andrey A Toropov
- Institute of Geology and Geophysics, Laboratory of Physicochemical Methods of Analysis, Khodzhibaev Street 49, 100041 Tashkent, Uzbekistan.
| | | | | | | |
Collapse
|
17
|
Castillo-Garit JA, Vega MC, Rolon M, Marrero-Ponce Y, Kouznetsov VV, Torres DFA, Gómez-Barrio A, Bello AA, Montero A, Torrens F, Pérez-Giménez F. Computational discovery of novel trypanosomicidal drug-like chemicals by using bond-based non-stochastic and stochastic quadratic maps and linear discriminant analysis. Eur J Pharm Sci 2009; 39:30-6. [PMID: 19854271 DOI: 10.1016/j.ejps.2009.10.007] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2009] [Revised: 10/01/2009] [Accepted: 10/13/2009] [Indexed: 11/28/2022]
Abstract
Herein we present results of a quantitative structure-activity relationship (QSAR) studies to classify and design, in a rational way, new antitrypanosomal compounds by using non-stochastic and stochastic bond-based quadratic indices. A data set of 440 organic chemicals, 143 with antitrypanosomal activity and 297 having other clinical uses, is used to develop QSAR models based on linear discriminant analysis (LDA). Non-stochastic model correctly classifies more than 93% and 95% of chemicals in both training and external prediction groups, respectively. On the other hand, the stochastic model shows an accuracy of about the 87% for both series. As an experiment of virtual lead generation, the present approach is finally satisfactorily applied to the virtual evaluation of 9 already synthesized in house compounds. The in vitro antitrypanosomal activity of this series against epimastigote forms of Trypanosoma cruzi is assayed. The model is able to predict correctly the behaviour for the majority of these compounds. Four compounds (FER16, FER32, FER33 and FER 132) showed more than 70% of epimastigote inhibition at a concentration of 100 microg/mL (86.74%, 78.12%, 88.85% and 72.10%, respectively) and two of these chemicals, FER16 (78.22% of AE) and FER33 (81.31% of AE), also showed good activity at a concentration of 10 microg/mL. At the same concentration, compound FER16 showed lower value of cytotoxicity (15.44%), and compound FER33 showed very low value of 1.37%. Taking into account all these results, we can say that these three compounds can be optimized in forthcoming works, but we consider that compound FER33 is the best candidate. Even though none of them resulted more active than Nifurtimox, the current results constitute a step forward in the search for efficient ways to discover new lead antitrypanosomals.
Collapse
Affiliation(s)
- Juan Alberto Castillo-Garit
- Applied Chemistry Research Center, Faculty of Chemistry-Pharmacy, Central University of Las Villas, Santa Clara, 54830, Villa Clara, Cuba.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|