1
|
Busch JD, Stone NE, Pemberton GL, Roberts ML, Turner RE, Thornton NB, Sahl JW, Lemmer D, Buckmeier G, Davis SK, Guerrero-Solorio RI, Karim S, Klafke G, Thomas DB, Olafson PU, Ueti M, Mosqueda J, Scoles GA, Wagner DM. Fourteen anti-tick vaccine targets are variably conserved in cattle fever ticks. Parasit Vectors 2025; 18:140. [PMID: 40234925 PMCID: PMC12001435 DOI: 10.1186/s13071-025-06683-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Accepted: 01/23/2025] [Indexed: 04/17/2025] Open
Abstract
BACKGROUND Rhipicephalus (Boophilus) microplus causes significant cattle production losses worldwide because it transmits Babesia bovis and B. bigemina, the causative agents of bovine babesiosis. Control of these ticks has primarily relied on treatment of cattle with chemical acaricides, but frequent use, exacerbated by the one-host lifecycle of these ticks, has led to high-level resistance to multiple classes of acaricides. Consequently, new approaches for control, such as anti-tick vaccines, are critically important. Key to this approach is targeting highly conserved antigenic epitopes to reduce the risk of vaccine escape in heterologous tick populations. METHODS We evaluated amino acid conservation within 14 tick proteins across 167 R. microplus collected from geographically diverse locations in the Americas and Pakistan using polymerase chain reaction (PCR) amplicon sequencing and in silico translation of exons. RESULTS We found that amino acid conservation varied considerably across these proteins. Only the voltage-dependent anion channel (VDAC) was fully conserved in all R. microplus samples (protein similarity 1.0). Four other proteins were highly conserved: the aquaporin RmAQP1 (0.989), vitellogenin receptor (0.985), serpin-1 (0.985), and subolesin (0.981). In contrast, the glycoprotein Bm86 was one of the least conserved (0.889). The Bm86 sequence used in the original Australian TickGARD vaccine carried many amino acid replacements compared with the R. microplus populations examined here, supporting the hypothesis that this vaccine target is not optimal for use in the Americas. By mapping amino acid replacements onto predicted three-dimensional (3D) protein models, we also identified amino acid changes within several small-peptide vaccines targeting portions of the aquaporin RmAQP2, chitinase, and Bm86. CONCLUSIONS These findings emphasize the importance of thoroughly analyzing protein variation within anti-tick vaccine targets across diverse tick populations before selecting candidate vaccine antigens. When considering protein conservation alone, RmAQP1, vitellogenin receptor, serpin-1, subolesin, and especially VDAC rank as high-priority anti-tick vaccine candidates for use in the Americas and perhaps globally.
Collapse
Affiliation(s)
- Joseph D Busch
- Pathogen and Microbiome Institute, Northern Arizona University, 1395 S. Knoles Dr. Bldg 56, Flagstaff, AZ, 86011-4073, USA.
| | - Nathan E Stone
- Pathogen and Microbiome Institute, Northern Arizona University, 1395 S. Knoles Dr. Bldg 56, Flagstaff, AZ, 86011-4073, USA
| | - Grant L Pemberton
- Pathogen and Microbiome Institute, Northern Arizona University, 1395 S. Knoles Dr. Bldg 56, Flagstaff, AZ, 86011-4073, USA
| | - Mackenzie L Roberts
- Pathogen and Microbiome Institute, Northern Arizona University, 1395 S. Knoles Dr. Bldg 56, Flagstaff, AZ, 86011-4073, USA
| | - Rebekah E Turner
- Pathogen and Microbiome Institute, Northern Arizona University, 1395 S. Knoles Dr. Bldg 56, Flagstaff, AZ, 86011-4073, USA
| | - Natalie B Thornton
- Pathogen and Microbiome Institute, Northern Arizona University, 1395 S. Knoles Dr. Bldg 56, Flagstaff, AZ, 86011-4073, USA
| | - Jason W Sahl
- Pathogen and Microbiome Institute, Northern Arizona University, 1395 S. Knoles Dr. Bldg 56, Flagstaff, AZ, 86011-4073, USA
| | - Darrin Lemmer
- TGen-North, 3051 W. Shamrell Blvd #106, Flagstaff, AZ, 86005, USA
| | - Greta Buckmeier
- USDA, ARS, KBUSLIRL-LAPRU, 2700 Fredericksburg Rd., Kerrville, TX, 78028-9184, USA
| | - Sara K Davis
- USDA, ARS, ADRU, Washington State University, 3003 ADBF, Pullman, WA, 99164-6630, USA
| | - Roberto I Guerrero-Solorio
- Immunology and Vaccine Research Laboratory, Natural Sciences College, Autonomous University of Querétaro, 76230, Querétaro, Mexico
| | - Shahid Karim
- School of Biological, Environmental, and Earth Sciences, University of Southern Mississippi, 118 College Drive, Hattiesburg, MS, 39406, USA
| | - Guilherme Klafke
- Instituto de Pesquisas Veterinarias Desidério Finamor, Estrada do conde, 6000, Eldorado do sul, 92990-000, Brazil
| | - Donald B Thomas
- Cattle Fever Tick Research Laboratory, USDA, ARS, Moore Air Base, Building 6419, 22675 N. Moorefield Road, Edinburg, TX, 78541, USA
| | - Pia U Olafson
- USDA, ARS, KBUSLIRL-LAPRU, 2700 Fredericksburg Rd., Kerrville, TX, 78028-9184, USA
| | - Massaro Ueti
- USDA, ARS, ADRU, Washington State University, 3003 ADBF, Pullman, WA, 99164-6630, USA
| | - Juan Mosqueda
- Immunology and Vaccine Research Laboratory, Natural Sciences College, Autonomous University of Querétaro, 76230, Querétaro, Mexico
| | - Glen A Scoles
- USDA, ARS, IIBBL, Beltsville Agricultural Research Center, 10300 Baltimore Ave., Beltsville, MD, 20705, USA
| | - David M Wagner
- Pathogen and Microbiome Institute, Northern Arizona University, 1395 S. Knoles Dr. Bldg 56, Flagstaff, AZ, 86011-4073, USA
| |
Collapse
|
2
|
Kleandrova VV, Cordeiro MNDS, Speck-Planche A. In Silico Approach for Antibacterial Discovery: PTML Modeling of Virtual Multi-Strain Inhibitors Against Staphylococcus aureus. Pharmaceuticals (Basel) 2025; 18:196. [PMID: 40006010 PMCID: PMC11858522 DOI: 10.3390/ph18020196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2024] [Revised: 01/20/2025] [Accepted: 01/29/2025] [Indexed: 02/27/2025] Open
Abstract
Background/Objectives: Infectious diseases caused by Staphylococcus aureus (S. aureus) have become alarming health issues worldwide due to the ever-increasing emergence of multidrug resistance. In silico approaches can accelerate the identification and/or design of versatile antibacterial chemicals with the ability to target multiple S. aureus strains with varying degrees of drug resistance. Here, we develop a perturbation theory machine learning model based on a multilayer perceptron neural network (PTML-MLP) for the prediction and design of versatile virtual inhibitors against S. aureus strains. Methods: To develop the PTML-MLP model, chemical and biological data associated with antibacterial activity against S. aureus strains were retrieved from the ChEMBL database. We applied the Box-Jenkins approach to convert the topological indices into multi-label graph-theoretical indices; the latter were used as inputs for the creation of the PTML-MLP model. Results: The PTML-MLP model exhibited accuracy higher than 80% in both training and test sets. The physicochemical and structural interpretation of the PTML-MLP model was performed through the fragment-based topological design (FBTD) approach. Such interpretations permitted the analysis of different molecular fragments with favorable contributions to the multi-strain antibacterial activity and the design of four new drug-like molecules using different fragments as building blocks. The designed molecules were predicted/confirmed by our PTML model as multi-strain inhibitors of diverse S. aureus strains, thus representing promising chemotypes to be considered for future synthesis and biological testing of versatile anti-S. aureus agents. Conclusions: This work envisages promising applications of PTML modeling for early antibacterial drug discovery and related antimicrobial research areas.
Collapse
Affiliation(s)
| | | | - Alejandro Speck-Planche
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal; (V.V.K.); (M.N.D.S.C.)
| |
Collapse
|
3
|
Kleandrova VV, Cordeiro MNDS, Speck-Planche A. Perturbation Theory Machine Learning Model for Phenotypic Early Antineoplastic Drug Discovery: Design of Virtual Anti-Lung-Cancer Agents. APPLIED SCIENCES 2024; 14:9344. [DOI: 10.3390/app14209344] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
Abstract
Lung cancer is the most diagnosed malignant neoplasm worldwide and it is associated with great mortality. Currently, developing antineoplastic agents is a challenging, time-consuming, and costly process. Computational methods can speed up the early discovery of anti-lung-cancer chemicals. Here, we report a perturbation theory machine learning model based on a multilayer perceptron (PTML-MLP) model for phenotypic early antineoplastic drug discovery, enabling the rational design and prediction of new molecules as virtual versatile inhibitors of multiple lung cancer cell lines. The PTML-MLP model achieved an accuracy above 80%. We applied the fragment-based topological design (FBTD) approach to physicochemically and structurally interpret the PTML-MLP model. This enabled the extraction of suitable fragments with a positive influence on anti-lung-cancer activity against the different lung cancer cell lines. By following the aforementioned interpretations, we could assemble several suitable fragments to design four novel molecules, which were predicted by the PTML-MLP model as versatile anti-lung-cancer agents. Such predictions of potent multi-cellular anticancer activity against diverse lung cancer cell lines were rigorously confirmed by a well-established virtual screening tool reported in the literature. The present work envisages new opportunities for the application of PTML models to accelerate early antineoplastic discovery from phenotypic assays.
Collapse
Affiliation(s)
- Valeria V. Kleandrova
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - M. Natália D. S. Cordeiro
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - Alejandro Speck-Planche
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| |
Collapse
|
4
|
López-Cortés A, Cabrera-Andrade A, Echeverría-Garcés G, Echeverría-Espinoza P, Pineda-Albán M, Elsitdie N, Bueno-Miño J, Cruz-Segundo CM, Dorado J, Pazos A, Gonzáles-Díaz H, Pérez-Castillo Y, Tejera E, Munteanu CR. Unraveling druggable cancer-driving proteins and targeted drugs using artificial intelligence and multi-omics analyses. Sci Rep 2024; 14:19359. [PMID: 39169044 PMCID: PMC11339426 DOI: 10.1038/s41598-024-68565-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2024] [Accepted: 07/25/2024] [Indexed: 08/23/2024] Open
Abstract
The druggable proteome refers to proteins that can bind to small molecules with appropriate chemical affinity, inducing a favorable clinical response. Predicting druggable proteins through screening and in silico modeling is imperative for drug design. To contribute to this field, we developed an accurate predictive classifier for druggable cancer-driving proteins using amino acid composition descriptors of protein sequences and 13 machine learning linear and non-linear classifiers. The optimal classifier was achieved with the support vector machine method, utilizing 200 tri-amino acid composition descriptors. The high performance of the model is evident from an area under the receiver operating characteristics (AUROC) of 0.975 ± 0.003 and an accuracy of 0.929 ± 0.006 (threefold cross-validation). The machine learning prediction model was enhanced with multi-omics approaches, including the target-disease evidence score, the shortest pathways to cancer hallmarks, structure-based ligandability assessment, unfavorable prognostic protein analysis, and the oncogenic variome. Additionally, we performed a drug repurposing analysis to identify drugs with the highest affinity capable of targeting the best predicted proteins. As a result, we identified 79 key druggable cancer-driving proteins with the highest ligandability, and 23 of them demonstrated unfavorable prognostic significance across 16 TCGA PanCancer types: CDKN2A, BCL10, ACVR1, CASP8, JAG1, TSC1, NBN, PREX2, PPP2R1A, DNM2, VAV1, ASXL1, TPR, HRAS, BUB1B, ATG7, MARK3, SETD2, CCNE1, MUTYH, CDKN2C, RB1, and SMARCA4. Moreover, we prioritized 11 clinically relevant drugs targeting these proteins. This strategy effectively predicts and prioritizes biomarkers, therapeutic targets, and drugs for in-depth studies in clinical trials. Scripts are available at https://github.com/muntisa/machine-learning-for-druggable-proteins .
Collapse
Affiliation(s)
- Andrés López-Cortés
- Cancer Research Group (CRG), Faculty of Medicine, Universidad de Las Américas, Quito, Ecuador.
| | - Alejandro Cabrera-Andrade
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito, Ecuador
- Escuela de Enfermería, Facultad de Ciencias de la Salud, Universidad de Las Américas, Quito, Ecuador
| | - Gabriela Echeverría-Garcés
- Centro de Referencia Nacional de Genómica, Secuenciación y Bioinformática, Instituto Nacional de Investigación en Salud Pública "Leopoldo Izquieta Pérez", Quito, Ecuador
- Latin American Network for the Implementation and Validation of Clinical Pharmacogenomics Guidelines (RELIVAF-CYTED), Santiago, Chile
| | | | - Micaela Pineda-Albán
- Cancer Research Group (CRG), Faculty of Medicine, Universidad de Las Américas, Quito, Ecuador
| | - Nicole Elsitdie
- Cancer Research Group (CRG), Faculty of Medicine, Universidad de Las Américas, Quito, Ecuador
| | - José Bueno-Miño
- Cancer Research Group (CRG), Faculty of Medicine, Universidad de Las Américas, Quito, Ecuador
| | - Carlos M Cruz-Segundo
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruña, Spain
- Tecnológico de Estudios Superiores de Jocotitlán, Jocotitlán, Mexico
| | - Julian Dorado
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruña, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), University of A Coruna, A Coruña, Spain
| | - Alejandro Pazos
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruña, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), University of A Coruna, A Coruña, Spain
- Biomedical Research Institute of A Coruna (INIBIC), University Hospital Complex of A Coruna (CHUAC), A Coruña, Spain
| | - Humberto Gonzáles-Díaz
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, Biscay, Spain
- IKERBASQUE, Basque Foundation for Science, Biscay, Spain
| | | | - Eduardo Tejera
- Grupo de Bio-Quimioinformática, Universidad de Las Américas, Quito, Ecuador
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, A Coruña, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), University of A Coruna, A Coruña, Spain
- Biomedical Research Institute of A Coruna (INIBIC), University Hospital Complex of A Coruna (CHUAC), A Coruña, Spain
| |
Collapse
|
5
|
He S, Segura Abarrategi J, Bediaga H, Arrasate S, González-Díaz H. On the additive artificial intelligence-based discovery of nanoparticle neurodegenerative disease drug delivery systems. BEILSTEIN JOURNAL OF NANOTECHNOLOGY 2024; 15:535-555. [PMID: 38774585 PMCID: PMC11106676 DOI: 10.3762/bjnano.15.47] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 04/23/2024] [Indexed: 05/24/2024]
Abstract
Neurodegenerative diseases are characterized by slowly progressing neuronal cell death. Conventional drug treatment strategies often fail because of poor solubility, low bioavailability, and the inability of the drugs to effectively cross the blood-brain barrier. Therefore, the development of new neurodegenerative disease drugs (NDDs) requires immediate attention. Nanoparticle (NP) systems are of increasing interest for transporting NDDs to the central nervous system. However, discovering effective nanoparticle neuronal disease drug delivery systems (N2D3Ss) is challenging because of the vast number of combinations of NP and NDD compounds, as well as the various assays involved. Artificial intelligence/machine learning (AI/ML) algorithms have the potential to accelerate this process by predicting the most promising NDD and NP candidates for assaying. Nevertheless, the relatively limited amount of reported data on N2D3S activity compared to assayed NDDs makes AI/ML analysis challenging. In this work, the IFPTML technique, which combines information fusion (IF), perturbation theory (PT), and machine learning (ML), was employed to address this challenge. Initially, we conducted the fusion into a unified dataset comprising 4403 NDD assays from ChEMBL and 260 NP cytotoxicity assays from journal articles. Through a resampling process, three new working datasets were generated, each containing 500,000 cases. We utilized linear discriminant analysis (LDA) along with artificial neural network (ANN) algorithms, such as multilayer perceptron (MLP) and deep learning networks (DLN), to construct linear and non-linear IFPTML models. The IFPTML-LDA models exhibited sensitivity (Sn) and specificity (Sp) values in the range of 70% to 73% (>375,000 training cases) and 70% to 80% (>125,000 validation cases), respectively. In contrast, the IFPTML-MLP and IFPTML-DLN achieved Sn and Sp values in the range of 85% to 86% for both training and validation series. Additionally, IFPTML-ANN models showed an area under the receiver operating curve (AUROC) of approximately 0.93 to 0.95. These results indicate that the IFPTML models could serve as valuable tools in the design of drug delivery systems for neurosciences.
Collapse
Affiliation(s)
- Shan He
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- IKERDATA S.L., ZITEK, UPV/EHU, Rectorate Building, nº6, 48940 Leioa, Greater Bilbao, Basque Country, Spain
| | - Julen Segura Abarrategi
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Harbil Bediaga
- IKERDATA S.L., ZITEK, UPV/EHU, Rectorate Building, nº6, 48940 Leioa, Greater Bilbao, Basque Country, Spain
- Painting Department, Fine Arts Faculty, University of the Basque Country UPV/EHU, 48940, Leioa, Biscay, Basque Country, Spain
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- Instituto Biofisika (UPV/EHU-CSIC), 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
6
|
Baltasar-Marchueta M, Llona L, M-Alicante S, Barbolla I, Ibarluzea MG, Ramis R, Salomon AM, Fundora B, Araujo A, Muguruza-Montero A, Nuñez E, Pérez-Olea S, Villanueva C, Leonardo A, Arrasate S, Sotomayor N, Villarroel A, Bergara A, Lete E, González-Díaz H. Identification of Riluzole derivatives as novel calmodulin inhibitors with neuroprotective activity by a joint synthesis, biosensor, and computational guided strategy. Biomed Pharmacother 2024; 174:116602. [PMID: 38636396 DOI: 10.1016/j.biopha.2024.116602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2024] [Revised: 04/10/2024] [Accepted: 04/11/2024] [Indexed: 04/20/2024] Open
Abstract
The development of new molecules for the treatment of calmodulin related cardiovascular or neurodegenerative diseases is an interesting goal. In this work, we introduce a novel strategy with four main steps: (1) chemical synthesis of target molecules, (2) Förster Resonance Energy Transfer (FRET) biosensor development and in vitro biological assay of new derivatives, (3) Cheminformatics models development and in vivo activity prediction, and (4) Docking studies. This strategy is illustrated with a case study. Firstly, a series of 4-substituted Riluzole derivatives 1-3 were synthetized through a strategy that involves the construction of the 4-bromoriluzole framework and its further functionalization via palladium catalysis or organolithium chemistry. Next, a FRET biosensor for monitoring Ca2+-dependent CaM-ligands interactions has been developed and used for the in vitro assay of Riluzole derivatives. In particular, the best inhibition (80%) was observed for 4-methoxyphenylriluzole 2b. Besides, we trained and validated a new Networks Invariant, Information Fusion, Perturbation Theory, and Machine Learning (NIFPTML) model for predicting probability profiles of in vivo biological activity parameters in different regions of the brain. Next, we used this model to predict the in vivo activity of the compounds experimentally studied in vitro. Last, docking study conducted on Riluzole and its derivatives has provided valuable insights into their binding conformations with the target protein, involving calmodulin and the SK4 channel. This new combined strategy may be useful to reduce assay costs (animals, materials, time, and human resources) in the drug discovery process of calmodulin inhibitors.
Collapse
Affiliation(s)
- Maider Baltasar-Marchueta
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, Leioa 48940, Spain
| | - Leire Llona
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, Leioa 48940, Spain
| | | | - Iratxe Barbolla
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, Leioa 48940, Spain
| | - Markel Garcia Ibarluzea
- Donostia International Physics Center, Donostia, Spain; Departament of Physics, University of the Basque Country, UPV/EHU, Leioa, Spain
| | - Rafael Ramis
- Donostia International Physics Center, Donostia, Spain; Departament of Physics, University of the Basque Country, UPV/EHU, Leioa, Spain
| | - Ane Miren Salomon
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, Leioa 48940, Spain
| | - Brenda Fundora
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, Leioa 48940, Spain
| | - Ariane Araujo
- Biofisika Institute, CSIC-UPV/EHU, Leioa 48940, Spain
| | | | - Eider Nuñez
- Biofisika Institute, CSIC-UPV/EHU, Leioa 48940, Spain
| | - Scarlett Pérez-Olea
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, Leioa 48940, Spain
| | - Christian Villanueva
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, Leioa 48940, Spain
| | - Aritz Leonardo
- Donostia International Physics Center, Donostia, Spain; Departament of Physics, University of the Basque Country, UPV/EHU, Leioa, Spain
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, Leioa 48940, Spain
| | - Nuria Sotomayor
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, Leioa 48940, Spain
| | | | - Aitor Bergara
- Donostia International Physics Center, Donostia, Spain; Departament of Physics, University of the Basque Country, UPV/EHU, Leioa, Spain.
| | - Esther Lete
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, Leioa 48940, Spain.
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of the Basque Country UPV/EHU, Leioa 48940, Spain; Biofisika Institute, CSIC-UPV/EHU, Leioa 48940, Spain; IKERBASQUE, Basque Foundation for Science, Bilbao 48011, Spain.
| |
Collapse
|
7
|
Kleandrova VV, Cordeiro MNDS, Speck-Planche A. Optimizing drug discovery using multitasking models for quantitative structure-biological effect relationships: an update of the literature. Expert Opin Drug Discov 2023; 18:1231-1243. [PMID: 37639708 DOI: 10.1080/17460441.2023.2251385] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 08/21/2023] [Indexed: 08/31/2023]
Abstract
INTRODUCTION Drug discovery has provided modern societies with the means to fight against many diseases. In this sense, computational methods have been at the forefront, playing an important role in rationalizing the search for novel drugs. Yet, tackling phenomena such as the multi-genic nature of diseases and drug resistance are limitations of the current computational methods. Multi-tasking models for quantitative structure-biological effect relationships (mtk-QSBER) have emerged to overcome such limitations. AREAS COVERED The present review describes an update on the fundamentals and applications of the mtk-QSBER models as tools to accelerate multiple stages/substages of the drug discovery process. EXPERT OPINION Computational approaches are extremely important for the rationalization of the search for novel and efficacious therapeutic agents. However, they need to focus more on the multi-target drug discovery paradigm. In this sense, mtk-QSBER models are particularly suited for multi-target drug discovery, offering encouraging opportunities across multiple therapeutic areas and scientific disciplines associated with drug discovery.
Collapse
Affiliation(s)
- Valeria V Kleandrova
- Laboratory of Fundamental and Applied Research of Quality and Technology of Food Production, Russian Biotechnological University, Moscow, Russian Federation
| | - M Natália D S Cordeiro
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, Porto, Portugal
| | - Alejandro Speck-Planche
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, Porto, Portugal
| |
Collapse
|
8
|
Parizi LF, Githaka NW, Logullo C, Zhou J, Onuma M, Termignoni C, da Silva Vaz I. Universal Tick Vaccines: Candidates and Remaining Challenges. Animals (Basel) 2023; 13:2031. [PMID: 37370541 DOI: 10.3390/ani13122031] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2023] [Revised: 05/29/2023] [Accepted: 06/17/2023] [Indexed: 06/29/2023] Open
Abstract
Recent advancements in molecular biology, particularly regarding massively parallel sequencing technologies, have enabled scientists to gain more insight into the physiology of ticks. While there has been progress in identifying tick proteins and the pathways they are involved in, the specificities of tick-host interaction at the molecular level are not yet fully understood. Indeed, the development of effective commercial tick vaccines has been slower than expected. While omics studies have pointed to some potential vaccine immunogens, selecting suitable antigens for a multi-antigenic vaccine is very complex due to the participation of redundant molecules in biological pathways. The expansion of ticks and their pathogens into new territories and exposure to new hosts makes it necessary to evaluate vaccine efficacy in unusual and non-domestic host species. This situation makes ticks and tick-borne diseases an increasing threat to animal and human health globally, demanding an urgent availability of vaccines against multiple tick species and their pathogens. This review discusses the challenges and advancements in the search for universal tick vaccines, including promising new antigen candidates, and indicates future directions in this crucial research field.
Collapse
Affiliation(s)
- Luís Fernando Parizi
- Centro de Biotecnologia, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brazil
| | | | - Carlos Logullo
- Instituto de Bioquímica Médica Leopoldo de Meis, Universidade Federal do Rio de Janeiro, Rio de Janeiro 21941-853, Brazil
- Instituto Nacional de Ciência e Tecnologia em Entomologia Molecular, Universidade Federal do Rio de Janeiro, Rio de Janeiro 21941-853, Brazil
| | - Jinlin Zhou
- Key Laboratory of Animal Parasitology of Ministry of Agriculture, Shanghai Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Shanghai 200241, China
| | - Misao Onuma
- Department of Infectious Diseases, Graduate School of Veterinary Medicine, Hokkaido University, Sapporo 060-0818, Japan
| | - Carlos Termignoni
- Centro de Biotecnologia, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brazil
- Instituto Nacional de Ciência e Tecnologia em Entomologia Molecular, Universidade Federal do Rio de Janeiro, Rio de Janeiro 21941-853, Brazil
- Departamento de Bioquímica, Universidade Federal do Rio Grande do Sul, Porto Alegre 90040-060, Brazil
| | - Itabajara da Silva Vaz
- Centro de Biotecnologia, Universidade Federal do Rio Grande do Sul, Porto Alegre 91501-970, Brazil
- Instituto Nacional de Ciência e Tecnologia em Entomologia Molecular, Universidade Federal do Rio de Janeiro, Rio de Janeiro 21941-853, Brazil
- Faculdade de Veterinária, Universidade Federal do Rio Grande do Sul, Porto Alegre 91540-000, Brazil
| |
Collapse
|
9
|
Santiago C, Ortega-Tenezaca B, Barbolla I, Fundora-Ortiz B, Arrasate S, Dea-Ayuela MA, González-Díaz H, Sotomayor N, Lete E. Prediction of Antileishmanial Compounds: General Model, Preparation, and Evaluation of 2-Acylpyrrole Derivatives. J Chem Inf Model 2022; 62:3928-3940. [PMID: 35946598 PMCID: PMC9986876 DOI: 10.1021/acs.jcim.2c00731] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In this work, the SOFT.PTML tool has been used to pre-process a ChEMBL dataset of pre-clinical assays of antileishmanial compound candidates. A comparative study of different ML algorithms, such as logistic regression (LOGR), support vector machine (SVM), and random forests (RF), has shown that the IFPTML-LOGR model presents excellent values of specificity and sensitivity (81-98%) in training and validation series. The use of this software has been illustrated with a practical case study focused on a series of 28 derivatives of 2-acylpyrroles 5a,b, obtained through a Pd(II)-catalyzed C-H radical acylation of pyrroles. Their in vitro leishmanicidal activity against visceral (L. donovani) and cutaneous (L. amazonensis) leishmaniasis was evaluated finding that compounds 5bc (IC50 = 30.87 μM, SI > 10.17) and 5bd (IC50 = 16.87 μM, SI > 10.67) were approximately 6-fold more selective than the drug of reference (miltefosine) in in vitro assays against L. amazonensis promastigotes. In addition, most of the compounds showed low cytotoxicity, CC50 > 100 μg/mL in J774 cells. Interestingly, the IFPMTL-LOGR model predicts correctly the relative biological activity of these series of acylpyrroles. A computational high-throughput screening (cHTS) study of 2-acylpyrroles 5a,b has been performed calculating >20,700 activity scores vs a large space of 647 assays involving multiple Leishmania species, cell lines, and potential target proteins. Overall, the study demonstrates that the SOFT.PTML all-in-one strategy is useful to obtain IFPTML models in a friendly interface making the work easier and faster than before. The present work also points to 2-acylpyrroles as new lead compounds worthy of further optimization as antileishmanial hits.
Collapse
Affiliation(s)
- Carlos Santiago
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080 Bilbao, Spain
| | - Bernabé Ortega-Tenezaca
- Department of Computer Science and Information Technologies, University of A Coruña (UDC), 15071, A Coruña, Spain
| | - Iratxe Barbolla
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080 Bilbao, Spain.,BIOFISIKA. Basque Center for Biophysics CSIC-UPV/EHU, 48940, Bilbao, Spain
| | - Brenda Fundora-Ortiz
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080 Bilbao, Spain
| | - Sonia Arrasate
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080 Bilbao, Spain
| | - María Auxiliadora Dea-Ayuela
- Departamento de Farmacia, Facultad de Ciencias de la Salud, Universidad CEU Cardenal Herrera, 46115 Alfara del Patriarca, Valencia, Spain
| | - Humberto González-Díaz
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080 Bilbao, Spain.,BIOFISIKA. Basque Center for Biophysics CSIC-UPV/EHU, 48940, Bilbao, Spain.,IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain
| | - Nuria Sotomayor
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080 Bilbao, Spain
| | - Esther Lete
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080 Bilbao, Spain
| |
Collapse
|
10
|
Prediction of B cell epitopes in proteins using a novel sequence similarity-based method. Sci Rep 2022; 12:13739. [PMID: 35962028 PMCID: PMC9374694 DOI: 10.1038/s41598-022-18021-1] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 08/03/2022] [Indexed: 11/29/2022] Open
Abstract
Prediction of B cell epitopes that can replace the antigen for antibody production and detection is of great interest for research and the biotech industry. Here, we developed a novel BLAST-based method to predict linear B cell epitopes. To that end, we generated a BLAST-formatted database upon a dataset of 62,730 known linear B cell epitope sequences and considered as a B cell epitope any peptide sequence producing ungapped BLAST hits to this database with identity ≥ 80% and length ≥ 8. We examined B cell epitope predictions by this method in tenfold cross-validations in which we considered various types of non-B cell epitopes, including 62,730 peptide sequences with verified negative B cell assays. As a result, we obtained values of accuracy, specificity and sensitivity of 72.54 ± 0.27%, 81.59 ± 0.37% and 63.49 ± 0.43%, respectively. In an independent dataset incorporating 503 B cell epitopes, this method reached accuracy, specificity and sensitivity of 74.85%, 99.20% and 50.50%, respectively, outperforming state-of-the-art methods to predict linear B cell epitopes. We implemented this BLAST-based approach to predict B cell epitopes at http://imath.med.ucm.es/bepiblast.
Collapse
|
11
|
Diéguez-Santana K, Casañola-Martin GM, Torres R, Rasulev B, Green JR, González-Díaz H. Machine Learning Study of Metabolic Networks vs ChEMBL Data of Antibacterial Compounds. Mol Pharm 2022; 19:2151-2163. [PMID: 35671399 PMCID: PMC9986951 DOI: 10.1021/acs.molpharmaceut.2c00029] [Citation(s) in RCA: 9] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Antibacterial drugs (AD) change the metabolic status of bacteria, contributing to bacterial death. However, antibiotic resistance and the emergence of multidrug-resistant bacteria increase interest in understanding metabolic network (MN) mutations and the interaction of AD vs MN. In this study, we employed the IFPTML = Information Fusion (IF) + Perturbation Theory (PT) + Machine Learning (ML) algorithm on a huge dataset from the ChEMBL database, which contains >155,000 AD assays vs >40 MNs of multiple bacteria species. We built a linear discriminant analysis (LDA) and 17 ML models centered on the linear index and based on atoms to predict antibacterial compounds. The IFPTML-LDA model presented the following results for the training subset: specificity (Sp) = 76% out of 70,000 cases, sensitivity (Sn) = 70%, and Accuracy (Acc) = 73%. The same model also presented the following results for the validation subsets: Sp = 76%, Sn = 70%, and Acc = 73.1%. Among the IFPTML nonlinear models, the k nearest neighbors (KNN) showed the best results with Sn = 99.2%, Sp = 95.5%, Acc = 97.4%, and Area Under Receiver Operating Characteristic (AUROC) = 0.998 in training sets. In the validation series, the Random Forest had the best results: Sn = 93.96% and Sp = 87.02% (AUROC = 0.945). The IFPTML linear and nonlinear models regarding the ADs vs MNs have good statistical parameters, and they could contribute toward finding new metabolic mutations in antibiotic resistance and reducing time/costs in antibacterial drug research.
Collapse
Affiliation(s)
- Karel Diéguez-Santana
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain.,Universidad Regional Amazónica IKIAM, Tena, Napo 150150, Ecuador
| | - Gerardo M Casañola-Martin
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, North Dakota 58102, United States.,Department of Systems and Computer Engineering, Carleton University, K1S5B6 Ottawa, Ontario, Canada
| | - Roldan Torres
- Universidad Regional Amazónica IKIAM, Tena, Napo 150150, Ecuador
| | - Bakhtiyor Rasulev
- Department of Coatings and Polymeric Materials, North Dakota State University, Fargo, North Dakota 58102, United States
| | - James R Green
- Department of Systems and Computer Engineering, Carleton University, K1S5B6 Ottawa, Ontario, Canada
| | - Humbert González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain.,BIOFISIKA, Basque Center for Biophysics CSIC-UPVEH, 48940 Leioa, Spain.,IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
12
|
PTML Modeling for Pancreatic Cancer Research: In Silico Design of Simultaneous Multi-Protein and Multi-Cell Inhibitors. Biomedicines 2022; 10:biomedicines10020491. [PMID: 35203699 PMCID: PMC8962338 DOI: 10.3390/biomedicines10020491] [Citation(s) in RCA: 23] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2022] [Revised: 02/10/2022] [Accepted: 02/15/2022] [Indexed: 02/07/2023] Open
Abstract
Pancreatic cancer (PANC) is a dangerous type of cancer that is a major cause of mortality worldwide and exhibits a remarkably poor prognosis. To date, discovering anti-PANC agents remains a very complex and expensive process. Computational approaches can accelerate the search for anti-PANC agents. We report for the first time two models that combined perturbation theory with machine learning via a multilayer perceptron network (PTML-MLP) to perform the virtual design and prediction of molecules that can simultaneously inhibit multiple PANC cell lines and PANC-related proteins, such as caspase-1, tumor necrosis factor-alpha (TNF-alpha), and the insulin-like growth factor 1 receptor (IGF1R). Both PTML-MLP models exhibited accuracies higher than 78%. Using the interpretation from one of the PTML-MLP models as a guideline, we extracted different molecular fragments desirable for the inhibition of the PANC cell lines and the aforementioned PANC-related proteins and then assembled some of those fragments to form three new molecules. The two PTML-MLP models predicted the designed molecules as potentially versatile anti-PANC agents through inhibition of the three PANC-related proteins and multiple PANC cell lines. Conclusions: This work opens new horizons for the application of the PTML modeling methodology to anticancer research.
Collapse
|
13
|
Diéguez-Santana K, González-Díaz H. Towards machine learning discovery of dual antibacterial drug-nanoparticle systems. NANOSCALE 2021; 13:17854-17870. [PMID: 34671801 DOI: 10.1039/d1nr04178a] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/13/2023]
Abstract
Artificial Intelligence/Machine Learning (AI/ML) algorithms may speed up the design of DADNP systems formed by Antibacterial Drugs (AD) and Nanoparticles (NP). In this work, we used IFPTML = Information Fusion (IF) + Perturbation-Theory (PT) + Machine Learning (ML) algorithm for the first time to study of a large dataset of putative DADNP systems composed by >165 000 ChEMBL AD assays and 300 NP assays vs. multiple bacteria species. We trained alternative models with Linear Discriminant Analysis (LDA), Artificial Neural Networks (ANN), Bayesian Networks (BNN), K-Nearest Neighbour (KNN) and other algorithms. IFPTML-LDA model was simpler with values of Sp ≈ 90% and Sn ≈ 74% in both training (>124 K cases) and validation (>41 K cases) series. IFPTML-ANN and KNN models are notably more complicated even when they are more balanced Sn ≈ Sp ≈ 88.5%-99.0% and AUROC ≈ 0.94-0.99 in both series. We also carried out a simulation (>1900 calculations) of the expected behavior for putative DADNPs in 72 different biological assays. The putative DADNPs studied are formed by 27 different drugs with multiple classes of NP and types of coats. In addition, we tested the validity of our additive model with 80 DADNP complexes experimentally synthetized and biologically tested (reported in >45 papers). All these DADNPs show values of MIC < 50 μg mL-1 (cutoff used) better that MIC of AD and NP alone (synergistic or additive effect). The assays involve DADNP complexes with 10 types of NP, 6 coating materials, NP size range 5-100 nm vs. 15 different antibiotics, and 12 bacteria species. The IFPTML-LDA model classified correctly 100% (80 out of 80) DADNP complexes as biologically active. IFPMTL additive strategy may become a useful tool to assist the design of DADNP systems for antibacterial therapy taking into consideration only information about AD and NP components by separate.
Collapse
Affiliation(s)
- Karel Diéguez-Santana
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Humberto González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- Basque Center for Biophysics CSIC-UPVEH, University of Basque Country UPV/EHU, 48940 Leioa, Spain.
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
14
|
Prediction of Anti-Glioblastoma Drug-Decorated Nanoparticle Delivery Systems Using Molecular Descriptors and Machine Learning. Int J Mol Sci 2021; 22:ijms222111519. [PMID: 34768951 PMCID: PMC8584266 DOI: 10.3390/ijms222111519] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Revised: 10/08/2021] [Accepted: 10/22/2021] [Indexed: 12/22/2022] Open
Abstract
The theoretical prediction of drug-decorated nanoparticles (DDNPs) has become a very important task in medical applications. For the current paper, Perturbation Theory Machine Learning (PTML) models were built to predict the probability of different pairs of drugs and nanoparticles creating DDNP complexes with anti-glioblastoma activity. PTML models use the perturbations of molecular descriptors of drugs and nanoparticles as inputs in experimental conditions. The raw dataset was obtained by mixing the nanoparticle experimental data with drug assays from the ChEMBL database. Ten types of machine learning methods have been tested. Only 41 features have been selected for 855,129 drug-nanoparticle complexes. The best model was obtained with the Bagging classifier, an ensemble meta-estimator based on 20 decision trees, with an area under the receiver operating characteristic curve (AUROC) of 0.96, and an accuracy of 87% (test subset). This model could be useful for the virtual screening of nanoparticle-drug complexes in glioblastoma. All the calculations can be reproduced with the datasets and python scripts, which are freely available as a GitHub repository from authors.
Collapse
|
15
|
Computational Drug Repurposing for Antituberculosis Therapy: Discovery of Multi-Strain Inhibitors. Antibiotics (Basel) 2021; 10:antibiotics10081005. [PMID: 34439055 PMCID: PMC8388932 DOI: 10.3390/antibiotics10081005] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2021] [Revised: 08/15/2021] [Accepted: 08/17/2021] [Indexed: 12/13/2022] Open
Abstract
Tuberculosis remains the most afflicting infectious disease known by humankind, with one quarter of the population estimated to have it in the latent state. Discovering antituberculosis drugs is a challenging, complex, expensive, and time-consuming task. To overcome the substantial costs and accelerate drug discovery and development, drug repurposing has emerged as an attractive alternative to find new applications for “old” drugs and where computational approaches play an essential role by filtering the chemical space. This work reports the first multi-condition model based on quantitative structure–activity relationships and an ensemble of neural networks (mtc-QSAR-EL) for the virtual screening of potential antituberculosis agents able to act as multi-strain inhibitors. The mtc-QSAR-EL model exhibited an accuracy higher than 85%. A physicochemical and fragment-based structural interpretation of this model was provided, and a large dataset of agency-regulated chemicals was virtually screened, with the mtc-QSAR-EL model identifying already proven antituberculosis drugs while proposing chemicals with great potential to be experimentally repurposed as antituberculosis (multi-strain inhibitors) agents. Some of the most promising molecules identified by the mtc-QSAR-EL model as antituberculosis agents were also confirmed by another computational approach, supporting the capabilities of the mtc-QSAR-EL model as an efficient tool for computational drug repurposing.
Collapse
|
16
|
Kleandrova VV, Speck-Planche A. The QSAR Paradigm in Fragment-Based Drug Discovery: From the Virtual Generation of Target Inhibitors to Multi-Scale Modeling. Mini Rev Med Chem 2021; 20:1357-1374. [PMID: 32013845 DOI: 10.2174/1389557520666200204123156] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2019] [Revised: 10/21/2019] [Accepted: 10/28/2019] [Indexed: 12/24/2022]
Abstract
Fragment-Based Drug Design (FBDD) has established itself as a promising approach in modern drug discovery, accelerating and improving lead optimization, while playing a crucial role in diminishing the high attrition rates at all stages in the drug development process. On the other hand, FBDD has benefited from the application of computational methodologies, where the models derived from the Quantitative Structure-Activity Relationships (QSAR) have become consolidated tools. This mini-review focuses on the evolution and main applications of the QSAR paradigm in the context of FBDD in the last five years. This report places particular emphasis on the QSAR models derived from fragment-based topological approaches to extract physicochemical and/or structural information, allowing to design potentially novel mono- or multi-target inhibitors from relatively large and heterogeneous databases. Here, we also discuss the need to apply multi-scale modeling, to exemplify how different datasets based on target inhibition can be simultaneously integrated and predicted together with other relevant endpoints such as the biological activity against non-biomolecular targets, as well as in vitro and in vivo toxicity and pharmacokinetic properties. In this context, seminal papers are briefly analyzed. As huge amounts of data continue to accumulate in the domains of the chemical, biological and biomedical sciences, it has become clear that drug discovery must be viewed as a multi-scale optimization process. An ideal multi-scale approach should integrate diverse chemical and biological data and also serve as a knowledge generator, enabling the design of potentially optimal chemicals that may become therapeutic agents.
Collapse
Affiliation(s)
- Valeria V Kleandrova
- Laboratory of Fundamental and Applied Research of Quality and Technology of Food Production, Moscow State University of Food Production, Volokolamskoe Shosse 11, 125080, Moscow, Russian Federation
| | - Alejandro Speck-Planche
- Department of Chemistry, Institute of Pharmacy, I.M. Sechenov First Moscow State Medical University, Trubetskaya Str., 8, b. 2, 119992, Moscow, Russian Federation
| |
Collapse
|
17
|
Barbolla I, Hernández-Suárez L, Quevedo-Tumailli V, Nocedo-Mena D, Arrasate S, Dea-Ayuela MA, González-Díaz H, Sotomayor N, Lete E. Palladium-mediated synthesis and biological evaluation of C-10b substituted Dihydropyrrolo[1,2-b]isoquinolines as antileishmanial agents. Eur J Med Chem 2021; 220:113458. [PMID: 33901901 DOI: 10.1016/j.ejmech.2021.113458] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Revised: 03/12/2021] [Accepted: 04/05/2021] [Indexed: 11/26/2022]
Abstract
The development of new molecules for the treatment of leishmaniasis is, a neglected parasitic disease, is urgent as current anti-leishmanial therapeutics are hampered by drug toxicity and resistance. The pyrrolo[1,2-b]isoquinoline core was selected as starting point, and palladium-catalyzed Heck-initiated cascade reactions were developed for the synthesis of a series of C-10 substituted derivatives. Their in vitro leishmanicidal activity against visceral (L. donovani) and cutaneous (L. amazonensis) leishmaniasis was evaluated. The best activity was found, in general, for the 10-arylmethyl substituted pyrroloisoquinolines. In particular, 2ad (IC50 = 3.30 μM, SI > 77.01) and 2bb (IC50 = 3.93 μM, SI > 58.77) were approximately 10-fold more potent and selective than the drug of reference (miltefosine), against L. amazonensis on in vitro promastigote assays, while 2ae was the more active compound in the in vitro amastigote assays (IC50 = 33.59 μM, SI > 8.93). Notably, almost all compounds showed low cytotoxicity, CC50 > 100 μg/mL in J774 cells, highest tested dose. In addition, we have developed the first Perturbation Theory Machine Learning (PTML) algorithm able to predict simultaneously multiple biological activity parameters (IC50, Ki, etc.) vs. any Leishmania species and target protein, with high values of specificity (>98%) and sensitivity (>90%) in both training and validation series. Therefore, this model may be useful to reduce time and assay costs (material and human resources) in the drug discovery process.
Collapse
Affiliation(s)
- Iratxe Barbolla
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad Del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080, Bilbao, Spain
| | - Leidi Hernández-Suárez
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad Del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080, Bilbao, Spain
| | - Viviana Quevedo-Tumailli
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad Del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080, Bilbao, Spain; RNASA-IMEDIR, Computer Science Faculty, University of A Coruña, 15071, A Coruña, Spain; Universidad Estatal Amazónica UEA, Puyo, 160150, Pastaza, Ecuador
| | - Deyani Nocedo-Mena
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad Del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080, Bilbao, Spain
| | - Sonia Arrasate
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad Del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080, Bilbao, Spain
| | - María Auxiliadora Dea-Ayuela
- Departamento de Farmacia, Facultad de Ciencias de La Salud, Universidad CEU Cardenal Herrera, Edificio Seminario S/n, 46113, Moncada, Valencia, Spain
| | - Humberto González-Díaz
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad Del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080, Bilbao, Spain; Basque Center for Biophysics CSIC-UPV/EHU, University of the Basque Country UPV/EHU, 48940, Bilbao, Spain; IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Spain.
| | - Nuria Sotomayor
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad Del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080, Bilbao, Spain.
| | - Esther Lete
- Departamento de Química Orgánica e Inorgánica, Facultad de Ciencia y Tecnología, Universidad Del País Vasco / Euskal Herriko Unibertsitatea UPV/EHU, Apdo. 644, 48080, Bilbao, Spain.
| |
Collapse
|
18
|
Sampaio-Dias IE, Rodríguez-Borges JE, Yáñez-Pérez V, Arrasate S, Llorente J, Brea JM, Bediaga H, Viña D, Loza MI, Caamaño O, García-Mera X, González-Díaz H. Synthesis, Pharmacological, and Biological Evaluation of 2-Furoyl-Based MIF-1 Peptidomimetics and the Development of a General-Purpose Model for Allosteric Modulators (ALLOPTML). ACS Chem Neurosci 2021; 12:203-215. [PMID: 33347281 DOI: 10.1021/acschemneuro.0c00687] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022] Open
Abstract
This work describes the synthesis and pharmacological evaluation of 2-furoyl-based Melanostatin (MIF-1) peptidomimetics as dopamine D2 modulating agents. Eight novel peptidomimetics were tested for their ability to enhance the maximal effect of tritiated N-propylapomorphine ([3H]-NPA) at D2 receptors (D2R). In this series, 2-furoyl-l-leucylglycinamide (6a) produced a statistically significant increase in the maximal [3H]-NPA response at 10 pM (11 ± 1%), comparable to the effect of MIF-1 (18 ± 9%) at the same concentration. This result supports previous evidence that the replacement of proline residue by heteroaromatic scaffolds are tolerated at the allosteric binding site of MIF-1. Biological assays performed for peptidomimetic 6a using cortex neurons from 19-day-old Wistar-Kyoto rat embryos suggest that 6a displays no neurotoxicity up to 100 μM. Overall, the pharmacological and toxicological profile and the structural simplicity of 6a makes this peptidomimetic a potential lead compound for further development and optimization, paving the way for the development of novel modulating agents of D2R suitable for the treatment of CNS-related diseases. Additionally, the pharmacological and biological data herein reported, along with >20 000 outcomes of preclinical assays, was used to seek a general model to predict the allosteric modulatory potential of molecular candidates for a myriad of target receptors, organisms, cell lines, and biological activity parameters based on perturbation theory (PT) ideas and machine learning (ML) techniques, abbreviated as ALLOPTML. By doing so, ALLOPTML shows high specificity Sp = 89.2/89.4%, sensitivity Sn = 71.3/72.2%, and accuracy Ac = 86.1%/86.4% in training/validation series, respectively. To the best of our knowledge, ALLOPTML is the first general-purpose chemoinformatic tool using a PTML-based model for the multioutput and multicondition prediction of allosteric compounds, which is expected to save both time and resources during the early drug discovery of allosteric modulators.
Collapse
Affiliation(s)
- Ivo E. Sampaio-Dias
- LAQV/REQUIMTE, Dept. of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - José E. Rodríguez-Borges
- LAQV/REQUIMTE, Dept. of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - Víctor Yáñez-Pérez
- Dept. of Organic Chemistry II, University of Basque Country (UPV-EHU), 48940 Leioa, Spain
| | - Sonia Arrasate
- Dept. of Pharmacology, Faculty of Medicine and Nursing, University of Basque Country (UPV-EHU), 48940 Leioa, Spain
| | - Javier Llorente
- Dept. of Pharmacology, Faculty of Medicine and Nursing, University of Basque Country (UPV-EHU), 48940 Leioa, Spain
- Dept. of Pharmacology, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - José M. Brea
- Innopharma Screening Platform, Biofarma Research group, Centre of Research in Molecular Medicine and Chronic Diseases CIMUS, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Harbil Bediaga
- Dept. of Organic Chemistry II, University of Basque Country (UPV-EHU), 48940 Leioa, Spain
- Dept. of Physical Chemistry, University of Basque Country (UPV-EHU), 48940 Leioa, Spain
| | - Dolores Viña
- Dept. of Pharmacology, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
- Centre of Research in Molecular Medicine and Chronic Diseases CIMUS, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - María Isabel Loza
- Innopharma Screening Platform, Biofarma Research group, Centre of Research in Molecular Medicine and Chronic Diseases CIMUS, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Olga Caamaño
- Dept. of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Xerardo García-Mera
- Dept. of Organic Chemistry, Faculty of Pharmacy, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Humberto González-Díaz
- Dept. of Organic Chemistry II, University of Basque Country (UPV-EHU), 48940 Leioa, Spain
- Basque Center for Biophysics (CSIC UPV/EHU), University of Basque Country (UPV-EHU), 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain
| |
Collapse
|
19
|
Ortega-Tenezaca B, Quevedo-Tumailli V, Bediaga H, Collados J, Arrasate S, Madariaga G, Munteanu CR, Cordeiro MND, González-Díaz H. PTML Multi-Label Algorithms: Models, Software, and Applications. Curr Top Med Chem 2020; 20:2326-2337. [DOI: 10.2174/1568026620666200916122616] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Revised: 07/19/2020] [Accepted: 07/20/2020] [Indexed: 12/17/2022]
Abstract
By combining Machine Learning (ML) methods with Perturbation Theory (PT), it is possible
to develop predictive models for a variety of response targets. Such combination often known as
Perturbation Theory Machine Learning (PTML) modeling comprises a set of techniques that can handle
various physical, and chemical properties of different organisms, complex biological or material
systems under multiple input conditions. In so doing, these techniques effectively integrate a manifold
of diverse chemical and biological data into a single computational framework that can then be applied
for screening lead chemicals as well as to find clues for improving the targeted response(s).
PTML models have thus been extremely helpful in drug or material design efforts and found to be
predictive and applicable across a broad space of systems. After a brief outline of the applied methodology,
this work reviews the different uses of PTML in Medicinal Chemistry, as well as in other
applications. Finally, we cover the development of software available nowadays for setting up PTML
models from large datasets.
Collapse
Affiliation(s)
| | | | - Harbil Bediaga
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Jon Collados
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Sonia Arrasate
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Gotzon Madariaga
- Department of Condensed Matter Physics, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| | - Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruna, Spain
| | - M. Natália D.S. Cordeiro
- LAQV@REQUIMTE, Department of Chemistry and Biochemistry, University of Porto, 4169-007 Porto, Portugal
| | - Humbert González-Díaz
- Department of Organic and Inorganic Chemistry, University of Basque Country UPV/EHU, 48940 Leioa, Spain
| |
Collapse
|
20
|
Kleandrova VV, Speck-Planche A. PTML Modeling for Alzheimer’s Disease: Design and Prediction of Virtual Multi-Target Inhibitors of GSK3B, HDAC1, and HDAC6. Curr Top Med Chem 2020; 20:1661-1676. [DOI: 10.2174/1568026620666200607190951] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2019] [Revised: 12/12/2019] [Accepted: 01/05/2020] [Indexed: 01/23/2023]
Abstract
Background:
Alzheimer’s disease is characterized by a progressive pattern of cognitive and
functional impairment, which ultimately leads to death. Computational approaches have played an important
role in the context of drug discovery for anti-Alzheimer's therapies. However, most of the computational
models reported to date have been focused on only one protein associated with Alzheimer's,
while relying on small datasets of structurally related molecules.
Objective:
We introduce the first model combining perturbation theory and machine learning based on
artificial neural networks (PTML-ANN) for simultaneous prediction and design of inhibitors of three
Alzheimer’s disease-related proteins, namely glycogen synthase kinase 3 beta (GSK3B), histone deacetylase
1 (HDAC1), and histone deacetylase 6 (HDAC6).
Methods:
The PTML-ANN model was obtained from a dataset retrieved from ChEMBL, and it relied on
a classification approach to predict chemicals as active or inactive.
Results:
The PTML-ANN model displayed sensitivity and specificity higher than 85% in both training
and test sets. The physicochemical and structural interpretation of the molecular descriptors in the model
permitted the direct extraction of fragments suggested to favorably contribute to enhancing the multitarget
inhibitory activity. Based on this information, we assembled ten molecules from several fragments
with positive contributions. Seven of these molecules were predicted as triple target inhibitors while the
remaining three were predicted as dual-target inhibitors. The estimated physicochemical properties of
the designed molecules complied with Lipinski’s rule of five and its variants.
Conclusion:
This work opens new horizons toward the design of multi-target inhibitors for anti- Alzheimer's
therapies.
Collapse
Affiliation(s)
- Valeria V. Kleandrova
- Laboratory of Fundamental and Applied Research of Quality and Technology of Food Production, Moscow State University of Food Production, Volokolamskoe Shosse 11, 125080, Moscow, Russian Federation
| | - Alejandro Speck-Planche
- Programa Institucional de Fomento a la Investigacion, Desarrollo e Innovacion, Universidad Tecnologica Metropolitana, Ignacio Valdivieso 2409, P.O. Box 8940577, San Joaquin, Santiago, Chile
| |
Collapse
|
21
|
Urista DV, Carrué DB, Otero I, Arrasate S, Quevedo-Tumailli VF, Gestal M, González-Díaz H, Munteanu CR. Prediction of Antimalarial Drug-Decorated Nanoparticle Delivery Systems with Random Forest Models. BIOLOGY 2020; 9:biology9080198. [PMID: 32751710 PMCID: PMC7465777 DOI: 10.3390/biology9080198] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 07/22/2020] [Accepted: 07/27/2020] [Indexed: 12/13/2022]
Abstract
Drug-decorated nanoparticles (DDNPs) have important medical applications. The current work combined Perturbation Theory with Machine Learning and Information Fusion (PTMLIF). Thus, PTMLIF models were proposed to predict the probability of nanoparticle–compound/drug complexes having antimalarial activity (against Plasmodium). The aim is to save experimental resources and time by using a virtual screening for DDNPs. The raw data was obtained by the fusion of experimental data for nanoparticles with compound chemical assays from the ChEMBL database. The inputs for the eight Machine Learning classifiers were transformed features of drugs/compounds and nanoparticles as perturbations of molecular descriptors in specific experimental conditions (experiment-centered features). The resulting dataset contains 107 input features and 249,992 examples. The best classification model was provided by Random Forest, with 27 selected features of drugs/compounds and nanoparticles in all experimental conditions considered. The high performance of the model was demonstrated by the mean Area Under the Receiver Operating Characteristics (AUC) in a test subset with a value of 0.9921 ± 0.000244 (10-fold cross-validation). The results demonstrated the power of information fusion of the experimental-centered features of drugs/compounds and nanoparticles for the prediction of nanoparticle–compound antimalarial activity. The scripts and dataset for this project are available in the open GitHub repository.
Collapse
Affiliation(s)
- Diana V. Urista
- Department of Organic Chemistry II, University of Basque Country (UPV/EHU), Sarriena w/n, 48940 Leioa, Spain; (D.V.U.); (S.A.); (H.G.-D.)
| | - Diego B. Carrué
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
| | - Iago Otero
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
| | - Sonia Arrasate
- Department of Organic Chemistry II, University of Basque Country (UPV/EHU), Sarriena w/n, 48940 Leioa, Spain; (D.V.U.); (S.A.); (H.G.-D.)
| | - Viviana F. Quevedo-Tumailli
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
- Universidad Estatal Amazónica UEA, Km. 2 1/2 vía Puyo a Tena (paso lateral), Puyo 160150, Pastaza, Ecuador
| | - Marcos Gestal
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
- Biomedical Research Institute of A Coruña (INIBIC), Hospital Teresa Herrera, Xubias de Arriba 84, 15006 A Coruña, Spain
| | - Humbert González-Díaz
- Department of Organic Chemistry II, University of Basque Country (UPV/EHU), Sarriena w/n, 48940 Leioa, Spain; (D.V.U.); (S.A.); (H.G.-D.)
- IKERBASQUE, Basque Foundation for Science, Alameda Urquijo 36, 48011 Bilbao, Spain
- Basque Centre for Biophysics CSIC-UPVEHU, University of Basque Country UPV/EHU, Barrio Sarriena, 48940 Leioa, Spain
| | - Cristian R. Munteanu
- RNASA-IMEDIR, Computer Science Faculty, CITIC, University of A Coruna, Campus Elviña s/n, 15071 A Coruña, Spain; (D.B.C.); (I.O.); (V.F.Q.-T.); (M.G.)
- Biomedical Research Institute of A Coruña (INIBIC), Hospital Teresa Herrera, Xubias de Arriba 84, 15006 A Coruña, Spain
- Correspondence:
| |
Collapse
|
22
|
Panda SS, Girgis AS, Thomas SJ, Capito JE, George RF, Salman A, El-Manawaty MA, Samir A. Synthesis, pharmacological profile and 2D-QSAR studies of curcumin-amino acid conjugates as potential drug candidates. Eur J Med Chem 2020; 196:112293. [PMID: 32311607 DOI: 10.1016/j.ejmech.2020.112293] [Citation(s) in RCA: 30] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2020] [Revised: 03/28/2020] [Accepted: 03/28/2020] [Indexed: 02/01/2023]
Abstract
A series of curcumin bis-conjugates 3a-q, 5a-k and 6a-k were synthesized in good yields utilizing an optimized reaction condition. We explored the effect of different amino acids and protecting groups on biological activities of curcumin. The conjugates were screened for anti-inflammatory, analgesic and antimicrobial properties. Some of the conjugates showed promising biological observations with a potency comparable with the standard references. The variations in biological properties concerning different amino acids and protecting groups are interesting observations. Effects of the synthesized conjugates on splenocytes and the production of nitric oxide by lipopolysaccharide-stimulated peritoneal macrophages are correlated with the observed anti-inflammatory properties. We have also established the safety profile of the most active conjugates. Robust 2D-QSAR studies supported and validated biological data.
Collapse
Affiliation(s)
- Siva S Panda
- Department of Chemistry and Physics, Augusta University, Augusta, GA, 30912, USA.
| | - Adel S Girgis
- Department of Pesticide Chemistry, National Research Centre, Dokki, Giza, 12622, Egypt
| | - Sean J Thomas
- Department of Chemistry and Physics, Augusta University, Augusta, GA, 30912, USA
| | - Jason E Capito
- Department of Chemistry and Physics, Augusta University, Augusta, GA, 30912, USA
| | - Riham F George
- Pharmaceutical Chemistry Department, Faculty of Pharmacy, Cairo University, Cairo, 11562, Egypt
| | - Asmaa Salman
- Medical and Pharmaceutical Chemistry Department, National Research Centre, Dokki, Giza, 12622, Egypt
| | - May A El-Manawaty
- Drug Bioassay-Cell Culture Laboratory, Pharmacognosy Department, National Research Centre, Dokki, Giza, 12622, Egypt
| | - Ahmed Samir
- Microbiology Department, Faculty of Veterinary Medicine, Cairo University, Cairo, Egypt
| |
Collapse
|
23
|
Álvarez-Machancoses Ó, DeAndrés Galiana EJ, Cernea A, Fernández de la Viña J, Fernández-Martínez JL. On the Role of Artificial Intelligence in Genomics to Enhance Precision Medicine. PHARMACOGENOMICS & PERSONALIZED MEDICINE 2020; 13:105-119. [PMID: 32256101 PMCID: PMC7090191 DOI: 10.2147/pgpm.s205082] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 02/17/2020] [Indexed: 12/21/2022]
Abstract
The complexity of orphan diseases, which are those that do not have an effective treatment, together with the high dimensionality of the genetic data used for their analysis and the high degree of uncertainty in the understanding of the mechanisms and genetic pathways which are involved in their development, motivate the use of advanced techniques of artificial intelligence and in-depth knowledge of molecular biology, which is crucial in order to find plausible solutions in drug design, including drug repositioning. Particularly, we show that the use of robust deep sampling methodologies of the altered genetics serves to obtain meaningful results and dramatically decreases the cost of research and development in drug design, influencing very positively the use of precision medicine and the outcomes in patients. The target-centric approach and the use of strong prior hypotheses that are not matched against reality (disease genetic data) are undoubtedly the cause of the high number of drug design failures and attrition rates. Sampling and prediction under uncertain conditions cannot be avoided in the development of precision medicine.
Collapse
Affiliation(s)
- Óscar Álvarez-Machancoses
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, Oviedo 33007, Spain.,DeepBiosInsights, NETGEV (Maof Tech), Dimona 8610902, Israel
| | - Enrique J DeAndrés Galiana
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, Oviedo 33007, Spain
| | - Ana Cernea
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, Oviedo 33007, Spain
| | - J Fernández de la Viña
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, Oviedo 33007, Spain
| | | |
Collapse
|
24
|
Lin X, Li X, Lin X. A Review on Applications of Computational Methods in Drug Screening and Design. Molecules 2020; 25:E1375. [PMID: 32197324 PMCID: PMC7144386 DOI: 10.3390/molecules25061375] [Citation(s) in RCA: 275] [Impact Index Per Article: 55.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2020] [Revised: 03/16/2020] [Accepted: 03/16/2020] [Indexed: 12/27/2022] Open
Abstract
Drug development is one of the most significant processes in the pharmaceutical industry. Various computational methods have dramatically reduced the time and cost of drug discovery. In this review, we firstly discussed roles of multiscale biomolecular simulations in identifying drug binding sites on the target macromolecule and elucidating drug action mechanisms. Then, virtual screening methods (e.g., molecular docking, pharmacophore modeling, and QSAR) as well as structure- and ligand-based classical/de novo drug design were introduced and discussed. Last, we explored the development of machine learning methods and their applications in aforementioned computational methods to speed up the drug discovery process. Also, several application examples of combining various methods was discussed. A combination of different methods to jointly solve the tough problem at different scales and dimensions will be an inevitable trend in drug screening and design.
Collapse
Affiliation(s)
- Xiaoqian Lin
- Institute of Single Cell Engineering, Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing 100191, China;
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100191, China
| | - Xiu Li
- School of Chemistry and Material Science, Shanxi Normal University, Linfen 041004, China;
| | - Xubo Lin
- Institute of Single Cell Engineering, Beijing Advanced Innovation Center for Biomedical Engineering, Beihang University, Beijing 100191, China;
- School of Biological Science and Medical Engineering, Beihang University, Beijing 100191, China
| |
Collapse
|
25
|
Santana R, Zuluaga R, Gañán P, Arrasate S, Onieva Caracuel E, González-Díaz H. PTML Model of ChEMBL Compounds Assays for Vitamin Derivatives. ACS COMBINATORIAL SCIENCE 2020; 22:129-141. [PMID: 32011854 DOI: 10.1021/acscombsci.9b00166] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Determining the biological activity of vitamin derivatives is needed given that organic synthesis of analogs of vitamins is an active field of interest for medicinal chemistry, pharmaceuticals, and food additives. Accordingly, scientists from different disciplines perform preclinical assays (nij) with a considerable combination of assay conditions (cj). Indeed, the ChEMBL platform contains a database that includes results from 36 220 different biological activity bioassays of 21 240 different vitamins and vitamin derivatives. These assays present are heterogeneous in terms of assay combinations of cj. They are focused on >500 different biological activity parameters (c0), >340 different targets (c1), >6200 types of cell (c2), >120 organisms of assay (c3), and >60 assay strains (c4). It includes a total of >1850 niacin assays, >1580 tretinoin assays, >1580 retinol assays, 857 ascorbic acid assays, etc. Given the complexity of this combinatorial data in terms of being assimilated by researchers, we propose to build a model by combining perturbation theory (PT) and machine learning (ML). Through this study, we propose a PTML (PT + ML) combinatorial model for ChEMBL results on biological activity of vitamins and vitamins derivatives. The linear discriminant analysis (LDA) model presented the following results for training subset a: specificity (%) = 90.38, sensitivity (%) = 87.51, and accuracy (%) = 89.89. The model showed the following results for the external validation subset: specificity (%) = 90.58, sensitivity (%) = 87.72, and accuracy (%) = 90.09. Different types of linear and nonlinear PTML models, such as logistic regression (LR), classification tree (CT), näive Bayes (NB), and random Forest (RF), were applied to contrast the capacity of prediction. The PTML-LDA model predicts with more accuracy by applying combinatorial descriptors. In addition, a PCA experiment with chemical structure descriptors allowed us to characterize the high structural diversity of the chemical space studied. In any case, PTML models using chemical structure descriptors do not improve the performance of the PTML-LDA model based on ALOGP and PSA. We can conclude that the three variable PTML-LDA model is a simplified and adaptable tool for the prediction, for different experiment combinations, the biological activity of derivative vitamins.
Collapse
Affiliation(s)
- Ricardo Santana
- DeustoTech-Fundación Deusto, Avda. Universidades, 24, 48007 Bilbao, Spain
- Grupo de Investigación sobre Nuevos Materiales, Universidad Pontificia Bolivariana UPB, 050031, Medellín, Colombia
| | - Robin Zuluaga
- Facultad de Ingeniería Agroindustrial, Universidad Pontificia Bolivariana UPB, 050031, Medellín, Colombia
| | - Piedad Gañán
- Facultad de Ingeniería Química, Universidad Pontificia Bolivariana UPB, 050031, Medellín, Colombia
| | - Sonia Arrasate
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940, Leioa, Spain
| | | | - Humbert González-Díaz
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940, Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Spain
| |
Collapse
|
26
|
Sunita, Sajid A, Singh Y, Shukla P. Computational tools for modern vaccine development. Hum Vaccin Immunother 2020; 16:723-735. [PMID: 31545127 PMCID: PMC7227725 DOI: 10.1080/21645515.2019.1670035] [Citation(s) in RCA: 53] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2019] [Revised: 08/28/2019] [Accepted: 09/13/2019] [Indexed: 12/12/2022] Open
Abstract
Vaccines play an essential role in controlling the rates of fatality and morbidity. Vaccines not only arrest the beginning of different diseases but also assign a gateway for its elimination and reduce toxicity. This review gives an overview of the possible uses of computational tools for vaccine design. Moreover, we have described the initiatives of utilizing the diverse computational resources by exploring the immunological databases for developing epitope-based vaccines, peptide-based drugs, and other resources of immunotherapeutics. Finally, the applications of multi-graft and multivalent scaffolding, codon optimization and antibodyomics tools in identifying and designing in silico vaccine candidates are described.
Collapse
Affiliation(s)
- Sunita
- Enzyme Technology and Protein Bioinformatics Laboratory, Department of Microbiology, Maharshi Dayanand University, Rohtak, India
- Bacterial Pathogenesis Laboratory, Department of Zoology, University of Delhi, Delhi
| | - Andaleeb Sajid
- National Institutes of Health, National Cancer Institute, Bethesda, MD, USA
| | - Yogendra Singh
- Bacterial Pathogenesis Laboratory, Department of Zoology, University of Delhi, Delhi
| | - Pratyoosh Shukla
- Enzyme Technology and Protein Bioinformatics Laboratory, Department of Microbiology, Maharshi Dayanand University, Rohtak, India
| |
Collapse
|
27
|
Montes-Bageneta I, Akesolo U, López S, Merino M, Anakabe E, Arrasate S. Pollutants in Organic Chemistry and Medicinal Chemistry Education Laboratory. Experimental and Machine Learning Studies. Curr Top Med Chem 2020; 20:720-730. [PMID: 32066360 DOI: 10.2174/1568026620666200211110043] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 12/27/2019] [Accepted: 12/27/2019] [Indexed: 11/22/2022]
Abstract
AIMS Computational modelling may help us to detect the more important factors governing this process in order to optimize it. BACKGROUND The generation of hazardous organic waste in teaching and research laboratories poses a big problem that universities have to manage. METHODS In this work, we report on the experimental measurement of waste generation on the chemical education laboratories within our department. We measured the waste generated in the teaching laboratories of the Organic Chemistry Department II (UPV/EHU), in the second semester of the 2017/2018 academic year. Likewise, to know the anthropogenic and social factors related to the generation of waste, a questionnaire has been utilized. We focused on all students of Experimentation in Organic Chemistry (EOC) and Organic Chemistry II (OC2) subjects. It helped us to know their prior knowledge about waste, awareness of the problem of separate organic waste and the correct use of the containers. These results, together with the volumetric data, have been analyzed with statistical analysis software. We obtained two Perturbation-Theory Machine Learning (PTML) models including chemical, operational, and academic factors. The dataset analyzed included 6050 cases of laboratory practices vs. practices of reference. RESULTS These models predict the values of acetone waste with R2 = 0.88 and non-halogenated waste with R2 = 0.91. CONCLUSION This work opens a new gate to the implementation of more sustainable techniques and a circular economy with the aim of improving the quality of university education processes.
Collapse
Affiliation(s)
- Iker Montes-Bageneta
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940, Leioa, Spain
| | - Urtzi Akesolo
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940, Leioa, Spain
| | - Sara López
- Faculty of Science and Technology, University of Basque Country UPV/EHU, 48940, Leioa, Spain
| | - Maria Merino
- Department of Applied Mathematics, Statistics, and Operational Research, University of Basque Country UPV/EHU, 48940, Leioa, Spain
| | - Eneritz Anakabe
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940, Leioa, Spain
| | - Sonia Arrasate
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940, Leioa, Spain
| |
Collapse
|
28
|
Carracedo-Reboredo P, Corona R, Martinez-Nunes M, Fernandez-Lozano C, Tsiliki G, Sarimveis H, Aranzamendi E, Arrasate S, Sotomayor N, Lete E, Munteanu CR, González-Díaz H. MCDCalc: Markov Chain Molecular Descriptors Calculator for Medicinal Chemistry. Curr Top Med Chem 2019; 20:305-317. [PMID: 31878856 DOI: 10.2174/1568026620666191226092431] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Revised: 09/17/2019] [Accepted: 09/17/2019] [Indexed: 11/22/2022]
Abstract
AIMS Cheminformatics models are able to predict different outputs (activity, property, chemical reactivity) in single molecules or complex molecular systems (catalyzed organic synthesis, metabolic reactions, nanoparticles, etc.). BACKGROUND Cheminformatics models are able to predict different outputs (activity, property, chemical reactivity) in single molecules or complex molecular systems (catalyzed organic synthesis, metabolic reactions, nanoparticles, etc.). OBJECTIVE Cheminformatics prediction of complex catalytic enantioselective reactions is a major goal in organic synthesis research and chemical industry. Markov Chain Molecular Descriptors (MCDs) have been largely used to solve Cheminformatics problems. There are different types of Markov chain descriptors such as Markov-Shannon entropies (Shk), Markov Means (Mk), Markov Moments (πk), etc. However, there are other possible MCDs that have not been used before. In addition, the calculation of MCDs is done very often using specific software not always available for general users and there is not an R library public available for the calculation of MCDs. This fact, limits the availability of MCMDbased Cheminformatics procedures. METHODS We studied the enantiomeric excess ee(%)[Rcat] for 324 α-amidoalkylation reactions. These reactions have a complex mechanism depending on various factors. The model includes MCDs of the substrate, solvent, chiral catalyst, product along with values of time of reaction, temperature, load of catalyst, etc. We tested several Machine Learning regression algorithms. The Random Forest regression model has R2 > 0.90 in training and test. Secondly, the biological activity of 5644 compounds against colorectal cancer was studied. RESULTS We developed very interesting model able to predict with Specificity and Sensitivity 70-82% the cases of preclinical assays in both training and validation series. CONCLUSION The work shows the potential of the new tool for computational studies in organic and medicinal chemistry.
Collapse
Affiliation(s)
- Paula Carracedo-Reboredo
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, 15071, A Coruña, Spain.,Group of Artificial Neural Networks and Adaptative Systems, Medical Imaging, and Diagnostic Radiology (RNASA-IMEDIR), Institute of Biomedical Research of Coruna (INIBIC), Hospital Complex of University of A Coruna (CHUAC), Sergas, University of Coruna (UDC), Xubias de arriba 84, 15006, A Coruna, Spain.,Department of Organic Chemistry II, University of the Basque Country UPV/EHU, 48940, Leioa, Bilbao, Spain
| | - Ramiro Corona
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, 48940, Leioa, Bilbao, Spain
| | - Mikel Martinez-Nunes
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, 48940, Leioa, Bilbao, Spain
| | - Carlos Fernandez-Lozano
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, 15071, A Coruña, Spain.,Group of Artificial Neural Networks and Adaptative Systems, Medical Imaging, and Diagnostic Radiology (RNASA-IMEDIR), Institute of Biomedical Research of Coruna (INIBIC), Hospital Complex of University of A Coruna (CHUAC), Sergas, University of Coruna (UDC), Xubias de arriba 84, 15006, A Coruna, Spain
| | - Georgia Tsiliki
- Institute for the Management of Information Systems, ATHENA Research and Innovation Centre, 15125, Athens, Greece
| | - Haralambos Sarimveis
- School of Chemical Engineering, National Technical University of Athens, Zografou, Campus, 15780, Athens, Greece.,Pharma-Informatics Unit, ATHENA Research and Innovation Centre, 15125, Athens, Greece
| | - Eider Aranzamendi
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, 48940, Leioa, Bilbao, Spain
| | - Sonia Arrasate
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, 48940, Leioa, Bilbao, Spain
| | - Nuria Sotomayor
- Group of Artificial Neural Networks and Adaptative Systems, Medical Imaging, and Diagnostic Radiology (RNASA-IMEDIR), Institute of Biomedical Research of Coruna (INIBIC), Hospital Complex of University of A Coruna (CHUAC), Sergas, University of Coruna (UDC), Xubias de arriba 84, 15006, A Coruna, Spain
| | - Esther Lete
- Department of Organic Chemistry II, University of the Basque Country UPV/EHU, 48940, Leioa, Bilbao, Spain
| | - Cristian Robert Munteanu
- Department of Computer Science and Information Technologies, Faculty of Computer Science, University of A Coruña, CITIC, Campus Elviña s/n, 15071, A Coruña, Spain.,Group of Artificial Neural Networks and Adaptative Systems, Medical Imaging, and Diagnostic Radiology (RNASA-IMEDIR), Institute of Biomedical Research of Coruna (INIBIC), Hospital Complex of University of A Coruna (CHUAC), Sergas, University of Coruna (UDC), Xubias de arriba 84, 15006, A Coruna, Spain
| | - Humbert González-Díaz
- Basque Center for Biophysics, University of the Basque Country UPV/EHU, 48940, Leioa, Bilbao, Spain.,IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Spain
| |
Collapse
|
29
|
Diez-Alarcia R, Yáñez-Pérez V, Muneta-Arrate I, Arrasate S, Lete E, Meana JJ, González-Díaz H. Big Data Challenges Targeting Proteins in GPCR Signaling Pathways; Combining PTML-ChEMBL Models and [ 35S]GTPγS Binding Assays. ACS Chem Neurosci 2019; 10:4476-4491. [PMID: 31618004 DOI: 10.1021/acschemneuro.9b00302] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
G-protein-coupled receptors (GPCRs), also known as 7-transmembrane receptors, are the single largest class of drug targets. Consequently, a large amount of preclinical assays having GPCRs as molecular targets has been released to public sources like the Chemical European Molecular Biology Laboratory (ChEMBL) database. These data are also very complex covering changes in drug chemical structure and assay conditions like c0 = activity parameter (Ki, IC50, etc.), c1 = target protein, c2 = cell line, c3 = assay organism, etc., making difficult the analysis of these databases that are placed in the borders of a Big Data challenge. One of the aims of this work is to develop a computational model able to predict new GPCRs targeting drugs taking into consideration multiple conditions of assay. Another objective is to perform new predictive and experimental studies of selective 5-HTA2 receptor agonist, antagonist, or inverse agonist in human comparing the results with those from the literature. In this work, we combined Perturbation Theory (PT) and Machine Learning (ML) to seek a general PTML model for this data set. We analyzed 343 738 unique compounds with 812 072 end points (assay outcomes), with 185 different experimental parameters, 592 protein targets, 51 cell lines, and/or 55 organisms (species). The best PTML linear model found has three input variables only and predicted 56 202/58 653 positive outcomes (sensitivity = 95.8%) and 470 230/550 401 control cases (specificity = 85.4%) in training series. The model also predicted correctly 18 732/19 549 (95.8%) of positive outcomes and 156 739/183 469 (85.4%) of cases in external validation series. To illustrate its practical use, we used the model to predict the outcomes of six different 5-HT2A receptor drugs, namely, TCB-2, DOI, DOB, altanserin, pimavanserin, and nelotanserin, in a very large number of different pharmacological assays. 5-HT2A receptors are altered in schizophrenia and represent drug target for antipsychotic therapeutic activity. The model correctly predicted 93.83% (76 of 86) experimental results for these compounds reported in ChEMBL. Moreover, [35S]GTPγS binding assays were performed experimentally with the same six drugs with the aim of determining their potency and efficacy in the modulation of G-proteins in human brain tissue. The antagonist ketanserin was included as inactive drug with demonstrated affinity for 5-HT2A/C receptors. Our results demonstrate that some of these drugs, previously described as serotonin 5-HT2A receptor agonists, antagonists, or inverse agonists, are not so specific and show different intrinsic activity to that previously reported. Overall, this work opens a new gate for the prediction of GPCRs targeting compounds.
Collapse
Affiliation(s)
- Rebeca Diez-Alarcia
- Centro de Investigación Biomédica en Red en Salud Mental, 48940 Leioa, Spain
| | | | | | | | | | - J. Javier Meana
- Centro de Investigación Biomédica en Red en Salud Mental, 48940 Leioa, Spain
| | - Humbert González-Díaz
- Biophysics Institute, CSIC-UPV/EHU, University of the Basque Country UPV/EHU, Leioa, 48940, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain
| |
Collapse
|
30
|
Speck-Planche A. Multiple Perspectives in Anti-cancer Drug Discovery: From old Targets and Natural Products to Innovative Computational Approaches. Anticancer Agents Med Chem 2019; 19:146-147. [PMID: 31298144 DOI: 10.2174/187152061902190418105054] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Alejandro Speck-Planche
- Research Program on Biomedical Informatics (GRIB) Hospital del Mar Medical Research Institute (IMIM) Barcelona, Spain
| |
Collapse
|
31
|
Pérez-Parras Toledano J, García-Pedrajas N, Cerruela-García G. Multilabel and Missing Label Methods for Binary Quantitative Structure-Activity Relationship Models: An Application for the Prediction of Adverse Drug Reactions. J Chem Inf Model 2019; 59:4120-4130. [PMID: 31514503 DOI: 10.1021/acs.jcim.9b00611] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
The prediction of adverse drug reactions in the discovery of new medicines is highly challenging. In the task of predicting the adverse reactions of chemical compounds, information about different targets is often available. Although we can focus on every adverse drug reaction prediction separately, multilabel approaches have been proven useful in many research areas for taking advantage of the relationship among the targets. However, when approaching the prediction problem from a multilabel point of view, we have to deal with the lack of information for some labels. This missing labels problem is a relevant issue in the field of cheminformatics approaches. This paper aims to predict the adverse drug reaction of commercial drugs using a multilabel approach where the possible presence of missing labels is also taken into consideration. We propose the use of multilabel methods to deal with the prediction of a large set of 27 different adverse reaction targets. We also propose the use of multilabel methods specifically designed to deal with the missing labels problem to test their ability to solve this difficult problem. The results show the validity of the proposed approach, demonstrating a superior performance of the multilabel method compared with the single-label approach in addressing the problem of adverse drug reaction prediction.
Collapse
Affiliation(s)
- José Pérez-Parras Toledano
- University of Córdoba , Department of Computing and Numerical Analysis, Campus de Rabanales , Albert Einstein Building , E-14071 Córdoba , Spain
| | - Nicolás García-Pedrajas
- University of Córdoba , Department of Computing and Numerical Analysis, Campus de Rabanales , Albert Einstein Building , E-14071 Córdoba , Spain
| | - Gonzalo Cerruela-García
- University of Córdoba , Department of Computing and Numerical Analysis, Campus de Rabanales , Albert Einstein Building , E-14071 Córdoba , Spain
| |
Collapse
|
32
|
Munteanu CR, Gestal M, Martínez-Acevedo YG, Pedreira N, Pazos A, Dorado J. Improvement of Epitope Prediction Using Peptide Sequence Descriptors and Machine Learning. Int J Mol Sci 2019; 20:ijms20184362. [PMID: 31491969 PMCID: PMC6770149 DOI: 10.3390/ijms20184362] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2019] [Revised: 08/26/2019] [Accepted: 08/30/2019] [Indexed: 01/27/2023] Open
Abstract
In this work, we improved a previous model used for the prediction of proteomes as new B-cell epitopes in vaccine design. The predicted epitope activity of a queried peptide is based on its sequence, a known reference epitope sequence under specific experimental conditions. The peptide sequences were transformed into molecular descriptors of sequence recurrence networks and were mixed under experimental conditions. The new models were generated using 709,100 instances of pair descriptors for query and reference peptide sequences. Using perturbations of the initial descriptors under sequence or assay conditions, 10 transformed features were used as inputs for seven Machine Learning methods. The best model was obtained with random forest classifiers with an Area Under the Receiver Operating Characteristics (AUROC) of 0.981 ± 0.0005 for the external validation series (five-fold cross-validation). The database included information about 83,683 peptides sequences, 1448 epitope organisms, 323 host organisms, 15 types of in vivo processes, 28 experimental techniques, and 505 adjuvant additives. The current model could improve the in silico predictions of epitopes for vaccine design. The script and results are available as a free repository.
Collapse
Affiliation(s)
- Cristian R Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain
| | - Marcos Gestal
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain.
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain.
| | - Yunuen G Martínez-Acevedo
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
- Unidad Profesional Interdisciplinaria de Biotecnología, National Polytechnic Institute (IPN), Ticoman, 07340 Mexico City, Mexico
| | - Nieves Pedreira
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
| | - Alejandro Pazos
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), 15006 A Coruña, Spain
| | - Julián Dorado
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruna, 15071 A Coruña, Spain
- Centro de Investigación en Tecnologías de la Información y las Comunicaciones (CITIC), Campus de Elviña s/n, 15071 A Coruña, Spain
| |
Collapse
|
33
|
Vásquez-Domínguez E, Armijos-Jaramillo VD, Tejera E, González-Díaz H. Multioutput Perturbation-Theory Machine Learning (PTML) Model of ChEMBL Data for Antiretroviral Compounds. Mol Pharm 2019; 16:4200-4212. [PMID: 31426639 DOI: 10.1021/acs.molpharmaceut.9b00538] [Citation(s) in RCA: 22] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
Retroviral infections, such as HIV, are, until now, diseases with no cure. Medicine and pharmaceutical chemistry need and consider it a huge goal to define target proteins of new antiretroviral compounds. ChEMBL manages Big Data features with a complex data set, which is hard to organize. This makes information difficult to analyze due to a big number of characteristics described in order to predict new drug candidates for retroviral infections. For this reason, we propose to develop a new predictive model combining perturbation theory (PT) bases and machine learning (ML) modeling to create a new tool that can take advantage of all the available information. The PTML model proposed in this work for the ChEMBL data set preclinical experimental assays for antiretroviral compounds consists of a linear equation with four variables. The PT operators used are founded on multicondition moving averages, combining different features and simplifying the difficulty to manage all data. More than 140 000 preclinical assays for 56 105 compounds with different characteristics or experimental conditions have been carried out and can be found in ChEMBL database, covering combinations with 359 biological activity parameters (c0), 55 protein accessions (c1), 83 cell lines (c2), 64 organisms of assay (c3), and 773 subtypes or strains. We have included 150 148 preclinical experimental assays for HIV virus, 1188 for HTLV virus, 84 for simian immunodeficiency virus, 370 for murine leukemia virus, 119 for Rous sarcoma virus, 1581 for MMTV, etc. We also included 5277 assays for hepatitis B virus. The developed PTML model reached considerable values in sensibility (73.05% for training and 73.10% for validation), specificity (86.61% for training and 87.17% for validation), and accuracy (75.84% for training and 75.98% for validation). We also compared alternative PTML models with different PT operators such as covariance, moments, and exponential terms. Finally, we made a comparison between literature ML models with our PTML model and also artificial neural network (ANN) nonlinear models. We conclude that this PTML model is the first one to consider multiple characteristics of preclinical experimental antiretroviral assays combined, generating a simple, useful, and adaptable instrument, which could reduce time and costs in antiretroviral drugs research.
Collapse
Affiliation(s)
- Emilia Vásquez-Domínguez
- Department of Organic Chemistry II , University of Basque Country UPV/EHU , 48940 Leioa , Spain.,Faculty of Engineering and Applied Sciences-Biotechnology , Universidad de Las Américas (UDLA) , 170125 Quito , Ecuador
| | - Vinicio Danilo Armijos-Jaramillo
- Faculty of Engineering and Applied Sciences-Biotechnology , Universidad de Las Américas (UDLA) , 170125 Quito , Ecuador.,Bio-chemioinformatics group , Universidad de Las Américas (UDLA) , 170125 Quito , Ecuador
| | - Eduardo Tejera
- Faculty of Engineering and Applied Sciences-Biotechnology , Universidad de Las Américas (UDLA) , 170125 Quito , Ecuador.,Bio-chemioinformatics group , Universidad de Las Américas (UDLA) , 170125 Quito , Ecuador
| | - Humbert González-Díaz
- Department of Organic Chemistry II , University of Basque Country UPV/EHU , 48940 Leioa , Spain.,IKERBASQUE, Basque Foundation for Science , 48011 Bilbao , Spain
| |
Collapse
|
34
|
Martínez-Arzate SG, Sánchez-Bermúdez JC, Sotelo-Gómez S, Diaz-Albiter HM, Hegazy-Hassan W, Tenorio-Borroto E, Barbabosa-Pliego A, Vázquez-Chagoyán JC. Genetic diversity of Bm86 sequences in Rhipicephalus (Boophilus) microplus ticks from Mexico: analysis of haplotype distribution patterns. BMC Genet 2019; 20:56. [PMID: 31299900 PMCID: PMC6626424 DOI: 10.1186/s12863-019-0754-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Accepted: 06/20/2019] [Indexed: 11/25/2022] Open
Abstract
Background Ticks are a problem for cattle production mainly in tropical and subtropical regions, because they generate great economic losses. Acaricides and vaccines have been used to try to keep tick populations under control. This has been proven difficult given the resistance to acaricides and vaccines observed in ticks. Resistance to protein rBm86-based vaccines has been associated with the genetic diversity of Bm86 among the ectoparasite’s populations. So far, neither genetic diversity, nor spatial distribution of circulating Bm86 haplotypes, have been studied within the Mexican territory. Here, we explored the genetic diversity of 125 Bm86 cDNA gene sequences from R. microplus from 10 endemic areas of Mexico by analyzing haplotype distribution patterns to help in understanding the population genetic structure of Mexican ticks. Results Our results showed an average nucleotide identity among the Mexican isolates of 98.3%, ranging from 91.1 to 100%. Divergence between the Mexican and Yeerongpilly (the Bm86 reference vaccine antigen) sequences ranged from 3.1 to 7.4%. Based on the geographic distribution of Bm86 haplotypes in Mexico, our results suggest gene flow occurrence within different regions of the Mexican territory, and even the USA. Conclusions The polymorphism of Bm86 found in the populations included in this study, could account for the poor efficacy of the current Bm86 antigen based commercial vaccine in many regions of Mexico. Our data may contribute towards designing new, highly-specific, Bm86 antigen vaccine candidates against R. microplus circulating in Mexico.
Collapse
Affiliation(s)
- S G Martínez-Arzate
- Centro de Investigación y Estudios Avanzados en Salud Animal, Facultad de Medicina Veterinaria y Zootecnia, Universidad Autónoma del Estado de México, Kilometro 15.5 Carretera Panamericana, CP 50200, Toluca-Atlacomulco, Mexico
| | - J C Sánchez-Bermúdez
- Centro de Investigación y Estudios Avanzados en Salud Animal, Facultad de Medicina Veterinaria y Zootecnia, Universidad Autónoma del Estado de México, Kilometro 15.5 Carretera Panamericana, CP 50200, Toluca-Atlacomulco, Mexico
| | - S Sotelo-Gómez
- Centro de Investigación y Estudios Avanzados en Salud Animal, Facultad de Medicina Veterinaria y Zootecnia, Universidad Autónoma del Estado de México, Kilometro 15.5 Carretera Panamericana, CP 50200, Toluca-Atlacomulco, Mexico
| | - H M Diaz-Albiter
- Wellcome Centre for Molecular Parasitology, University of Glasgow, University Place, Glasgow, G12 8TA, UK.,Colegio de la Frontera del Sur, Carretera Villahermosa-Reforma Km 15.5, Ranchería Guineo, sección II, CP 86280, Villahermosa, Tabasco, Mexico
| | - W Hegazy-Hassan
- Centro de Investigación y Estudios Avanzados en Salud Animal, Facultad de Medicina Veterinaria y Zootecnia, Universidad Autónoma del Estado de México, Kilometro 15.5 Carretera Panamericana, CP 50200, Toluca-Atlacomulco, Mexico
| | - E Tenorio-Borroto
- Centro de Investigación y Estudios Avanzados en Salud Animal, Facultad de Medicina Veterinaria y Zootecnia, Universidad Autónoma del Estado de México, Kilometro 15.5 Carretera Panamericana, CP 50200, Toluca-Atlacomulco, Mexico
| | - A Barbabosa-Pliego
- Centro de Investigación y Estudios Avanzados en Salud Animal, Facultad de Medicina Veterinaria y Zootecnia, Universidad Autónoma del Estado de México, Kilometro 15.5 Carretera Panamericana, CP 50200, Toluca-Atlacomulco, Mexico
| | - J C Vázquez-Chagoyán
- Centro de Investigación y Estudios Avanzados en Salud Animal, Facultad de Medicina Veterinaria y Zootecnia, Universidad Autónoma del Estado de México, Kilometro 15.5 Carretera Panamericana, CP 50200, Toluca-Atlacomulco, Mexico.
| |
Collapse
|
35
|
Álvarez-Machancoses Ó, Fernández-Martínez JL. Using artificial intelligence methods to speed up drug discovery. Expert Opin Drug Discov 2019; 14:769-777. [DOI: 10.1080/17460441.2019.1621284] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Affiliation(s)
- Óscar Álvarez-Machancoses
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, Oviedo, Spain
| | - Juan Luis Fernández-Martínez
- Group of Inverse Problems, Optimization and Machine Learning, Department of Mathematics, University of Oviedo, Oviedo, Spain
| |
Collapse
|
36
|
Concu R, D. S. Cordeiro MN, Munteanu CR, González-Díaz H. PTML Model of Enzyme Subclasses for Mining the Proteome of Biofuel Producing Microorganisms. J Proteome Res 2019; 18:2735-2746. [DOI: 10.1021/acs.jproteome.8b00949] [Citation(s) in RCA: 15] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Affiliation(s)
- Riccardo Concu
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - M. Natália. D. S. Cordeiro
- LAQV@REQUIMTE/Department of Chemistry and Biochemistry, Faculty of Sciences, University of Porto, 4169-007 Porto, Portugal
| | - Cristian R. Munteanu
- RNASA-IMEDIR, Computer Science Faculty, University of A Coruña, 15071 A Coruña, Spain
- INIBIC Biomedical Research Institute of Coruña, CHUAC University Hospital, 15006 A Coruña, Spain
| | - Humbert González-Díaz
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940 Leioa, Biscay, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Biscay, Spain
| |
Collapse
|
37
|
Nocedo-Mena D, Cornelio C, Camacho-Corona MDR, Garza-González E, Waksman de Torres N, Arrasate S, Sotomayor N, Lete E, González-Díaz H. Modeling Antibacterial Activity with Machine Learning and Fusion of Chemical Structure Information with Microorganism Metabolic Networks. J Chem Inf Model 2019; 59:1109-1120. [PMID: 30802402 DOI: 10.1021/acs.jcim.9b00034] [Citation(s) in RCA: 41] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Predicting the activity of new chemical compounds over pathogenic microorganisms with different metabolic reaction networks (MRN s) is an important goal due to the different susceptibility to antibiotics. The ChEMBL database contains >160 000 outcomes of preclinical assays of antimicrobial activity for 55 931 compounds with >365 parameters of activity (MIC, IC50, etc.) and >90 bacteria strains of >25 bacterial species. In addition, the Leong and Barabàsi data set includes >40 MRNs of microorganisms. However, there are no models able to predict antibacterial activity for multiple assays considering both drug and MRN structures at the same time. In this work, we combined perturbation theory, machine learning, and information fusion techniques to develop the first PTMLIF model. The best linear model found presented values of specificity = 90.31/90.40 and sensitivity = 88.14/88.07 in training/validation series. We carried out a comparison to nonlinear artificial neural network (ANN) techniques and previous models from the literature. Next, we illustrated the practical use of the model with an experimental case of study. We reported for the first time the isolation and characterization of terpenes from the plant Cissus incisa. The antibacterial activity of the terpenes was experimentally determined. The more active compounds were phytol and α-amyrin, with MIC = 100 μg/mL for Vancomycin-resistant Enterococcus faecium and Acinetobacter baumannii resistant to carbapenems. These compounds are already known from other sources. However, they have been isolated and evaluated for the first time here against several strains of multidrug-resistant bacteria including World Health Organization (WHO) priority pathogens. Last, we used the model to predict the activity of these compounds versus other microorganisms with different MRNs in order to find other potential targets.
Collapse
Affiliation(s)
- Deyani Nocedo-Mena
- Department of Organic Chemistry II , University of the Basque Country UPV/EHU , 48940 Leioa , Spain.,Facultad de Ciencias Químicas , Universidad Autónoma de Nuevo León , CP 66455 San Nicolás de los Garza , Nuevo León , México
| | - Carlos Cornelio
- Department of Organic Chemistry II , University of the Basque Country UPV/EHU , 48940 Leioa , Spain
| | - María Del Rayo Camacho-Corona
- Facultad de Ciencias Químicas , Universidad Autónoma de Nuevo León , CP 66455 San Nicolás de los Garza , Nuevo León , México
| | - Elvira Garza-González
- Servicio de Gastroenterología, Hospital Universitario, Dr. Eleuterio González , Universidad Autónoma de Nuevo León , CP 64460 Monterrey , Nuevo León , México
| | - Noemi Waksman de Torres
- Facultad de Medicina , Universidad Autónoma de Nuevo León , CP 64460 Monterrey , Nuevo León , México
| | - Sonia Arrasate
- Department of Organic Chemistry II , University of the Basque Country UPV/EHU , 48940 Leioa , Spain
| | - Nuria Sotomayor
- Department of Organic Chemistry II , University of the Basque Country UPV/EHU , 48940 Leioa , Spain
| | - Esther Lete
- Department of Organic Chemistry II , University of the Basque Country UPV/EHU , 48940 Leioa , Spain
| | - Humbert González-Díaz
- Department of Organic Chemistry II , University of the Basque Country UPV/EHU , 48940 Leioa , Spain.,IKERBASQUE, Basque Foundation for Science , 48011 Bilbao , Biscay , Spain
| |
Collapse
|
38
|
Speck-Planche A. Combining Ensemble Learning with a Fragment-Based Topological Approach To Generate New Molecular Diversity in Drug Discovery: In Silico Design of Hsp90 Inhibitors. ACS OMEGA 2018; 3:14704-14716. [PMID: 30555986 PMCID: PMC6289491 DOI: 10.1021/acsomega.8b02419] [Citation(s) in RCA: 23] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Accepted: 10/23/2018] [Indexed: 05/05/2023]
Abstract
Machine learning methods have revolutionized modern science, providing fast and accurate solutions to multiple problems. However, they are commonly treated as "black boxes". Therefore, in important scientific fields such as medicinal chemistry and drug discovery, machine learning methods are restricted almost exclusively to the task of performing predictions of large and heterogeneous data sets of chemicals. The lack of interpretability prevents the full exploitation of the machine learning models as generators of new chemical knowledge. This work focuses on the development of an ensemble learning model for the prediction and design of potent dual heat shock protein 90 (Hsp90) inhibitors. The model displays accuracy higher than 80% in both training and test sets. To use the ensemble model as a generator of new chemical knowledge, three steps were followed. First, a physicochemical and/or structural interpretation was provided for each molecular descriptor present in the ensemble learning model. Second, the term "pseudolinear equation" was introduced within the context of machine learning to calculate the relative quantitative contributions of different molecular fragments to the inhibitory activity against the two Hsp90 isoforms studied here. Finally, by assembling the fragments with positive contributions, new molecules were designed, being predicted as potent Hsp90 inhibitors. According to Lipinski's rule of five, the designed molecules were found to exhibit potentially good oral bioavailability, a primordial property that chemicals must have to pass early stages in drug discovery. The present approach based on the combination of ensemble learning and fragment-based topological design holds great promise in drug discovery, and it can be adapted and applied to many different scientific disciplines.
Collapse
|
39
|
Ferreira da Costa J, Silva D, Caamaño O, Brea JM, Loza MI, Munteanu CR, Pazos A, García-Mera X, González-Díaz H. Perturbation Theory/Machine Learning Model of ChEMBL Data for Dopamine Targets: Docking, Synthesis, and Assay of New l-Prolyl-l-leucyl-glycinamide Peptidomimetics. ACS Chem Neurosci 2018; 9:2572-2587. [PMID: 29791132 DOI: 10.1021/acschemneuro.8b00083] [Citation(s) in RCA: 37] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Predicting drug-protein interactions (DPIs) for target proteins involved in dopamine pathways is a very important goal in medicinal chemistry. We can tackle this problem using Molecular Docking or Machine Learning (ML) models for one specific protein. Unfortunately, these models fail to account for large and complex big data sets of preclinical assays reported in public databases. This includes multiple conditions of assays, such as different experimental parameters, biological assays, target proteins, cell lines, organism of the target, or organism of assay. On the other hand, perturbation theory (PT) models allow us to predict the properties of a query compound or molecular system in experimental assays with multiple boundary conditions based on a previously known case of reference. In this work, we report the first PTML (PT + ML) study of a large ChEMBL data set of preclinical assays of compounds targeting dopamine pathway proteins. The best PTML model found predicts 50000 cases with accuracy of 70-91% in training and external validation series. We also compared the linear PTML model with alternative PTML models trained with multiple nonlinear methods (artificial neural network (ANN), Random Forest, Deep Learning, etc.). Some of the nonlinear methods outperform the linear model but at the cost of a notable increment of the complexity of the model. We illustrated the practical use of the new model with a proof-of-concept theoretical-experimental study. We reported for the first time the organic synthesis, chemical characterization, and pharmacological assay of a new series of l-prolyl-l-leucyl-glycinamide (PLG) peptidomimetic compounds. In addition, we performed a molecular docking study for some of these compounds with the software Vina AutoDock. The work ends with a PTML model predictive study of the outcomes of the new compounds in a large number of assays. Therefore, this study offers a new computational methodology for predicting the outcome for any compound in new assays. This PTML method focuses on the prediction with a simple linear model of multiple pharmacological parameters (IC50, EC50, Ki, etc.) for compounds in assays involving different cell lines used, organisms of the protein target, or organism of assay for proteins in the dopamine pathway.
Collapse
Affiliation(s)
- Joana Ferreira da Costa
- Department of Organic Chemistry, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - David Silva
- Department of Organic Chemistry, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Olga Caamaño
- Department of Organic Chemistry, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - José M. Brea
- CIMUS, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
- Department of Pharmacology, Pharmacy and Pharmaceutical Technology, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Maria Isabel Loza
- CIMUS, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
- Department of Pharmacology, Pharmacy and Pharmaceutical Technology, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Cristian R. Munteanu
- Instituto de Investigacion Biomedica de A Coruña (INIBIC), Complexo Hospitalario Universitario de A Coruña (CHUAC), A Coruña, 15006, Spain
| | - Alejandro Pazos
- Instituto de Investigacion Biomedica de A Coruña (INIBIC), Complexo Hospitalario Universitario de A Coruña (CHUAC), A Coruña, 15006, Spain
- Computer Science Department, Faculty of Computer Science, University of A Coruna, 15071 A Coruña, Spain
| | - Xerardo García-Mera
- Department of Organic Chemistry, University of Santiago de Compostela, 15782 Santiago de Compostela, Spain
| | - Humbert González-Díaz
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain
| |
Collapse
|
40
|
Bediaga H, Arrasate S, González-Díaz H. PTML Combinatorial Model of ChEMBL Compounds Assays for Multiple Types of Cancer. ACS COMBINATORIAL SCIENCE 2018; 20:621-632. [PMID: 30240186 DOI: 10.1021/acscombsci.8b00090] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Determining the target proteins of new anticancer compounds is a very important task in Medicinal Chemistry. In this sense, chemists carry out preclinical assays with a high number of combinations of experimental conditions (c j). In fact, ChEMBL database contains outcomes of 65 534 different anticancer activity preclinical assays for 35 565 different chemical compounds (1.84 assays per compound). These assays cover different combinations of c j formed from >70 different biological activity parameters ( c0), >300 different drug targets ( c1), >230 cell lines ( c2), and 5 organisms of assay ( c3) or organisms of the target ( c4). It include a total of 45 833 assays in leukemia, 6227 assays in breast cancer, 2499 assays in ovarian cancer, 3499 in colon cancer, 3159 in lung cancer, 2750 in prostate cancer, 601 in melanoma, etc. This is a very complex data set with multiple Big Data features. This data is hard to be rationalized by researchers to extract useful relationships and predict new compounds. In this context, we propose to combine perturbation theory (PT) ideas and machine learning (ML) modeling to solve this combinatorial-like problem. In this work, we report a PTML (PT + ML) model for ChEMBL data set of preclinical assays of anticancer compounds. This is a simple linear model with only three variables. The model presented values of area under receiver operating curve = AUROC = 0.872, specificity = Sp(%) = 90.2, sensitivity = Sn(%) = 70.6, and overall accuracy = Ac(%) = 87.7 in training series. The model also have Sp(%) = 90.1, Sn(%) = 71.4, and Ac(%) = 87.8 in external validation series. The model use PT operators based on multicondition moving averages to capture all the complexity of the data set. We also compared the model with nonlinear artificial neural network (ANN) models obtaining similar results. This confirms the hypothesis of a linear relationship between the PT operators and the classification as anticancer compounds in different combinations of assay conditions. Last, we compared the model with other PTML models reported in the literature concluding that this is the only one PTML model able to predict activity against multiple types of cancer. This model is a simple but versatile tool for the prediction of the targets of anticancer compounds taking into consideration multiple combinations of experimental conditions in preclinical assays.
Collapse
Affiliation(s)
- Harbil Bediaga
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940, Leioa, Spain
| | - Sonia Arrasate
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940, Leioa, Spain
| | - Humbert González-Díaz
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940, Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Spain
| |
Collapse
|
41
|
Blay V, Yokoi T, González-Díaz H. Perturbation Theory–Machine Learning Study of Zeolite Materials Desilication. J Chem Inf Model 2018; 58:2414-2419. [DOI: 10.1021/acs.jcim.8b00383] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Affiliation(s)
- Vincent Blay
- Fisher College of Business, The Ohio State University, Gerlach Hall, 2108 Neil Avenue, Columbus, Ohio 43210, United States
| | - Toshiyuki Yokoi
- Institute of Innovative Research, Chemical Resources Laboratory, Tokyo Institute of Technology, 4259 Nagatsuta,
Midori-ku, Yokohama 226-8503, Japan
| | - Humbert González-Díaz
- Department of Organic Chemistry II, University of Basque Country UPV/EHU, 48940 Leioa, Spain
- IKERBASQUE, Basque Foundation for Science, 48011 Bilbao, Spain
| |
Collapse
|