1
|
Savino DF, Silva JV, da Silva Santos S, Lourenço FR, Giarolla J. How do physicochemical properties contribute to inhibitory activity of promising peptides against Zika Virus NS3 protease? J Mol Model 2024; 30:54. [PMID: 38289526 DOI: 10.1007/s00894-024-05843-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2023] [Accepted: 01/09/2024] [Indexed: 02/01/2024]
Abstract
CONTEXT AND RESULTS Flavivirus diseases' cycles, especially Dengue and Yellow Fever, can be observed all over Brazilian territory, representing a great health concern. Additionally, there are no drugs available in therapy. In this scenario, in silico methodologies were applied to obtain physicochemical properties, as well as to better understand the ligand-biological target interaction mode of 20 previously reported NS2B/NS3 protease inhibitors of Dengue virus. Since catalytic site of flavivirus hold similarities, such as the same catalytic triad (His51, Asp75 e Ser135), the ability of this series of molecules to fit in Zika NS3 domains can be achieved. We performed an exploratory data analysis, using statistical methodologies, such as PCA (Principal Component Analysis) and HCA (Hierarchical Component Analysis), to assist the comprehension of how physicochemical properties impact the interaction observed by the docking studies, as well as to build a correlation between the respective ranked characteristics. Based on these previous studies, peptides were selected for the dynamics simulations, which were useful to better understand the ligand-protein interactions. Information relating to, for instance, energy, ΔG, average number of hydrogen bonds and distance from Ser135 (one of the main amino acids in the catalytic pocket) were discussed. In this sense, peptides 15 (considering ΔG value and Hbond number), 7 (ΔG and energy) and 1, 6, 7 and 15 (the proximity to Ser135 throughout the dynamics simulation) were highlighted as promising. Those interesting results could contribute to future studies regarding Zika virus drug design, since this infection represents a great concern in neglected populations. METHODS The models were constructed in the ChemDraw software. The ligand parametrization was performed in the CHEM3D 17.0, UCSF Chimera. Docking simulations were carried out in the GOLD software, after the redocking validation. We used ASP as the function score. Additionally, for dynamics simulations we applied GROMACS software, exploring, mainly, free binding energy calculations. Exploratory analysis was carried out in Minitab 17.3.1 statistical software. Prior to the exploratory analysis, data of quantum chemical properties of the peptides were collected in Microsoft Excel spreadsheet and organized to obtain Hierarchical Cluster Analysis (HCA) and Principal Component Analysis (PCA).
Collapse
Affiliation(s)
- Débora Feliciano Savino
- Department of Pharmacy, School of Pharmaceutical Sciences, University of São Paulo (USP), Professor Lineu Prestes Avenue, 580, Building 13, São Paulo, SP, 05508-900, Brazil
| | - João Vitor Silva
- Department of Pharmacy, School of Pharmaceutical Sciences, University of São Paulo (USP), Professor Lineu Prestes Avenue, 580, Building 13, São Paulo, SP, 05508-900, Brazil
| | - Soraya da Silva Santos
- Department of Pharmacy, School of Pharmaceutical Sciences, University of São Paulo (USP), Professor Lineu Prestes Avenue, 580, Building 13, São Paulo, SP, 05508-900, Brazil
| | - Felipe Rebello Lourenço
- Department of Pharmacy, School of Pharmaceutical Sciences, University of São Paulo (USP), Professor Lineu Prestes Avenue, 580, Building 13, São Paulo, SP, 05508-900, Brazil
| | - Jeanine Giarolla
- Department of Pharmacy, School of Pharmaceutical Sciences, University of São Paulo (USP), Professor Lineu Prestes Avenue, 580, Building 13, São Paulo, SP, 05508-900, Brazil.
| |
Collapse
|
2
|
Prediction of antischistosomal small molecules using machine learning in the era of big data. Mol Divers 2021; 26:1597-1607. [PMID: 34351547 DOI: 10.1007/s11030-021-10288-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 07/24/2021] [Indexed: 12/13/2022]
Abstract
Schistosomiasis is a neglected tropical disease caused by helminths of the Schistosoma genus. Despite its high morbidity and socio-economic burden, therapeutics are just a handful with praziquantel being the main drug. Praziquantel is an old drug registered for human use in 1982 and has since been administered en masse for chemotherapy, risking the development of resistance, thus the need for new drugs with different mechanisms of action. This review examines the use of machine learning (ML) in this era of big data to aid in the prediction of novel antischistosomal molecules. It first discusses the challenges of drug discovery in schistosomiasis. Explanations are then offered for big data, its characteristics and then, some open databases where large biochemical data on schistosomiasis can be obtained for ML model development are examined. The concepts of artificial intelligence, ML, and deep learning and their drug applications are explored in schistosomiasis. The use of binary classification in predicting antischistosomal compounds and some algorithms that have been applied including random forest and naive Bayesian are discussed. For this review, some deep learning algorithms (deep neural networks) are proposed as novel algorithms for predicting antischistosomal molecules via binary classification. Databases specifically designed for housing bioactivity data on antischistosomal molecules enriched with functional genomic datasets and ontologies are thus urgently needed for developing predictive ML models. This shows the application of machine learning techniques for the discovery of novel antischistosomal small molecules via binary classification in the era of big data.
Collapse
|
3
|
Ben Khalaf N, Pham S, Romeo G, Abdelghany S, Intagliata S, Sedillo P, Salerno L, Gonzales J, Fathallah DM, Perkins DJ, Hurwitz I, Pittalà V. A computer-aided approach to identify novel Leishmania major protein disulfide isomerase inhibitors for treatment of leishmaniasis. J Comput Aided Mol Des 2021; 35:297-314. [PMID: 33615401 DOI: 10.1007/s10822-021-00374-w] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Accepted: 01/15/2021] [Indexed: 12/19/2022]
Abstract
Leishmaniasis is an infectious disease caused by parasites of the genus Leishmania and transmitted by the bite of a sand fly. To date, most available drugs for treatment are toxic and beyond the economic means of those affected by the disease. Protein disulfide isomerase (PDI) is a chaperone protein that plays a major role in the folding of newly synthesized proteins, specifically assisting in disulfide bond formation, breakage, or rearrangement in all non-native proteins. In previous work, we demonstrated that Leishmania major PDI (LmPDI) has an essential role in pathogen virulence. Furthermore, inhibition of LmPDI further blocked parasite infection in macrophages. In this study, we utilized a computer-aided approach to design a series of LmPDI inhibitors. Fragment-based virtual screening allowed for the understanding of the inhibitors' modes of action on LmPDI active sites. The generated compounds obtained after multiple rounds of virtual screening were synthesized and significantly inhibited target LmPDI reductase activity and were shown to decrease in vitro parasite growth in human monocyte-derived macrophages. This novel cheminformatics and synthetic approach led to the identification of a new series of compounds that might be optimized into novel drugs, likely more specific and less toxic for the treatment of leishmaniasis.
Collapse
Affiliation(s)
- Noureddine Ben Khalaf
- Department of Life Sciences, Health Biotechnology Program, College of Graduates Studies, King Fahd Chair for Health Biotechnology, Arabian Gulf University, Road 2904 Building 293, Manama, 329, Kingdom of Bahrain.
| | - Susie Pham
- Center for Global Health, University of New Mexico Health Sciences Center, Albuquerque, NM, USA
| | - Giuseppe Romeo
- Department of Drug Sciences, University of Catania, V.le A. Doria 6, 95125, Catania, Italy
| | - Sara Abdelghany
- Department of Molecular Medicine, Princess Al-Jawhara Center for Genetics and Inherited Diseases, College of Medicine and Medical Sciences, Arabian Gulf University, Manama, Bahrain
| | - Sebastiano Intagliata
- Department of Drug Sciences, University of Catania, V.le A. Doria 6, 95125, Catania, Italy
| | - Peter Sedillo
- Center for Global Health, University of New Mexico Health Sciences Center, Albuquerque, NM, USA
| | - Loredana Salerno
- Department of Drug Sciences, University of Catania, V.le A. Doria 6, 95125, Catania, Italy
| | - Jessica Gonzales
- Center for Global Health, University of New Mexico Health Sciences Center, Albuquerque, NM, USA
| | - Dahmani M Fathallah
- Department of Life Sciences, Health Biotechnology Program, College of Graduates Studies, King Fahd Chair for Health Biotechnology, Arabian Gulf University, Road 2904 Building 293, Manama, 329, Kingdom of Bahrain
| | - Douglas J Perkins
- Center for Global Health, University of New Mexico Health Sciences Center, Albuquerque, NM, USA
| | - Ivy Hurwitz
- Center for Global Health, University of New Mexico Health Sciences Center, Albuquerque, NM, USA
| | - Valeria Pittalà
- Department of Drug Sciences, University of Catania, V.le A. Doria 6, 95125, Catania, Italy
| |
Collapse
|
4
|
Zorn KM, Sun S, McConnon CL, Ma K, Chen EK, Foil DH, Lane TR, Liu LJ, El-Sakkary N, Skinner DE, Ekins S, Caffrey CR. A Machine Learning Strategy for Drug Discovery Identifies Anti-Schistosomal Small Molecules. ACS Infect Dis 2021; 7:406-420. [PMID: 33434015 PMCID: PMC7887754 DOI: 10.1021/acsinfecdis.0c00754] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
![]()
Schistosomiasis is a chronic and
painful disease of poverty caused
by the flatworm parasite Schistosoma. Drug discovery
for antischistosomal compounds predominantly employs in vitro whole organism (phenotypic) screens against two developmental stages
of Schistosoma mansoni, post-infective larvae (somules)
and adults. We generated two rule books and associated scoring systems
to normalize 3898 phenotypic data points to enable machine learning.
The data were used to generate eight Bayesian machine learning models
with the Assay Central software according to parasite’s developmental
stage and experimental time point (≤24, 48, 72, and >72
h).
The models helped predict 56 active and nonactive compounds from commercial
compound libraries for testing. When these were screened against S. mansoni in vitro, the prediction accuracy for active
and inactives was 61% and 56% for somules and adults, respectively;
also, hit rates were 48% and 34%, respectively, far exceeding the
typical 1–2% hit rate for traditional high throughput screens.
Collapse
Affiliation(s)
- Kimberley M. Zorn
- Collaborations Pharmaceuticals, 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Shengxi Sun
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093-0021, United States
| | - Cecelia L. McConnon
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093-0021, United States
| | - Kelley Ma
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093-0021, United States
| | - Eric K. Chen
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093-0021, United States
| | - Daniel H. Foil
- Collaborations Pharmaceuticals, 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Thomas R. Lane
- Collaborations Pharmaceuticals, 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Lawrence J. Liu
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093-0021, United States
| | - Nelly El-Sakkary
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093-0021, United States
| | - Danielle E. Skinner
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093-0021, United States
| | - Sean Ekins
- Collaborations Pharmaceuticals, 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina 27606, United States
| | - Conor R. Caffrey
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California 92093-0021, United States
| |
Collapse
|
5
|
Barros RPC, Scotti L, Scotti MT. Exploring Secondary Metabolites Database of Apocynaceae, Menispermaceae, and Annonaceae to Select Potential Anti-HCV Compounds. Curr Top Med Chem 2019; 19:900-913. [PMID: 31074368 DOI: 10.2174/1568026619666190510094228] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2018] [Revised: 03/27/2019] [Accepted: 03/27/2019] [Indexed: 12/14/2022]
Abstract
BACKGROUND Hepatitis C is a disease that constitutes a serious global health problem, is often asymptomatic and difficult to diagnose and about 60-80% of infected patients develop chronic diseases over time. As there is no vaccine against hepatitis C virus (HCV), developing new cheap treatments is a big challenge. OBJECTIVE The search for new drugs from natural products has been outstanding in recent years. The aim of this study was to combine structure-based and ligand-based virtual screening (VS) techniques to select potentially active molecules against four HCV target proteins from in-house secondary metabolite dataset (SistematX). MATERIALS AND METHODS From the ChEMBL database, we selected four sets of 1199, 355, 290 and 237chemical structures with inhibitory activity against different targets of HCV to create random forest models with an accuracy value higher than 82% for cross-validation and test sets. Afterward, a ligandbased virtual screen of the entire 1848 secondary metabolites database stored in SistematX was performed. In addition, a structure-based virtual screening was also performed for the same set of secondary metabolites using molecular docking. RESULTS Finally, using consensus analyses approach combining ligand-based and structure-based VS, three alkaloids were selected as potential anti-HCV compounds. CONCLUSION The selected structures are a starting point for further studies in order to develop new anti- HCV compounds based on natural products.
Collapse
Affiliation(s)
- Renata P C Barros
- Post-Graduate Program in Natural Synthetic Bioactive Products, Federal University of Paraiba, Joao Pessoa, Brazil
| | - Luciana Scotti
- Post-Graduate Program in Natural Synthetic Bioactive Products, Federal University of Paraiba, Joao Pessoa, Brazil
| | - Marcus T Scotti
- Post-Graduate Program in Natural Synthetic Bioactive Products, Federal University of Paraiba, Joao Pessoa, Brazil
| |
Collapse
|
6
|
Hernandez HW, Soeung M, Zorn KM, Ashoura N, Mottin M, Andrade CH, Caffrey CR, de Siqueira-Neto JL, Ekins S. High Throughput and Computational Repurposing for Neglected Diseases. Pharm Res 2018; 36:27. [PMID: 30560386 PMCID: PMC6792295 DOI: 10.1007/s11095-018-2558-3] [Citation(s) in RCA: 28] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2018] [Accepted: 12/09/2018] [Indexed: 12/21/2022]
Abstract
Purpose Neglected tropical diseases (NTDs) represent are a heterogeneous group of communicable diseases that are found within the poorest populations of the world. There are 23 NTDs that have been prioritized by the World Health Organization, which are endemic in 149 countries and affect more than 1.4 billion people, costing these developing economies billions of dollars annually. The NTDs result from four different causative pathogens: protozoa, bacteria, helminth and virus. The majority of the diseases lack effective treatments. Therefore, new therapeutics for NTDs are desperately needed. Methods We describe various high throughput screening and computational approaches that have been performed in recent years. We have collated the molecules identified in these studies and calculated molecular properties. Results Numerous global repurposing efforts have yielded some promising compounds for various neglected tropical diseases. These compounds when analyzed as one would expect appear drug-like. Several large datasets are also now in the public domain and this enables machine learning models to be constructed that then facilitate the discovery of new molecules for these pathogens. Conclusions In the space of a few years many groups have either performed experimental or computational repurposing high throughput screens against neglected diseases. These have identified compounds which in many cases are already approved drugs. Such approaches perhaps offer a more efficient way to develop treatments which are generally not a focus for global pharmaceutical companies because of the economics or the lack of a viable market. Other diseases could perhaps benefit from these repurposing approaches. Electronic supplementary material The online version of this article (10.1007/s11095-018-2558-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Melinda Soeung
- MD Anderson Cancer Center, University of Texas, Houston, Texas, USA
| | - Kimberley M Zorn
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina, 27606, USA
| | | | - Melina Mottin
- LabMol - Laboratory for Molecular Modeling and Drug Design Faculdade de Farmacia, Universidade Federal de Goias - UFG, Goiânia, GO, 74605-170, Brazil
| | - Carolina Horta Andrade
- LabMol - Laboratory for Molecular Modeling and Drug Design Faculdade de Farmacia, Universidade Federal de Goias - UFG, Goiânia, GO, 74605-170, Brazil
| | - Conor R Caffrey
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, California, 92093, USA
| | - Jair Lage de Siqueira-Neto
- Center for Discovery and Innovation in Parasitic Diseases, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, California, 92093, USA
| | - Sean Ekins
- Collaborations Pharmaceuticals Inc., 840 Main Campus Drive, Lab 3510, Raleigh, North Carolina, 27606, USA.
| |
Collapse
|
7
|
Ferreira LLG, Andricopulo AD. Chemoinformatics Strategies for Leishmaniasis Drug Discovery. Front Pharmacol 2018; 9:1278. [PMID: 30443215 PMCID: PMC6221941 DOI: 10.3389/fphar.2018.01278] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2018] [Accepted: 10/18/2018] [Indexed: 12/15/2022] Open
Abstract
Leishmaniasis is a fatal neglected tropical disease (NTD) that is caused by more than 20 species of Leishmania parasites. The disease kills approximately 20,000 people each year and more than 1 billion are susceptible to infection. Although counting on a few compounds, the therapeutic arsenal faces some drawbacks such as drug resistance, toxicity issues, high treatment costs, and accessibility problems, which highlight the need for novel treatment options. Worldwide efforts have been made to that aim and, as well as in other therapeutic areas, chemoinformatics have contributed significantly to leishmaniasis drug discovery. Breakthrough advances in the comprehension of the parasites’ molecular biology have enabled the design of high-affinity ligands for a number of macromolecular targets. In addition, the use of chemoinformatics has allowed highly accurate predictions of biological activity and physicochemical and pharmacokinetics properties of novel antileishmanial compounds. This review puts into perspective the current context of leishmaniasis drug discovery and focuses on the use of chemoinformatics to develop better therapies for this life-threatening condition.
Collapse
Affiliation(s)
- Leonardo L G Ferreira
- Laboratory of Medicinal and Computational Chemistry, Center for Research and Innovation in Biodiversity and Drug Discovery, São Carlos Institute of Physics, University of São Paulo, São Carlos, Brazil
| | - Adriano D Andricopulo
- Laboratory of Medicinal and Computational Chemistry, Center for Research and Innovation in Biodiversity and Drug Discovery, São Carlos Institute of Physics, University of São Paulo, São Carlos, Brazil
| |
Collapse
|
8
|
Collaborative drug discovery for More Medicines for Tuberculosis (MM4TB). Drug Discov Today 2016; 22:555-565. [PMID: 27884746 DOI: 10.1016/j.drudis.2016.10.009] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2016] [Revised: 10/11/2016] [Accepted: 10/21/2016] [Indexed: 01/30/2023]
Abstract
Neglected disease drug discovery is generally poorly funded compared with major diseases and hence there is an increasing focus on collaboration and precompetitive efforts such as public-private partnerships (PPPs). The More Medicines for Tuberculosis (MM4TB) project is one such collaboration funded by the EU with the goal of discovering new drugs for tuberculosis. Collaborative Drug Discovery has provided a commercial web-based platform called CDD Vault which is a hosted collaborative solution for securely sharing diverse chemistry and biology data. Using CDD Vault alongside other commercial and free cheminformatics tools has enabled support of this and other large collaborative projects, aiding drug discovery efforts and fostering collaboration. We will describe CDD's efforts in assisting with the MM4TB project.
Collapse
|
9
|
Ekins S. The Next Era: Deep Learning in Pharmaceutical Research. Pharm Res 2016; 33:2594-603. [PMID: 27599991 DOI: 10.1007/s11095-016-2029-7] [Citation(s) in RCA: 127] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2016] [Accepted: 08/23/2016] [Indexed: 01/22/2023]
Abstract
Over the past decade we have witnessed the increasing sophistication of machine learning algorithms applied in daily use from internet searches, voice recognition, social network software to machine vision software in cameras, phones, robots and self-driving cars. Pharmaceutical research has also seen its fair share of machine learning developments. For example, applying such methods to mine the growing datasets that are created in drug discovery not only enables us to learn from the past but to predict a molecule's properties and behavior in future. The latest machine learning algorithm garnering significant attention is deep learning, which is an artificial neural network with multiple hidden layers. Publications over the last 3 years suggest that this algorithm may have advantages over previous machine learning methods and offer a slight but discernable edge in predictive performance. The time has come for a balanced review of this technique but also to apply machine learning methods such as deep learning across a wider array of endpoints relevant to pharmaceutical research for which the datasets are growing such as physicochemical property prediction, formulation prediction, absorption, distribution, metabolism, excretion and toxicity (ADME/Tox), target prediction and skin permeation, etc. We also show that there are many potential applications of deep learning beyond cheminformatics. It will be important to perform prospective testing (which has been carried out rarely to date) in order to convince skeptics that there will be benefits from investing in this technique.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations Pharmaceuticals, Inc, 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina, 27526, USA. .,Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California, 94010, USA.
| |
Collapse
|
10
|
Ekins S, Mietchen D, Coffee M, Stratton TP, Freundlich JS, Freitas-Junior L, Muratov E, Siqueira-Neto J, Williams AJ, Andrade C. Open drug discovery for the Zika virus. F1000Res 2016; 5:150. [PMID: 27134728 PMCID: PMC4841202 DOI: 10.12688/f1000research.8013.1] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Accepted: 02/08/2016] [Indexed: 01/20/2023] Open
Abstract
The Zika virus (ZIKV) outbreak in the Americas has caused global concern that we may be on the brink of a healthcare crisis. The lack of research on ZIKV in the over 60 years that we have known about it has left us with little in the way of starting points for drug discovery. Our response can build on previous efforts with virus outbreaks and lean heavily on work done on other flaviviruses such as dengue virus. We provide some suggestions of what might be possible and propose an open drug discovery effort that mobilizes global science efforts and provides leadership, which thus far has been lacking. We also provide a listing of potential resources and molecules that could be prioritized for testing as
in vitro assays for ZIKV are developed. We propose also that in order to incentivize drug discovery, a neglected disease priority review voucher should be available to those who successfully develop an FDA approved treatment. Learning from the response to the ZIKV, the approaches to drug discovery used and the success and failures will be critical for future infectious disease outbreaks.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry Inc, Fuquay-Varina, NC, USA; Collaborations Pharmaceuticals Inc., Fuquay-Varina, NC, USA; Collaborative Drug Discovery Inc., Burlingame, CA, USA
| | | | - Megan Coffee
- The International Rescue Committee , NY, NY, USA
| | - Thomas P Stratton
- Department of Pharmacology, Physiology and Neuroscience, Rutgers University-New Jersey Medical School, Newark, NJ, USA
| | - Joel S Freundlich
- Department of Pharmacology, Physiology and Neuroscience, Rutgers University-New Jersey Medical School, Newark, NJ, USA; Division of Infectious Diseases, Department of Medicine, and the Ruy V. Lourenço Center for the Study of Emerging and Re-emerging Pathogens, Rutgers University-New Jersey Medical School, Newark, NJ, USA
| | - Lucio Freitas-Junior
- Chemical Biology and Screening Platform, Brazilian Laboratory of Biosciences (LNBio), CNPEM, Campinas, Brazil
| | - Eugene Muratov
- Laboratory for Molecular Modeling, Division of Chemical Biology and Medicinal Chemistry, Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, NC, USA
| | - Jair Siqueira-Neto
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, San Diego, CA, USA
| | | | - Carolina Andrade
- LabMol - Laboratory for Molecular Modeling and Drug Design, Faculty of Pharmacy, Federal University of Goias, Goiânia, Brazil
| |
Collapse
|
11
|
Ekins S, Lage de Siqueira-Neto J, McCall LI, Sarker M, Yadav M, Ponder EL, Kallel EA, Kellar D, Chen S, Arkin M, Bunin BA, McKerrow JH, Talcott C. Machine Learning Models and Pathway Genome Data Base for Trypanosoma cruzi Drug Discovery. PLoS Negl Trop Dis 2015; 9:e0003878. [PMID: 26114876 PMCID: PMC4482694 DOI: 10.1371/journal.pntd.0003878] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Accepted: 06/05/2015] [Indexed: 12/21/2022] Open
Abstract
Background Chagas disease is a neglected tropical disease (NTD) caused by the eukaryotic parasite Trypanosoma cruzi. The current clinical and preclinical pipeline for T. cruzi is extremely sparse and lacks drug target diversity. Methodology/Principal Findings In the present study we developed a computational approach that utilized data from several public whole-cell, phenotypic high throughput screens that have been completed for T. cruzi by the Broad Institute, including a single screen of over 300,000 molecules in the search for chemical probes as part of the NIH Molecular Libraries program. We have also compiled and curated relevant biological and chemical compound screening data including (i) compounds and biological activity data from the literature, (ii) high throughput screening datasets, and (iii) predicted metabolites of T. cruzi metabolic pathways. This information was used to help us identify compounds and their potential targets. We have constructed a Pathway Genome Data Base for T. cruzi. In addition, we have developed Bayesian machine learning models that were used to virtually screen libraries of compounds. Ninety-seven compounds were selected for in vitro testing, and 11 of these were found to have EC50 < 10μM. We progressed five compounds to an in vivo mouse efficacy model of Chagas disease and validated that the machine learning model could identify in vitro active compounds not in the training set, as well as known positive controls. The antimalarial pyronaridine possessed 85.2% efficacy in the acute Chagas mouse model. We have also proposed potential targets (for future verification) for this compound based on structural similarity to known compounds with targets in T. cruzi. Conclusions/ Significance We have demonstrated how combining chemoinformatics and bioinformatics for T. cruzi drug discovery can bring interesting in vivo active molecules to light that may have been overlooked. The approach we have taken is broadly applicable to other NTDs. Chagas disease is a neglected tropical disease (NTD) caused by the eukaryotic parasite Trypanosoma cruzi. The disease is endemic to Latin America but is increasingly found in North America and Europe, primarily through immigration, and the spread of this disease is bringing new attention to the need for novel, safe, and effective therapeutics to treat T. cruzi infection. We have used data from a phenotypic screen to build Bayesian models to predict anti-parasitic activity against T. cruzi in vitro. These models were used to score various small libraries of molecules. We selected less than 100 compounds for testing and found in vitro actives, some of which were tested in an in vivo efficacy model. We identified the antimalarial pyronaridine as having in vivo efficacy and provides us with a new starting point for further investigation and optimization.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative Drug Discovery, Burlingame, California, United States of America
- Collaborations in Chemistry, Fuquay-Varina, North Carolina, United States of America
- * E-mail:
| | - Jair Lage de Siqueira-Neto
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, California, United States of America
| | - Laura-Isobel McCall
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, California, United States of America
| | - Malabika Sarker
- SRI International, Menlo Park, California, United States of America
| | - Maneesh Yadav
- SRI International, Menlo Park, California, United States of America
| | - Elizabeth L. Ponder
- Chemistry, Engineering & Medicine for Human Health (ChEM-H), Stanford, California, United States of America
| | - E. Adam Kallel
- Collaborative Drug Discovery, Burlingame, California, United States of America
| | - Danielle Kellar
- Department of Pathology, University of California, San Francisco, San Francisco, California, United States of America
| | - Steven Chen
- Small Molecule Discovery Center and Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, United States of America
| | - Michelle Arkin
- Small Molecule Discovery Center and Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, California, United States of America
| | - Barry A. Bunin
- Collaborative Drug Discovery, Burlingame, California, United States of America
| | - James H. McKerrow
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, San Diego, California, United States of America
| | - Carolyn Talcott
- SRI International, Menlo Park, California, United States of America
| |
Collapse
|
12
|
Clark AM, Ekins S. Open Source Bayesian Models. 2. Mining a "Big Dataset" To Create and Validate Models with ChEMBL. J Chem Inf Model 2015; 55:1246-60. [PMID: 25995041 DOI: 10.1021/acs.jcim.5b00144] [Citation(s) in RCA: 63] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
In an associated paper, we have described a reference implementation of Laplacian-corrected naïve Bayesian model building using extended connectivity (ECFP)- and molecular function class fingerprints of maximum diameter 6 (FCFP)-type fingerprints. As a follow-up, we have now undertaken a large-scale validation study in order to ensure that the technique generalizes to a broad variety of drug discovery datasets. To achieve this, we have used the ChEMBL (version 20) database and split it into more than 2000 separate datasets, each of which consists of compounds and measurements with the same target and activity measurement. In order to test these datasets with the two-state Bayesian classification, we developed an automated algorithm for detecting a suitable threshold for active/inactive designation, which we applied to all collections. With these datasets, we were able to establish that our Bayesian model implementation is effective for the large majority of cases, and we were able to quantify the impact of fingerprint folding on the receiver operator curve cross-validation metrics. We were also able to study the impact that the choice of training/testing set partitioning has on the resulting recall rates. The datasets have been made publicly available to be downloaded, along with the corresponding model data files, which can be used in conjunction with the CDK and several mobile apps. We have also explored some novel visualization methods which leverage the structural origins of the ECFP/FCFP fingerprints to attribute regions of a molecule responsible for positive and negative contributions to activity. The ability to score molecules across thousands of relevant datasets across organisms also may help to access desirable and undesirable off-target effects as well as suggest potential targets for compounds derived from phenotypic screens.
Collapse
Affiliation(s)
- Alex M Clark
- †Molecular Materials Informatics, Inc., 1900 St. Jacques No. 302, Montreal H3J 2S1, Quebec, Canada
| | - Sean Ekins
- ‡Collaborations Pharmaceuticals, Inc., 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States.,§Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States.,∥Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
| |
Collapse
|
13
|
Ekins S, Clark AM, Swamidass SJ, Litterman N, Williams AJ. Bigger data, collaborative tools and the future of predictive drug discovery. J Comput Aided Mol Des 2014; 28:997-1008. [PMID: 24943138 PMCID: PMC4198464 DOI: 10.1007/s10822-014-9762-y] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2014] [Accepted: 06/09/2014] [Indexed: 12/31/2022]
Abstract
Over the past decade we have seen a growth in the provision of chemistry data and cheminformatics tools as either free websites or software as a service commercial offerings. These have transformed how we find molecule-related data and use such tools in our research. There have also been efforts to improve collaboration between researchers either openly or through secure transactions using commercial tools. A major challenge in the future will be how such databases and software approaches handle larger amounts of data as it accumulates from high throughput screening and enables the user to draw insights, enable predictions and move projects forward. We now discuss how information from some drug discovery datasets can be made more accessible and how privacy of data should not overwhelm the desire to share it at an appropriate time with collaborators. We also discuss additional software tools that could be made available and provide our thoughts on the future of predictive drug discovery in this age of big data. We use some examples from our own research on neglected diseases, collaborations, mobile apps and algorithm development to illustrate these ideas.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry, 5616 Hilltop Needmore Road, Fuquay-Varina, NC, 27526, USA,
| | | | | | | | | |
Collapse
|
14
|
Operational modelling to guide implementation and scale-up of diagnostic tests within the health system: exploring opportunities for parasitic disease diagnostics based on example application for tuberculosis. Parasitology 2014; 141:1795-802. [PMID: 25035934 DOI: 10.1017/s0031182014000985] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Research and innovation in the diagnosis of infectious and parasitic diseases has led to the development of several promising diagnostic tools, for example in malaria there is extensive literature concerning the use of rapid diagnostic tests. This means policymakers in many low and middle income countries need to make difficult decisions about which of the recommended tools and approaches to implement and scale-up. The test characteristics (e.g. sensitivity and specificity) of the tools alone are not a sufficient basis on which to make these decisions as policymakers need to also consider the best combination of tools, whether the new tools should complement or replace existing diagnostics and who should be tested. Diagnostic strategies need dovetailing to different epidemiology and structural resource constraints (e.g. existing diagnostic pathways, human resources and laboratory capacity). We propose operational modelling to assist with these complex decisions. Projections of patient, health system and cost impacts are essential and operational modelling of the relevant elements of the health system could provide these projections and support rational decisions. We demonstrate how the technique of operational modelling applied in the developing world to support decisions on diagnostics for tuberculosis, could in a parallel way, provide useful insights to support implementation of appropriate diagnostic innovations for parasitic diseases.
Collapse
|
15
|
Ekins S, Freundlich JS, Reynolds RC. Are bigger data sets better for machine learning? Fusing single-point and dual-event dose response data for Mycobacterium tuberculosis. J Chem Inf Model 2014; 54:2157-65. [PMID: 24968215 DOI: 10.1021/ci500264r] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Tuberculosis is a major, neglected disease for which the quest to find new treatments continues. There is an abundance of data from large phenotypic screens in the public domain against Mycobacterium tuberculosis (Mtb). Since machine learning methods can learn from past data, we were interested in addressing whether more data builds better models. We now describe using Bayesian machine learning to assess whether we can improve our models by combining the large quantities of single-point data with the much smaller (higher quality) dual-event data sets, which use both dose-response data for both whole-cell antitubercular activity and Vero cell cytotoxicity. We have evaluated 12 models ranging from different single-point, dual-event dose-response, single-point and dual-event dose-response as well as combined data sets for three distinct data sets from the same laboratory. We used a fourth data set of active and inactive compounds from the same group as well as a smaller set of 177 active compounds from GlaxoSmithKline as test sets. Our data suggest combining single-point with dual-event dose-response data does not diminish the internal or external predictive ability of the models based on the receiver operator curve (ROC) for these models (internal ROC range 0.83-0.91, external ROC range 0.62-0.83) compared to the orders of magnitude smaller dual-event models (internal ROC range 0.6-0.83 and external ROC 0.54-0.83). In conclusion, models developed with 1200-5000 compounds appear to be as predictive as those generated with 25 000-350 000 molecules. Our results have implications for justifying further high-throughput screening versus focused testing based on model predictions.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborations in Chemistry , 5616 Hilltop Needmore Road, Fuquay-Varina, North Carolina 27526, United States
| | | | | |
Collapse
|
16
|
Njoroge M, Njuguna NM, Mutai P, Ongarora DSB, Smith PW, Chibale K. Recent approaches to chemical discovery and development against malaria and the neglected tropical diseases human African trypanosomiasis and schistosomiasis. Chem Rev 2014; 114:11138-63. [PMID: 25014712 DOI: 10.1021/cr500098f] [Citation(s) in RCA: 82] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Affiliation(s)
| | | | | | | | - Paul W Smith
- Novartis Institute for Tropical Diseases , Singapore 138670, Singapore
| | | |
Collapse
|
17
|
Pollastri MP. Finding new collaboration models for enabling neglected tropical disease drug discovery. PLoS Negl Trop Dis 2014; 8:e2866. [PMID: 24992488 PMCID: PMC4081034 DOI: 10.1371/journal.pntd.0002866] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Affiliation(s)
- Michael P. Pollastri
- Department of Chemistry & Chemical Biology, Northeastern University, Egan Research Center, Boston, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
18
|
Ekins S, Freundlich JS, Reynolds RC. Fusing dual-event data sets for Mycobacterium tuberculosis machine learning models and their evaluation. J Chem Inf Model 2013; 53:3054-63. [PMID: 24144044 DOI: 10.1021/ci400480s] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
The search for new tuberculosis treatments continues as we need to find molecules that can act more quickly, be accommodated in multidrug regimens, and overcome ever increasing levels of drug resistance. Multiple large scale phenotypic high-throughput screens against Mycobacterium tuberculosis (Mtb) have generated dose response data, enabling the generation of machine learning models. These models also incorporated cytotoxicity data and were recently validated with a large external data set. A cheminformatics data-fusion approach followed by Bayesian machine learning, Support Vector Machine, or Recursive Partitioning model development (based on publicly available Mtb screening data) was used to compare individual data sets and subsequent combined models. A set of 1924 commercially available molecules with promising antitubercular activity (and lack of relative cytotoxicity to Vero cells) were used to evaluate the predictive nature of the models. We demonstrate that combining three data sets incorporating antitubercular and cytotoxicity data in Vero cells from our previous screens results in external validation receiver operator curve (ROC) of 0.83 (Bayesian or RP Forest). Models that do not have the highest 5-fold cross-validation ROC scores can outperform other models in a test set dependent manner. We demonstrate with predictions for a recently published set of Mtb leads from GlaxoSmithKline that no single machine learning model may be enough to identify compounds of interest. Data set fusion represents a further useful strategy for machine learning construction as illustrated with Mtb. Coverage of chemistry and Mtb target spaces may also be limiting factors for the whole-cell screening data generated to date.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, California 94010, United States
| | | | | |
Collapse
|