51
|
Abstract
BACKGROUND Reverse docking approaches have been explored in previous studies on drug discovery to overcome some problems in traditional virtual screening. However, current reverse docking approaches are problematic in that the target spaces of those studies were rather small, and their applications were limited to identifying new drug targets. In this study, we expanded the scope of target space to a set of all protein structures currently available and developed several new applications of reverse docking method. RESULTS We generated 2D Matrix of docking scores among all the possible protein structures in yeast and human and 35 famous drugs. By clustering the docking profile data and then comparing them with fingerprint-based clustering of drugs, we first showed that our data contained accurate information on their chemical properties. Next, we showed that our method could be used to predict the druggability of target proteins. We also showed that a combination of sequence similarity and docking profile similarity could predict the enzyme EC numbers more accurately than sequence similarity alone. In two case studies, 5-fluorouracil and cycloheximide, we showed that our method can successfully find identifying target proteins. CONCLUSIONS By using a large number of protein structures, we improved the sensitivity of reverse docking and showed that using as many protein structure as possible was important in finding real binding targets.
Collapse
Affiliation(s)
- Minho Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology, 291 Daehak-ro, Yuseong-gu, Daejeon, 305-701, Korea
| | | |
Collapse
|
52
|
Affiliation(s)
- Michael Bieler
- Boehringer Ingelheim Pharma GmbH & Co. KG; Lead Discovery and Optimization Support; 88397; Biberach/Riss; Germany
| | - Herbert Koeppen
- Boehringer Ingelheim Pharma GmbH & Co. KG; Lead Discovery and Optimization Support; 88397; Biberach/Riss; Germany
| |
Collapse
|
53
|
Xie L, Kinnings SL, Xie L, Bourne PE. Predicting the Polypharmacology of Drugs: Identifying New Uses through Chemoinformatics, Structural Informatics, and Molecular Modeling‐Based Approaches. DRUG REPOSITIONING 2012:163-205. [DOI: 10.1002/9781118274408.ch7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/03/2025]
|
54
|
Sehnal D, Vařeková RS, Huber HJ, Geidl S, Ionescu CM, Wimmerová M, Koča J. SiteBinder: an improved approach for comparing multiple protein structural motifs. J Chem Inf Model 2012; 52:343-59. [PMID: 22296449 DOI: 10.1021/ci200444d] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
There is a paramount need to develop new techniques and tools that will extract as much information as possible from the ever growing repository of protein 3D structures. We report here on the development of a software tool for the multiple superimposition of large sets of protein structural motifs. Our superimposition methodology performs a systematic search for the atom pairing that provides the best fit. During this search, the RMSD values for all chemically relevant pairings are calculated by quaternion algebra. The number of evaluated pairings is markedly decreased by using PDB annotations for atoms. This approach guarantees that the best fit will be found and can be applied even when sequence similarity is low or does not exist at all. We have implemented this methodology in the Web application SiteBinder, which is able to process up to thousands of protein structural motifs in a very short time, and which provides an intuitive and user-friendly interface. Our benchmarking analysis has shown the robustness, efficiency, and versatility of our methodology and its implementation by the successful superimposition of 1000 experimentally determined structures for each of 32 eukaryotic linear motifs. We also demonstrate the applicability of SiteBinder using three case studies. We first compared the structures of 61 PA-IIL sugar binding sites containing nine different sugars, and we found that the sugar binding sites of PA-IIL and its mutants have a conserved structure despite their binding different sugars. We then superimposed over 300 zinc finger central motifs and revealed that the molecular structure in the vicinity of the Zn atom is highly conserved. Finally, we superimposed 12 BH3 domains from pro-apoptotic proteins. Our findings come to support the hypothesis that there is a structural basis for the functional segregation of BH3-only proteins into activators and enablers.
Collapse
Affiliation(s)
- David Sehnal
- National Centre for Biomolecular Research, Faculty of Science and CEITEC-Central European Institute of Technology, Masaryk University Brno, Kamenice 5, 62500 Brno-Bohunice, Czech Republic
| | | | | | | | | | | | | |
Collapse
|
55
|
Daminelli S, Haupt VJ, Reimann M, Schroeder M. Drug repositioning through incomplete bi-cliques in an integrated drug–target–disease network. Integr Biol (Camb) 2012; 4:778-88. [DOI: 10.1039/c2ib00154c] [Citation(s) in RCA: 44] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
|
56
|
Spitzer R, Cleves AE, Jain AN. Surface-based protein binding pocket similarity. Proteins 2011; 79:2746-63. [PMID: 21769944 DOI: 10.1002/prot.23103] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2011] [Revised: 05/06/2011] [Accepted: 05/25/2011] [Indexed: 11/08/2022]
Abstract
Protein similarity comparisons may be made on a local or global basis and may consider sequence information or differing levels of structural information. We present a local three-dimensional method that compares protein binding site surfaces in full atomic detail. The approach is based on the morphological similarity method which has been widely applied for global comparison of small molecules. We apply the method to all-by-all comparisons two sets of human protein kinases, a very diverse set of ATP-bound proteins from multiple species, and three heterogeneous benchmark protein binding site data sets. Cases of disagreement between sequence-based similarity and binding site similarity yield informative examples. Where sequence similarity is very low, high pocket similarity can reliably identify important binding motifs. Where sequence similarity is very high, significant differences in pocket similarity are related to ligand binding specificity and similarity. Local protein binding pocket similarity provides qualitatively complementary information to other approaches, and it can yield quantitative information in support of functional annotation.
Collapse
Affiliation(s)
- Russell Spitzer
- Department of Bioengineering and Therapeutic Sciences, Helen Diller Family Comprehensive Cancer Center, University of California, San Francisco, California 94158-9001, USA
| | | | | |
Collapse
|
57
|
González-Díaz H, Prado-Prado F, Sobarzo-Sánchez E, Haddad M, Maurel Chevalley S, Valentin A, Quetin-Leclercq J, Dea-Ayuela MA, Teresa Gomez-Muños M, Munteanu CR, José Torres-Labandeira J, García-Mera X, Tapia RA, Ubeira FM. NL MIND-BEST: A web server for ligands and proteins discovery—Theoretic-experimental study of proteins of Giardia lamblia and new compounds active against Plasmodium falciparum. J Theor Biol 2011; 276:229-49. [DOI: 10.1016/j.jtbi.2011.01.010] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2010] [Revised: 12/02/2010] [Accepted: 01/10/2011] [Indexed: 10/18/2022]
|
58
|
Xie L, Evangelidis T, Xie L, Bourne PE. Drug discovery using chemical systems biology: weak inhibition of multiple kinases may contribute to the anti-cancer effect of nelfinavir. PLoS Comput Biol 2011; 7:e1002037. [PMID: 21552547 PMCID: PMC3084228 DOI: 10.1371/journal.pcbi.1002037] [Citation(s) in RCA: 119] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2010] [Accepted: 03/14/2011] [Indexed: 11/18/2022] Open
Abstract
Nelfinavir is a potent HIV-protease inhibitor with pleiotropic effects in cancer cells. Experimental studies connect its anti-cancer effects to the suppression of the Akt signaling pathway, but the actual molecular targets remain unknown. Using a structural proteome-wide off-target pipeline, which integrates molecular dynamics simulation and MM/GBSA free energy calculations with ligand binding site comparison and biological network analysis, we identified putative human off-targets of Nelfinavir and analyzed the impact on the associated biological processes. Our results suggest that Nelfinavir is able to inhibit multiple members of the protein kinase-like superfamily, which are involved in the regulation of cellular processes vital for carcinogenesis and metastasis. The computational predictions are supported by kinase activity assays and are consistent with existing experimental and clinical evidence. This finding provides a molecular basis to explain the broad-spectrum anti-cancer effect of Nelfinavir and presents opportunities to optimize the drug as a targeted polypharmacology agent.
Collapse
Affiliation(s)
- Li Xie
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, United States of America
| | - Thomas Evangelidis
- School of Pharmacy and Pharmaceutical Sciences, University of Manchester, Manchester, United Kingdom
| | - Lei Xie
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, United States of America
- Department of Computer Science, Hunter College, the City University of New York, New York, United States of America
| | - Philip E. Bourne
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, United States of America
| |
Collapse
|
59
|
Use of secondary structure element information in drug design: polypharmacology and conserved motifs in protein–ligand binding and protein–protein interfaces. Future Med Chem 2011; 3:699-708. [DOI: 10.4155/fmc.11.26] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open
Abstract
The structure-based design of small-molecule inhibitors of protein–ligand and protein–protein interfaces is a key component of drug discovery. The underlying protein interactions can be regarded based on structural similarity of the secondary structure elements: similarities around the binding site (‘ligand-sensing cores’) or in the protein interface (‘interface-sensing surfaces’) in otherwise unrelated proteins can be useful in predicting polypharmacology and identifying new lead structures. Even small conserved motifs can provide similar interaction patterns in proteins with a completely different fold and function. The identification of these structural similarities can help in the design of new drugs by guiding further optimization. Here, the concepts and ideas based on secondary structure element similarities and their successful applications in drug design are reviewed and discussed.
Collapse
|
60
|
Kinnings SL, Liu N, Tonge PJ, Jackson RM, Xie L, Bourne PE. A machine learning-based method to improve docking scoring functions and its application to drug repurposing. J Chem Inf Model 2011; 51:408-19. [PMID: 21291174 PMCID: PMC3076728 DOI: 10.1021/ci100369f] [Citation(s) in RCA: 135] [Impact Index Per Article: 9.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
Docking scoring functions are notoriously weak predictors of binding affinity. They typically assign a common set of weights to the individual energy terms that contribute to the overall energy score; however, these weights should be gene family dependent. In addition, they incorrectly assume that individual interactions contribute toward the total binding affinity in an additive manner. In reality, noncovalent interactions often depend on one another in a nonlinear manner. In this paper, we show how the use of support vector machines (SVMs), trained by associating sets of individual energy terms retrieved from molecular docking with the known binding affinity of each compound from high-throughput screening experiments, can be used to improve the correlation between known binding affinities and those predicted by the docking program eHiTS. We construct two prediction models: a regression model trained using IC(50) values from BindingDB, and a classification model trained using active and decoy compounds from the Directory of Useful Decoys (DUD). Moreover, to address the issue of overrepresentation of negative data in high-throughput screening data sets, we have designed a multiple-planar SVM training procedure for the classification model. The increased performance that both SVMs give when compared with the original eHiTS scoring function highlights the potential for using nonlinear methods when deriving overall energy scores from their individual components. We apply the above methodology to train a new scoring function for direct inhibitors of Mycobacterium tuberculosis (M.tb) InhA. By combining ligand binding site comparison with the new scoring function, we propose that phosphodiesterase inhibitors can potentially be repurposed to target M.tb InhA. Our methodology may be applied to other gene families for which target structures and activity data are available, as demonstrated in the work presented here.
Collapse
Affiliation(s)
- Sarah L. Kinnings
- Institute of Molecular and Cellular Biology and Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds, LS2 9JT, United Kingdom
| | - Nina Liu
- Institute of Chemical Biology & Drug Discovery, Department of Chemistry, Stony Brook University, Stony Brook, NY 11794, USA
| | - Peter J. Tonge
- Institute of Chemical Biology & Drug Discovery, Department of Chemistry, Stony Brook University, Stony Brook, NY 11794, USA
| | - Richard M. Jackson
- Institute of Molecular and Cellular Biology and Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds, LS2 9JT, United Kingdom
| | - Lei Xie
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
- Department of Computer Science, Hunter College, The City University of New York, New York, NY 10065, USA
| | - Philip E. Bourne
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
| |
Collapse
|
61
|
Wang Z, Liu J, Cheng Y, Wang Y. Fangjiomics: in search of effective and safe combination therapies. J Clin Pharmacol 2011; 51:1132-51. [PMID: 21209238 DOI: 10.1177/0091270010382913] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Millennia-old Chinese medicine treats disease with many combination therapies involving ingredients used in clinic practice. Fangjiomics is the science of identifying and designing effective mixtures of bioactive agents and elucidating their modes of action beyond those of Chinese patent medicines. Omics profiling and quantitative optimal modeling have been used to associate the various responses with biological pathways related to disease phenotype. Fangjiomics seeks to study myriad compatible combinations that may act through multiple targets, modes of action, and biological pathways balancing on off-target and on-target effects. This approach may lead to the discovery of controllable array-designed therapies to combine less potent elements that are more effective collectively but have fewer adverse side effects than does any element singly.
Collapse
Affiliation(s)
- Zhong Wang
- Institute of Basic Research in Clinical Medicine, China Academy of Chinese Medical Sciences, Beijing, China.
| | | | | | | |
Collapse
|
62
|
Moll M, Bryant DH, Kavraki LE. The LabelHash algorithm for substructure matching. BMC Bioinformatics 2010; 11:555. [PMID: 21070651 PMCID: PMC2996407 DOI: 10.1186/1471-2105-11-555] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2010] [Accepted: 11/11/2010] [Indexed: 01/01/2025] Open
Abstract
Background There is an increasing number of proteins with known structure but unknown function. Determining their function would have a significant impact on understanding diseases and designing new therapeutics. However, experimental protein function determination is expensive and very time-consuming. Computational methods can facilitate function determination by identifying proteins that have high structural and chemical similarity. Results We present LabelHash, a novel algorithm for matching substructural motifs to large collections of protein structures. The algorithm consists of two phases. In the first phase the proteins are preprocessed in a fashion that allows for instant lookup of partial matches to any motif. In the second phase, partial matches for a given motif are expanded to complete matches. The general applicability of the algorithm is demonstrated with three different case studies. First, we show that we can accurately identify members of the enolase superfamily with a single motif. Next, we demonstrate how LabelHash can complement SOIPPA, an algorithm for motif identification and pairwise substructure alignment. Finally, a large collection of Catalytic Site Atlas motifs is used to benchmark the performance of the algorithm. LabelHash runs very efficiently in parallel; matching a motif against all proteins in the 95% sequence identity filtered non-redundant Protein Data Bank typically takes no more than a few minutes. The LabelHash algorithm is available through a web server and as a suite of standalone programs at http://labelhash.kavrakilab.org. The output of the LabelHash algorithm can be further analyzed with Chimera through a plugin that we developed for this purpose. Conclusions LabelHash is an efficient, versatile algorithm for large-scale substructure matching. When LabelHash is running in parallel, motifs can typically be matched against the entire PDB on the order of minutes. The algorithm is able to identify functional homologs beyond the twilight zone of sequence identity and even beyond fold similarity. The three case studies presented in this paper illustrate the versatility of the algorithm.
Collapse
Affiliation(s)
- Mark Moll
- Department of Computer Science, Rice University, Houston, TX 77005, USA.
| | | | | |
Collapse
|
63
|
Kinnings SL, Xie L, Fung KH, Jackson RM, Xie L, Bourne PE. The Mycobacterium tuberculosis drugome and its polypharmacological implications. PLoS Comput Biol 2010; 6:e1000976. [PMID: 21079673 PMCID: PMC2973814 DOI: 10.1371/journal.pcbi.1000976] [Citation(s) in RCA: 73] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2010] [Accepted: 09/24/2010] [Indexed: 11/26/2022] Open
Abstract
We report a computational approach that integrates structural bioinformatics, molecular modelling and systems biology to construct a drug-target network on a structural proteome-wide scale. The approach has been applied to the genome of Mycobacterium tuberculosis (M.tb), the causative agent of one of today's most widely spread infectious diseases. The resulting drug-target interaction network for all structurally characterized approved drugs bound to putative M.tb receptors, we refer to as the ‘TB-drugome’. The TB-drugome reveals that approximately one-third of the drugs examined have the potential to be repositioned to treat tuberculosis and that many currently unexploited M.tb receptors may be chemically druggable and could serve as novel anti-tubercular targets. Furthermore, a detailed analysis of the TB-drugome has shed new light on the controversial issues surrounding drug-target networks [1]–[3]. Indeed, our results support the idea that drug-target networks are inherently modular, and further that any observed randomness is mainly caused by biased target coverage. The TB-drugome (http://funsite.sdsc.edu/drugome/TB) has the potential to be a valuable resource in the development of safe and efficient anti-tubercular drugs. More generally the methodology may be applied to other pathogens of interest with results improving as more of their structural proteomes are determined through the continued efforts of structural biology/genomics. The worldwide increase in multi-drug resistant TB poses a great threat to human health and highlights the need to identify new anti-tubercular agents. We have developed a computational strategy to link the structural proteome of Mycobacterium tuberculosis, the causative agent of tuberculosis, to all structurally characterized approved drugs, and hence construct a proteome-wide drug-target network – the TB-drugome. The TB-drugome has the potential to be a valuable resource in the development of safe and efficient anti-tubercular drugs. More generally, the proteome-wide and multi-scale view of target and drug space may facilitate a systematic drug discovery process, which concurrently takes into account the disease mechanism and druggability of targets, the drug-likeness and ADMET properties of chemical compounds, and the genetic dispositions of individuals. Ultimately it may help to reduce the high attrition rate in drug development through a better understanding of drug-receptor interactions on a large scale.
Collapse
Affiliation(s)
- Sarah L. Kinnings
- Institute of Molecular and Cellular Biology and Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds, United Kingdom
- San Diego Supercomputer Center, University of California, San Diego, La Jolla, California, United States of America
| | - Li Xie
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, United States of America
| | - Kingston H. Fung
- Bioinformatics Program, University of California, San Diego, La Jolla, California, United States of America
| | - Richard M. Jackson
- Institute of Molecular and Cellular Biology and Astbury Centre for Structural Molecular Biology, University of Leeds, Leeds, United Kingdom
| | - Lei Xie
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, United States of America
- * E-mail: (LX); (PEB)
| | - Philip E. Bourne
- San Diego Supercomputer Center, University of California, San Diego, La Jolla, California, United States of America
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, La Jolla, California, United States of America
- * E-mail: (LX); (PEB)
| |
Collapse
|
64
|
Gherardini PF, Ausiello G, Helmer-Citterich M. Superpose3D: a local structural comparison program that allows for user-defined structure representations. PLoS One 2010; 5:e11988. [PMID: 20700534 PMCID: PMC2916828 DOI: 10.1371/journal.pone.0011988] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2010] [Accepted: 07/08/2010] [Indexed: 11/19/2022] Open
Abstract
Local structural comparison methods can be used to find structural similarities involving functional protein patches such as enzyme active sites and ligand binding sites. The outcome of such analyses is critically dependent on the representation used to describe the structure. Indeed different categories of functional sites may require the comparison program to focus on different characteristics of the protein residues. We have therefore developed superpose3D, a novel structural comparison software that lets users specify, with a powerful and flexible syntax, the structure description most suited to the requirements of their analysis. Input proteins are processed according to the user's directives and the program identifies sets of residues (or groups of atoms) that have a similar 3D position in the two structures. The advantages of using such a general purpose program are demonstrated with several examples. These test cases show that no single representation is appropriate for every analysis, hence the usefulness of having a flexible program that can be tailored to different needs. Moreover we also discuss how to interpret the results of a database screening where a known structural motif is searched against a large ensemble of structures. The software is written in C++ and is released under the open source GPL license. Superpose3D does not require any external library, runs on Linux, Mac OSX, Windows and is available at http://cbm.bio.uniroma2.it/superpose3D.
Collapse
Affiliation(s)
- Pier Federico Gherardini
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| | - Gabriele Ausiello
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
- * E-mail:
| | - Manuela Helmer-Citterich
- Centre for Molecular Bioinformatics, Department of Biology, University of Rome “Tor Vergata”, Rome, Italy
| |
Collapse
|
65
|
Ren J, Williams N, Clementi L, Krishnan S, Li WW. Opal web services for biomedical applications. Nucleic Acids Res 2010; 38:W724-31. [PMID: 20529877 PMCID: PMC2896135 DOI: 10.1093/nar/gkq503] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Biomedical applications have become increasingly complex, and they often require large-scale high-performance computing resources with a large number of processors and memory. The complexity of application deployment and the advances in cluster, grid and cloud computing require new modes of support for biomedical research. Scientific Software as a Service (sSaaS) enables scalable and transparent access to biomedical applications through simple standards-based Web interfaces. Towards this end, we built a production web server (http://ws.nbcr.net) in August 2007 to support the bioinformatics application called MEME. The server has grown since to include docking analysis with AutoDock and AutoDock Vina, electrostatic calculations using PDB2PQR and APBS, and off-target analysis using SMAP. All the applications on the servers are powered by Opal, a toolkit that allows users to wrap scientific applications easily as web services without any modification to the scientific codes, by writing simple XML configuration files. Opal allows both web forms-based access and programmatic access of all our applications. The Opal toolkit currently supports SOAP-based Web service access to a number of popular applications from the National Biomedical Computation Resource (NBCR) and affiliated collaborative and service projects. In addition, Opal’s programmatic access capability allows our applications to be accessed through many workflow tools, including Vision, Kepler, Nimrod/K and VisTrails. From mid-August 2007 to the end of 2009, we have successfully executed 239 814 jobs. The number of successfully executed jobs more than doubled from 205 to 411 per day between 2008 and 2009. The Opal-enabled service model is useful for a wide range of applications. It provides for interoperation with other applications with Web Service interfaces, and allows application developers to focus on the scientific tool and workflow development. Web server availability: http://ws.nbcr.net.
Collapse
Affiliation(s)
- Jingyuan Ren
- National Biomedical Computation Resource, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA.
| | | | | | | | | |
Collapse
|
66
|
Ren J, Xie L, Li WW, Bourne PE. SMAP-WS: a parallel web service for structural proteome-wide ligand-binding site comparison. Nucleic Acids Res 2010; 38:W441-4. [PMID: 20484373 PMCID: PMC2896174 DOI: 10.1093/nar/gkq400] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
The proteome-wide characterization and analysis of protein ligand-binding sites and their interactions with ligands can provide pivotal information in understanding the structure, function and evolution of proteins and for designing safe and efficient therapeutics. The SMAP web service (SMAP-WS) meets this need through parallel computations designed for 3D ligand-binding site comparison and similarity searching on a structural proteome scale. SMAP-WS implements a shape descriptor (the Geometric Potential) that characterizes both local and global topological properties of the protein structure and which can be used to predict the likely ligand-binding pocket [Xie,L. and Bourne,P.E. (2007) A robust and efficient algorithm for the shape description of protein structures and its application in predicting ligand-binding sites. BMC bioinformatics, 8 (Suppl. 4.), S9.]. Subsequently a sequence order independent profile–profile alignment (SOIPPA) algorithm is used to detect and align similar pockets thereby finding protein functional and evolutionary relationships across fold space [Xie, L. and Bourne, P.E. (2008) Detecting evolutionary relationships across existing fold space, using sequence order-independent profile-profile alignments. Proc. Natl Acad. Sci. USA, 105, 5441–5446]. An extreme value distribution model estimates the statistical significance of the match [Xie, L., Xie, L. and Bourne, P.E. (2009) A unified statistical model to support local sequence order independent similarity searching for ligand-binding sites and its application to genome-based drug discovery. Bioinformatics, 25, i305–i312.]. These algorithms have been extensively benchmarked and shown to outperform most existing algorithms. Moreover, several predictions resulting from SMAP-WS have been validated experimentally. Thus far SMAP-WS has been applied to predict drug side effects, and to repurpose existing drugs for new indications. SMAP-WS provides both a user-friendly web interface and programming API for scientists to address a wide range of compute intense questions in biology and drug discovery. AVAILABILITY SMAP-WS is available from the URL http://smap.nbcr.net.
Collapse
Affiliation(s)
- Jingyuan Ren
- San Diego Supercomputer Center, National Biomedical Computation Resource and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| | - Lei Xie
- San Diego Supercomputer Center, National Biomedical Computation Resource and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
- *To whom correspondence should be addressed. Tel: +1 858 822 3686; Fax: +1 858 822 0873;
| | - Wilfred W. Li
- San Diego Supercomputer Center, National Biomedical Computation Resource and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
- *Correspondence may also be addressed to Wilfred W. Li. Tel: +1 858 534 0591; Fax: +1 858 822 1619;
| | - Philip E. Bourne
- San Diego Supercomputer Center, National Biomedical Computation Resource and Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA
| |
Collapse
|
67
|
Hoffmann B, Zaslavskiy M, Vert JP, Stoven V. A new protein binding pocket similarity measure based on comparison of clouds of atoms in 3D: application to ligand prediction. BMC Bioinformatics 2010; 11:99. [PMID: 20175916 PMCID: PMC2838872 DOI: 10.1186/1471-2105-11-99] [Citation(s) in RCA: 66] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2009] [Accepted: 02/22/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Predicting which molecules can bind to a given binding site of a protein with known 3D structure is important to decipher the protein function, and useful in drug design. A classical assumption in structural biology is that proteins with similar 3D structures have related molecular functions, and therefore may bind similar ligands. However, proteins that do not display any overall sequence or structure similarity may also bind similar ligands if they contain similar binding sites. Quantitatively assessing the similarity between binding sites may therefore be useful to propose new ligands for a given pocket, based on those known for similar pockets. RESULTS We propose a new method to quantify the similarity between binding pockets, and explore its relevance for ligand prediction. We represent each pocket by a cloud of atoms, and assess the similarity between two pockets by aligning their atoms in the 3D space and comparing the resulting configurations with a convolution kernel. Pocket alignment and comparison is possible even when the corresponding proteins share no sequence or overall structure similarities. In order to predict ligands for a given target pocket, we compare it to an ensemble of pockets with known ligands to identify the most similar pockets. We discuss two criteria to evaluate the performance of a binding pocket similarity measure in the context of ligand prediction, namely, area under ROC curve (AUC scores) and classification based scores. We show that the latter is better suited to evaluate the methods with respect to ligand prediction, and demonstrate the relevance of our new binding site similarity compared to existing similarity measures. CONCLUSIONS This study demonstrates the relevance of the proposed method to identify ligands binding to known binding pockets. We also provide a new benchmark for future work in this field. The new method and the benchmark are available at http://cbio.ensmp.fr/paris/.
Collapse
Affiliation(s)
- Brice Hoffmann
- Mines ParisTech, Centre for Computational Biology, Fontainbleau, France
| | | | | | | |
Collapse
|
68
|
Ekins S, Bradford J, Dole K, Spektor A, Gregory K, Blondeau D, Hohman M, Bunin BA. A collaborative database and computational models for tuberculosis drug discovery. MOLECULAR BIOSYSTEMS 2010; 6:840-51. [PMID: 20567770 DOI: 10.1039/b917766c] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
The search for molecules with activity against Mycobacterium tuberculosis (Mtb) is employing many approaches in parallel including high throughput screening and computational methods. We have developed a database (CDD TB) to capture public and private Mtb data while enabling data mining and collaborations with other researchers. We have used the public data along with several cheminformatics approaches to produce models that describe active and inactive compounds. We have compared these datasets to those for known FDA approved drugs and between Mtb active and inactive compounds. The distribution of polar surface area and pK(a) of active compounds was found to be a statistically significant determinant of activity against Mtb. Hydrophobicity was not always statistically significant. Bayesian classification models for 220, 463 molecules were generated and tested with external molecules, and enabled the discrimination of active or inactive substructures from other datasets in the CDD TB. Computational pharmacophores based on known Mtb drugs were able to map to and retrieve a small subset of some of the Mtb datasets, including a high percentage of Mtb actives. The combination of the database, dataset analysis, Bayesian and pharmacophore models provides new insights into molecular properties and features that are determinants of activity in whole cells. This study provides novel insights into the key 1D molecular descriptors, 2D chemical substructures and 3D pharmacophores which can be used to mine the chemistry space, prioritizing those molecules with a higher probability of activity against Mtb.
Collapse
Affiliation(s)
- Sean Ekins
- Collaborative Drug Discovery, 1633 Bayshore Highway, Suite 342, Burlingame, CA 94403, USA.
| | | | | | | | | | | | | | | |
Collapse
|
69
|
A multidimensional strategy to detect polypharmacological targets in the absence of structural and sequence homology. PLoS Comput Biol 2010; 6:e1000648. [PMID: 20098496 PMCID: PMC2799658 DOI: 10.1371/journal.pcbi.1000648] [Citation(s) in RCA: 57] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2009] [Accepted: 12/16/2009] [Indexed: 01/18/2023] Open
Abstract
Conventional drug design embraces the “one gene, one drug, one disease” philosophy. Polypharmacology, which focuses on multi-target drugs, has emerged as a new paradigm in drug discovery. The rational design of drugs that act via polypharmacological mechanisms can produce compounds that exhibit increased therapeutic potency and against which resistance is less likely to develop. Additionally, identifying multiple protein targets is also critical for side-effect prediction. One third of potential therapeutic compounds fail in clinical trials or are later removed from the market due to unacceptable side effects often caused by off-target binding. In the current work, we introduce a multidimensional strategy for the identification of secondary targets of known small-molecule inhibitors in the absence of global structural and sequence homology with the primary target protein. To demonstrate the utility of the strategy, we identify several targets of 4,5-dihydroxy-3-(1-naphthyldiazenyl)-2,7-naphthalenedisulfonic acid, a known micromolar inhibitor of Trypanosoma brucei RNA editing ligase 1. As it is capable of identifying potential secondary targets, the strategy described here may play a useful role in future efforts to reduce drug side effects and/or to increase polypharmacology. Proteins play a critical role in human disease; bacteria, viruses, and parasites have unique proteins that can interfere with human health, and dysfunctional human proteins can likewise lead to illness. In order to find cures, scientists often try to identify small molecules (drugs) that can inhibit disease-causing proteins. The goal is to identify a molecule that can fit snugly into the pockets and grooves, or “active sites,” on the protein's surface. Unfortunately, drugs that inhibit a single disease-causing protein are problematic. A single protein can evolve to evade drug action. Additionally, when only one protein is targeted, drug potency is often diminished. Single drugs that simultaneously target multiple disease-causing proteins are much more effective. On the other hand, if scientists are not careful, the drugs they design might inhibit essential human proteins in addition to inhibiting their intended targets, leading to unexpected side effects. In our current work, we have developed a computer-based procedure that can be used to identify proteins with similar active sites. Once unexpected protein targets have been identified, scientists can modify drugs under development in order to increase the simultaneous inhibition of multiple disease-causing proteins while avoiding potential side effects by decreasing the inhibition of useful human proteins.
Collapse
|
70
|
Drug discovery using chemical systems biology: repositioning the safe medicine Comtan to treat multi-drug and extensively drug resistant tuberculosis. PLoS Comput Biol 2009; 5:e1000423. [PMID: 19578428 PMCID: PMC2699117 DOI: 10.1371/journal.pcbi.1000423] [Citation(s) in RCA: 226] [Impact Index Per Article: 14.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2009] [Accepted: 05/28/2009] [Indexed: 01/03/2023] Open
Abstract
The rise of multi-drug resistant (MDR) and extensively drug resistant (XDR) tuberculosis around the world, including in industrialized nations, poses a great threat to human health and defines a need to develop new, effective and inexpensive anti-tubercular agents. Previously we developed a chemical systems biology approach to identify off-targets of major pharmaceuticals on a proteome-wide scale. In this paper we further demonstrate the value of this approach through the discovery that existing commercially available drugs, prescribed for the treatment of Parkinson's disease, have the potential to treat MDR and XDR tuberculosis. These drugs, entacapone and tolcapone, are predicted to bind to the enzyme InhA and directly inhibit substrate binding. The prediction is validated by in vitro and InhA kinetic assays using tablets of Comtan, whose active component is entacapone. The minimal inhibition concentration (MIC(99)) of entacapone for Mycobacterium tuberculosis (M.tuberculosis) is approximately 260.0 microM, well below the toxicity concentration determined by an in vitro cytotoxicity model using a human neuroblastoma cell line. Moreover, kinetic assays indicate that Comtan inhibits InhA activity by 47.0% at an entacapone concentration of approximately 80 microM. Thus the active component in Comtan represents a promising lead compound for developing a new class of anti-tubercular therapeutics with excellent safety profiles. More generally, the protocol described in this paper can be included in a drug discovery pipeline in an effort to discover novel drug leads with desired safety profiles, and therefore accelerate the development of new drugs.
Collapse
|
71
|
Drug discovery using chemical systems biology: identification of the protein-ligand binding network to explain the side effects of CETP inhibitors. PLoS Comput Biol 2009; 5:e1000387. [PMID: 19436720 PMCID: PMC2676506 DOI: 10.1371/journal.pcbi.1000387] [Citation(s) in RCA: 188] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2009] [Accepted: 04/13/2009] [Indexed: 01/11/2023] Open
Abstract
Systematic identification of protein-drug interaction networks is crucial to correlate complex modes of drug action to clinical indications. We introduce a novel computational strategy to identify protein-ligand binding profiles on a genome-wide scale and apply it to elucidating the molecular mechanisms associated with the adverse drug effects of Cholesteryl Ester Transfer Protein (CETP) inhibitors. CETP inhibitors are a new class of preventive therapies for the treatment of cardiovascular disease. However, clinical studies indicated that one CETP inhibitor, Torcetrapib, has deadly off-target effects as a result of hypertension, and hence it has been withdrawn from phase III clinical trials. We have identified a panel of off-targets for Torcetrapib and other CETP inhibitors from the human structural genome and map those targets to biological pathways via the literature. The predicted protein-ligand network is consistent with experimental results from multiple sources and reveals that the side-effect of CETP inhibitors is modulated through the combinatorial control of multiple interconnected pathways. Given that combinatorial control is a common phenomenon observed in many biological processes, our findings suggest that adverse drug effects might be minimized by fine-tuning multiple off-target interactions using single or multiple therapies. This work extends the scope of chemogenomics approaches and exemplifies the role that systems biology has in the future of drug discovery. Both the cost to launch a new drug and the attrition rate during the late stage of the drug discovery and development process are increasing. Torcetrapib is a case in point, having been withdrawn from phase III clinical trials after 15 years of development and an estimated cost of US $800 M. Torcetrapib represents a new class of therapies for the treatment of cardiovascular disease; however, clinical studies indicated that Torcetrapib has deadly side-effects as a result of hypertension. To understand the origins of these adverse drug reactions from Torcetrapib and other related drugs undergoing clinical trials, we introduce a systematic strategy to identify off-targets in the human structural proteome and investigate the roles of these off-targets in impacting human physiology and pathology using biochemical pathway analysis. Our findings suggest that potential side-effects of a new drug can be identified at an early stage of the development cycle and be minimized by fine-tuning multiple off-target interactions. The hope is that this can reduce both the cost of drug development and the mortality rates during clinical trials.
Collapse
|