1
|
Reim T, Ehrt C, Graef J, Günther S, Meents A, Rarey M. SiteMine: Large-scale binding site similarity searching in protein structure databases. Arch Pharm (Weinheim) 2024; 357:e2300661. [PMID: 38335311 DOI: 10.1002/ardp.202300661] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2023] [Revised: 01/10/2024] [Accepted: 01/16/2024] [Indexed: 02/12/2024]
Abstract
Drug discovery and design challenges, such as drug repurposing, analyzing protein-ligand and protein-protein complexes, ligand promiscuity studies, or function prediction, can be addressed by protein binding site similarity analysis. Although numerous tools exist, they all have individual strengths and drawbacks with regard to run time, provision of structure superpositions, and applicability to diverse application domains. Here, we introduce SiteMine, an all-in-one database-driven, alignment-providing binding site similarity search tool to tackle the most pressing challenges of binding site comparison. The performance of SiteMine is evaluated on the ProSPECCTs benchmark, showing a promising performance on most of the data sets. The method performs convincingly regarding all quality criteria for reliable binding site comparison, offering a novel state-of-the-art approach for structure-based molecular design based on binding site comparisons. In a SiteMine showcase, we discuss the high structural similarity between cathepsin L and calpain 1 binding sites and give an outlook on the impact of this finding on structure-based drug design. SiteMine is available at https://uhh.de/naomi.
Collapse
Affiliation(s)
- Thorben Reim
- ZBH - Center for Bioinformatics, Universität Hamburg, Hamburg, Germany
| | - Christiane Ehrt
- ZBH - Center for Bioinformatics, Universität Hamburg, Hamburg, Germany
| | - Joel Graef
- ZBH - Center for Bioinformatics, Universität Hamburg, Hamburg, Germany
| | - Sebastian Günther
- Center for Free-Electron Laser Science CFEL, Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany
| | - Alke Meents
- Center for Free-Electron Laser Science CFEL, Deutsches Elektronen-Synchrotron DESY, Hamburg, Germany
| | - Matthias Rarey
- ZBH - Center for Bioinformatics, Universität Hamburg, Hamburg, Germany
| |
Collapse
|
2
|
Pallante L, Cannariato M, Androutsos L, Zizzi EA, Bompotas A, Hada X, Grasso G, Kalogeras A, Mavroudi S, Di Benedetto G, Theofilatos K, Deriu MA. VirtuousPocketome: a computational tool for screening protein-ligand complexes to identify similar binding sites. Sci Rep 2024; 14:6296. [PMID: 38491261 PMCID: PMC10943019 DOI: 10.1038/s41598-024-56893-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/09/2024] [Accepted: 03/12/2024] [Indexed: 03/18/2024] Open
Abstract
Protein residues within binding pockets play a critical role in determining the range of ligands that can interact with a protein, influencing its structure and function. Identifying structural similarities in proteins offers valuable insights into their function and activation mechanisms, aiding in predicting protein-ligand interactions, anticipating off-target effects, and facilitating the development of therapeutic agents. Numerous computational methods assessing global or local similarity in protein cavities have emerged, but their utilization is impeded by complexity, impractical automation for amino acid pattern searches, and an inability to evaluate the dynamics of scrutinized protein-ligand systems. Here, we present a general, automatic and unbiased computational pipeline, named VirtuousPocketome, aimed at screening huge databases of proteins for similar binding pockets starting from an interested protein-ligand complex. We demonstrate the pipeline's potential by exploring a recently-solved human bitter taste receptor, i.e. the TAS2R46, complexed with strychnine. We pinpointed 145 proteins sharing similar binding sites compared to the analysed bitter taste receptor and the enrichment analysis highlighted the related biological processes, molecular functions and cellular components. This work represents the foundation for future studies aimed at understanding the effective role of tastants outside the gustatory system: this could pave the way towards the rationalization of the diet as a supplement to standard pharmacological treatments and the design of novel tastants-inspired compounds to target other proteins involved in specific diseases or disorders. The proposed pipeline is publicly accessible, can be applied to any protein-ligand complex, and could be expanded to screen any database of protein structures.
Collapse
Affiliation(s)
- Lorenzo Pallante
- Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129, Torino, Italy
| | - Marco Cannariato
- Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129, Torino, Italy
| | | | - Eric A Zizzi
- Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129, Torino, Italy
| | - Agorakis Bompotas
- Industrial Systems Institute, Athena Research Center, 265 04, Patras, Greece
| | - Xhesika Hada
- Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129, Torino, Italy
| | - Gianvito Grasso
- Dalle Molle Institute for Artificial Intelligence IDSIA USI-SUPSI, 6962, Lugano-Viganello, Switzerland
| | | | - Seferina Mavroudi
- Department of Nursing, School of Health Rehabilitation Sciences, University of Patras, 265 04, Patras, Greece
| | | | | | - Marco A Deriu
- Department of Mechanical and Aerospace Engineering, Politecnico di Torino, PolitoBIOMedLab, 10129, Torino, Italy.
| |
Collapse
|
3
|
Zhao Z, Bourne PE. How Ligands Interact with the Kinase Hinge. ACS Med Chem Lett 2023; 14:1503-1508. [PMID: 37974950 PMCID: PMC10641887 DOI: 10.1021/acsmedchemlett.3c00212] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Accepted: 10/03/2023] [Indexed: 11/19/2023] Open
Abstract
ATP-competitive kinase inhibitors form hydrogen bond interactions with the kinase hinge region at the adenine binding site. Thus, it is crucial to explore hinge-ligand recognition as part of a rational drug design strategy. Here, harnessing known ligand-bound kinase structures and experimental assay resources, we first created a kinase structure-assay database (KSAD) containing 2705 nM ligand-bound kinase complexes. Then, using KSAD, we systematically investigate hinge-ligand binding patterns using interaction fingerprints, thereby delineating 15 different hydrogen-bond interaction modes. We believe these results will be valuable for de novo drug design and/or scaffold hopping of kinase-targeted drugs.
Collapse
Affiliation(s)
- Zheng Zhao
- School of Data Science and Department
of Biomedical Engineering, University of
Virginia, Charlottesville, Virginia 22904, United States
| | - Philip E. Bourne
- School of Data Science and Department
of Biomedical Engineering, University of
Virginia, Charlottesville, Virginia 22904, United States
| |
Collapse
|
4
|
Eguida M, Rognan D. Estimating the Similarity between Protein Pockets. Int J Mol Sci 2022; 23:12462. [PMID: 36293316 PMCID: PMC9604425 DOI: 10.3390/ijms232012462] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 10/15/2022] [Accepted: 10/16/2022] [Indexed: 10/28/2023] Open
Abstract
With the exponential increase in publicly available protein structures, the comparison of protein binding sites naturally emerged as a scientific topic to explain observations or generate hypotheses for ligand design, notably to predict ligand selectivity for on- and off-targets, explain polypharmacology, and design target-focused libraries. The current review summarizes the state-of-the-art computational methods applied to pocket detection and comparison as well as structural druggability estimates. The major strengths and weaknesses of current pocket descriptors, alignment methods, and similarity search algorithms are presented. Lastly, an exhaustive survey of both retrospective and prospective applications in diverse medicinal chemistry scenarios illustrates the capability of the existing methods and the hurdle that still needs to be overcome for more accurate predictions.
Collapse
Affiliation(s)
| | - Didier Rognan
- Laboratoire d’Innovation Thérapeutique, UMR7200 CNRS-Université de Strasbourg, 67400 Illkirch, France
| |
Collapse
|
5
|
Li Y, Xu Y, Yu Y. CRNNTL: Convolutional Recurrent Neural Network and Transfer Learning for QSAR Modeling in Organic Drug and Material Discovery. Molecules 2021; 26:molecules26237257. [PMID: 34885843 PMCID: PMC8658888 DOI: 10.3390/molecules26237257] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 11/25/2021] [Accepted: 11/26/2021] [Indexed: 11/16/2022] Open
Abstract
Molecular latent representations, derived from autoencoders (AEs), have been widely used for drug or material discovery over the past couple of years. In particular, a variety of machine learning methods based on latent representations have shown excellent performance on quantitative structure–activity relationship (QSAR) modeling. However, the sequence feature of them has not been considered in most cases. In addition, data scarcity is still the main obstacle for deep learning strategies, especially for bioactivity datasets. In this study, we propose the convolutional recurrent neural network and transfer learning (CRNNTL) method inspired by the applications of polyphonic sound detection and electrocardiogram classification. Our model takes advantage of both convolutional and recurrent neural networks for feature extraction, as well as the data augmentation method. According to QSAR modeling on 27 datasets, CRNNTL can outperform or compete with state-of-art methods in both drug and material properties. In addition, the performances on one isomers-based dataset indicate that its excellent performance results from the improved ability in global feature extraction when the ability of the local one is maintained. Then, the transfer learning results show that CRNNTL can overcome data scarcity when choosing relative source datasets. Finally, the high versatility of our model is shown by using different latent representations as inputs from other types of AEs.
Collapse
Affiliation(s)
- Yaqin Li
- West China Tianfu Hospital, Sichuan University, Chengdu 610041, China
- Correspondence: (Y.L.); (Y.Y.)
| | - Yongjin Xu
- Department of Chemistry and Molecular Biology, University of Gothenburg, Kemivägen 10, 41296 Gothenburg, Sweden;
| | - Yi Yu
- Department of Chemistry and Molecular Biology, University of Gothenburg, Kemivägen 10, 41296 Gothenburg, Sweden;
- Correspondence: (Y.L.); (Y.Y.)
| |
Collapse
|
6
|
Bhadra A, Yeturu K. Site2Vec: a reference frame invariant algorithm for vector embedding of protein–ligand binding sites. MACHINE LEARNING: SCIENCE AND TECHNOLOGY 2021. [DOI: 10.1088/2632-2153/abad88] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Abstract
Protein–ligand interactions are one of the fundamental types of molecular interactions in living systems. Ligands are small molecules that interact with protein molecules at specific regions on their surfaces called binding sites. Binding sites would also determine ADMET properties of a drug molecule. Tasks such as assessment of protein functional similarity and detection of side effects of drugs need identification of similar binding sites of disparate proteins across diverse pathways. To this end, methods for computing similarities between binding sites are still evolving and is an active area of research even today. Machine learning methods for similarity assessment require feature descriptors of binding sites. Traditional methods based on hand engineered motifs and atomic configurations are not scalable across several thousands of sites. In this regard, deep neural network algorithms are now deployed which can capture very complex input feature space. However, one fundamental challenge in applying deep learning to structures of binding sites is the input representation and the reference frame. We report here a novel algorithm, Site2Vec, that derives reference frame invariant vector embedding of a protein–ligand binding site. The method is based on pairwise distances between representative points and chemical compositions in terms of constituent amino acids of a site. The vector embedding serves as a locality sensitive hash function for proximity queries and determining similar sites. The method has been the top performer with more than 95% quality scores in extensive benchmarking studies carried over 10 data sets and against 23 other site comparison methods in the field. The algorithm serves for high throughput processing and has been evaluated for stability with respect to reference frame shifts, coordinate perturbations and residue mutations. We also provide the method as a standalone executable and a web service hosted at (http://services.iittp.ac.in/bioinfo/home).
Collapse
|
7
|
Wang C, Kurgan L. Survey of Similarity-Based Prediction of Drug-Protein Interactions. Curr Med Chem 2021; 27:5856-5886. [PMID: 31393241 DOI: 10.2174/0929867326666190808154841] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2017] [Revised: 04/16/2018] [Accepted: 10/23/2018] [Indexed: 12/20/2022]
Abstract
Therapeutic activity of a significant majority of drugs is determined by their interactions with proteins. Databases of drug-protein interactions (DPIs) primarily focus on the therapeutic protein targets while the knowledge of the off-targets is fragmented and partial. One way to bridge this knowledge gap is to employ computational methods to predict protein targets for a given drug molecule, or interacting drugs for given protein targets. We survey a comprehensive set of 35 methods that were published in high-impact venues and that predict DPIs based on similarity between drugs and similarity between protein targets. We analyze the internal databases of known PDIs that these methods utilize to compute similarities, and investigate how they are linked to the 12 publicly available source databases. We discuss contents, impact and relationships between these internal and source databases, and well as the timeline of their releases and publications. The 35 predictors exploit and often combine three types of similarities that consider drug structures, drug profiles, and target sequences. We review the predictive architectures of these methods, their impact, and we explain how their internal DPIs databases are linked to the source databases. We also include a detailed timeline of the development of these predictors and discuss the underlying limitations of the current resources and predictive tools. Finally, we provide several recommendations concerning the future development of the related databases and methods.
Collapse
Affiliation(s)
- Chen Wang
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| | - Lukasz Kurgan
- Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, United States
| |
Collapse
|
8
|
Artificial intelligence in the early stages of drug discovery. Arch Biochem Biophys 2020; 698:108730. [PMID: 33347838 DOI: 10.1016/j.abb.2020.108730] [Citation(s) in RCA: 21] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 12/11/2020] [Accepted: 12/14/2020] [Indexed: 02/07/2023]
Abstract
Although the use of computational methods within the pharmaceutical industry is well established, there is an urgent need for new approaches that can improve and optimize the pipeline of drug discovery and development. In spite of the fact that there is no unique solution for this need for innovation, there has recently been a strong interest in the use of Artificial Intelligence for this purpose. As a matter of fact, not only there have been major contributions from the scientific community in this respect, but there has also been a growing partnership between the pharmaceutical industry and Artificial Intelligence companies. Beyond these contributions and efforts there is an underlying question, which we intend to discuss in this review: can the intrinsic difficulties within the drug discovery process be overcome with the implementation of Artificial Intelligence? While this is an open question, in this work we will focus on the advantages that these algorithms provide over the traditional methods in the context of early drug discovery.
Collapse
|
9
|
Lim H, He D, Qiu Y, Krawczuk P, Sun X, Xie L. Rational discovery of dual-indication multi-target PDE/Kinase inhibitor for precision anti-cancer therapy using structural systems pharmacology. PLoS Comput Biol 2019; 15:e1006619. [PMID: 31206508 PMCID: PMC6576746 DOI: 10.1371/journal.pcbi.1006619] [Citation(s) in RCA: 29] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2018] [Accepted: 04/26/2019] [Indexed: 01/09/2023] Open
Abstract
Many complex diseases such as cancer are associated with multiple pathological manifestations. Moreover, the therapeutics for their treatments often lead to serious side effects. Thus, it is needed to develop multi-indication therapeutics that can simultaneously target multiple clinical indications of interest and mitigate the side effects. However, conventional one-drug-one-gene drug discovery paradigm and emerging polypharmacology approach rarely tackle the challenge of multi-indication drug design. For the first time, we propose a one-drug-multi-target-multi-indication strategy. We develop a novel structural systems pharmacology platform 3D-REMAP that uses ligand binding site comparison and protein-ligand docking to augment sparse chemical genomics data for the machine learning model of genome-scale chemical-protein interaction prediction. Experimentally validated predictions systematically show that 3D-REMAP outperforms state-of-the-art ligand-based, receptor-based, and machine learning methods alone. As a proof-of-concept, we utilize the concept of drug repurposing that is enabled by 3D-REMAP to design dual-indication anti-cancer therapy. The repurposed drug can demonstrate anti-cancer activity for cancers that do not have effective treatment as well as reduce the risk of heart failure that is associated with all types of existing anti-cancer therapies. We predict that levosimendan, a PDE inhibitor for heart failure, inhibits serine/threonine-protein kinase RIOK1 and other kinases. Subsequent experiments and systems biology analyses confirm this prediction, and suggest that levosimendan is active against multiple cancers, notably lymphoma, through the direct inhibition of RIOK1 and RNA processing pathway. We further develop machine learning models to predict cancer cell-line's and a patient's response to levosimendan. Our findings suggest that levosimendan can be a promising novel lead compound for the development of safe, effective, and precision multi-indication anti-cancer therapy. This study demonstrates the potential of structural systems pharmacology in designing polypharmacology for precision medicine. It may facilitate transforming the conventional one-drug-one-gene-one-disease drug discovery process and single-indication polypharmacology approach into a new one-drug-multi-target-multi-indication paradigm for complex diseases.
Collapse
Affiliation(s)
- Hansaim Lim
- Ph.D. Program in Biochemistry, The Graduate Center, The City University of New York, New York, New York, United States of America
| | - Di He
- Ph.D. Program in Computer Science, The Graduate Center, The City University of New York, New York, New York, United States of America
| | - Yue Qiu
- Ph.D. Program in Biology, The Graduate Center, The City University of New York, New York, New York, United States of America
| | - Patrycja Krawczuk
- Department of Computer Science, Hunter College, The City University of New York, New York, New York, United States of America
| | - Xiaoru Sun
- Department of Computer Science, Hunter College, The City University of New York, New York, New York, United States of America
- Department of Biostatistics, School of Public Heath, Shandong University, Jinan, Shandong, People’s Republic of China
| | - Lei Xie
- Ph.D. Program in Biochemistry, The Graduate Center, The City University of New York, New York, New York, United States of America
- Ph.D. Program in Computer Science, The Graduate Center, The City University of New York, New York, New York, United States of America
- Ph.D. Program in Biology, The Graduate Center, The City University of New York, New York, New York, United States of America
- Department of Computer Science, Hunter College, The City University of New York, New York, New York, United States of America
- * E-mail:
| |
Collapse
|
10
|
Ehrt C, Brinkjost T, Koch O. Binding site characterization - similarity, promiscuity, and druggability. MEDCHEMCOMM 2019; 10:1145-1159. [PMID: 31391887 DOI: 10.1039/c9md00102f] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/19/2019] [Accepted: 05/31/2019] [Indexed: 12/19/2022]
Abstract
The elucidation of non-obvious binding site similarities has provided useful indications for the establishment of polypharmacology, the identification of potential off-targets, or the repurposing of known drugs. The concept underlying all of these approaches is promiscuous binding which can be analyzed from a ligand-based or a binding site-based perspective. Herein, we applied methods for the automated analysis and comparison of protein binding sites to study promiscuous binding on a novel dataset of sites in complex with ligands sharing common shape and physicochemical properties. We show the suitability of this dataset for the benchmarking of novel binding site comparison methods. Our investigations also reveal promising directions for further in-depth analyses of promiscuity and druggability in a pocket-centered manner. Drawbacks concerning binding site similarity assessment and druggability prediction are outlined, enabling researchers to avoid the typical pitfalls of binding site analyses.
Collapse
Affiliation(s)
- Christiane Ehrt
- Faculty of Chemistry and Chemical Biology , TU Dortmund University , Dortmund , Germany
| | - Tobias Brinkjost
- Faculty of Chemistry and Chemical Biology , TU Dortmund University , Dortmund , Germany.,Department of Computer Science , TU Dortmund University , Dortmund , Germany
| | - Oliver Koch
- Faculty of Chemistry and Chemical Biology , TU Dortmund University , Dortmund , Germany
| |
Collapse
|
11
|
Abstract
INTRODUCTION The success of binding site comparisons in drug discovery is based on the recognized fact that many different proteins have similar binding sites. Indeed, binding site comparisons have found many uses in drug development and have the potential to dramatically cut the cost and shorten the time necessary for the development of new drugs. Areas covered: The authors review recent methods for comparing protein binding sites and their use in drug repurposing and polypharmacology. They examine emerging fields including the use of binding site comparisons in precision medicine, the prediction of structured water molecules, the search for targets of natural compounds, and their application in the development of protein-based drugs by loop modeling and for comparison of RNA binding sites. Expert opinion: Binding site comparisons have produced many interesting results in drug development, but relatively little work has been done on protein-protein interaction sites, which are particularly relevant in view of the success of biological drugs. Growth of protein loop modeling for modulating biological drugs is anticipated. The fusion of currently distinct methods for the comparison of RNA and protein binding sites into a single comprehensive approach could allow the search for new selective ribosomal antibiotics and initiate pharmaceutical research into other nucleoproteins.
Collapse
Affiliation(s)
- Janez Konc
- a Theory Department , National Institute of Chemistry , Ljubljana , Slovenia.,b Faculty of Pharmacy , University of Ljubljana , Ljubljana , Slovenia.,c Faculty of Mathematics , Natural Sciences and Information Technologies, University of Primorska , Koper , Slovenia.,d Faculty of Chemistry and Chemical Technology , University of Maribor , Maribor , Slovenia
| |
Collapse
|
12
|
Zhao Z, Xie L, Bourne PE. Structural Insights into Characterizing Binding Sites in Epidermal Growth Factor Receptor Kinase Mutants. J Chem Inf Model 2019; 59:453-462. [PMID: 30582689 DOI: 10.1021/acs.jcim.8b00458] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Abstract
Over the last two decades epidermal growth factor receptor (EGFR) kinase has become an important target to treat nonsmall cell lung cancer (NSCLC). Currently, three generations of EGFR kinase-targeted small molecule drugs have been FDA approved. They nominally produce a response at the start of treatment and lead to a substantial survival benefit for patients. However, long-term treatment results in acquired drug resistance and further vulnerability to NSCLC. Therefore, novel EGFR kinase inhibitors that specially overcome acquired mutations are urgently needed. To this end, we carried out a comprehensive study of different EGFR kinase mutants using a structural systems pharmacology strategy. Our analysis shows that both wild-type and mutated structures exhibit multiple conformational states that have not been observed in solved crystal structures. We show that this conformational flexibility accommodates diverse types of ligands with multiple types of binding modes. These results provide insights for designing a new generation of EGFR kinase inhibitor that combats acquired drug-resistant mutations through a multiconformation-based drug design strategy.
Collapse
Affiliation(s)
- Zheng Zhao
- Department of Biomedical Engineering , University of Virginia , Charlottesville , Virginia 22904 , United States of America
| | - Lei Xie
- Department of Computer Science, Hunter College , The City University of New York , New York , New York 10065 , United States of America.,The Graduate Center , The City University of New York , New York , New York 10016 , United States of America
| | - Philip E Bourne
- Department of Biomedical Engineering , University of Virginia , Charlottesville , Virginia 22904 , United States of America.,Data Science Institute , University of Virginia , Charlottesville , Virginia 22904 , United States of America
| |
Collapse
|
13
|
Abstract
Drugs modulate disease states through their actions on targets in the body. Determining these targets aids the focused development of new treatments, and helps to better characterize those already employed. One means of accomplishing this is through the deployment of in silico methodologies, harnessing computational analytical and predictive power to produce educated hypotheses for experimental verification. Here, we provide an overview of the current state of the art, describe some of the well-established methods in detail, and reflect on how they, and emerging technologies promoting the incorporation of complex and heterogeneous data-sets, can be employed to improve our understanding of (poly)pharmacology.
Collapse
Affiliation(s)
- Ryan Byrne
- Department of Chemistry and Applied Biosciences, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland
| | - Gisbert Schneider
- Department of Chemistry and Applied Biosciences, Swiss Federal Institute of Technology (ETH), Zurich, Switzerland.
| |
Collapse
|
14
|
Ehrt C, Brinkjost T, Koch O. A benchmark driven guide to binding site comparison: An exhaustive evaluation using tailor-made data sets (ProSPECCTs). PLoS Comput Biol 2018; 14:e1006483. [PMID: 30408032 PMCID: PMC6224041 DOI: 10.1371/journal.pcbi.1006483] [Citation(s) in RCA: 42] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/02/2018] [Accepted: 09/02/2018] [Indexed: 11/24/2022] Open
Abstract
The automated comparison of protein-ligand binding sites provides useful insights into yet unexplored site similarities. Various stages of computational and chemical biology research can benefit from this knowledge. The search for putative off-targets and the establishment of polypharmacological effects by comparing binding sites led to promising results for numerous projects. Although many cavity comparison methods are available, a comprehensive analysis to guide the choice of a tool for a specific application is wanting. Moreover, the broad variety of binding site modeling approaches, comparison algorithms, and scoring metrics impedes this choice. Herein, we aim to elucidate strengths and weaknesses of binding site comparison methodologies. A detailed benchmark study is the only possibility to rationalize the selection of appropriate tools for different scenarios. Specific evaluation data sets were developed to shed light on multiple aspects of binding site comparison. An assembly of all applied benchmark sets (ProSPECCTs–Protein Site Pairs for the Evaluation of Cavity Comparison Tools) is made available for the evaluation and optimization of further and still emerging methods. The results indicate the importance of such analyses to facilitate the choice of a methodology that complies with the requirements of a specific scientific challenge. Binding site similarities are useful in the context of promiscuity prediction, drug repurposing, the analysis of protein-ligand and protein-protein complexes, function prediction, and further fields of general interest in chemical biology and biochemistry. Many years of research have led to the development of a multitude of methods for binding site analysis and comparison. On the one hand, their availability supports research. On the other hand, the huge number of methods hampers the efficient selection of a specific tool. Our research is dedicated to the analysis of different cavity comparison tools. We use several binding site data sets to establish guidelines which can be applied to ensure a successful application of comparison methods by circumventing potential pitfalls.
Collapse
Affiliation(s)
- Christiane Ehrt
- Faculty of Chemistry and Chemical Biology, TU Dortmund University, Dortmund, Germany
| | - Tobias Brinkjost
- Faculty of Chemistry and Chemical Biology, TU Dortmund University, Dortmund, Germany
- Department of Computer Science, TU Dortmund University, Dortmund, Germany
| | - Oliver Koch
- Faculty of Chemistry and Chemical Biology, TU Dortmund University, Dortmund, Germany
- * E-mail: ,
| |
Collapse
|
15
|
Pinzi L, Caporuscio F, Rastelli G. Selection of protein conformations for structure-based polypharmacology studies. Drug Discov Today 2018; 23:1889-1896. [PMID: 30099123 DOI: 10.1016/j.drudis.2018.08.007] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Revised: 08/03/2018] [Accepted: 08/06/2018] [Indexed: 11/29/2022]
Abstract
Several drugs exert their therapeutic effect through the modulation of multiple targets. Structure-based approaches hold great promise for identifying compounds with the desired polypharmacological profiles. These methods use knowledge of the protein binding sites to identify stereoelectronically complementary ligands. The selection of the most suitable protein conformations to be used in the design process is vital, especially for multitarget drug design in which the same ligand has to be accommodated in multiple binding pockets. Herein, we focus on currently available techniques for the selection of the most suitable protein conformations for multitarget drug design, compare the potential advantages and limitations of each method, and comment on how their combination could help in polypharmacology drug design.
Collapse
Affiliation(s)
- Luca Pinzi
- Department of Life Sciences, University of Modena and Reggio Emilia, Via Giuseppe Campi 103, 41125, Modena, Italy
| | - Fabiana Caporuscio
- Department of Life Sciences, University of Modena and Reggio Emilia, Via Giuseppe Campi 103, 41125, Modena, Italy
| | - Giulio Rastelli
- Department of Life Sciences, University of Modena and Reggio Emilia, Via Giuseppe Campi 103, 41125, Modena, Italy.
| |
Collapse
|
16
|
Liu T, Ish‐Shalom S, Torng W, Lafita A, Bock C, Mort M, Cooper DN, Bliven S, Capitani G, Mooney SD, Altman RB. Biological and functional relevance of CASP predictions. Proteins 2018; 86 Suppl 1:374-386. [PMID: 28975675 PMCID: PMC5820171 DOI: 10.1002/prot.25396] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2017] [Revised: 09/12/2017] [Accepted: 10/03/2017] [Indexed: 02/06/2023]
Abstract
Our goal is to answer the question: compared with experimental structures, how useful are predicted models for functional annotation? We assessed the functional utility of predicted models by comparing the performances of a suite of methods for functional characterization on the predictions and the experimental structures. We identified 28 sites in 25 protein targets to perform functional assessment. These 28 sites included nine sites with known ligand binding (holo-sites), nine sites that are expected or suggested by experimental authors for small molecule binding (apo-sites), and Ten sites containing important motifs, loops, or key residues with important disease-associated mutations. We evaluated the utility of the predictions by comparing their microenvironments to the experimental structures. Overall structural quality correlates with functional utility. However, the best-ranked predictions (global) may not have the best functional quality (local). Our assessment provides an ability to discriminate between predictions with high structural quality. When assessing ligand-binding sites, most prediction methods have higher performance on apo-sites than holo-sites. Some servers show consistently high performance for certain types of functional sites. Finally, many functional sites are associated with protein-protein interaction. We also analyzed biologically relevant features from the protein assemblies of two targets where the active site spanned the protein-protein interface. For the assembly targets, we find that the features in the models are mainly determined by the choice of template.
Collapse
Affiliation(s)
- Tianyun Liu
- Department of BioengineeringStanford UniversityStanfordCalifornia
| | - Shirbi Ish‐Shalom
- Biomedical Informatics Training Program, Stanford UniversityStanfordCalifornia
| | - Wen Torng
- Department of BioengineeringStanford UniversityStanfordCalifornia
| | - Aleix Lafita
- Laboratory of Biomolecular ResearchPaul Scherrer InstituteVilligenSwitzerland
- Department of Biosystems Science and EngineeringETH Zurich4058BaselSwitzerland
| | - Christian Bock
- Department of Biomedical Informatics and Medical EducationUniversity of WashingtonSeattleWashington
- Heidelberg UniversityHeidelbergGermany
| | - Matthew Mort
- Institute of Medical Genetics, Cardiff UniversityUnited Kingdom
| | - David N Cooper
- Institute of Medical Genetics, Cardiff UniversityUnited Kingdom
| | - Spencer Bliven
- Laboratory of Biomolecular ResearchPaul Scherrer InstituteVilligenSwitzerland
- National Center for Biotechnology Information, National Library of MedicineNational Institutes of HealthBethesdaMaryland
| | - Guido Capitani
- Laboratory of Biomolecular ResearchPaul Scherrer InstituteVilligenSwitzerland
- Department of BiologyETH ZurichZurichSwitzerland
| | - Sean D. Mooney
- Department of Biomedical Informatics and Medical EducationUniversity of WashingtonSeattleWashington
| | - Russ B. Altman
- Department of BioengineeringStanford UniversityStanfordCalifornia
| |
Collapse
|
17
|
Lee J, Konc J, Janežič D, Brooks BR. Global organization of a binding site network gives insight into evolution and structure-function relationships of proteins. Sci Rep 2017; 7:11652. [PMID: 28912495 PMCID: PMC5599562 DOI: 10.1038/s41598-017-10412-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2017] [Accepted: 08/07/2017] [Indexed: 01/06/2023] Open
Abstract
The global organization of protein binding sites is analyzed by constructing a weighted network of binding sites based on their structural similarities and detecting communities of structurally similar binding sites based on the minimum description length principle. The analysis reveals that there are two central binding site communities that play the roles of the network hubs of smaller peripheral communities. The sizes of communities follow a power-law distribution, which indicates that the binding sites included in larger communities may be older and have been evolutionary structural scaffolds of more recent ones. Structurally similar binding sites in the same community bind to diverse ligands promiscuously and they are also embedded in diverse domain structures. Understanding the general principles of binding site interplay will pave the way for improved drug design and protein design.
Collapse
Affiliation(s)
- Juyong Lee
- Department of Chemistry, Kangwon National University, 1 Kangwondaehak-gil, Chuncheon, 24341, Republic of Korea. .,Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, 20892, United States.
| | - Janez Konc
- Faculty of Mathematics, Natural Sciences and Information Technologies, University of Primorska, Glagoljaška 8, SI-6000, Koper, Slovenia.,National Institute of Chemistry, Hajdrihova 19, SI-1000, Ljubljana, Slovenia
| | - Dušanka Janežič
- Faculty of Mathematics, Natural Sciences and Information Technologies, University of Primorska, Glagoljaška 8, SI-6000, Koper, Slovenia
| | - Bernard R Brooks
- Laboratory of Computational Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, 20892, United States
| |
Collapse
|
18
|
Zhao Z, Xie L, Bourne PE. Insights into the binding mode of MEK type-III inhibitors. A step towards discovering and designing allosteric kinase inhibitors across the human kinome. PLoS One 2017; 12:e0179936. [PMID: 28628649 PMCID: PMC5476283 DOI: 10.1371/journal.pone.0179936] [Citation(s) in RCA: 35] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Accepted: 06/06/2017] [Indexed: 11/18/2022] Open
Abstract
Protein kinases are critical drug targets for treating a large variety of human diseases. Type-III kinase inhibitors have attracted increasing attention as highly selective therapeutics. Thus, understanding the binding mechanism of existing type-III kinase inhibitors provides useful insights into designing new type-III kinase inhibitors. In this work, we have systematically studied the binding mode of MEK-targeted type-III inhibitors using structural systems pharmacology and molecular dynamics simulation. Our studies provide detailed sequence, structure, interaction-fingerprint, pharmacophore and binding-site information on the binding characteristics of MEK type-III kinase inhibitors. We hypothesize that the helix-folding activation loop is a hallmark allosteric binding site for type-III inhibitors. Subsequently, we screened and predicted allosteric binding sites across the human kinome, suggesting other kinases as potential targets suitable for type-III inhibitors.
Collapse
Affiliation(s)
- Zheng Zhao
- National Center for Biotechnology Information, National Library of Medicine, National Institute of Health, Bethesda, Maryland, United States of America
| | - Lei Xie
- Department of Computer Science, Hunter College, The City University of New York, New York, United States of America
- The Graduate Center, The City University of New York, New York, United States of America
| | - Philip E. Bourne
- National Center for Biotechnology Information, National Library of Medicine, National Institute of Health, Bethesda, Maryland, United States of America
- Office of the Director, National Institutes of Health, Bethesda, Maryland, United States of America
| |
Collapse
|
19
|
Zhao Z, Liu Q, Bliven S, Xie L, Bourne PE. Determining Cysteines Available for Covalent Inhibition Across the Human Kinome. J Med Chem 2017; 60:2879-2889. [PMID: 28326775 PMCID: PMC5493210 DOI: 10.1021/acs.jmedchem.6b01815] [Citation(s) in RCA: 90] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Covalently bound protein kinase inhibitors have been frequently designed to target noncatalytic cysteines at the ATP binding site. Thus, it is important to know if a given cysteine can form a covalent bond. Here we combine a function-site interaction fingerprint method and DFT calculations to determine the potential of cysteines to form a covalent interaction with an inhibitor. By harnessing the human structural kinome, a comprehensive structure-based binding site cysteine data set was assembled. The orientation of the cysteine thiol group indicates which cysteines can potentially form covalent bonds. These covalent inhibitor easy-available cysteines are located within five regions: P-loop, roof of pocket, front pocket, catalytic-2 of the catalytic loop, and DFG-3 close to the DFG peptide. In an independent test set these cysteines covered 95% of covalent kinase inhibitors. This study provides new insights into cysteine reactivity and preference which is important for the prospective development of covalent kinase inhibitors.
Collapse
Affiliation(s)
- Zheng Zhao
- National Center for Biotechnology Information, National Library of Medicine, National Institute of Health, Bethesda, MD 20892, USA
| | - Qingsong Liu
- High Magnetic Field Laboratory, Chinese Academy of Sciences, Hefei, Anhui230031, China
| | - Spencer Bliven
- National Center for Biotechnology Information, National Library of Medicine, National Institute of Health, Bethesda, MD 20892, USA
- Laboratory of Biomolecular Research, Paul Scherrer Institute, 5232 Villigen PSI, Switzerland
| | - Lei Xie
- Department of Computer Science, Hunter College, The City University of New York, NY 10065, USA
- The Graduate Center, The City University of New York, NY 10016, USA
| | - Philip E. Bourne
- National Center for Biotechnology Information, National Library of Medicine, National Institute of Health, Bethesda, MD 20892, USA
- Office of the Director, National Institutes of Health, Bethesda, MD 20892, USA
| |
Collapse
|
20
|
Molecular mechanisms involved in the side effects of fatty acid amide hydrolase inhibitors: a structural phenomics approach to proteome-wide cellular off-target deconvolution and disease association. NPJ Syst Biol Appl 2016; 2:16023. [PMID: 28725477 PMCID: PMC5516858 DOI: 10.1038/npjsba.2016.23] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2016] [Revised: 07/14/2016] [Accepted: 08/02/2016] [Indexed: 01/20/2023] Open
Abstract
Fatty acid amide hydrolase (FAAH) is a promising therapeutic target for the treatment of pain and CNS disorders. However, the development of potent and safe FAAH inhibitors is hindered by their off-target mediated side effect that leads to brain cell death. Its physiological off-targets and their associations with phenotypes may not be characterized using existing experimental and computational techniques as these methods fail to have sufficient proteome coverage and/or ignore native biological assemblies (BAs; i.e., protein quaternary structures). To understand the mechanisms of the side effects from FAAH inhibitors and other drugs, we develop a novel structural phenomics approach to identifying the physiological off-targets binding profile in the cellular context and on a structural proteome scale, and investigate the roles of these off-targets in impacting human physiology and pathology using text mining-based phenomics analysis. Using this integrative approach, we discover that FAAH inhibitors may bind to the dimerization interface of NMDA receptor (NMDAR) and several other BAs, and thus disrupt their cellular functions. Specifically, the malfunction of the NMDAR is associated with a wide spectrum of brain disorders that are directly related to the observed side effects of FAAH inhibitors. This finding is consistent with the existing literature, and provides testable hypotheses for investigating the molecular origin of the side effects of FAAH inhibitors. Thus, the in silico method proposed here, which can for the first time predict proteome-wide drug interactions with cellular BAs and link BA–ligand interaction with clinical outcomes, can be valuable in off-target screening. The development and application of such methods will accelerate the development of more safe and effective therapeutics.
Collapse
|
21
|
Ehrt C, Brinkjost T, Koch O. Impact of Binding Site Comparisons on Medicinal Chemistry and Rational Molecular Design. J Med Chem 2016; 59:4121-51. [PMID: 27046190 DOI: 10.1021/acs.jmedchem.6b00078] [Citation(s) in RCA: 66] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]
Abstract
Modern rational drug design not only deals with the search for ligands binding to interesting and promising validated targets but also aims to identify the function and ligands of yet uncharacterized proteins having impact on different diseases. Additionally, it contributes to the design of inhibitors with distinct selectivity patterns and the prediction of possible off-target effects. The identification of similarities between binding sites of various proteins is a useful approach to cope with those challenges. The main scope of this perspective is to describe applications of different protein binding site comparison approaches to outline their applicability and impact on molecular design. The article deals with various substantial application domains and provides some outstanding examples to show how various binding site comparison methods can be applied to promote in silico drug design workflows. In addition, we will also briefly introduce the fundamental principles of different protein binding site comparison methods.
Collapse
Affiliation(s)
- Christiane Ehrt
- Faculty of Chemistry and Chemical Biology, TU Dortmund University , Otto-Hahn-Straße 6, 44227 Dortmund, Germany
| | - Tobias Brinkjost
- Faculty of Chemistry and Chemical Biology, TU Dortmund University , Otto-Hahn-Straße 6, 44227 Dortmund, Germany.,Department of Computer Science, TU Dortmund University , Otto-Hahn-Straße 14, 44224 Dortmund, Germany
| | - Oliver Koch
- Faculty of Chemistry and Chemical Biology, TU Dortmund University , Otto-Hahn-Straße 6, 44227 Dortmund, Germany
| |
Collapse
|
22
|
Núñez-Vivanco G, Valdés-Jiménez A, Besoaín F, Reyes-Parada M. Geomfinder: a multi-feature identifier of similar three-dimensional protein patterns: a ligand-independent approach. J Cheminform 2016; 8:19. [PMID: 27092185 PMCID: PMC4834829 DOI: 10.1186/s13321-016-0131-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2015] [Accepted: 04/04/2016] [Indexed: 11/15/2022] Open
Abstract
Background Since the structure of proteins is more conserved than the sequence, the identification of conserved three-dimensional (3D) patterns among a set of proteins, can be important for protein function prediction, protein clustering, drug discovery and the establishment of evolutionary relationships. Thus, several computational applications to identify, describe and compare 3D patterns (or motifs) have been developed. Often, these tools consider a 3D pattern as that described by the residues surrounding co-crystallized/docked ligands available from X-ray crystal structures or homology models. Nevertheless, many of the protein structures stored in public databases do not provide information about the location and characteristics of ligand binding sites and/or other important 3D patterns such as allosteric sites, enzyme-cofactor interaction motifs, etc. This makes necessary the development of new ligand-independent methods to search and compare 3D patterns in all available protein structures. Results Here we introduce Geomfinder, an intuitive, flexible, alignment-free and ligand-independent web server for detailed estimation of similarities between all pairs of 3D patterns detected in any two given protein structures. We used around 1100 protein structures to form pairs of proteins which were assessed with Geomfinder. In these analyses each protein was considered in only one pair (e.g. in a subset of 100 different proteins, 50 pairs of proteins can be defined). Thus: (a) Geomfinder detected identical pairs of 3D patterns in a series of monoamine oxidase-B structures, which corresponded to the effectively similar ligand binding sites at these proteins; (b) we identified structural similarities among pairs of protein structures which are targets of compounds such as acarbose, benzamidine, adenosine triphosphate and pyridoxal phosphate; these similar 3D patterns are not detected using sequence-based methods; (c) the detailed evaluation of three specific cases showed the versatility of Geomfinder, which was able to discriminate between similar and different 3D patterns related to binding sites of common substrates in a range of diverse proteins. Conclusions Geomfinder allows detecting similar 3D patterns between any two pair of protein structures, regardless of the divergency among their amino acids sequences. Although the software is not intended for simultaneous multiple comparisons in a large number of proteins, it can be particularly useful in cases such as the structure-based design of multitarget drugs, where a detailed analysis of 3D patterns similarities between a few selected protein targets is essential. Electronic supplementary material The online version of this article (doi:10.1186/s13321-016-0131-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Gabriel Núñez-Vivanco
- Escuela de Ingeniería Civil en Bioinformática, Universidad de Talca, Avenida Lircay s/n, Talca, Chile ; Centro de Bioinformática y Simulación Molecular, Universidad de Talca, 2 Norte 685, Talca, Chile
| | - Alejandro Valdés-Jiménez
- Escuela de Ingeniería Civil en Bioinformática, Universidad de Talca, Avenida Lircay s/n, Talca, Chile
| | - Felipe Besoaín
- Escuela de Ingeniería Civil en Bioinformática, Universidad de Talca, Avenida Lircay s/n, Talca, Chile ; Estudis d'Informática, Multimedia i Telecomunicacio, Universitat Oberta de Catalunya, Rambla del Poblenou 15, Barcelona, Spain ; Internet Interdisciplinary Institute (IN3), Universitat Oberta de Catalunya, Av. Carl Friedrich Gauss, 5, Castelldefels, Barcelona, Spain
| | - Miguel Reyes-Parada
- School of Medicine, Faculty of Medical Sciences, Universidad de Santiago de Chile, Avenida Libertador Bernardo O'Higgins 3363, Santiago, Chile ; Facultad de Ciencias de la Salud, Universidad Autonóma de Chile, 5 Poniente 1670, Talca, Chile
| |
Collapse
|
23
|
Zhao Z, Xie L, Xie L, Bourne PE. Delineation of Polypharmacology across the Human Structural Kinome Using a Functional Site Interaction Fingerprint Approach. J Med Chem 2016; 59:4326-41. [PMID: 26929980 DOI: 10.1021/acs.jmedchem.5b02041] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Targeted polypharmacology of kinases has emerged as a promising strategy to design efficient and safe therapeutics. Here, we perform a systematic study of kinase-ligand binding modes for the human structural kinome at scale (208 kinases, 1777 unique ligands, and their complexes) by integrating chemical genomics and structural genomics data and by introducing a functional site interaction fingerprint (Fs-IFP) method. New insights into kinase-ligand binding modes were obtained. We establish relationships between the features of binding modes, the ligands, and the binding pockets, respectively. We also drive the intrinsic binding specificity and which correlation with amino acid conservation. Third, we explore the landscape of the binding modes and highlight the regions of "selectivity pocket" and "selectivity entrance". Finally, we demonstrate that Fs-IFP similarity is directly correlated to the experimentally determined profile. These improve our understanding of kinase-ligand interactions and contribute to the design of novel polypharmacological therapies targeting kinases.
Collapse
Affiliation(s)
- Zheng Zhao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health , Bethesda, Maryland 20894, United States
| | - Li Xie
- Scripps Ranch , San Diego, California 92131, United States
| | - Lei Xie
- Department of Computer Science, Hunter College, The City University of New York , New York, New York 10065, United States.,The Graduate Center, The City University of New York , New York, New York 10016, United States
| | - Philip E Bourne
- Office of the Director, National Institutes of Health, Bethesda, Maryland 20892, United States
| |
Collapse
|
24
|
Zhao Z, Martin C, Fan R, Bourne PE, Xie L. Drug repurposing to target Ebola virus replication and virulence using structural systems pharmacology. BMC Bioinformatics 2016; 17:90. [PMID: 26887654 PMCID: PMC4757998 DOI: 10.1186/s12859-016-0941-9] [Citation(s) in RCA: 37] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2015] [Accepted: 02/10/2016] [Indexed: 01/09/2023] Open
Abstract
Background The recent outbreak of Ebola has been cited as the largest in history. Despite this global health crisis, few drugs are available to efficiently treat Ebola infections. Drug repurposing provides a potentially efficient solution to accelerating the development of therapeutic approaches in response to Ebola outbreak. To identify such candidates, we use an integrated structural systems pharmacology pipeline which combines proteome-scale ligand binding site comparison, protein-ligand docking, and Molecular Dynamics (MD) simulation. Results One thousand seven hundred and sixty-six FDA-approved drugs and 259 experimental drugs were screened to identify those with the potential to inhibit the replication and virulence of Ebola, and to determine the binding modes with their respective targets. Initial screening has identified a number of promising hits. Notably, Indinavir; an HIV protease inhibitor, may be effective in reducing the virulence of Ebola. Additionally, an antifungal (Sinefungin) and several anti-viral drugs (e.g. Maraviroc, Abacavir, Telbivudine, and Cidofovir) may inhibit Ebola RNA-directed RNA polymerase through targeting the MTase domain. Conclusions Identification of safe drug candidates is a crucial first step toward the determination of timely and effective therapeutic approaches to address and mitigate the impact of the Ebola global crisis and future outbreaks of pathogenic diseases. Further in vitro and in vivo testing to evaluate the anti-Ebola activity of these drugs is warranted. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-0941-9) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Zheng Zhao
- High Magnetic Field Laboratory, Chinese Academy of Sciences, Hefei, P. R. China.,National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Che Martin
- The Graduate Center, The City University of New York, New York, USA
| | - Raymond Fan
- Department of Chemistry, Hunter College, The City University of New York, New York, USA
| | - Philip E Bourne
- Office of the Director, National Institutes of Health, Bethesda, MD, USA
| | - Lei Xie
- The Graduate Center, The City University of New York, New York, USA. .,Department of Computer Science, Hunter College, The City University of New York, New York, USA.
| |
Collapse
|
25
|
Wang C, Hu G, Wang K, Brylinski M, Xie L, Kurgan L. PDID: database of molecular-level putative protein-drug interactions in the structural human proteome. Bioinformatics 2016; 32:579-86. [PMID: 26504143 PMCID: PMC5963357 DOI: 10.1093/bioinformatics/btv597] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Revised: 09/24/2015] [Accepted: 10/12/2015] [Indexed: 12/31/2022] Open
Abstract
MOTIVATION Many drugs interact with numerous proteins besides their intended therapeutic targets and a substantial portion of these interactions is yet to be elucidated. Protein-Drug Interaction Database (PDID) addresses incompleteness of these data by providing access to putative protein-drug interactions that cover the entire structural human proteome. RESULTS PDID covers 9652 structures from 3746 proteins and houses 16 800 putative interactions generated from close to 1.1 million accurate, all-atom structure-based predictions for several dozens of popular drugs. The predictions were generated with three modern methods: ILbind, SMAP and eFindSite. They are accompanied by propensity scores that quantify likelihood of interactions and coordinates of the putative location of the binding drugs in the corresponding protein structures. PDID complements the current databases that focus on the curated interactions and the BioDrugScreen database that relies on docking to find putative interactions. Moreover, we also include experimentally curated interactions which are linked to their sources: DrugBank, BindingDB and Protein Data Bank. Our database can be used to facilitate studies related to polypharmacology of drugs including repurposing and explaining side effects of drugs. AVAILABILITY AND IMPLEMENTATION PDID database is freely available at http://biomine.ece.ualberta.ca/PDID/.
Collapse
Affiliation(s)
- Chen Wang
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada T6G 2V4
| | - Gang Hu
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, People's Republic of China
| | - Kui Wang
- School of Mathematical Sciences and LPMC, Nankai University, Tianjin 300071, People's Republic of China
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA 70803, USA
| | - Lei Xie
- Department of Computer Science, Hunter College, City University of New York (CUNY), New York, NY 10065, USA and
| | - Lukasz Kurgan
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada T6G 2V4, Department of Computer Science, Virginia Commonwealth University, Richmond, VA 23284, USA
| |
Collapse
|
26
|
Liu T, Altman RB. Relating Essential Proteins to Drug Side-Effects Using Canonical Component Analysis: A Structure-Based Approach. J Chem Inf Model 2015; 55:1483-94. [PMID: 26121262 DOI: 10.1021/acs.jcim.5b00030] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
The molecular mechanism of many drug side-effects is unknown and difficult to predict. Previous methods for explaining side-effects have focused on known drug targets and their pathways. However, low affinity binding to proteins that are not usually considered drug targets may also drive side-effects. In order to assess these alternative targets, we used the 3D structures of 563 essential human proteins systematically to predict binding to 216 drugs. We first benchmarked our affinity predictions with available experimental data. We then combined singular value decomposition and canonical component analysis (SVD-CCA) to predict side-effects based on these novel target profiles. Our method predicts side-effects with good accuracy (average AUC: 0.82 for side effects present in <50% of drug labels). We also noted that side-effect frequency is the most important feature for prediction and can confound efforts at elucidating mechanism; our method allows us to remove the contribution of frequency and isolate novel biological signals. In particular, our analysis produces 2768 triplet associations between 50 essential proteins, 99 drugs, and 77 side-effects. Although experimental validation is difficult because many of our essential proteins do not have validated assays, we nevertheless attempted to validate a subset of these associations using experimental assay data. Our focus on essential proteins allows us to find potential associations that would likely be missed if we used recognized drug targets. Our associations provide novel insights about the molecular mechanisms of drug side-effects and highlight the need for expanded experimental efforts to investigate drug binding to proteins more broadly.
Collapse
Affiliation(s)
- Tianyun Liu
- †Department of Genetics, Stanford University, Stanford, California 94305, United States
| | - Russ B Altman
- ‡Department of Genetics and Department of Bioengineering, Stanford University, Stanford, California 94305, United States
| |
Collapse
|
27
|
Arooj M, Sakkiah S, Cao GP, Kim S, Arulalapperumal V, Lee KW. Finding off-targets, biological pathways, and target diseases for chymase inhibitors via structure-based systems biology approach. Proteins 2015; 83:1209-24. [PMID: 25143259 DOI: 10.1002/prot.24677] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2014] [Revised: 08/08/2014] [Accepted: 08/14/2014] [Indexed: 02/03/2023]
Affiliation(s)
- Mahreen Arooj
- School of Biomedical Sciences, Faculty of Health Sciences, Curtin Health Innovation Research Institute (CHIRI); Curtin University Australia
| | - Sugunadevi Sakkiah
- Division of Applied Life Science (BK21 Program); Systems and Synthetic Agrobiotech Center (SSAC), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Research Institute of Natural Science(RINS), Gyeongsang National University (GNU); 501 Jinju-daero Gazha-dong Jinju 660-701 Republic of Korea
| | - Guang Ping Cao
- Division of Applied Life Science (BK21 Program); Systems and Synthetic Agrobiotech Center (SSAC), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Research Institute of Natural Science(RINS), Gyeongsang National University (GNU); 501 Jinju-daero Gazha-dong Jinju 660-701 Republic of Korea
| | - Songmi Kim
- Division of Applied Life Science (BK21 Program); Systems and Synthetic Agrobiotech Center (SSAC), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Research Institute of Natural Science(RINS), Gyeongsang National University (GNU); 501 Jinju-daero Gazha-dong Jinju 660-701 Republic of Korea
| | - Venkatesh Arulalapperumal
- Division of Applied Life Science (BK21 Program); Systems and Synthetic Agrobiotech Center (SSAC), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Research Institute of Natural Science(RINS), Gyeongsang National University (GNU); 501 Jinju-daero Gazha-dong Jinju 660-701 Republic of Korea
| | - Keun Woo Lee
- Division of Applied Life Science (BK21 Program); Systems and Synthetic Agrobiotech Center (SSAC), Plant Molecular Biology and Biotechnology Research Center (PMBBRC), Research Institute of Natural Science(RINS), Gyeongsang National University (GNU); 501 Jinju-daero Gazha-dong Jinju 660-701 Republic of Korea
| |
Collapse
|
28
|
Zhang Y, Zhao Z, Liu H. Deriving Chemically Essential Interactions Based on Active Site Alignments and Quantum Chemical Calculations: A Case Study on Glycoside Hydrolases. ACS Catal 2015. [DOI: 10.1021/cs501709d] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Yinliang Zhang
- School
of Life Sciences, University of Science and Technology of China, 443 Huangshan Road, Hefei, Anhui 230027, China
| | - Zheng Zhao
- Hefei
Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui 230031, China
| | - Haiyan Liu
- School
of Life Sciences, University of Science and Technology of China, 443 Huangshan Road, Hefei, Anhui 230027, China
- Hefei National Laboratory for Physical Sciences at the Microscales, Hefei, Anhui 230027, China
- Hefei
Institutes of Physical Science, Chinese Academy of Sciences, Hefei, Anhui 230031, China
| |
Collapse
|
29
|
Currin A, Swainston N, Day PJ, Kell DB. Synthetic biology for the directed evolution of protein biocatalysts: navigating sequence space intelligently. Chem Soc Rev 2015; 44:1172-239. [PMID: 25503938 PMCID: PMC4349129 DOI: 10.1039/c4cs00351a] [Citation(s) in RCA: 258] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2014] [Indexed: 12/21/2022]
Abstract
The amino acid sequence of a protein affects both its structure and its function. Thus, the ability to modify the sequence, and hence the structure and activity, of individual proteins in a systematic way, opens up many opportunities, both scientifically and (as we focus on here) for exploitation in biocatalysis. Modern methods of synthetic biology, whereby increasingly large sequences of DNA can be synthesised de novo, allow an unprecedented ability to engineer proteins with novel functions. However, the number of possible proteins is far too large to test individually, so we need means for navigating the 'search space' of possible protein sequences efficiently and reliably in order to find desirable activities and other properties. Enzymologists distinguish binding (Kd) and catalytic (kcat) steps. In a similar way, judicious strategies have blended design (for binding, specificity and active site modelling) with the more empirical methods of classical directed evolution (DE) for improving kcat (where natural evolution rarely seeks the highest values), especially with regard to residues distant from the active site and where the functional linkages underpinning enzyme dynamics are both unknown and hard to predict. Epistasis (where the 'best' amino acid at one site depends on that or those at others) is a notable feature of directed evolution. The aim of this review is to highlight some of the approaches that are being developed to allow us to use directed evolution to improve enzyme properties, often dramatically. We note that directed evolution differs in a number of ways from natural evolution, including in particular the available mechanisms and the likely selection pressures. Thus, we stress the opportunities afforded by techniques that enable one to map sequence to (structure and) activity in silico, as an effective means of modelling and exploring protein landscapes. Because known landscapes may be assessed and reasoned about as a whole, simultaneously, this offers opportunities for protein improvement not readily available to natural evolution on rapid timescales. Intelligent landscape navigation, informed by sequence-activity relationships and coupled to the emerging methods of synthetic biology, offers scope for the development of novel biocatalysts that are both highly active and robust.
Collapse
Affiliation(s)
- Andrew Currin
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- School of Chemistry , The University of Manchester , Manchester M13 9PL , UK
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
| | - Neil Swainston
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
- School of Computer Science , The University of Manchester , Manchester M13 9PL , UK
| | - Philip J. Day
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
- Faculty of Medical and Human Sciences , The University of Manchester , Manchester M13 9PT , UK
| | - Douglas B. Kell
- Manchester Institute of Biotechnology , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK . ; http://dbkgroup.org/; @dbkell ; Tel: +44 (0)161 306 4492
- School of Chemistry , The University of Manchester , Manchester M13 9PL , UK
- Centre for Synthetic Biology of Fine and Speciality Chemicals (SYNBIOCHEM) , The University of Manchester , 131, Princess St , Manchester M1 7DN , UK
| |
Collapse
|
30
|
Kaiser F, Eisold A, Labudde D. A Novel Algorithm for Enhanced Structural Motif Matching in Proteins. J Comput Biol 2015; 22:698-713. [PMID: 25695840 DOI: 10.1089/cmb.2014.0263] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
As widely discussed in literature, spatial patterns of amino acids, so-called structural motifs, play an important role in protein function. The functionally responsible part of proteins often lies in an evolutionarily highly conserved spatial arrangement of only a few amino acids, which are held in place tightly by the rest of the structure. Those recurring amino acid arrangements can be seen as patterns in the three-dimensional space and are known as structural motifs. In general, these motifs can mediate various functional interactions, such as DNA/RNA targeting and binding, ligand interactions, substrate catalysis, and stabilization of the protein structure. Hence, characterizing and identifying such conserved structural motifs can contribute to the understanding of structure-function relationships. Therefore, and because of the rapidly increasing number of solved protein structures, it is highly desirable to identify, understand, and moreover to search for structurally scattered amino acid motifs. This work aims at the development and the implementation of a novel and robust matching algorithm to detect structural motifs in large sets of target structures. The proposed methods were combined and implemented to a feature-rich and easy-to-use command line software tool written in Java.
Collapse
Affiliation(s)
- Florian Kaiser
- Department of Bioinformatics, University of Applied Sciences Mittweida , Mittweida, Germany
| | - Alexander Eisold
- Department of Bioinformatics, University of Applied Sciences Mittweida , Mittweida, Germany
| | - Dirk Labudde
- Department of Bioinformatics, University of Applied Sciences Mittweida , Mittweida, Germany
| |
Collapse
|
31
|
Brylinski M. eMatchSite: sequence order-independent structure alignments of ligand binding pockets in protein models. PLoS Comput Biol 2014; 10:e1003829. [PMID: 25232727 PMCID: PMC4168975 DOI: 10.1371/journal.pcbi.1003829] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2014] [Accepted: 07/24/2014] [Indexed: 01/26/2023] Open
Abstract
Detecting similarities between ligand binding sites in the absence of global homology between target proteins has been recognized as one of the critical components of modern drug discovery. Local binding site alignments can be constructed using sequence order-independent techniques, however, to achieve a high accuracy, many current algorithms for binding site comparison require high-quality experimental protein structures, preferably in the bound conformational state. This, in turn, complicates proteome scale applications, where only various quality structure models are available for the majority of gene products. To improve the state-of-the-art, we developed eMatchSite, a new method for constructing sequence order-independent alignments of ligand binding sites in protein models. Large-scale benchmarking calculations using adenine-binding pockets in crystal structures demonstrate that eMatchSite generates accurate alignments for almost three times more protein pairs than SOIPPA. More importantly, eMatchSite offers a high tolerance to structural distortions in ligand binding regions in protein models. For example, the percentage of correctly aligned pairs of adenine-binding sites in weakly homologous protein models is only 4–9% lower than those aligned using crystal structures. This represents a significant improvement over other algorithms, e.g. the performance of eMatchSite in recognizing similar binding sites is 6% and 13% higher than that of SiteEngine using high- and moderate-quality protein models, respectively. Constructing biologically correct alignments using predicted ligand binding sites in protein models opens up the possibility to investigate drug-protein interaction networks for complete proteomes with prospective systems-level applications in polypharmacology and rational drug repositioning. eMatchSite is freely available to the academic community as a web-server and a stand-alone software distribution at http://www.brylinski.org/ematchsite.
Collapse
Affiliation(s)
- Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, Louisiana, United States of America
- Center for Computation & Technology, Louisiana State University, Baton Rouge, Louisiana, United States of America
- * E-mail:
| |
Collapse
|
32
|
Micale G, Pulvirenti A, Giugno R, Ferro A. Proteins comparison through probabilistic optimal structure local alignment. Front Genet 2014; 5:302. [PMID: 25228906 PMCID: PMC4151033 DOI: 10.3389/fgene.2014.00302] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/31/2014] [Accepted: 08/12/2014] [Indexed: 11/13/2022] Open
Abstract
Multiple local structure comparison helps to identify common structural motifs or conserved binding sites in 3D structures in distantly related proteins. Since there is no best way to compare structures and evaluate the alignment, a wide variety of techniques and different similarity scoring schemes have been proposed. Existing algorithms usually compute the best superposition of two structures or attempt to solve it as an optimization problem in a simpler setting (e.g., considering contact maps or distance matrices). Here, we present PROPOSAL (PROteins comparison through Probabilistic Optimal Structure local ALignment), a stochastic algorithm based on iterative sampling for multiple local alignment of protein structures. Our method can efficiently find conserved motifs across a set of protein structures. Only the distances between all pairs of residues in the structures are computed. To show the accuracy and the effectiveness of PROPOSAL we tested it on a few families of protein structures. We also compared PROPOSAL with two state-of-the-art tools for pairwise local alignment on a dataset of manually annotated motifs. PROPOSAL is available as a Java 2D standalone application or a command line program at http://ferrolab.dmi.unict.it/proposal/proposal.html.
Collapse
Affiliation(s)
- Giovanni Micale
- Department of Computer Science, University of Pisa Pisa, Italy
| | - Alfredo Pulvirenti
- Department of Clinical and Molecular Biomedicine, University of Catania Catania, Italy
| | - Rosalba Giugno
- Department of Clinical and Molecular Biomedicine, University of Catania Catania, Italy
| | - Alfredo Ferro
- Department of Clinical and Molecular Biomedicine, University of Catania Catania, Italy
| |
Collapse
|
33
|
Chen BY. VASP-E: specificity annotation with a volumetric analysis of electrostatic isopotentials. PLoS Comput Biol 2014; 10:e1003792. [PMID: 25166865 PMCID: PMC4148194 DOI: 10.1371/journal.pcbi.1003792] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2014] [Accepted: 06/17/2014] [Indexed: 12/01/2022] Open
Abstract
Algorithms for comparing protein structure are frequently used for function annotation. By searching for subtle similarities among very different proteins, these algorithms can identify remote homologs with similar biological functions. In contrast, few comparison algorithms focus on specificity annotation, where the identification of subtle differences among very similar proteins can assist in finding small structural variations that create differences in binding specificity. Few specificity annotation methods consider electrostatic fields, which play a critical role in molecular recognition. To fill this gap, this paper describes VASP-E (Volumetric Analysis of Surface Properties with Electrostatics), a novel volumetric comparison tool based on the electrostatic comparison of protein-ligand and protein-protein binding sites. VASP-E exploits the central observation that three dimensional solids can be used to fully represent and compare both electrostatic isopotentials and molecular surfaces. With this integrated representation, VASP-E is able to dissect the electrostatic environments of protein-ligand and protein-protein binding interfaces, identifying individual amino acids that have an electrostatic influence on binding specificity. VASP-E was used to examine a nonredundant subset of the serine and cysteine proteases as well as the barnase-barstar and Rap1a-raf complexes. Based on amino acids established by various experimental studies to have an electrostatic influence on binding specificity, VASP-E identified electrostatically influential amino acids with 100% precision and 83.3% recall. We also show that VASP-E can accurately classify closely related ligand binding cavities into groups with different binding preferences. These results suggest that VASP-E should prove a useful tool for the characterization of specific binding and the engineering of binding preferences in proteins. Proteins, the ubiquitous worker molecules of the cell, are a diverse class of molecules that perform very specific tasks. Understanding how proteins achieve specificity is a critical step towards understanding biological systems and a key prerequisite for rationally engineering new proteins. To examine electrostatic influences on specificity in proteins, this paper presents VASP-E, a software tool that generates solid representations of the electrostatic potential fields that surround proteins. VASP-E compares solids with constructive solid geometry, a class of techniques developed first for modeling complex machine parts. We observed that solid representations could quantify the degree of charge complementarity in protein-protein interactions and identify key residues that strengthen or weaken them. VASP-E correctly identified amino acids with established experimental influences on protein-protein binding specificity. We also observed that solid representations of electrostatic fields could identify electrostatic conservations and variations that relate to similarities and differences in binding specificity between proteins and small molecules.
Collapse
Affiliation(s)
- Brian Y. Chen
- Department of Computer Science and Engineering, P.C. Rossin College of Engineering and Applied Sciences, Lehigh University, Bethlehem, Pennsylvania, United States of America
- * E-mail:
| |
Collapse
|
34
|
Niu M, Hu J, Wu S, Xiaoe Z, Xu H, Zhang Y, Zhang J, Yang Y. Structural bioinformatics-based identification of EGFR inhibitor gefitinib as a putative lead compound for BACE. Chem Biol Drug Des 2014; 83:81-8. [PMID: 24516878 DOI: 10.1111/cbdd.12200] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]
Abstract
β-secretase (BACE-1) is a potential target for the treatment of Alzheimer's disease (AD). Despite its potential, only few compounds targeting BACE have entered the clinical trials. Herein, we describe the identification of Gefitinib as a potential lead compound for BACE through an integrated approach of structural bioinformatics analysis, experimental assessment and computational analysis. In particular, we performed ELISA and western analysis to assess the effect of Gefitinib using N2a human APP695 cells. In addition, we investigated the binding mechanism of Gefitinib with BACE through molecular docking coupled with molecular dynamics simulations. The computational analyses revealed that hydrophobic contact is a major contributing factor to the binding of Gefitinib with BACE. The results obtained in the study have rendered Gefitinib as a putative lead compound for BACE. Further optimization studies are warranted to improve its potency and pharmacological properties against BACE for potential AD treatment.
Collapse
|
35
|
Chen FC, Liao YC, Huang JM, Lin CH, Chen YY, Dou HY, Hsiung CA. Pros and cons of the tuberculosis drugome approach--an empirical analysis. PLoS One 2014; 9:e100829. [PMID: 24971632 PMCID: PMC4074101 DOI: 10.1371/journal.pone.0100829] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2014] [Accepted: 05/27/2014] [Indexed: 01/20/2023] Open
Abstract
Drug-resistant Mycobacterium tuberculosis (MTB), the causative pathogen of tuberculosis (TB), has become a serious threat to global public health. Yet the development of novel drugs against MTB has been lagging. One potentially powerful approach to drug development is computation-aided repositioning of current drugs. However, the effectiveness of this approach has rarely been examined. Here we select the "TB drugome" approach--a protein structure-based method for drug repositioning for tuberculosis treatment--to (1) experimentally validate the efficacy of the identified drug candidates for inhibiting MTB growth, and (2) computationally examine how consistently drug candidates are prioritized, considering changes in input data. Twenty three drugs in the TB drugome were tested. Of them, only two drugs (tamoxifen and 4-hydroxytamoxifen) effectively suppressed MTB growth at relatively high concentrations. Both drugs significantly enhanced the inhibitory effects of three first-line anti-TB drugs (rifampin, isoniazid, and ethambutol). However, tamoxifen is not a top-listed drug in the TB drugome, and 4-hydroxytamoxifen is not approved for use in humans. Computational re-examination of the TB drugome indicated that the rankings were subject to technical and data-related biases. Thus, although our results support the effectiveness of the TB drugome approach for identifying drugs that can potentially be repositioned for stand-alone applications or for combination treatments for TB, the approach requires further refinements via incorporation of additional biological information. Our findings can also be extended to other structure-based drug repositioning methods.
Collapse
Affiliation(s)
- Feng-Chi Chen
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
- Department of Life Sciences, National Chiao-Tung University, Hsinchu, Taiwan
- Department of Dentistry, China Medical University, Taichung, Taiwan
| | - Yu-Chieh Liao
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
| | - Jie-Mao Huang
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
| | - Chieh-Hua Lin
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
- Institute of Bioinformatics and Structural Biology, National Tsing Hua University, Hsinchu, Taiwan
| | - Yih-Yuan Chen
- National Institute of Infectious Diseases and Vaccinology, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
| | - Horng-Yunn Dou
- National Institute of Infectious Diseases and Vaccinology, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
| | - Chao Agnes Hsiung
- Division of Biostatistics and Bioinformatics, Institute of Population Health Sciences, National Health Research Institutes, Zhunan, Miaoli County, Taiwan
| |
Collapse
|
36
|
Salentin S, Haupt VJ, Daminelli S, Schroeder M. Polypharmacology rescored: protein-ligand interaction profiles for remote binding site similarity assessment. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2014; 116:174-86. [PMID: 24923864 DOI: 10.1016/j.pbiomolbio.2014.05.006] [Citation(s) in RCA: 87] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/06/2014] [Revised: 05/20/2014] [Accepted: 05/26/2014] [Indexed: 11/27/2022]
Abstract
Detection of remote binding site similarity in proteins plays an important role for drug repositioning and off-target effect prediction. Various non-covalent interactions such as hydrogen bonds and van-der-Waals forces drive ligands' molecular recognition by binding sites in proteins. The increasing amount of available structures of protein-small molecule complexes enabled the development of comparative approaches. Several methods have been developed to characterize and compare protein-ligand interaction patterns. Usually implemented as fingerprints, these are mainly used for post processing docking scores and (off-)target prediction. In the latter application, interaction profiles detect similarities in the bound interactions of different ligands and thus identify essential interactions between a protein and its small molecule ligands. Interaction pattern similarity correlates with binding site similarity and is thus contributing to a higher precision in binding site similarity assessment of proteins with distinct global structure. This renders it valuable for existing drug repositioning approaches in structural bioinformatics. Current methods to characterize and compare structure-based interaction patterns - both for protein-small-molecule and protein-protein interactions - as well as their potential in target prediction will be reviewed in this article. The question of how the set of interaction types, flexibility or water-mediated interactions, influence the comparison of interaction patterns will be discussed. Due to the wealth of protein-ligand structures available today, predicted targets can be ranked by comparing their ligand interaction pattern to patterns of the known target. Such knowledge-based methods offer high precision in comparison to methods comparing whole binding sites based on shape and amino acid physicochemical similarity.
Collapse
|
37
|
Xie L, Ge X, Tan H, Xie L, Zhang Y, Hart T, Yang X, Bourne PE. Towards structural systems pharmacology to study complex diseases and personalized medicine. PLoS Comput Biol 2014; 10:e1003554. [PMID: 24830652 PMCID: PMC4022462 DOI: 10.1371/journal.pcbi.1003554] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
Genome-Wide Association Studies (GWAS), whole genome sequencing, and high-throughput omics techniques have generated vast amounts of genotypic and molecular phenotypic data. However, these data have not yet been fully explored to improve the effectiveness and efficiency of drug discovery, which continues along a one-drug-one-target-one-disease paradigm. As a partial consequence, both the cost to launch a new drug and the attrition rate are increasing. Systems pharmacology and pharmacogenomics are emerging to exploit the available data and potentially reverse this trend, but, as we argue here, more is needed. To understand the impact of genetic, epigenetic, and environmental factors on drug action, we must study the structural energetics and dynamics of molecular interactions in the context of the whole human genome and interactome. Such an approach requires an integrative modeling framework for drug action that leverages advances in data-driven statistical modeling and mechanism-based multiscale modeling and transforms heterogeneous data from GWAS, high-throughput sequencing, structural genomics, functional genomics, and chemical genomics into unified knowledge. This is not a small task, but, as reviewed here, progress is being made towards the final goal of personalized medicines for the treatment of complex diseases.
Collapse
Affiliation(s)
- Lei Xie
- Department of Computer Science, Hunter College, The City University of New York, New York, New York, United States of America
- Ph.D. Program in Computer Science, Biology, and Biochemistry, The Graduate Center, The City University of New York, New York, New York, United States of America
- * E-mail:
| | - Xiaoxia Ge
- Department of Computer Science, Hunter College, The City University of New York, New York, New York, United States of America
| | - Hepan Tan
- Department of Computer Science, Hunter College, The City University of New York, New York, New York, United States of America
| | - Li Xie
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, United States of America
| | - Yinliang Zhang
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, United States of America
| | - Thomas Hart
- Department of Biological Sciences, Hunter College, The City University of New York, New York, New York, United States of America
| | - Xiaowei Yang
- School of Public Health, Hunter College, The City University of New York, New York, New York, United States of America
| | - Philip E. Bourne
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, California, United States of America
| |
Collapse
|
38
|
Hung CL, Hua GJ. Local alignment tool based on Hadoop framework and GPU architecture. BIOMED RESEARCH INTERNATIONAL 2014; 2014:541490. [PMID: 24955362 PMCID: PMC4052794 DOI: 10.1155/2014/541490] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/30/2014] [Accepted: 04/14/2014] [Indexed: 11/17/2022]
Abstract
With the rapid growth of next generation sequencing technologies, such as Slex, more and more data have been discovered and published. To analyze such huge data the computational performance is an important issue. Recently, many tools, such as SOAP, have been implemented on Hadoop and GPU parallel computing architectures. BLASTP is an important tool, implemented on GPU architectures, for biologists to compare protein sequences. To deal with the big biology data, it is hard to rely on single GPU. Therefore, we implement a distributed BLASTP by combining Hadoop and multi-GPUs. The experimental results present that the proposed method can improve the performance of BLASTP on single GPU, and also it can achieve high availability and fault tolerance.
Collapse
Affiliation(s)
- Che-Lun Hung
- Department of Computer Science and Communication Engineering, Providence University, No. 200, Section 7, Taiwan Boulevard, Shalu District, Taichung 43301, Taiwan
| | - Guan-Jie Hua
- Department of Computer Science and Information Engineering, Providence University, No. 200, Section 7, Taiwan Boulevard, Shalu District, Taichung 43301, Taiwan
| |
Collapse
|
39
|
Ng C, Hauptman R, Zhang Y, Bourne PE, Xie L. Anti-infectious drug repurposing using an integrated chemical genomics and structural systems biology approach. PACIFIC SYMPOSIUM ON BIOCOMPUTING. PACIFIC SYMPOSIUM ON BIOCOMPUTING 2014:136-47. [PMID: 24297541 PMCID: PMC6322395] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Subscribe] [Scholar Register] [Indexed: 06/02/2023]
Abstract
The emergence of multi-drug and extensive drug resistance of microbes to antibiotics poses a great threat to human health. Although drug repurposing is a promising solution for accelerating the drug development process, its application to anti-infectious drug discovery is limited by the scope of existing phenotype-, ligand-, or target-based methods. In this paper we introduce a new computational strategy to determine the genome-wide molecular targets of bioactive compounds in both human and bacterial genomes. Our method is based on the use of a novel algorithm, ligand Enrichment of Network Topological Similarity (ligENTS), to map the chemical universe to its global pharmacological space. ligENTS outperforms the state-of-the-art algorithms in identifying novel drug-target relationships. Furthermore, we integrate ligENTS with our structural systems biology platform to identify drug repurposing opportunities via target similarity profiling. Using this integrated strategy, we have identified novel P. falciparum targets of drug-like active compounds from the Malaria Box, and suggest that a number of approved drugs may be active against malaria. This study demonstrates the potential of an integrative chemical genomics and structural systems biology approach to drug repurposing.
Collapse
Affiliation(s)
- Clara Ng
- Department of Computer Science, Hunter College, the City University of New York, 695 Park Avenue, New York City, NY 10065, U. S. A..
| | | | | | | | | |
Collapse
|
40
|
Schumann M, Armen RS. Identification of distant drug off-targets by direct superposition of binding pocket surfaces. PLoS One 2013; 8:e83533. [PMID: 24391782 PMCID: PMC3877058 DOI: 10.1371/journal.pone.0083533] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2013] [Accepted: 11/04/2013] [Indexed: 01/23/2023] Open
Abstract
Correctly predicting off-targets for a given molecular structure, which would have the ability to bind a large range of ligands, is both particularly difficult and important if they share no significant sequence or fold similarity with the respective molecular target ("distant off-targets"). A novel approach for identification of off-targets by direct superposition of protein binding pocket surfaces is presented and applied to a set of well-studied and highly relevant drug targets, including representative kinases and nuclear hormone receptors. The entire Protein Data Bank is searched for similar binding pockets and convincing distant off-target candidates were identified that share no significant sequence or fold similarity with the respective target structure. These putative target off-target pairs are further supported by the existence of compounds that bind strongly to both with high topological similarity, and in some cases, literature examples of individual compounds that bind to both. Also, our results clearly show that it is possible for binding pockets to exhibit a striking surface similarity, while the respective off-target shares neither significant sequence nor significant fold similarity with the respective molecular target ("distant off-target").
Collapse
Affiliation(s)
- Marcel Schumann
- Department of Pharmaceutical Sciences, School of Pharmacy, Thomas Jefferson University, Philadelphia, Pennsylvania, United States of America
| | - Roger S. Armen
- Department of Pharmaceutical Sciences, School of Pharmacy, Thomas Jefferson University, Philadelphia, Pennsylvania, United States of America
| |
Collapse
|
41
|
Spitzer R, Cleves AE, Varela R, Jain AN. Protein function annotation by local binding site surface similarity. Proteins 2013; 82:679-94. [PMID: 24166661 DOI: 10.1002/prot.24450] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2013] [Revised: 10/02/2013] [Accepted: 10/10/2013] [Indexed: 11/06/2022]
Abstract
Hundreds of protein crystal structures exist for proteins whose function cannot be confidently determined from sequence similarity. Surflex-PSIM, a previously reported surface-based protein similarity algorithm, provides an alternative method for hypothesizing function for such proteins. The method now supports fully automatic binding site detection and is fast enough to screen comprehensive databases of protein binding sites. The binding site detection methodology was validated on apo/holo cognate protein pairs, correctly identifying 91% of ligand binding sites in holo structures and 88% in apo structures where corresponding sites existed. For correctly detected apo binding sites, the cognate holo site was the most similar binding site 87% of the time. PSIM was used to screen a set of proteins that had poorly characterized functions at the time of crystallization, but were later biochemically annotated. Using a fully automated protocol, this set of 8 proteins was screened against ∼60,000 ligand binding sites from the PDB. PSIM correctly identified functional matches that predated query protein biochemical annotation for five out of the eight query proteins. A panel of 12 currently unannotated proteins was also screened, resulting in a large number of statistically significant binding site matches, some of which suggest likely functions for the poorly characterized proteins.
Collapse
Affiliation(s)
- Russell Spitzer
- Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, California
| | | | | | | |
Collapse
|
42
|
Haupt VJ, Daminelli S, Schroeder M. Drug Promiscuity in PDB: Protein Binding Site Similarity Is Key. PLoS One 2013; 8:e65894. [PMID: 23805191 PMCID: PMC3689763 DOI: 10.1371/journal.pone.0065894] [Citation(s) in RCA: 107] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2012] [Accepted: 04/30/2013] [Indexed: 11/19/2022] Open
Abstract
Drug repositioning applies established drugs to new disease indications with increasing success. A pre-requisite for drug repurposing is drug promiscuity (polypharmacology) – a drug’s ability to bind to several targets. There is a long standing debate on the reasons for drug promiscuity. Based on large compound screens, hydrophobicity and molecular weight have been suggested as key reasons. However, the results are sometimes contradictory and leave space for further analysis. Protein structures offer a structural dimension to explain promiscuity: Can a drug bind multiple targets because the drug is flexible or because the targets are structurally similar or even share similar binding sites? We present a systematic study of drug promiscuity based on structural data of PDB target proteins with a set of 164 promiscuous drugs. We show that there is no correlation between the degree of promiscuity and ligand properties such as hydrophobicity or molecular weight but a weak correlation to conformational flexibility. However, we do find a correlation between promiscuity and structural similarity as well as binding site similarity of protein targets. In particular, 71% of the drugs have at least two targets with similar binding sites. In order to overcome issues in detection of remotely similar binding sites, we employed a score for binding site similarity: LigandRMSD measures the similarity of the aligned ligands and uncovers remote local similarities in proteins. It can be applied to arbitrary structural binding site alignments. Three representative examples, namely the anti-cancer drug methotrexate, the natural product quercetin and the anti-diabetic drug acarbose are discussed in detail. Our findings suggest that global structural and binding site similarity play a more important role to explain the observed drug promiscuity in the PDB than physicochemical drug properties like hydrophobicity or molecular weight. Additionally, we find ligand flexibility to have a minor influence.
Collapse
Affiliation(s)
| | | | - Michael Schroeder
- Biotechnology Center (BIOTEC), TU Dresden, Dresden, Germany
- * E-mail:
| |
Collapse
|
43
|
Combinatorial clustering of residue position subsets predicts inhibitor affinity across the human kinome. PLoS Comput Biol 2013; 9:e1003087. [PMID: 23754939 PMCID: PMC3675009 DOI: 10.1371/journal.pcbi.1003087] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2012] [Accepted: 04/22/2013] [Indexed: 11/22/2022] Open
Abstract
The protein kinases are a large family of enzymes that play fundamental roles in propagating signals within the cell. Because of the high degree of binding site similarity shared among protein kinases, designing drug compounds with high specificity among the kinases has proven difficult. However, computational approaches to comparing the 3-dimensional geometry and physicochemical properties of key binding site residue positions have been shown to be informative of inhibitor selectivity. The Combinatorial Clustering Of Residue Position Subsets (ccorps) method, introduced here, provides a semi-supervised learning approach for identifying structural features that are correlated with a given set of annotation labels. Here, ccorps is applied to the problem of identifying structural features of the kinase atp binding site that are informative of inhibitor binding. ccorps is demonstrated to make perfect or near-perfect predictions for the binding affinity profile of 8 of the 38 kinase inhibitors studied, while only having overall poor predictive ability for 1 of the 38 compounds. Additionally, ccorps is shown to identify shared structural features across phylogenetically diverse groups of kinases that are correlated with binding affinity for particular inhibitors; such instances of structural similarity among phylogenetically diverse kinases are also shown to not be rare among kinases. Finally, these function-specific structural features may serve as potential starting points for the development of highly specific kinase inhibitors. The kinases are a group of essential signaling proteins within the cell and are the largest family of enzymes encoded by the human genome. The high degree of binding site similarity shared across the protein kinases has made them difficult targets for which to design highly selective inhibitors, but kinome-wide binding site analysis can help predict unintended off-target inhibitions. Given the increasingly large number of available kinase structures, kinome-wide comparative analysis of binding sites is now possible. In this paper, the Combinatorial Clustering Of Residue Position Subsets (ccorps) method is introduced and used to synthesize kinome-wide structure datasets with a kinome-wide inhibitor affinity screening dataset consisting of 38 kinase inhibitors. ccorps identifies structural features of the kinase binding site that are correlated with an inhibitor binding and uses these features to predict if this inhibitor will be capable of binding to uncharacterized kinases. This paper demonstrates the ability of ccorps to accurately predict inhibitor binding and identify features of the kinase binding site that are unique to kinases capable of binding a given inhibitor.
Collapse
|
44
|
Csermely P, Korcsmáros T, Kiss HJM, London G, Nussinov R. Structure and dynamics of molecular networks: a novel paradigm of drug discovery: a comprehensive review. Pharmacol Ther 2013; 138:333-408. [PMID: 23384594 PMCID: PMC3647006 DOI: 10.1016/j.pharmthera.2013.01.016] [Citation(s) in RCA: 522] [Impact Index Per Article: 43.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2013] [Accepted: 01/22/2013] [Indexed: 02/02/2023]
Abstract
Despite considerable progress in genome- and proteome-based high-throughput screening methods and in rational drug design, the increase in approved drugs in the past decade did not match the increase of drug development costs. Network description and analysis not only give a systems-level understanding of drug action and disease complexity, but can also help to improve the efficiency of drug design. We give a comprehensive assessment of the analytical tools of network topology and dynamics. The state-of-the-art use of chemical similarity, protein structure, protein-protein interaction, signaling, genetic interaction and metabolic networks in the discovery of drug targets is summarized. We propose that network targeting follows two basic strategies. The "central hit strategy" selectively targets central nodes/edges of the flexible networks of infectious agents or cancer cells to kill them. The "network influence strategy" works against other diseases, where an efficient reconfiguration of rigid networks needs to be achieved by targeting the neighbors of central nodes/edges. It is shown how network techniques can help in the identification of single-target, edgetic, multi-target and allo-network drug target candidates. We review the recent boom in network methods helping hit identification, lead selection optimizing drug efficacy, as well as minimizing side-effects and drug toxicity. Successful network-based drug development strategies are shown through the examples of infections, cancer, metabolic diseases, neurodegenerative diseases and aging. Summarizing >1200 references we suggest an optimized protocol of network-aided drug development, and provide a list of systems-level hallmarks of drug quality. Finally, we highlight network-related drug development trends helping to achieve these hallmarks by a cohesive, global approach.
Collapse
Affiliation(s)
- Peter Csermely
- Department of Medical Chemistry, Semmelweis University, P.O. Box 260, H-1444 Budapest 8, Hungary.
| | | | | | | | | |
Collapse
|
45
|
Xie L, Ng C, Ali T, Valencia R, Ferreira BL, Xue V, Tanweer M, Zhou D, Haddad GG, Bourne PE, Xie L. Multiscale modeling of the causal functional roles of nsSNPs in a genome-wide association study: application to hypoxia. BMC Genomics 2013; 14 Suppl 3:S9. [PMID: 23819581 PMCID: PMC3665574 DOI: 10.1186/1471-2164-14-s3-s9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND It is a great challenge of modern biology to determine the functional roles of non-synonymous Single Nucleotide Polymorphisms (nsSNPs) on complex phenotypes. Statistical and machine learning techniques establish correlations between genotype and phenotype, but may fail to infer the biologically relevant mechanisms. The emerging paradigm of Network-based Association Studies aims to address this problem of statistical analysis. However, a mechanistic understanding of how individual molecular components work together in a system requires knowledge of molecular structures, and their interactions. RESULTS To address the challenge of understanding the genetic, molecular, and cellular basis of complex phenotypes, we have, for the first time, developed a structural systems biology approach for genome-wide multiscale modeling of nsSNPs--from the atomic details of molecular interactions to the emergent properties of biological networks. We apply our approach to determine the functional roles of nsSNPs associated with hypoxia tolerance in Drosophila melanogaster. The integrated view of the functional roles of nsSNP at both molecular and network levels allows us to identify driver mutations and their interactions (epistasis) in H, Rad51D, Ulp1, Wnt5, HDAC4, Sol, Dys, GalNAc-T2, and CG33714 genes, all of which are involved in the up-regulation of Notch and Gurken/EGFR signaling pathways. Moreover, we find that a large fraction of the driver mutations are neither located in conserved functional sites, nor responsible for structural stability, but rather regulate protein activity through allosteric transitions, protein-protein interactions, or protein-nucleic acid interactions. This finding should impact future Genome-Wide Association Studies. CONCLUSIONS Our studies demonstrate that the consolidation of statistical, structural, and network views of biomolecules and their interactions can provide new insight into the functional role of nsSNPs in Genome-Wide Association Studies, in a way that neither the knowledge of molecular structures nor biological networks alone could achieve. Thus, multiscale modeling of nsSNPs may prove to be a powerful tool for establishing the functional roles of sequence variants in a wide array of applications.
Collapse
Affiliation(s)
- Li Xie
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
46
|
Kirshner DA, Nilmeier JP, Lightstone FC. Catalytic site identification--a web server to identify catalytic site structural matches throughout PDB. Nucleic Acids Res 2013; 41:W256-65. [PMID: 23680785 PMCID: PMC3692059 DOI: 10.1093/nar/gkt403] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open
Abstract
The catalytic site identification web server provides the innovative capability to find structural matches to a user-specified catalytic site among all Protein Data Bank proteins rapidly (in less than a minute). The server also can examine a user-specified protein structure or model to identify structural matches to a library of catalytic sites. Finally, the server provides a database of pre-calculated matches between all Protein Data Bank proteins and the library of catalytic sites. The database has been used to derive a set of hypothesized novel enzymatic function annotations. In all cases, matches and putative binding sites (protein structure and surfaces) can be visualized interactively online. The website can be accessed at http://catsid.llnl.gov.
Collapse
Affiliation(s)
| | | | - Felice C. Lightstone
- *To whom correspondence should be addressed. Tel: +1 925 423 8657; Fax: +1 925 423 0785;
| |
Collapse
|
47
|
Hung CL, Hua GJ. Cloud computing for protein-ligand binding site comparison. BIOMED RESEARCH INTERNATIONAL 2013; 2013:170356. [PMID: 23762824 PMCID: PMC3671236 DOI: 10.1155/2013/170356] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 12/18/2012] [Accepted: 03/28/2013] [Indexed: 12/30/2022]
Abstract
The proteome-wide analysis of protein-ligand binding sites and their interactions with ligands is important in structure-based drug design and in understanding ligand cross reactivity and toxicity. The well-known and commonly used software, SMAP, has been designed for 3D ligand binding site comparison and similarity searching of a structural proteome. SMAP can also predict drug side effects and reassign existing drugs to new indications. However, the computing scale of SMAP is limited. We have developed a high availability, high performance system that expands the comparison scale of SMAP. This cloud computing service, called Cloud-PLBS, combines the SMAP and Hadoop frameworks and is deployed on a virtual cloud computing platform. To handle the vast amount of experimental data on protein-ligand binding site pairs, Cloud-PLBS exploits the MapReduce paradigm as a management and parallelizing tool. Cloud-PLBS provides a web portal and scalability through which biologists can address a wide range of computer-intensive questions in biology and drug discovery.
Collapse
Affiliation(s)
- Che-Lun Hung
- Department of Computer Science and Communication Engineering, Providence University, Taiwan Boulevard, Shalu District, Taichung 43301, Taiwan.
| | | |
Collapse
|
48
|
Nilmeier JP, Kirshner DA, Wong SE, Lightstone FC. Rapid catalytic template searching as an enzyme function prediction procedure. PLoS One 2013; 8:e62535. [PMID: 23675414 PMCID: PMC3651201 DOI: 10.1371/journal.pone.0062535] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2013] [Accepted: 03/22/2013] [Indexed: 11/18/2022] Open
Abstract
We present an enzyme protein function identification algorithm, Catalytic Site Identification (CatSId), based on identification of catalytic residues. The method is optimized for highly accurate template identification across a diverse template library and is also very efficient in regards to time and scalability of comparisons. The algorithm matches three-dimensional residue arrangements in a query protein to a library of manually annotated, catalytic residues--The Catalytic Site Atlas (CSA). Two main processes are involved. The first process is a rapid protein-to-template matching algorithm that scales quadratically with target protein size and linearly with template size. The second process incorporates a number of physical descriptors, including binding site predictions, in a logistic scoring procedure to re-score matches found in Process 1. This approach shows very good performance overall, with a Receiver-Operator-Characteristic Area Under Curve (AUC) of 0.971 for the training set evaluated. The procedure is able to process cofactors, ions, nonstandard residues, and point substitutions for residues and ions in a robust and integrated fashion. Sites with only two critical (catalytic) residues are challenging cases, resulting in AUCs of 0.9411 and 0.5413 for the training and test sets, respectively. The remaining sites show excellent performance with AUCs greater than 0.90 for both the training and test data on templates of size greater than two critical (catalytic) residues. The procedure has considerable promise for larger scale searches.
Collapse
Affiliation(s)
- Jerome P. Nilmeier
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California, United States of America
| | - Daniel A. Kirshner
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California, United States of America
| | - Sergio E. Wong
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California, United States of America
| | - Felice C. Lightstone
- Biosciences and Biotechnology Division, Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California, United States of America
| |
Collapse
|
49
|
Shahid M, Shahzad Cheema M, Klenner A, Younesi E, Hofmann-Apitius M. SVM Based Descriptor Selection and Classification of Neurodegenerative Disease Drugs for Pharmacological Modeling. Mol Inform 2013; 32:241-9. [PMID: 27481519 DOI: 10.1002/minf.201200116] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2012] [Accepted: 01/07/2013] [Indexed: 11/10/2022]
Abstract
Systems pharmacological modeling of drug mode of action for the next generation of multitarget drugs may open new routes for drug design and discovery. Computational methods are widely used in this context amongst which support vector machines (SVM) have proven successful in addressing the challenge of classifying drugs with similar features. We have applied a variety of such SVM-based approaches, namely SVM-based recursive feature elimination (SVM-RFE). We use the approach to predict the pharmacological properties of drugs widely used against complex neurodegenerative disorders (NDD) and to build an in-silico computational model for the binary classification of NDD drugs from other drugs. Application of an SVM-RFE model to a set of drugs successfully classified NDD drugs from non-NDD drugs and resulted in overall accuracy of ∼80 % with 10 fold cross validation using 40 top ranked molecular descriptors selected out of total 314 descriptors. Moreover, SVM-RFE method outperformed linear discriminant analysis (LDA) based feature selection and classification. The model reduced the multidimensional descriptors space of drugs dramatically and predicted NDD drugs with high accuracy, while avoiding over fitting. Based on these results, NDD-specific focused libraries of drug-like compounds can be designed and existing NDD-specific drugs can be characterized by a well-characterized set of molecular descriptors.
Collapse
Affiliation(s)
- Mohammad Shahid
- Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn-Aachen International Center for IT, Dahlmannstr. 2, 53113 Bonn, Germany
| | - Muhammad Shahzad Cheema
- Rheinische Friedrich-Wilhelms-Universität Bonn, Bonn-Aachen International Center for IT, Dahlmannstr. 2, 53113 Bonn, Germany
| | - Alexander Klenner
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), 53754 Sankt Augustin, Germany
| | - Erfan Younesi
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), 53754 Sankt Augustin, Germany
| | - Martin Hofmann-Apitius
- Department of Bioinformatics, Fraunhofer Institute for Algorithms and Scientific Computing (SCAI), 53754 Sankt Augustin, Germany..
| |
Collapse
|
50
|
Finding protein targets for small biologically relevant ligands across fold space using inverse ligand binding predictions. Structure 2013; 20:1815-22. [PMID: 23141694 DOI: 10.1016/j.str.2012.09.011] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2012] [Revised: 08/14/2012] [Accepted: 09/16/2012] [Indexed: 01/12/2023]
Abstract
Inverse ligand binding prediction utilizes a few protein-ligand (drug) complexes to predict other secondary therapeutic and off-targets of a given drug molecule on a proteomic scale. We adapt two binding site predictors, FINDSITE and SMAP, to perform the inverse predictions and evaluate them on over 30 representative ligands. Use of just one complex allows the identification of other protein targets; the availability of additional complexes improves the results. Both methods offer comparable quality when using three complexes with diverse proteins. SMAP is better when fewer complexes are available, while FINDSITE provides stronger predictions for smaller ligands. We propose a consensus that combines (and outperforms) the two complementary approaches implemented by FINDSITE and SMAP. Most importantly, we demonstrate that these methods successfully find distant targets that belong to structurally different folds compared to the proteins in the input complexes.
Collapse
|