1
|
Jalencas X, Mestres J. Identification of Similar Binding Sites to Detect Distant Polypharmacology. Mol Inform 2013; 32:976-90. [PMID: 27481143 DOI: 10.1002/minf.201300082] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Accepted: 07/29/2013] [Indexed: 01/19/2023]
Abstract
The ability of small molecules to interact with multiple proteins is referred to as polypharmacology. This property is often linked to the therapeutic action of drugs but it is known also to be responsible for many of their side effects. Because of its importance, the development of computational methods that can predict drug polypharmacology has become an important line of research that led recently to the identification of many novel targets for known drugs. Nowadays, the majority of these methods are based on measuring the similarity of a query molecule against the hundreds of thousands of molecules for which pharmacological data on thousands of proteins are available in public sources. However, similarity-based methods are inherently biased by the chemical coverage offered by the active molecules present in those public repositories, which limits significantly their capacity to predict interactions with proteins structurally and functionally unrelated to any of the already known targets for drugs. It is in this respect that structure-based methods aiming at identifying similar binding sites may offer an alternative complementary means to ligand-based methods for detecting distant polypharmacology. The different existing approaches to binding site detection, representation, comparison, and fragmentation are reviewed and recent successful applications presented.
Collapse
Affiliation(s)
- Xavier Jalencas
- Systems Pharmacology, Research Program on Biomedical Informatics (GRIB), IMIM Hospital del Mar Research Institute & University Pompeu Fabra, Parc de Recerca Biomèdica, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain fax: +34 93 3160550
| | - Jordi Mestres
- Systems Pharmacology, Research Program on Biomedical Informatics (GRIB), IMIM Hospital del Mar Research Institute & University Pompeu Fabra, Parc de Recerca Biomèdica, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain fax: +34 93 3160550.
| |
Collapse
|
2
|
Johansson MU, Zoete V, Michielin O, Guex N. Defining and searching for structural motifs using DeepView/Swiss-PdbViewer. BMC Bioinformatics 2012; 13:173. [PMID: 22823337 PMCID: PMC3436773 DOI: 10.1186/1471-2105-13-173] [Citation(s) in RCA: 204] [Impact Index Per Article: 17.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2011] [Accepted: 07/06/2012] [Indexed: 11/10/2022] Open
Abstract
Background Today, recognition and classification of sequence motifs and protein folds is a mature field, thanks to the availability of numerous comprehensive and easy to use software packages and web-based services. Recognition of structural motifs, by comparison, is less well developed and much less frequently used, possibly due to a lack of easily accessible and easy to use software. Results In this paper, we describe an extension of DeepView/Swiss-PdbViewer through which structural motifs may be defined and searched for in large protein structure databases, and we show that common structural motifs involved in stabilizing protein folds are present in evolutionarily and structurally unrelated proteins, also in deeply buried locations which are not obviously related to protein function. Conclusions The possibility to define custom motifs and search for their occurrence in other proteins permits the identification of recurrent arrangements of residues that could have structural implications. The possibility to do so without having to maintain a complex software/hardware installation on site brings this technology to experts and non-experts alike.
Collapse
Affiliation(s)
- Maria U Johansson
- Vital-IT Group, SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
| | | | | | | |
Collapse
|
3
|
Tang GW, Altman RB. Remote thioredoxin recognition using evolutionary conservation and structural dynamics. Structure 2011; 19:461-70. [PMID: 21481770 DOI: 10.1016/j.str.2011.02.007] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2010] [Revised: 02/06/2011] [Accepted: 02/16/2011] [Indexed: 12/25/2022]
Abstract
The thioredoxin family of oxidoreductases plays an important role in redox signaling and control of protein function. Not only are thioredoxins linked to a variety of disorders, but their stable structure has also seen application in protein engineering. Both sequence-based and structure-based tools exist for thioredoxin identification, but remote homolog detection remains a challenge. We developed a thioredoxin predictor using the approach of integrating sequence with structural information. We combined a sequence-based Hidden Markov Model (HMM) with a molecular dynamics enhanced structure-based recognition method (dynamic FEATURE, DF). This hybrid method (HMMDF) has high precision and recall (0.90 and 0.95, respectively) compared with HMM (0.92 and 0.87, respectively) and DF (0.82 and 0.97, respectively). Dynamic FEATURE is sensitive but struggles to resolve closely related protein families, while HMM identifies these evolutionary differences by compromising sensitivity. Our method applied to structural genomics targets makes a strong prediction of a novel thioredoxin.
Collapse
Affiliation(s)
- Grace W Tang
- Department of Bioengineering, Stanford University, Stanford, CA 94305, USA
| | | |
Collapse
|
4
|
Abstract
The complex interactions between proteins and small organic molecules (ligands) are intensively studied because they play key roles in biological processes and drug activities. Here, we present a novel approach to characterize and map the ligand-binding cavities of proteins without direct geometric comparison of structures, based on Principal Component Analysis of cavity properties (related mainly to size, polarity, and charge). This approach can provide valuable information on the similarities and dissimilarities, of binding cavities due to mutations, between-species differences and flexibility upon ligand-binding. The presented results show that information on ligand-binding cavity variations can complement information on protein similarity obtained from sequence comparisons. The predictive aspect of the method is exemplified by successful predictions of serine proteases that were not included in the model construction. The presented strategy to compare ligand-binding cavities of related and unrelated proteins has many potential applications within protein and medicinal chemistry, for example in the characterization and mapping of "orphan structures", selection of protein structures for docking studies in structure-based design, and identification of proteins for selectivity screens in drug design programs.
Collapse
|
5
|
Xiong B, Wu J, Burk DL, Xue M, Jiang H, Shen J. BSSF: a fingerprint based ultrafast binding site similarity search and function analysis server. BMC Bioinformatics 2010; 11:47. [PMID: 20100327 PMCID: PMC3098077 DOI: 10.1186/1471-2105-11-47] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2009] [Accepted: 01/25/2010] [Indexed: 11/17/2022] Open
Abstract
Background Genome sequencing and post-genomics projects such as structural genomics are extending the frontier of the study of sequence-structure-function relationship of genes and their products. Although many sequence/structure-based methods have been devised with the aim of deciphering this delicate relationship, there still remain large gaps in this fundamental problem, which continuously drives researchers to develop novel methods to extract relevant information from sequences and structures and to infer the functions of newly identified genes by genomics technology. Results Here we present an ultrafast method, named BSSF(Binding Site Similarity & Function), which enables researchers to conduct similarity searches in a comprehensive three-dimensional binding site database extracted from PDB structures. This method utilizes a fingerprint representation of the binding site and a validated statistical Z-score function scheme to judge the similarity between the query and database items, even if their similarities are only constrained in a sub-pocket. This fingerprint based similarity measurement was also validated on a known binding site dataset by comparing with geometric hashing, which is a standard 3D similarity method. The comparison clearly demonstrated the utility of this ultrafast method. After conducting the database searching, the hit list is further analyzed to provide basic statistical information about the occurrences of Gene Ontology terms and Enzyme Commission numbers, which may benefit researchers by helping them to design further experiments to study the query proteins. Conclusions This ultrafast web-based system will not only help researchers interested in drug design and structural genomics to identify similar binding sites, but also assist them by providing further analysis of hit list from database searching.
Collapse
Affiliation(s)
- Bing Xiong
- State Key Laboratory of Drug Research, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, 555 Zuchongzhi Road, Zhangjiang Hi-Tech Park, Pudong, Shanghai, 201203, PR China.
| | | | | | | | | | | |
Collapse
|
6
|
Iván G, Szabadka Z, Ordög R, Grolmusz V, Náray-Szabó G. Four spatial points that define enzyme families. Biochem Biophys Res Commun 2009; 383:417-20. [PMID: 19364497 DOI: 10.1016/j.bbrc.2009.04.022] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2009] [Accepted: 04/02/2009] [Indexed: 11/26/2022]
Abstract
The catalytic properties of enzymes, containing the Asp-His-Ser triads are deeply investigated for a long time. Serine endopeptidases, cutinases, acetylcholinesterases, cellulases, among other enzymes, contain these triads. We found that solely the geometric properties of just four points in the spatial structure of these enzymes are characteristic to their family.
Collapse
Affiliation(s)
- Gábor Iván
- Protein Information Technology Group, Eötvös University, 1117 Budapest, Hungary
| | | | | | | | | |
Collapse
|
7
|
Skolnick J, Brylinski M. FINDSITE: a combined evolution/structure-based approach to protein function prediction. Brief Bioinform 2009; 10:378-91. [PMID: 19324930 DOI: 10.1093/bib/bbp017] [Citation(s) in RCA: 72] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
A key challenge of the post-genomic era is the identification of the function(s) of all the molecules in a given organism. Here, we review the status of sequence and structure-based approaches to protein function inference and ligand screening that can provide functional insights for a significant fraction of the approximately 50% of ORFs of unassigned function in an average proteome. We then describe FINDSITE, a recently developed algorithm for ligand binding site prediction, ligand screening and molecular function prediction, which is based on binding site conservation across evolutionary distant proteins identified by threading. Importantly, FINDSITE gives comparable results when high-resolution experimental structures as well as predicted protein models are used.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology 250 14th St NW, Atlanta, GA 30318, USA.
| | | |
Collapse
|
8
|
Kuhn D, Weskamp N, Hüllermeier E, Klebe G. Functional classification of protein kinase binding sites using Cavbase. ChemMedChem 2008; 2:1432-47. [PMID: 17694525 DOI: 10.1002/cmdc.200700075] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
Increasingly, drug-discovery processes focus on complete gene families. Tools for analyzing similarities and differences across protein families are important for the understanding of key functional features of proteins. Herein we present a method for classifying protein families on the basis of the properties of their active sites. We have developed Cavbase, a method for describing and comparing protein binding pockets, and show its application to the functional classification of the binding pockets of the protein family of protein kinases. A diverse set of kinase cavities is mutually compared and analyzed in terms of recurring functional recognition patterns in the active sites. We are able to propose a relevant classification based on the binding motifs in the active sites. The obtained classification provides a novel perspective on functional properties across protein space. The classification of the MAP and the c-Abl kinases is analyzed in detail, showing a clear separation of the respective kinase subfamilies. Remarkable cross-relations among protein kinases are detected, in contrast to sequence-based classifications, which are not able to detect these relations. Furthermore, our classification is able to highlight features important in the optimization of protein kinase inhibitors. Using small-molecule inhibition data we could rationalize cross-reactivities between unrelated kinases which become apparent in the structural comparison of their binding sites. This procedure helps in the identification of other possible kinase targets that behave similarly in "binding pocket space" to the kinase under consideration.
Collapse
Affiliation(s)
- Daniel Kuhn
- Department of Pharmaceutical Chemistry, University of Marburg, Marbacher Weg 6, 35032 Marburg, Germany
| | | | | | | |
Collapse
|
9
|
Kupas K, Ultsch A, Klebe G. Large scale analysis of protein-binding cavities using self-organizing maps and wavelet-based surface patches to describe functional properties, selectivity discrimination, and putative cross-reactivity. Proteins 2007; 71:1288-306. [DOI: 10.1002/prot.21823] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
10
|
Kuhn D, Weskamp N, Schmitt S, Hüllermeier E, Klebe G. From the similarity analysis of protein cavities to the functional classification of protein families using cavbase. J Mol Biol 2006; 359:1023-44. [PMID: 16697007 PMCID: PMC7094329 DOI: 10.1016/j.jmb.2006.04.024] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2006] [Revised: 03/31/2006] [Accepted: 04/06/2006] [Indexed: 02/05/2023]
Abstract
In this contribution, the classification of protein binding sites using the physicochemical properties exposed to their pockets is presented. We recently introduced Cavbase, a method for describing and comparing protein binding pockets on the basis of the geometrical and physicochemical properties of their active sites. Here, we present algorithmic and methodological enhancements in the Cavbase property description and in the cavity comparison step. We give examples of the Cavbase similarity analysis detecting pronounced similarities in the binding sites of proteins unrelated in sequence. A similarity search using SARS M(pro) protease subpockets as queries retrieved ligands and ligand fragments accommodated in a physicochemical environment similar to that of the query. This allowed the characterization of the protease recognition pockets and the identification of molecular building blocks that can be incorporated into novel antiviral compounds. A cluster analysis procedure for the functional classification of binding pockets was implemented and calibrated using a diverse set of enzyme binding sites. Two relevant protein families, the alpha-carbonic anhydrases and the protein kinases, are used to demonstrate the scope of our cluster approach. We propose a relevant classification of both protein families, on the basis of the binding motifs in their active sites. The classification provides a new perspective on functional properties across a protein family and is able to highlight features important for potency and selectivity. Furthermore, this information can be used to identify possible cross-reactivities among proteins due to similarities in their binding sites.
Collapse
Key Words
- protein binding pockets
- classification of protein binding pockets
- cluster analysis of protein binding pockets
- protein kinases
- sars protease
- sam, s-adenosyl-methionine
- fad, flavine adenine dinucleotide
- sars, severe acute respiratory syndrome
- cov, coronavirus
- tgev, transmissible gastroenteritis virus
- ca, carbonic anhydrase
- cml, chronic myelogenous leukemia
- map, mitogen-activated protein kinases
- cdks, cyclin-dependent protein kinases
- hb, hydrogen bond
- rmsd, root-mean-square deviations
- upgma, unweighted pair group method with arithmetic mean
- ec, enzyme classification
Collapse
Affiliation(s)
- Daniel Kuhn
- Department of Pharmaceutical Chemistry, University of Marburg, Marbacher Weg 6, D-35032 Marburg, Germany
| | | | | | | | | |
Collapse
|
11
|
Laskowski RA, Watson JD, Thornton JM. Protein function prediction using local 3D templates. J Mol Biol 2005; 351:614-26. [PMID: 16019027 DOI: 10.1016/j.jmb.2005.05.067] [Citation(s) in RCA: 119] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2005] [Revised: 05/26/2005] [Accepted: 05/30/2005] [Indexed: 10/25/2022]
Abstract
The prediction of a protein's function from its 3D structure is becoming more and more important as the worldwide structural genomics initiatives gather pace and continue to solve 3D structures, many of which are of proteins of unknown function. Here, we present a methodology for predicting function from structure that shows great promise. It is based on 3D templates that are defined as specific 3D conformations of small numbers of residues. We use four types of template, covering enzyme active sites, ligand-binding residues, DNA-binding residues and reverse templates. The latter are templates generated from the target structure itself and scanned against a representative subset of all known protein structures. Together, the templates provide a fairly thorough coverage of the known structures and ensure that if there is a match to a known structure it is unlikely to be missed. A new scoring scheme provides a highly sensitive means of discriminating between true positive and false positive template matches. In all, the methodology provides a powerful new tool for function prediction to complement those already in use.
Collapse
Affiliation(s)
- Roman A Laskowski
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | |
Collapse
|
12
|
Torrance JW, Bartlett GJ, Porter CT, Thornton JM. Using a library of structural templates to recognise catalytic sites and explore their evolution in homologous families. J Mol Biol 2005; 347:565-81. [PMID: 15755451 DOI: 10.1016/j.jmb.2005.01.044] [Citation(s) in RCA: 101] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2004] [Revised: 01/13/2005] [Accepted: 01/19/2005] [Indexed: 11/20/2022]
Abstract
Catalytic site structure is normally highly conserved between distantly related enzymes. As a consequence, templates representing catalytic sites have the potential to succeed at function prediction in cases where methods based on sequence or overall structure fail. There are many methods for searching protein structures for matches to structural templates, but few validated template libraries to use with these methods. We present a library of structural templates representing catalytic sites, based on information from the scientific literature. Furthermore, we analyse homologous template families to discover the diversity within families and the utility of templates for active site recognition. Templates representing the catalytic sites of homologous proteins mostly differ by less than 1A root mean square deviation, even when the sequence similarity between the two proteins is low. Within these sets of homologues there is usually no discernible relationship between catalytic site structure similarity and sequence similarity. Because of this structural conservation of catalytic sites, the templates can discriminate between matches to related proteins and random matches with over 85% sensitivity and predictive accuracy. Templates based on protein backbone positions are more discriminating than those based on side-chain atoms. These analyses show encouraging prospects for prediction of functional sites in structural genomics structures of unknown function, and will be of use in analyses of convergent evolution and exploring relationships between active site geometry and chemistry. The template library can be queried via a web server at and is available for download.
Collapse
Affiliation(s)
- James W Torrance
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK.
| | | | | | | |
Collapse
|
13
|
Kupas K, Ultsch A, Klebe G. Comparison of substructural epitopes in enzyme active sites using self-organizing maps. J Comput Aided Mol Des 2005; 18:697-708. [PMID: 15865062 DOI: 10.1007/s10822-004-6553-x] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
This paper presents a new algorithm to compare substructural epitopes in protein binding cavities. Through the comparison of binding cavities accommodating well characterized ligands with cavities whose actual guests are yet unknown, it is possible to draw some conclusions on the required shape of a putative ligand likely to bind to the latter cavities. To detect functional relationships among proteins, their binding-site exposed physicochemical characteristics are described by assigning generic pseudocenters to the functional groups of the amino acids flanking the particular active site. The cavities are divided into small local regions of four pseudocenters having the shape of a pyramid with triangular basis. To find similar local regions, an emergent self-organizing map is used for clustering. Two local regions within the same cluster are similar and form the basis for the superpositioning of the corresponding cavities to score this match. First results show that the similarities between enzymes with the same EC number can be found correctly. Enzymes with different EC numbers are detected to have no common substructures. These results indicate the benefit of this method and motivate further studies.
Collapse
Affiliation(s)
- Katrin Kupas
- Data Bionics Research Group, Department of Computer Science, University of Marburg, Germany
| | | | | |
Collapse
|
14
|
Rigden DJ, Galperin MY. The DxDxDG Motif for Calcium Binding: Multiple Structural Contexts and Implications for Evolution. J Mol Biol 2004; 343:971-84. [PMID: 15476814 DOI: 10.1016/j.jmb.2004.08.077] [Citation(s) in RCA: 107] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2004] [Revised: 08/12/2004] [Accepted: 08/25/2004] [Indexed: 11/30/2022]
|
15
|
Mullaney EJ, Ullah AHJ. The term phytase comprises several different classes of enzymes. Biochem Biophys Res Commun 2003; 312:179-84. [PMID: 14630039 DOI: 10.1016/j.bbrc.2003.09.176] [Citation(s) in RCA: 183] [Impact Index Per Article: 8.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Affiliation(s)
- Edward J Mullaney
- Southern Regional Research Center, Agricultural Research Center, United States Department of Agriculture, New Orleans, LA 70124, USA
| | | |
Collapse
|