Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Meng EC, Polacco BJ, Babbitt PC. Superfamily active site templates. Proteins 2004;55:962-76. [PMID: 15146493 DOI: 10.1002/prot.20099] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Number

Cited by Other Article(s)

Riziotis IG, Ribeiro AJM, Borkakoti N, Thornton JM. The 3D Modules of Enzyme Catalysis: Deconstructing Active Sites into Distinct Functional Entities. J Mol Biol 2023;435:168254. [PMID: 37652131 DOI: 10.1016/j.jmb.2023.168254] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2023] [Revised: 08/20/2023] [Accepted: 08/22/2023] [Indexed: 09/02/2023]

Riziotis IG, Thornton JM. Capturing the geometry, function, and evolution of enzymes with 3D templates. Protein Sci 2022;31:e4363. [PMID: 35762726 PMCID: PMC9207746 DOI: 10.1002/pro.4363] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2022] [Revised: 05/06/2022] [Accepted: 05/14/2022] [Indexed: 11/05/2022]

Barnsley KK, Ondrechen MJ. Enzyme active sites: Identification and prediction of function using computational chemistry. Curr Opin Struct Biol 2022;74:102384. [DOI: 10.1016/j.sbi.2022.102384] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/08/2022] [Revised: 03/20/2022] [Accepted: 03/28/2022] [Indexed: 11/03/2022]

Riziotis IG, Ribeiro AJ, Borkakoti N, Thornton JM. Conformational variation in enzyme catalysis: A structural study on catalytic residues. J Mol Biol 2022;434:167517. [PMID: 35240125 PMCID: PMC9005782 DOI: 10.1016/j.jmb.2022.167517] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2021] [Revised: 02/21/2022] [Accepted: 02/23/2022] [Indexed: 11/26/2022]

Abstract

•

We introduce a pipeline to compare and contrast active sites from homologous enzymes in 3D.

•

Comprehensive structural study covering enzymes from a large functional space.

•

High heterogeneity in magnitude of active site flexibililty between enzyme families.

•

Diffferent catalytic residue types and functions relate to different degrees of flexibility.

•

Four paradigms classify enzymes according to the structural behaviour during catalysis.

Conformational variation in catalytic residues can be captured as alternative snapshots in enzyme crystal structures. Addressing the question of whether active site flexibility is an intrinsic and essential property of enzymes for catalysis, we present a comprehensive study on the 3D variation of active sites of 925 enzyme families, using explicit catalytic residue annotations from the Mechanism and Catalytic Site Atlas and structural data from the Protein Data Bank. Through weighted pairwise superposition of the functional atoms of active sites, we captured structural variability at single-residue level and examined the geometrical changes as ligands bind or as mutations occur. We demonstrate that catalytic centres of enzymes can be inherently rigid or flexible to various degrees according to the function they perform, and structural variability most often involves a subset of the catalytic residues, usually those not directly involved in the formation or cleavage of bonds. Moreover, data suggest that 2/3 of active sites are flexible, and in half of those, flexibility is only observed in the side chain. The goal of this work is to characterise our current knowledge of the extent of flexibility at the heart of catalysis and ultimately place our findings in the context of the evolution of catalysis as enzymes evolve new functions and bind different substrates.

Collapse

Bittrich S, Burley SK, Rose AS. Real-time structural motif searching in proteins using an inverted index strategy. PLoS Comput Biol 2020;16:e1008502. [PMID: 33284792 PMCID: PMC7746303 DOI: 10.1371/journal.pcbi.1008502] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2020] [Revised: 12/17/2020] [Accepted: 11/09/2020] [Indexed: 12/30/2022] Open

Abstract

Biochemical and biological functions of proteins are the product of both the overall fold of the polypeptide chain, and, typically, structural motifs made up of smaller numbers of amino acids constituting a catalytic center or a binding site that may be remote from one another in amino acid sequence. Detection of such structural motifs can provide valuable insights into the function(s) of previously uncharacterized proteins. Technically, this remains an extremely challenging problem because of the size of the Protein Data Bank (PDB) archive. Existing methods depend on a clustering by sequence similarity and can be computationally slow. We have developed a new approach that uses an inverted index strategy capable of analyzing >170,000 PDB structures with unmatched speed. The efficiency of the inverted index method depends critically on identifying the small number of structures containing the query motif and ignoring most of the structures that are irrelevant. Our approach (implemented at motif.rcsb.org) enables real-time retrieval and superposition of structural motifs, either extracted from a reference structure or uploaded by the user. Herein, we describe the method and present five case studies that exemplify its efficacy and speed for analyzing 3D structures of both proteins and nucleic acids.

The Protein Data Bank (PDB) provides open access to more than 170,000 three-dimensional structures of proteins, nucleic acids, and biological complexes. Similarities between PDB structures give valuable functional and evolutionary insights but such resemblance may not be evident at sequence or global structure level. Throughout the database, there are recurring structural motifs—groups of modest numbers of residues in proximity that, for example, support catalytic activity. Identification of common structural motifs can reveal similarities between proteins and serve as fingerprints for spatial configurations of amino acids, such as the His-Asp-Ser catalytic triad found in serine proteases or the zinc coordination site found in Zinc Finger DNA-binding domains. We present a highly efficient yet flexible strategy that allows users for the first time to search for arbitrary structural motifs across the entire PDB archive in real-time. Our approach scales favorably with the increasing number and complexity of deposited structures, and, also, has the potential to be adapted for other applications in a macromolecular context.

Collapse

Kaiser F, Labudde D. Unsupervised Discovery of Geometrically Common Structural Motifs and Long-Range Contacts in Protein 3D Structures. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019;16:671-680. [PMID: 29990265 DOI: 10.1109/tcbb.2017.2786250] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]

Parasuram R, Mills CL, Wang Z, Somasundaram S, Beuning PJ, Ondrechen MJ. Local structure based method for prediction of the biochemical function of proteins: Applications to glycoside hydrolases. Methods 2016;93:51-63. [DOI: 10.1016/j.ymeth.2015.11.010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 11/05/2015] [Accepted: 11/09/2015] [Indexed: 01/07/2023] Open

Kaiser F, Eisold A, Bittrich S, Labudde D. Fit3D: a web application for highly accurate screening of spatial residue patterns in protein structure data. Bioinformatics 2015;32:792-4. [PMID: 26519504 DOI: 10.1093/bioinformatics/btv637] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2015] [Accepted: 10/24/2015] [Indexed: 01/12/2023] Open

Kaiser F, Eisold A, Labudde D. A Novel Algorithm for Enhanced Structural Motif Matching in Proteins. J Comput Biol 2015;22:698-713. [PMID: 25695840 DOI: 10.1089/cmb.2014.0263] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Alderson RG, Barker D, Mitchell JBO. One origin for metallo-β-lactamase activity, or two? An investigation assessing a diverse set of reconstructed ancestral sequences based on a sample of phylogenetic trees. J Mol Evol 2014;79:117-29. [PMID: 25185655 PMCID: PMC4185109 DOI: 10.1007/s00239-014-9639-7] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2014] [Accepted: 08/11/2014] [Indexed: 01/04/2023]

Tóth-Petróczy A, Tawfik DS. The robustness and innovability of protein folds. Curr Opin Struct Biol 2014;26:131-8. [PMID: 25038399 DOI: 10.1016/j.sbi.2014.06.007] [Citation(s) in RCA: 98] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2013] [Revised: 06/26/2014] [Accepted: 06/26/2014] [Indexed: 11/30/2022]

He L, Vandin F, Pandurangan G, Bailey-Kellogg C. Ballast: a ball-based algorithm for structural motifs. J Comput Biol 2013;20:137-51. [PMID: 23383999 DOI: 10.1089/cmb.2012.0246] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Kirshner DA, Nilmeier JP, Lightstone FC. Catalytic site identification--a web server to identify catalytic site structural matches throughout PDB. Nucleic Acids Res 2013;41:W256-65. [PMID: 23680785 PMCID: PMC3692059 DOI: 10.1093/nar/gkt403] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023] Open

Nilmeier JP, Kirshner DA, Wong SE, Lightstone FC. Rapid catalytic template searching as an enzyme function prediction procedure. PLoS One 2013;8:e62535. [PMID: 23675414 PMCID: PMC3651201 DOI: 10.1371/journal.pone.0062535] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2013] [Accepted: 03/22/2013] [Indexed: 11/18/2022] Open

Wang Z, Yin P, Lee JS, Parasuram R, Somarowthu S, Ondrechen MJ. Protein function annotation with Structurally Aligned Local Sites of Activity (SALSAs). BMC Bioinformatics 2013;14 Suppl 3:S13. [PMID: 23514271 PMCID: PMC3584854 DOI: 10.1186/1471-2105-14-s3-s13] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open

Abstract

Background

The prediction of biochemical function from the 3D structure of a protein has proved to be much more difficult than was originally foreseen. A reliable method to test the likelihood of putative annotations and to predict function from structure would add tremendous value to structural genomics data. We report on a new method, Structurally Aligned Local Sites of Activity (SALSA), for the prediction of biochemical function based on a local structural match at the predicted catalytic or binding site.

Results

Implementation of the SALSA method is described. For the structural genomics protein PY01515 (PDB ID 2aqw) from Plasmodium yoelii, it is shown that the putative annotation, Orotidine 5'-monophosphate decarboxylase (OMPDC), is most likely correct. SALSA analysis of YP_001304206.1 (PDB ID 3h3l), a putative sugar hydrolase from Parabacteroides distasonis, shows that its active site does not bear close resemblance to any previously characterized member of its superfamily, the Concanavalin A-like lectins/glucanases. It is noted that three residues in the active site of the thermophilic beta-1,4-xylanase from Nonomuraea flexuosa (PDB ID 1m4w), Y78, E87, and E176, overlap with POOL-predicted residues of similar type, Y168, D153, and E232, in YP_001304206.1. The substrate recognition regions of the two proteins are rather different, suggesting that YP_001304206.1 is a new functional type within the superfamily. A structural genomics protein from Mycobacterium avium (PDB ID 3q1t) has been reported to be an enoyl-CoA hydratase (ECH), but SALSA analysis shows a poor match between the predicted residues for the SG protein and those of known ECHs. A better local structural match is obtained with Anabaena beta-diketone hydrolase (ABDH), a known β-diketone hydrolase from Cyanobacterium anabaena (PDB ID 2j5s). This suggests that the reported ECH function of the SG protein is incorrect and that it is more likely a β-diketone hydrolase.

Conclusions

A local site match provides a more compelling function prediction than that obtainable from a simple 3D structure match. The present method can confirm putative annotations, identify misannotation, and in some cases suggest a more probable annotation.

Collapse

Wu CY, Hwa YH, Chen YC, Lim C. Hidden relationship between conserved residues and locally conserved phosphate-binding structures in NAD(P)-binding proteins. J Phys Chem B 2012;116:5644-52. [PMID: 22530587 DOI: 10.1021/jp3014332] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]

Classification of protein functional surfaces using structural characteristics. Proc Natl Acad Sci U S A 2012;109:1170-5. [PMID: 22238424 DOI: 10.1073/pnas.1119684109] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Tseng YY, Li WH. Evolutionary approach to predicting the binding site residues of a protein from its primary sequence. Proc Natl Acad Sci U S A 2011;108:5313-8. [PMID: 21402946 PMCID: PMC3069214 DOI: 10.1073/pnas.1102210108] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open

Dundas J, Adamian L, Liang J. Structural signatures of enzyme binding pockets from order-independent surface alignment: a study of metalloendopeptidase and NAD binding proteins. J Mol Biol 2011;406:713-29. [PMID: 21145898 PMCID: PMC3061237 DOI: 10.1016/j.jmb.2010.12.005] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2010] [Revised: 10/14/2010] [Accepted: 12/03/2010] [Indexed: 10/18/2022]

Abstract

Detecting similarities between local binding surfaces can facilitate identification of enzyme binding sites and prediction of enzyme functions, and aid in our understanding of enzyme mechanisms. Constructing a template of local surface characteristics for a specific enzyme function or binding activity is a challenging task, as the size and shape of the binding surfaces of a biochemical function often vary. Here we introduce the concept of signature binding pockets, which captures information on preserved and varied atomic positions at multiresolution levels. For proteins with complex enzyme binding and activity, multiple signatures arise naturally in our model, forming a signature basis set that characterizes this class of proteins. Both signatures and signature basis sets can be automatically constructed by a method called SOLAR (Signature Of Local Active Regions). This method is based on a sequence-order-independent alignment of computed binding surface pockets. SOLAR also provides a structure-based multiple sequence fragment alignment to facilitate the interpretation of computed signatures. By studying a family of evolutionarily related proteins, we show that for metzincin metalloendopeptidase, which has a broad spectrum of substrate binding, signature and basis set pockets can be used to discriminate metzincins from other enzymes, to predict the subclass of metzincins functions, and to identify specific binding surfaces. Studying unrelated proteins that have evolved to bind to the same NAD cofactor, we constructed signatures of NAD binding pockets and used them to predict NAD binding proteins and to locate NAD binding pockets. By measuring preservation ratio and location variation, our method can identify residues and atoms that are important for binding affinity and specificity. In both cases, we show that signatures and signature basis set reveal significant biological insight.

Collapse

Moll M, Bryant DH, Kavraki LE. The LabelHash algorithm for substructure matching. BMC Bioinformatics 2010;11:555. [PMID: 21070651 PMCID: PMC2996407 DOI: 10.1186/1471-2105-11-555] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2010] [Accepted: 11/11/2010] [Indexed: 01/01/2025] Open

Abstract

Background

There is an increasing number of proteins with known structure but unknown function. Determining their function would have a significant impact on understanding diseases and designing new therapeutics. However, experimental protein function determination is expensive and very time-consuming. Computational methods can facilitate function determination by identifying proteins that have high structural and chemical similarity.

Results

We present LabelHash, a novel algorithm for matching substructural motifs to large collections of protein structures. The algorithm consists of two phases. In the first phase the proteins are preprocessed in a fashion that allows for instant lookup of partial matches to any motif. In the second phase, partial matches for a given motif are expanded to complete matches. The general applicability of the algorithm is demonstrated with three different case studies. First, we show that we can accurately identify members of the enolase superfamily with a single motif. Next, we demonstrate how LabelHash can complement SOIPPA, an algorithm for motif identification and pairwise substructure alignment. Finally, a large collection of Catalytic Site Atlas motifs is used to benchmark the performance of the algorithm. LabelHash runs very efficiently in parallel; matching a motif against all proteins in the 95% sequence identity filtered non-redundant Protein Data Bank typically takes no more than a few minutes. The LabelHash algorithm is available through a web server and as a suite of standalone programs at http://labelhash.kavrakilab.org. The output of the LabelHash algorithm can be further analyzed with Chimera through a plugin that we developed for this purpose.

Conclusions

LabelHash is an efficient, versatile algorithm for large-scale substructure matching. When LabelHash is running in parallel, motifs can typically be matched against the entire PDB on the order of minutes. The algorithm is able to identify functional homologs beyond the twilight zone of sequence identity and even beyond fold similarity. The three case studies presented in this paper illustrate the versatility of the algorithm.

Collapse

Bryant DH, Moll M, Chen BY, Fofanov VY, Kavraki LE. Analysis of substructural variation in families of enzymatic proteins with applications to protein function prediction. BMC Bioinformatics 2010;11:242. [PMID: 20459833 PMCID: PMC2885373 DOI: 10.1186/1471-2105-11-242] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2009] [Accepted: 05/11/2010] [Indexed: 12/02/2022] Open

Abstract

Background

Structural variations caused by a wide range of physico-chemical and biological sources directly influence the function of a protein. For enzymatic proteins, the structure and chemistry of the catalytic binding site residues can be loosely defined as a substructure of the protein. Comparative analysis of drug-receptor substructures across and within species has been used for lead evaluation. Substructure-level similarity between the binding sites of functionally similar proteins has also been used to identify instances of convergent evolution among proteins. In functionally homologous protein families, shared chemistry and geometry at catalytic sites provide a common, local point of comparison among proteins that may differ significantly at the sequence, fold, or domain topology levels.

Results

This paper describes two key results that can be used separately or in combination for protein function analysis. The Family-wise Analysis of SubStructural Templates (FASST) method uses all-against-all substructure comparison to determine Substructural Clusters (SCs). SCs characterize the binding site substructural variation within a protein family. In this paper we focus on examples of automatically determined SCs that can be linked to phylogenetic distance between family members, segregation by conformation, and organization by homology among convergent protein lineages. The Motif Ensemble Statistical Hypothesis (MESH) framework constructs a representative motif for each protein cluster among the SCs determined by FASST to build motif ensembles that are shown through a series of function prediction experiments to improve the function prediction power of existing motifs.

Conclusions

FASST contributes a critical feedback and assessment step to existing binding site substructure identification methods and can be used for the thorough investigation of structure-function relationships. The application of MESH allows for an automated, statistically rigorous procedure for incorporating structural variation data into protein function prediction pipelines. Our work provides an unbiased, automated assessment of the structural variability of identified binding site substructures among protein structure families and a technique for exploring the relation of substructural variation to protein function. As available proteomic data continues to expand, the techniques proposed will be indispensable for the large-scale analysis and interpretation of structural data.

Collapse

Bandyopadhyay D, Huan J, Prins J, Snoeyink J, Wang W, Tropsha A. Identification of family-specific residue packing motifs and their use for structure-based protein function prediction: II. Case studies and applications. J Comput Aided Mol Des 2009;23:785-97. [PMID: 19548090 DOI: 10.1007/s10822-009-9277-0] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/21/2008] [Accepted: 04/22/2009] [Indexed: 11/25/2022]

Xie L, Xie L, Bourne PE. A unified statistical model to support local sequence order independent similarity searching for ligand-binding sites and its application to genome-based drug discovery. Bioinformatics 2009;25:i305-12. [PMID: 19478004 PMCID: PMC2687974 DOI: 10.1093/bioinformatics/btp220] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Abstract

Functional relationships between proteins that do not share global structure similarity can be established by detecting their ligand-binding-site similarity. For a large-scale comparison, it is critical to accurately and efficiently assess the statistical significance of this similarity. Here, we report an efficient statistical model that supports local sequence order independent ligand-binding-site similarity searching. Most existing statistical models only take into account the matching vertices between two sites that are defined by a fixed number of points. In reality, the boundary of the binding site is not known or is dependent on the bound ligand making these approaches limited. To address these shortcomings and to perform binding-site mapping on a genome-wide scale, we developed a sequence-order independent profile-profile alignment (SOIPPA) algorithm that is able to detect local similarity between unknown binding sites a priori. The SOIPPA scoring integrates geometric, evolutionary and physical information into a unified framework. However, this imposes a significant challenge in assessing the statistical significance of the similarity because the conventional probability model that is based on fixed-point matching cannot be applied. Here we find that scores for binding-site matching by SOIPPA follow an extreme value distribution (EVD). Benchmark studies show that the EVD model performs at least two-orders faster and is more accurate than the non-parametric statistical method in the previous SOIPPA version. Efficient statistical analysis makes it possible to apply SOIPPA to genome-based drug discovery. Consequently, we have applied the approach to the structural genome of Mycobacterium tuberculosis to construct a protein-ligand interaction network. The network reveals highly connected proteins, which represent suitable targets for promiscuous drugs.

Collapse

Tseng YY, Dundas J, Liang J. Predicting protein function and binding profile via matching of local evolutionary and geometric surface patterns. J Mol Biol 2009;387:451-64. [PMID: 19154742 PMCID: PMC2670802 DOI: 10.1016/j.jmb.2008.12.072] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2008] [Revised: 12/19/2008] [Accepted: 12/23/2008] [Indexed: 11/25/2022]

Abstract

Inferring protein functions from structures is a challenging task, as a large number of orphan protein structures from structural genomics project are now solved without their biochemical functions characterized. For proteins binding to similar substrates or ligands and carrying out similar functions, their binding surfaces are under similar physicochemical constraints, and hence the sets of allowed and forbidden residue substitutions are similar. However, it is difficult to isolate such selection pressure due to protein function from selection pressure due to protein folding, and evolutionary relationship reflected by global sequence and structure similarities between proteins is often unreliable for inferring protein function. We have developed a method, called pevoSOAR (pocket-based evolutionary search of amino acid residues), for predicting protein functions by solving the problem of uncovering amino acids residue substitution pattern due to protein function and separating it from amino acids substitution pattern due to protein folding. We incorporate evolutionary information specific to an individual binding region and match local surfaces on a large scale with millions of precomputed protein surfaces to identify those with similar functions. Our pevoSOAR method also generates a probablistic model called the computed binding a profile that characterizes protein-binding activities that may involve multiple substrates or ligands. We show that our method can be used to predict enzyme functions with accuracy. Our method can also assess enzyme binding specificity and promiscuity. In an objective large-scale test of 100 enzyme families with thousands of structures, our predictions are found to be sensitive and specific: At the stringent specificity level of 99.98%, we can correctly predict enzyme functions for 80.55% of the proteins. The overall area under the receiver operating characteristic curve measuring the performance of our prediction is 0.955, close to the perfect value of 1.00. The best Matthews coefficient is 86.6%. Our method also works well in predicting the biochemical functions of orphan proteins from structural genomics projects.

Collapse

Zamocky M, Jakopitsch C, Furtmüller PG, Dunand C, Obinger C. The peroxidase-cyclooxygenase superfamily: Reconstructed evolution of critical enzymes of the innate immune system. Proteins 2008;72:589-605. [PMID: 18247411 DOI: 10.1002/prot.21950] [Citation(s) in RCA: 132] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2025]

Chien TY, Chang DTH, Chen CY, Weng YZ, Hsu CM. E1DS: catalytic site prediction based on 1D signatures of concurrent conservation. Nucleic Acids Res 2008;36:W291-6. [PMID: 18524800 PMCID: PMC2447799 DOI: 10.1093/nar/gkn324] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2008] [Revised: 04/25/2008] [Accepted: 05/07/2008] [Indexed: 11/21/2022] Open

Affiliation(s)

Ting-Ying Chien Department of Computer Science and Information Engineering, National Taiwan University, Taipei 106, Department of Electrical Engineering, National Cheng Kung University, Tainan 701, Department of Bio-Industrial Mechatronics Engineering, National Taiwan University, Taipei 106 and Department of Computer Science and Engineering, Yuan Ze University, Chung-Li 320, Taiwan, ROC
Darby Tien-Hao Chang Department of Computer Science and Information Engineering, National Taiwan University, Taipei 106, Department of Electrical Engineering, National Cheng Kung University, Tainan 701, Department of Bio-Industrial Mechatronics Engineering, National Taiwan University, Taipei 106 and Department of Computer Science and Engineering, Yuan Ze University, Chung-Li 320, Taiwan, ROC
Chien-Yu Chen Department of Computer Science and Information Engineering, National Taiwan University, Taipei 106, Department of Electrical Engineering, National Cheng Kung University, Tainan 701, Department of Bio-Industrial Mechatronics Engineering, National Taiwan University, Taipei 106 and Department of Computer Science and Engineering, Yuan Ze University, Chung-Li 320, Taiwan, ROC
Yi-Zhong Weng Department of Computer Science and Information Engineering, National Taiwan University, Taipei 106, Department of Electrical Engineering, National Cheng Kung University, Tainan 701, Department of Bio-Industrial Mechatronics Engineering, National Taiwan University, Taipei 106 and Department of Computer Science and Engineering, Yuan Ze University, Chung-Li 320, Taiwan, ROC
Chen-Ming Hsu Department of Computer Science and Information Engineering, National Taiwan University, Taipei 106, Department of Electrical Engineering, National Cheng Kung University, Tainan 701, Department of Bio-Industrial Mechatronics Engineering, National Taiwan University, Taipei 106 and Department of Computer Science and Engineering, Yuan Ze University, Chung-Li 320, Taiwan, ROC

Collapse

Xie L, Bourne PE. Detecting evolutionary relationships across existing fold space, using sequence order-independent profile-profile alignments. Proc Natl Acad Sci U S A 2008;105:5441-6. [PMID: 18385384 PMCID: PMC2291117 DOI: 10.1073/pnas.0704422105] [Citation(s) in RCA: 181] [Impact Index Per Article: 10.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2007] [Indexed: 11/18/2022] Open

Tong W, Williams RJ, Wei Y, Murga LF, Ko J, Ondrechen MJ. Enhanced performance in prediction of protein active sites with THEMATICS and support vector machines. Protein Sci 2007;17:333-41. [PMID: 18096640 DOI: 10.1110/ps.073213608] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]

Schmidberger JW, Wilce JA, Tsang JSH, Wilce MCJ. Crystal structures of the substrate free-enzyme, and reaction intermediate of the HAD superfamily member, haloacid dehalogenase DehIVa from Burkholderia cepacia MBA4. J Mol Biol 2007;368:706-17. [PMID: 17368477 DOI: 10.1016/j.jmb.2007.02.015] [Citation(s) in RCA: 46] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2006] [Revised: 02/02/2007] [Accepted: 02/07/2007] [Indexed: 11/17/2022]

Tseng YY, Liang J. Predicting enzyme functional surfaces and locating key residues automatically from structures. Ann Biomed Eng 2007;35:1037-42. [PMID: 17294116 DOI: 10.1007/s10439-006-9241-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2006] [Accepted: 11/27/2006] [Indexed: 10/23/2022]

Wei Y, Ringe D, Wilson MA, Ondrechen MJ. Identification of functional subclasses in the DJ-1 superfamily proteins. PLoS Comput Biol 2007;3:e10. [PMID: 17257049 PMCID: PMC1782040 DOI: 10.1371/journal.pcbi.0030010] [Citation(s) in RCA: 58] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2006] [Accepted: 12/07/2006] [Indexed: 12/02/2022] Open

Abstract

Genomics has posed the challenge of determination of protein function from sequence and/or 3-D structure. Functional assignment from sequence relationships can be misleading, and structural similarity does not necessarily imply functional similarity. Proteins in the DJ-1 family, many of which are of unknown function, are examples of proteins with both sequence and fold similarity that span multiple functional classes. THEMATICS (theoretical microscopic titration curves), an electrostatics-based computational approach to functional site prediction, is used to sort proteins in the DJ-1 family into different functional classes. Active site residues are predicted for the eight distinct DJ-1 proteins with available 3-D structures. Placement of the predicted residues onto a structural alignment for six of these proteins reveals three distinct types of active sites. Each type overlaps only partially with the others, with only one residue in common across all six sets of predicted residues. Human DJ-1 and YajL from Escherichia coli have very similar predicted active sites and belong to the same probable functional group. Protease I, a known cysteine protease from Pyrococcus horikoshii, and PfpI/YhbO from E. coli, a hypothetical protein of unknown function, belong to a separate class. THEMATICS predicts a set of residues that is typical of a cysteine protease for Protease I; the prediction for PfpI/YhbO bears some similarity. YDR533Cp from Saccharomyces cerevisiae, of unknown function, and the known chaperone Hsp31 from E. coli constitute a third group with nearly identical predicted active sites. While the first four proteins have predicted active sites at dimer interfaces, YDR533Cp and Hsp31 both have predicted sites contained within each subunit. Although YDR533Cp and Hsp31 form different dimers with different orientations between the subunits, the predicted active sites are superimposable within the monomer structures. Thus, the three predicted functional classes form four different types of quaternary structures. The computational prediction of the functional sites for protein structures of unknown function provides valuable clues for functional classification.

Collapse

Lisewski AM, Lichtarge O. Rapid detection of similarity in protein structure and function through contact metric distances. Nucleic Acids Res 2006;34:e152. [PMID: 17130161 PMCID: PMC1702494 DOI: 10.1093/nar/gkl788] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Lu CH, Lin YS, Chen YC, Yu CS, Chang SY, Hwang JK. The fragment transformation method to detect the protein structural motifs. Proteins 2006;63:636-43. [PMID: 16470805 DOI: 10.1002/prot.20904] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Burroughs AM, Allen KN, Dunaway-Mariano D, Aravind L. Evolutionary genomics of the HAD superfamily: understanding the structural adaptations and catalytic diversity in a superfamily of phosphoesterases and allied enzymes. J Mol Biol 2006;361:1003-34. [PMID: 16889794 DOI: 10.1016/j.jmb.2006.06.049] [Citation(s) in RCA: 343] [Impact Index Per Article: 18.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2005] [Revised: 06/16/2006] [Accepted: 06/20/2006] [Indexed: 11/21/2022]

Abstract

The HAD (haloacid dehalogenase) superfamily includes phosphoesterases, ATPases, phosphonatases, dehalogenases, and sugar phosphomutases acting on a remarkably diverse set of substrates. The availability of numerous crystal structures of representatives belonging to diverse branches of the HAD superfamily provides us with a unique opportunity to reconstruct their evolutionary history and uncover the principal determinants that led to their diversification of structure and function. To this end we present a comprehensive analysis of the HAD superfamily that identifies their unique structural features and provides a detailed classification of the entire superfamily. We show that at the highest level the HAD superfamily is unified with several other superfamilies, namely the DHH, receiver (CheY-like), von Willebrand A, TOPRIM, classical histone deacetylases and PIN/FLAP nuclease domains, all of which contain a specific form of the Rossmannoid fold. These Rossmannoid folds are distinguished from others by the presence of equivalently placed acidic catalytic residues, including one at the end of the first core beta-strand of the central sheet. The HAD domain is distinguished from these related Rossmannoid folds by two key structural signatures, a "squiggle" (a single helical turn) and a "flap" (a beta hairpin motif) located immediately downstream of the first beta-strand of their core Rossmanoid fold. The squiggle and the flap motifs are predicted to provide the necessary mobility to these enzymes for them to alternate between the "open" and "closed" conformations. In addition, most members of the HAD superfamily contains inserts, termed caps, occurring at either of two positions in the core Rossmannoid fold. We show that the cap modules have been independently inserted into these two stereotypic positions on multiple occasions in evolution and display extensive evolutionary diversification independent of the core catalytic domain. The first group of caps, the C1 caps, is directly inserted into the flap motif and regulates access of reactants to the active site. The second group, the C2 caps, forms a roof over the active site, and access to their internal cavities might be in part regulated by the movement of the flap. The diversification of the cap module was a major factor in the exploration of a vast substrate space in the course of the evolution of this superfamily. We show that the HAD superfamily contains 33 major families distributed across the three superkingdoms of life. Analysis of the phyletic patterns suggests that at least five distinct HAD proteins are traceable to the last universal common ancestor (LUCA) of all extant organisms. While these prototypes diverged prior to the emergence of the LUCA, the major diversification in terms of both substrate specificity and reaction types occurred after the radiation of the three superkingdoms of life, primarily in bacteria. Most major diversification events appear to correlate with the acquisition of new metabolic capabilities, especially related to the elaboration of carbohydrate metabolism in the bacteria. The newly identified relationships and functional predictions provided here are likely to aid the future exploration of the numerous poorly understood members of this large superfamily of enzymes.

Collapse

Polacco BJ, Babbitt PC. Automated discovery of 3D motifs for protein function annotation. Bioinformatics 2006;22:723-30. [PMID: 16410325 DOI: 10.1093/bioinformatics/btk038] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Tseng YY, Liang J. Automated method for predicting enzyme functional surfaces and locating key residues with accuracy and specificity. CONFERENCE PROCEEDINGS : ... ANNUAL INTERNATIONAL CONFERENCE OF THE IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. IEEE ENGINEERING IN MEDICINE AND BIOLOGY SOCIETY. ANNUAL CONFERENCE 2006;2006:4552-4555. [PMID: 17947099 DOI: 10.1109/iembs.2006.259540] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/25/2023]

Torrance JW, Bartlett GJ, Porter CT, Thornton JM. Using a library of structural templates to recognise catalytic sites and explore their evolution in homologous families. J Mol Biol 2005;347:565-81. [PMID: 15755451 DOI: 10.1016/j.jmb.2005.01.044] [Citation(s) in RCA: 92] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2004] [Revised: 01/13/2005] [Accepted: 01/19/2005] [Indexed: 11/20/2022]

Van Lanen SG, Reader JS, Swairjo MA, de Crécy-Lagard V, Lee B, Iwata-Reuyl D. From cyclohydrolase to oxidoreductase: discovery of nitrile reductase activity in a common fold. Proc Natl Acad Sci U S A 2005;102:4264-9. [PMID: 15767583 PMCID: PMC555470 DOI: 10.1073/pnas.0408056102] [Citation(s) in RCA: 92] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2004] [Indexed: 11/18/2022] Open