1
|
Verma R, Pandit SB. Unraveling the structural landscape of intra-chain domain interfaces: Implication in the evolution of domain-domain interactions. PLoS One 2019; 14:e0220336. [PMID: 31374091 PMCID: PMC6677297 DOI: 10.1371/journal.pone.0220336] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 07/12/2019] [Indexed: 12/22/2022] Open
Abstract
Intra-chain domain interactions are known to play a significant role in the function and stability of multidomain proteins. These interactions are mediated through a physical interaction at domain-domain interfaces (DDIs). With a motivation to understand evolution of interfaces, we have investigated similarities among DDIs. Even though interfaces of protein-protein interactions (PPIs) have been previously studied by structurally aligning interfaces, similar analyses have not yet been performed on DDIs of either multidomain proteins or PPIs. For studying the structural landscape of DDIs, we have used iAlign to structurally align intra-chain domain interfaces of domains. The interface alignment of spatially constrained domains (due to inter-domain linkers) showed that ~88% of these could identify a structural matching interface having similar C-alpha geometry and contact pattern despite that aligned domain pairs are not structurally related. Moreover, the mean interface similarity score (IS-score) is 0.307, which is higher compared to the average random IS-score (0.207) suggesting domain interfaces are not random. The structural space of DDIs is highly connected as ~84% of all possible directed edges among interfaces are found to have at most path length of 8 when 0.26 is IS-score threshold. At this threshold, ~83% of interfaces form the largest strongly connected component. Thus, suggesting that structural space of intra-chain domain interfaces is degenerate and highly connected, as has been found in PPI interfaces. Interestingly, searching for structural neighbors of inter-chain interfaces among intra-chain interfaces showed that ~86% could find a statistically significant match to intra-chain interface with a mean IS-score of 0.311. This implies that domain interfaces are degenerate whether formed within a protein or between proteins. The interface degeneracy is most likely due to limited possible ways of packing secondary structures. In principle, interface similarities can be exploited to accurately model domain interfaces in structure prediction of multidomain proteins.
Collapse
Affiliation(s)
- Rivi Verma
- Department of Biological Sciences, Indian Institute of Science Education and Research, Mohali, India
| | - Shashi Bhushan Pandit
- Department of Biological Sciences, Indian Institute of Science Education and Research, Mohali, India
- * E-mail:
| |
Collapse
|
2
|
Dygut J, Kalinowska B, Banach M, Piwowar M, Konieczny L, Roterman I. Structural Interface Forms and Their Involvement in Stabilization of Multidomain Proteins or Protein Complexes. Int J Mol Sci 2016; 17:ijms17101741. [PMID: 27763556 PMCID: PMC5085769 DOI: 10.3390/ijms17101741] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2016] [Revised: 09/30/2016] [Accepted: 10/11/2016] [Indexed: 12/20/2022] Open
Abstract
The presented analysis concerns the inter-domain and inter-protein interface in protein complexes. We propose extending the traditional understanding of the protein domain as a function of local compactness with an additional criterion which refers to the presence of a well-defined hydrophobic core. Interface areas in selected homodimers vary with respect to their contribution to share as well as individual (domain-specific) hydrophobic cores. The basic definition of a protein domain, i.e., a structural unit characterized by tighter packing than its immediate environment, is extended in order to acknowledge the role of a structured hydrophobic core, which includes the interface area. The hydrophobic properties of interfaces vary depending on the status of interacting domains—In this context we can distinguish: (1) Shared hydrophobic cores (spanning the whole dimer); (2) Individual hydrophobic cores present in each monomer irrespective of whether the dimer contains a shared core. Analysis of interfaces in dystrophin and utrophin indicates the presence of an additional quasi-domain with a prominent hydrophobic core, consisting of fragments contributed by both monomers. In addition, we have also attempted to determine the relationship between the type of interface (as categorized above) and the biological function of each complex. This analysis is entirely based on the fuzzy oil drop model.
Collapse
Affiliation(s)
- Jacek Dygut
- Department of Rehabilitation, Hospital in Przemyśl, Monte Cassino 18, 37-700 Przemyśl, Poland.
| | - Barbara Kalinowska
- Faculty of Physics, Astronomy and Applied Computer Science, Jagiellonian University, Łojasiewicza 11, 30-348 Krakow, Poland.
| | - Mateusz Banach
- Department of Bioinformatics and Telemedicine, Jagiellonian University-Medical College, Łazarza 16, 31-530 Krakow, Poland.
| | - Monika Piwowar
- Department of Bioinformatics and Telemedicine, Jagiellonian University-Medical College, Łazarza 16, 31-530 Krakow, Poland.
| | - Leszek Konieczny
- Chair of Medical Biochemistry, Jagiellonian University-Medical College, Kopernika 7, 31-034 Krakow, Poland.
| | - Irena Roterman
- Department of Bioinformatics and Telemedicine, Jagiellonian University-Medical College, Łazarza 16, 31-530 Krakow, Poland.
| |
Collapse
|
3
|
Esmaielbeiki R, Krawczyk K, Knapp B, Nebel JC, Deane CM. Progress and challenges in predicting protein interfaces. Brief Bioinform 2016; 17:117-31. [PMID: 25971595 PMCID: PMC4719070 DOI: 10.1093/bib/bbv027] [Citation(s) in RCA: 85] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Revised: 03/18/2015] [Indexed: 12/31/2022] Open
Abstract
The majority of biological processes are mediated via protein-protein interactions. Determination of residues participating in such interactions improves our understanding of molecular mechanisms and facilitates the development of therapeutics. Experimental approaches to identifying interacting residues, such as mutagenesis, are costly and time-consuming and thus, computational methods for this purpose could streamline conventional pipelines. Here we review the field of computational protein interface prediction. We make a distinction between methods which address proteins in general and those targeted at antibodies, owing to the radically different binding mechanism of antibodies. We organize the multitude of currently available methods hierarchically based on required input and prediction principles to provide an overview of the field.
Collapse
|
4
|
Surfing the Protein-Protein Interaction Surface Using Docking Methods: Application to the Design of PPI Inhibitors. Molecules 2015; 20:11569-603. [PMID: 26111183 PMCID: PMC6272567 DOI: 10.3390/molecules200611569] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2015] [Revised: 06/02/2015] [Accepted: 06/15/2015] [Indexed: 02/06/2023] Open
Abstract
Blocking protein-protein interactions (PPI) using small molecules or peptides modulates biochemical pathways and has therapeutic significance. PPI inhibition for designing drug-like molecules is a new area that has been explored extensively during the last decade. Considering the number of available PPI inhibitor databases and the limited number of 3D structures available for proteins, docking and scoring methods play a major role in designing PPI inhibitors as well as stabilizers. Docking methods are used in the design of PPI inhibitors at several stages of finding a lead compound, including modeling the protein complex, screening for hot spots on the protein-protein interaction interface and screening small molecules or peptides that bind to the PPI interface. There are three major challenges to the use of docking on the relatively flat surfaces of PPI. In this review we will provide some examples of the use of docking in PPI inhibitor design as well as its limitations. The combination of experimental and docking methods with improved scoring function has thus far resulted in few success stories of PPI inhibitors for therapeutic purposes. Docking algorithms used for PPI are in the early stages, however, and as more data are available docking will become a highly promising area in the design of PPI inhibitors or stabilizers.
Collapse
|
5
|
Zhou P, Tian F, Ren Y, Shang Z. Systematic classification and analysis of themes in protein-DNA recognition. J Chem Inf Model 2010; 50:1476-88. [PMID: 20726602 DOI: 10.1021/ci100145d] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Protein-DNA recognition plays a central role in the regulation of gene expression. With the rapidly increasing number of protein-DNA complex structures available at atomic resolution in recent years, a systematic, complete, and intuitive framework to clarify the intrinsic relationship between the global binding modes of these complexes is needed. In this work, we modified, extended, and applied previously defined RNA-recognition themes to describe protein-DNA recognition and used a protocol that incorporates automatic methods into manual inspection to plant a comprehensive classification tree for currently available high-quality protein-DNA structures. Further, a nonredundant (representative) data set consisting of 200 thematically diverse complexes was extracted from the leaves of the classification tree by using a locally sensitive interface comparison algorithm. On the basis of the representative data set, various physical and chemical properties associated with protein-DNA interactions were analyzed using empirical or semiempirical methods. We also examined the individual energetic components involved in protein-DNA interactions and highlighted the importance of conformational entropy, which has been almost completely ignored in previous studies of protein-DNA binding energy.
Collapse
Affiliation(s)
- Peng Zhou
- Department of Chemistry, Zhejiang University, Hangzhou 310027, China, College of Bioengineering, Chongqing University, Chongqing 400044, China
| | | | | | | |
Collapse
|
6
|
Abstract
With the advent of Systems Biology, the prediction of whether two proteins form a complex has become a problem of increased importance. A variety of experimental techniques have been applied to the problem, but three-dimensional structural information has not been widely exploited. Here we explore the range of applicability of such information by analyzing the extent to which the location of binding sites on protein surfaces is conserved among structural neighbors. We find, as expected, that interface conservation is most significant among proteins that have a clear evolutionary relationship, but that there is a significant level of conservation even among remote structural neighbors. This finding is consistent with recent evidence that information available from structural neighbors, independent of classification, should be exploited in the search for functional insights. The value of such structural information is highlighted through the development of a new protein interface prediction method, PredUs, that identifies what residues on protein surfaces are likely to participate in complexes with other proteins. The performance of PredUs, as measured through comparisons with other methods, suggests that relationships across protein structure space can be successfully exploited in the prediction of protein-protein interactions.
Collapse
|
7
|
Chang DTH, Syu YT, Lin PC. Predicting the protein-protein interactions using primary structures with predicted protein surface. BMC Bioinformatics 2010; 11 Suppl 1:S3. [PMID: 20122202 PMCID: PMC3009501 DOI: 10.1186/1471-2105-11-s1-s3] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022] Open
Abstract
BACKGROUND Many biological functions involve various protein-protein interactions (PPIs). Elucidating such interactions is crucial for understanding general principles of cellular systems. Previous studies have shown the potential of predicting PPIs based on only sequence information. Compared to approaches that require other auxiliary information, these sequence-based approaches can be applied to a broader range of applications. RESULTS This study presents a novel sequence-based method based on the assumption that protein-protein interactions are more related to amino acids at the surface than those at the core. The present method considers surface information and maintains the advantage of relying on only sequence data by including an accessible surface area (ASA) predictor recently proposed by the authors. This study also reports the experiments conducted to evaluate a) the performance of PPI prediction achieved by including the predicted surface and b) the quality of the predicted surface in comparison with the surface obtained from structures. The experimental results show that surface information helps to predict interacting protein pairs. Furthermore, the prediction performance achieved by using the surface estimated with the ASA predictor is close to that using the surface obtained from protein structures. CONCLUSION This work presents a sequence-based method that takes into account surface information for predicting PPIs. The proposed procedure of surface identification improves the prediction performance with an F-measure of 5.1%. The extracted surfaces are also valuable in other biomedical applications that require similar information.
Collapse
Affiliation(s)
- Darby Tien-Hao Chang
- Department of Electrical Engineering, National Cheng Kung University, Tainan, 70101, Taiwan
| | - Yu-Tang Syu
- Department of Electrical Engineering, National Cheng Kung University, Tainan, 70101, Taiwan
| | - Po-Chang Lin
- Department of Electrical Engineering, National Cheng Kung University, Tainan, 70101, Taiwan
| |
Collapse
|
8
|
Tyagi M, Shoemaker BA, Bryant SH, Panchenko AR. Exploring functional roles of multibinding protein interfaces. Protein Sci 2009; 18:1674-83. [PMID: 19591200 DOI: 10.1002/pro.181] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/21/2023]
Abstract
Cellular processes are highly interconnected and many proteins are shared in different pathways. Some of these shared proteins or protein families may interact with diverse partners using the same interface regions; such multibinding proteins are the subject of our study. The main goal of our study is to attempt to decipher the mechanisms of specific molecular recognition of multiple diverse partners by promiscuous protein regions. To address this, we attempt to analyze the physicochemical properties of multibinding interfaces and highlight the major mechanisms of functional switches realized through multibinding. We find that only 5% of protein families in the structure database have multibinding interfaces, and multibinding interfaces do not show any higher sequence conservation compared with the background interface sites. We highlight several important functional mechanisms utilized by multibinding families. (a) Overlap between different functional pathways can be prevented by the switches involving nearby residues of the same interfacial region. (b) Interfaces can be reused in pathways where the substrate should be passed from one protein to another sequentially. (c) The same protein family can develop different specificities toward different binding partners reusing the same interface; and finally, (d) inhibitors can attach to substrate binding sites as substrate mimicry and thereby prevent substrate binding.
Collapse
Affiliation(s)
- Manoj Tyagi
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | | | | | | |
Collapse
|
9
|
Guda C, King BR, Pal LR, Guda P. A top-down approach to infer and compare domain-domain interactions across eight model organisms. PLoS One 2009; 4:e5096. [PMID: 19333396 PMCID: PMC2659750 DOI: 10.1371/journal.pone.0005096] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2008] [Accepted: 02/10/2009] [Indexed: 11/22/2022] Open
Abstract
Knowledge of specific domain-domain interactions (DDIs) is essential to understand the functional significance of protein interaction networks. Despite the availability of an enormous amount of data on protein-protein interactions (PPIs), very little is known about specific DDIs occurring in them. Here, we present a top-down approach to accurately infer functionally relevant DDIs from PPI data. We created a comprehensive, non-redundant dataset of 209,165 experimentally-derived PPIs by combining datasets from five major interaction databases. We introduced an integrated scoring system that uses a novel combination of a set of five orthogonal scoring features covering the probabilistic, evolutionary, evidence-based, spatial and functional properties of interacting domains, which can map the interacting propensity of two domains in many dimensions. This method outperforms similar existing methods both in the accuracy of prediction and in the coverage of domain interaction space. We predicted a set of 52,492 high-confidence DDIs to carry out cross-species comparison of DDI conservation in eight model species including human, mouse, Drosophila, C. elegans, yeast, Plasmodium, E. coli and Arabidopsis. Our results show that only 23% of these DDIs are conserved in at least two species and only 3.8% in at least 4 species, indicating a rather low conservation across species. Pair-wise analysis of DDI conservation revealed a ‘sliding conservation’ pattern between the evolutionarily neighboring species. Our methodology and the high-confidence DDI predictions generated in this study can help to better understand the functional significance of PPIs at the modular level, thus can significantly impact further experimental investigations in systems biology research.
Collapse
Affiliation(s)
- Chittibabu Guda
- GenNYsis Center for Excellence in Cancer Genomics and Department of Epidemiology & Biostatistics, State University of New York at Albany, Rensselaer, NY, USA.
| | | | | | | |
Collapse
|
10
|
Coevolution at protein complex interfaces can be detected by the complementarity trace with important impact for predictive docking. Proc Natl Acad Sci U S A 2008; 105:7708-13. [PMID: 18511568 DOI: 10.1073/pnas.0707032105] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Protein surfaces are under significant selection pressure to maintain interactions with their partners throughout evolution. Capturing how selection pressure acts at the interfaces of protein-protein complexes is a fundamental issue with high interest for the structural prediction of macromolecular assemblies. We tackled this issue under the assumption that, throughout evolution, mutations should minimally disrupt the physicochemical compatibility between specific clusters of interacting residues. This constraint drove the development of the so-called Surface COmplementarity Trace in Complex History score (SCOTCH), which was found to discriminate with high efficiency the structure of biological complexes. SCOTCH performances were assessed not only with respect to other evolution-based approaches, such as conservation and coevolution analyses, but also with respect to statistically based scoring methods. Validated on a set of 129 complexes of known structure exhibiting both permanent and transient intermolecular interactions, SCOTCH appears as a robust strategy to guide the prediction of protein-protein complex structures. Of particular interest, it also provides a basic framework to efficiently track how protein surfaces could evolve while keeping their partners in contact.
Collapse
|
11
|
Chang CEA, McLaughlin WA, Baron R, Wang W, McCammon JA. Entropic contributions and the influence of the hydrophobic environment in promiscuous protein-protein association. Proc Natl Acad Sci U S A 2008; 105:7456-61. [PMID: 18495919 PMCID: PMC2391134 DOI: 10.1073/pnas.0800452105] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2008] [Indexed: 11/18/2022] Open
Abstract
The mechanisms by which a promiscuous protein can strongly interact with several different proteins using the same binding interface are not completely understood. An example is protein kinase A (PKA), which uses a single face on its docking/dimerization domain to interact with multiple A-kinase anchoring proteins (AKAP) that localize it to different parts of the cell. In the current study, the configurational entropy contributions to the binding between the AKAP protein HT31 with the D/D domain of RII alpha-regulatory subunit of PKA were examined. The results show that the majority of configurational entropy loss for the interaction was due to decreased fluctuations within rotamer states of the side chains. The result is in contrast to the widely held approximation that the decrease in the number of rotamer states available to the side chains forms the major component. Further analysis showed that there was a direct linear relationship between total configurational entropy and the number of favorable, alternative contacts available within hydrophobic environments. The hydrophobic binding pocket of the D/D domain provides alternative contact points for the side chains of AKAP peptides that allow them to adopt different binding conformations. The increase in binding conformations provides an increase in binding entropy and hence binding affinity. We infer that a general strategy for a promiscuous protein is to provide alternative contact points at its interface to increase binding affinity while the plasticity required for binding to multiple partners is retained. Implications are discussed for understanding and treating diseases in which promiscuous protein interactions are used.
Collapse
Affiliation(s)
- Chia-en A. Chang
- Departments of *Chemistry and Biochemistry and
- Center for Theoretical Biological Physics
| | - William A. McLaughlin
- Departments of *Chemistry and Biochemistry and
- Center for Theoretical Biological Physics
| | - Riccardo Baron
- Departments of *Chemistry and Biochemistry and
- Center for Theoretical Biological Physics
| | - Wei Wang
- Departments of *Chemistry and Biochemistry and
- Center for Theoretical Biological Physics
| | - J. Andrew McCammon
- Departments of *Chemistry and Biochemistry and
- Pharmacology
- Center for Theoretical Biological Physics
- Howard Hughes Medical Institute, University of California at San Diego, La Jolla, CA 92093-0365
| |
Collapse
|
12
|
Zhang SW, Chen W, Yang F, Pan Q. Using Chou's pseudo amino acid composition to predict protein quaternary structure: a sequence-segmented PseAAC approach. Amino Acids 2008; 35:591-8. [PMID: 18427713 DOI: 10.1007/s00726-008-0086-x] [Citation(s) in RCA: 71] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2008] [Accepted: 02/28/2008] [Indexed: 12/11/2022]
Abstract
In the protein universe, many proteins are composed of two or more polypeptide chains, generally referred to as subunits, which associate through noncovalent interactions and, occasionally, disulfide bonds to form protein quaternary structures. It has long been known that the functions of proteins are closely related to their quaternary structures; some examples include enzymes, hemoglobin, DNA polymerase, and ion channels. However, it is extremely labor-expensive and even impossible to quickly determine the structures of hundreds of thousands of protein sequences solely from experiments. Since the number of protein sequences entering databanks is increasing rapidly, it is highly desirable to develop computational methods for classifying the quaternary structures of proteins from their primary sequences. Since the concept of Chou's pseudo amino acid composition (PseAAC) was introduced, a variety of approaches, such as residue conservation scores, von Neumann entropy, multiscale energy, autocorrelation function, moment descriptors, and cellular automata, have been utilized to formulate the PseAAC for predicting different attributes of proteins. Here, in a different approach, a sequence-segmented PseAAC is introduced to represent protein samples. Meanwhile, multiclass SVM classifier modules were adopted to classify protein quaternary structures. As a demonstration, the dataset constructed by Chou and Cai [(2003) Proteins 53:282-289] was adopted as a benchmark dataset. The overall jackknife success rates thus obtained were 88.2-89.1%, indicating that the new approach is quite promising for predicting protein quaternary structure.
Collapse
Affiliation(s)
- Shao-Wu Zhang
- College of Automation, Northwestern Polytechnical University, 710072, Xi'an, China.
| | | | | | | |
Collapse
|
13
|
Teyra J, Paszkowski-Rogacz M, Anders G, Pisabarro MT. SCOWLP classification: structural comparison and analysis of protein binding regions. BMC Bioinformatics 2008; 9:9. [PMID: 18182098 PMCID: PMC2259299 DOI: 10.1186/1471-2105-9-9] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2007] [Accepted: 01/08/2008] [Indexed: 11/10/2022] Open
Abstract
Background Detailed information about protein interactions is critical for our understanding of the principles governing protein recognition mechanisms. The structures of many proteins have been experimentally determined in complex with different ligands bound either in the same or different binding regions. Thus, the structural interactome requires the development of tools to classify protein binding regions. A proper classification may provide a general view of the regions that a protein uses to bind others and also facilitate a detailed comparative analysis of the interacting information for specific protein binding regions at atomic level. Such classification might be of potential use for deciphering protein interaction networks, understanding protein function, rational engineering and design. Description Protein binding regions (PBRs) might be ideally described as well-defined separated regions that share no interacting residues one another. However, PBRs are often irregular, discontinuous and can share a wide range of interacting residues among them. The criteria to define an individual binding region can be often arbitrary and may differ from other binding regions within a protein family. Therefore, the rational behind protein interface classification should aim to fulfil the requirements of the analysis to be performed. We extract detailed interaction information of protein domains, peptides and interfacial solvent from the SCOWLP database and we classify the PBRs of each domain family. For this purpose, we define a similarity index based on the overlapping of interacting residues mapped in pair-wise structural alignments. We perform our classification with agglomerative hierarchical clustering using the complete-linkage method. Our classification is calculated at different similarity cut-offs to allow flexibility in the analysis of PBRs, feature especially interesting for those protein families with conflictive binding regions. The hierarchical classification of PBRs is implemented into the SCOWLP database and extends the SCOP classification with three additional family sub-levels: Binding Region, Interface and Contacting Domains. SCOWLP contains 9,334 binding regions distributed within 2,561 families. In 65% of the cases we observe families containing more than one binding region. Besides, 22% of the regions are forming complex with more than one different protein family. Conclusion The current SCOWLP classification and its web application represent a framework for the study of protein interfaces and comparative analysis of protein family binding regions. This comparison can be performed at atomic level and allows the user to study interactome conservation and variability. The new SCOWLP classification may be of great utility for reconstruction of protein complexes, understanding protein networks and ligand design. SCOWLP will be updated with every SCOP release. The web application is available at .
Collapse
Affiliation(s)
- Joan Teyra
- Structural Bioinformatics, BIOTEC TU-Dresden, Tatzberg 47-51, 01307 Dresden, Germany.
| | | | | | | |
Collapse
|
14
|
Sommer I, Müller O, Domingues FS, Sander O, Weickert J, Lengauer T. Moment invariants as shape recognition technique for comparing protein binding sites. Bioinformatics 2007; 23:3139-46. [PMID: 17977888 DOI: 10.1093/bioinformatics/btm503] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION An approach for identifying similarities of protein-protein binding sites is presented. The geometric shape of a binding site is described by computing a feature vector based on moment invariants. In order to search for similarities, feature vectors of binding sites are compared. Similar feature vectors indicate binding sites with similar shapes. RESULTS The approach is validated on a representative set of protein-protein binding sites, extracted from the SCOPPI database. When querying binding sites from a representative set, we search for known similarities among 2819 binding sites. A median area under the ROC curve of 0.98 is observed. For half of the queries, a similar binding site is identified among the first two of 2819 when sorting all binding sites according the proposed similarity measure. Typical examples identified by this method are analyzed and discussed. The nitrogenase iron protein-like SCOP family is clustered hierarchically according to the proposed similarity measure as a case study. AVAILABILITY Python code is available on request from the authors.
Collapse
Affiliation(s)
- Ingolf Sommer
- Max-Planck-Institute for Informatics, Stuhlsatzenhausweg 85, 66123 Saarbrücken, Germany.
| | | | | | | | | | | |
Collapse
|
15
|
Computational prediction of protein-protein interactions. Mol Biotechnol 2007; 38:1-17. [PMID: 18095187 DOI: 10.1007/s12033-007-0069-2] [Citation(s) in RCA: 169] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/04/2007] [Accepted: 07/16/2007] [Indexed: 01/19/2023]
Abstract
Recently a number of computational approaches have been developed for the prediction of protein-protein interactions. Complete genome sequencing projects have provided the vast amount of information needed for these analyses. These methods utilize the structural, genomic, and biological context of proteins and genes in complete genomes to predict protein interaction networks and functional linkages between proteins. Given that experimental techniques remain expensive, time-consuming, and labor-intensive, these methods represent an important advance in proteomics. Some of these approaches utilize sequence data alone to predict interactions, while others combine multiple computational and experimental datasets to accurately build protein interaction maps for complete genomes. These methods represent a complementary approach to current high-throughput projects whose aim is to delineate protein interaction maps in complete genomes. We will describe a number of computational protocols for protein interaction prediction based on the structural, genomic, and biological context of proteins in complete genomes, and detail methods for protein interaction network visualization and analysis.
Collapse
|
16
|
Henschel A, Winter C, Kim WK, Schroeder M. Using structural motif descriptors for sequence-based binding site prediction. BMC Bioinformatics 2007; 8 Suppl 4:S5. [PMID: 17570148 PMCID: PMC1892084 DOI: 10.1186/1471-2105-8-s4-s5] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Many protein sequences are still poorly annotated. Functional characterization of a protein is often improved by the identification of its interaction partners. Here, we aim to predict protein-protein interactions (PPI) and protein-ligand interactions (PLI) on sequence level using 3D information. To this end, we use machine learning to compile sequential segments that constitute structural features of an interaction site into one profile Hidden Markov Model descriptor. The resulting collection of descriptors can be used to screen sequence databases in order to predict functional sites. RESULTS We generate descriptors for 740 classified types of protein-protein binding sites and for more than 3,000 protein-ligand binding sites. Cross validation reveals that two thirds of the PPI descriptors are sufficiently conserved and significant enough to be used for binding site recognition. We further validate 230 PPIs that were extracted from the literature, where we additionally identify the interface residues. Finally we test ligand-binding descriptors for the case of ATP. From sequences with Swiss-Prot annotation "ATP-binding", we achieve a recall of 25% with a precision of 89%, whereas Prosite's P-loop motif recognizes an equal amount of hits at the expense of a much higher number of false positives (precision: 57%). Our method yields 771 hits with a precision of 96% that were not previously picked up by any Prosite-pattern. CONCLUSION The automatically generated descriptors are a useful complement to known Prosite/InterPro motifs. They serve to predict protein-protein as well as protein-ligand interactions along with their binding site residues for proteins where merely sequence information is available.
Collapse
Affiliation(s)
- Andreas Henschel
- Biotechnological Center, TU Dresden, Tatzberg 47-51, 01307 Dresden, Germany
| | - Christof Winter
- Biotechnological Center, TU Dresden, Tatzberg 47-51, 01307 Dresden, Germany
| | - Wan Kyu Kim
- Biotechnological Center, TU Dresden, Tatzberg 47-51, 01307 Dresden, Germany
- Institute for Cellular and Molecular Biology, University of Texas at Austin, Austin, TX 78712, USA
| | - Michael Schroeder
- Biotechnological Center, TU Dresden, Tatzberg 47-51, 01307 Dresden, Germany
| |
Collapse
|
17
|
Han JH, Batey S, Nickson AA, Teichmann SA, Clarke J. The folding and evolution of multidomain proteins. Nat Rev Mol Cell Biol 2007; 8:319-30. [PMID: 17356578 DOI: 10.1038/nrm2144] [Citation(s) in RCA: 283] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Analyses of genomes show that more than 70% of eukaryotic proteins are composed of multiple domains. However, most studies of protein folding focus on individual domains and do not consider how interactions between domains might affect folding. Here, we address this by analysing the three-dimensional structures of multidomain proteins that have been characterized experimentally and observe that where the interface is small and loosely packed, or unstructured, the folding of the domains is independent. Furthermore, recent studies indicate that multidomain proteins have evolved mechanisms to minimize the problems of interdomain misfolding.
Collapse
Affiliation(s)
- Jung-Hoon Han
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK
| | | | | | | | | |
Collapse
|
18
|
Anashkina A, Kuznetsov E, Esipova N, Tumanyan V. Comprehensive statistical analysis of residues interaction specificity at protein-protein interfaces. Proteins 2007; 67:1060-77. [PMID: 17357164 DOI: 10.1002/prot.21363] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
We calculated interchain contacts on the atomic level for nonredundant set of 4602 protein-protein interfaces using an unbiased Voronoi-Delaune tessellation method, and made 20x20 residue contact matrixes both for homodimers and heterocomplexes. The area of contacts and the distance distribution for these contacts were calculated on both the residue and the atomic levels. We analyzed residue area distribution and showed the existence of two types of interresidue contacts: stochastic and specific. We also derived formulas describing the distribution of contact area for stochastic and specific interactions in parametric form. Maximum pairing preference index was found for Cys-Cys contacts and for oppositely charged interactions. A significant difference in residue contacts was observed between homodimers and heterocomplexes. Interfaces in homodimers were enriched with contacts between residues of the same type due to the effects of structure symmetry.
Collapse
Affiliation(s)
- Anastasya Anashkina
- Laboratory of bioinformatics and system biology, Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, Moscow, Russia.
| | | | | | | |
Collapse
|
19
|
Abstract
Protein-protein interactions (or PPIs) are key elements for the normal functioning of a living cell. A large description of the protein interactomics field is given in this review where different aspects will be discussed. We first give an introduction of the different large scale experimental approaches from yeast two-hybrid to mass spectrometry used to discover PPIs and build protein interaction maps. Single PPI validation techniques such as co-immunoprecipitation or fluorescence methods are then presented as they are more and more integrated in global PPI discovery strategy. Data from different experimental sets are compared and an assessment of the different large scale technologies is presented. Bioinformatics tools can also predict with a good accuracy PPIs in silico, PPIs databases are now numerous and topological analysis has led to interesting insights into the nature of network connection. Finally, PPI, as an association of two proteins, has been structurally characterized for many protein complexes and is largely discussed throughout existing examples. The results obtained so far already provide the biologist with a large set of structured data from which knowledge on pathways and associated protein function can be extracted.
Collapse
|
20
|
Jothi R, Cherukuri PF, Tasneem A, Przytycka TM. Co-evolutionary analysis of domains in interacting proteins reveals insights into domain-domain interactions mediating protein-protein interactions. J Mol Biol 2006; 362:861-75. [PMID: 16949097 PMCID: PMC1618801 DOI: 10.1016/j.jmb.2006.07.072] [Citation(s) in RCA: 81] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2006] [Revised: 06/19/2006] [Accepted: 07/14/2006] [Indexed: 11/28/2022]
Abstract
Recent advances in functional genomics have helped generate large-scale high-throughput protein interaction data. Such networks, though extremely valuable towards molecular level understanding of cells, do not provide any direct information about the regions (domains) in the proteins that mediate the interaction. Here, we performed co-evolutionary analysis of domains in interacting proteins in order to understand the degree of co-evolution of interacting and non-interacting domains. Using a combination of sequence and structural analysis, we analyzed protein-protein interactions in F1-ATPase, Sec23p/Sec24p, DNA-directed RNA polymerase and nuclear pore complexes, and found that interacting domain pair(s) for a given interaction exhibits higher level of co-evolution than the non-interacting domain pairs. Motivated by this finding, we developed a computational method to test the generality of the observed trend, and to predict large-scale domain-domain interactions. Given a protein-protein interaction, the proposed method predicts the domain pair(s) that is most likely to mediate the protein interaction. We applied this method on the yeast interactome to predict domain-domain interactions, and used known domain-domain interactions found in PDB crystal structures to validate our predictions. Our results show that the prediction accuracy of the proposed method is statistically significant. Comparison of our prediction results with those from two other methods reveals that only a fraction of predictions are shared by all the three methods, indicating that the proposed method can detect known interactions missed by other methods. We believe that the proposed method can be used with other methods to help identify previously unrecognized domain-domain interactions on a genome scale, and could potentially help reduce the search space for identifying interaction sites.
Collapse
Affiliation(s)
- Raja Jothi
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- *Corresponding authors; E-mail addresses of the corresponding authors: ;
| | - Praveen F. Cherukuri
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- Bioinformatics Program Boston University, Boston, MA 02215, USA
| | - Asba Tasneem
- Booz Allen Hamilton Inc., Rockville, MD 20852, USA
| | - Teresa M. Przytycka
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
- *Corresponding authors; E-mail addresses of the corresponding authors: ;
| |
Collapse
|
21
|
Kim WK, Henschel A, Winter C, Schroeder M. The many faces of protein-protein interactions: A compendium of interface geometry. PLoS Comput Biol 2006; 2:e124. [PMID: 17009862 PMCID: PMC1584320 DOI: 10.1371/journal.pcbi.0020124] [Citation(s) in RCA: 86] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2006] [Accepted: 07/31/2006] [Indexed: 11/18/2022] Open
Abstract
A systematic classification of protein-protein interfaces is a valuable resource for understanding the principles of molecular recognition and for modelling protein complexes. Here, we present a classification of domain interfaces according to their geometry. Our new algorithm uses a hybrid approach of both sequential and structural features. The accuracy is evaluated on a hand-curated dataset of 416 interfaces. Our hybrid procedure achieves 83% precision and 95% recall, which improves the earlier sequence-based method by 5% on both terms. We classify virtually all domain interfaces of known structure, which results in nearly 6,000 distinct types of interfaces. In 40% of the cases, the interacting domain families associate in multiple orientations, suggesting that all the possible binding orientations need to be explored for modelling multidomain proteins and protein complexes. In general, hub proteins are shown to use distinct surface regions (multiple faces) for interactions with different partners. Our classification provides a convenient framework to query genuine gene fusion, which conserves binding orientation in both fused and separate forms. The result suggests that the binding orientations are not conserved in at least one-third of the gene fusion cases detected by a conventional sequence similarity search. We show that any evolutionary analysis on interfaces can be skewed by multiple binding orientations and multiple interaction partners. The taxonomic distribution of interface types suggests that ancient interfaces common to the three major kingdoms of life are enriched by symmetric homodimers. The classification results are online at http://www.scoppi.org.
Collapse
Affiliation(s)
- Wan Kyu Kim
- Bioinformatics Group, Biotechnological Centre, Technische Universität Dresden, Dresden, Germany
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Cambridge, United Kingdom
| | - Andreas Henschel
- Bioinformatics Group, Biotechnological Centre, Technische Universität Dresden, Dresden, Germany
| | - Christof Winter
- Bioinformatics Group, Biotechnological Centre, Technische Universität Dresden, Dresden, Germany
| | - Michael Schroeder
- Bioinformatics Group, Biotechnological Centre, Technische Universität Dresden, Dresden, Germany
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
22
|
Winter C, Henschel A, Kim WK, Schroeder M. SCOPPI: a structural classification of protein-protein interfaces. Nucleic Acids Res 2006; 34:D310-4. [PMID: 16381874 PMCID: PMC1347461 DOI: 10.1093/nar/gkj099] [Citation(s) in RCA: 119] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
SCOPPI, the structural classification of protein–protein interfaces, is a comprehensive database that classifies and annotates domain interactions derived from all known protein structures. SCOPPI applies SCOP domain definitions and a distance criterion to determine inter-domain interfaces. Using a novel method based on multiple sequence and structural alignments of SCOP families, SCOPPI presents a comprehensive geometrical classification of domain interfaces. Various interface characteristics such as number, type and position of interacting amino acids, conservation, interface size, and permanent or transient nature of the interaction are further provided. Proteins in SCOPPI are annotated with Gene Ontology terms, and the ontology can be used to quickly browse SCOPPI. Screenshots are available for every interface and its participating domains. Here, we describe contents and features of the web-based user interface as well as the underlying methods used to generate SCOPPI's data. In addition, we present a number of examples where SCOPPI becomes a useful tool to analyze viral mimicry of human interface binding sites, gene fusion events, conservation of interface residues and diversity of interface localizations. SCOPPI is available at .
Collapse
Affiliation(s)
- Christof Winter
- Biotechnological Centre of TU DresdenTatzberg 47-51, 01307 Dresden, Germany
| | - Andreas Henschel
- Biotechnological Centre of TU DresdenTatzberg 47-51, 01307 Dresden, Germany
| | - Wan Kyu Kim
- Biotechnological Centre of TU DresdenTatzberg 47-51, 01307 Dresden, Germany
| | - Michael Schroeder
- Biotechnological Centre of TU DresdenTatzberg 47-51, 01307 Dresden, Germany
- To whom correspondence should be addressed. Tel: +49 351 463 40062; Fax: +49 351 463 40061;
| |
Collapse
|