1
|
Pierson E, De Pol F, Fillet M, Wouters J. A morpheein equilibrium regulates catalysis in phosphoserine phosphatase SerB2 from Mycobacterium tuberculosis. Commun Biol 2023; 6:1024. [PMID: 37817000 PMCID: PMC10564941 DOI: 10.1038/s42003-023-05402-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2023] [Accepted: 09/29/2023] [Indexed: 10/12/2023] Open
Abstract
Mycobacterium tuberculosis phosphoserine phosphatase MtSerB2 is of interest as a new antituberculosis target due to its essential metabolic role in L-serine biosynthesis and effector functions in infected cells. Previous works indicated that MtSerB2 is regulated through an oligomeric transition induced by L-Ser that could serve as a basis for the design of selective allosteric inhibitors. However, the mechanism underlying this transition remains highly elusive due to the lack of experimental structural data. Here we describe a structural, biophysical, and enzymological characterisation of MtSerB2 oligomerisation in the presence and absence of L-Ser. We show that MtSerB2 coexists in dimeric, trimeric, and tetrameric forms of different activity levels interconverting through a conformationally flexible monomeric state, which is not observed in two near-identical mycobacterial orthologs. This morpheein behaviour exhibited by MtSerB2 lays the foundation for future allosteric drug discovery and provides a starting point to the understanding of its peculiar multifunctional moonlighting properties.
Collapse
Affiliation(s)
- Elise Pierson
- Laboratoire de Chimie Biologique Structurale (CBS), Namur Research Institute for Life Sciences (NARILIS), University of Namur (UNamur), 5000, Namur, Belgium
| | - Florian De Pol
- Laboratoire de Chimie Biologique Structurale (CBS), Namur Research Institute for Life Sciences (NARILIS), University of Namur (UNamur), 5000, Namur, Belgium
| | - Marianne Fillet
- Laboratory for the Analysis of Medicines (LAM), Center for Interdisciplinary Research on Medicines (CIRM), University of Liège (ULiège), 4000, Liège, Belgium
| | - Johan Wouters
- Laboratoire de Chimie Biologique Structurale (CBS), Namur Research Institute for Life Sciences (NARILIS), University of Namur (UNamur), 5000, Namur, Belgium.
| |
Collapse
|
2
|
Collins KW, Copeland MM, Kotthoff I, Singh A, Kundrotas PJ, Vakser IA. Dockground resource for protein recognition studies. Protein Sci 2022; 31:e4481. [PMID: 36281025 PMCID: PMC9667896 DOI: 10.1002/pro.4481] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/19/2022] [Accepted: 10/20/2022] [Indexed: 12/13/2022]
Abstract
Structural information of protein-protein interactions is essential for characterization of life processes at the molecular level. While a small fraction of known protein interactions has experimentally determined structures, computational modeling of protein complexes (protein docking) has to fill the gap. The Dockground resource (http://dockground.compbio.ku.edu) provides a collection of datasets for the development and testing of protein docking techniques. Currently, Dockground contains datasets for the bound and the unbound (experimentally determined and simulated) protein structures, model-model complexes, docking decoys of experimentally determined and modeled proteins, and templates for comparative docking. The Dockground bound proteins dataset is a core set, from which other Dockground datasets are generated. It is devised as a relational PostgreSQL database containing information on experimentally determined protein-protein complexes. This report on the Dockground resource describes current status of the datasets, new automated update procedures and further development of the core datasets. We also present a new Dockground interactive web interface, which allows search by various parameters, such as release date, multimeric state, complex type, structure resolution, and so on, visualization of the search results with a number of customizable parameters, as well as downloadable datasets with predefined levels of sequence and structure redundancy.
Collapse
Affiliation(s)
| | | | - Ian Kotthoff
- Computational Biology ProgramThe University of KansasKansasUSA
| | - Amar Singh
- Computational Biology ProgramThe University of KansasKansasUSA
| | | | - Ilya A. Vakser
- Computational Biology ProgramThe University of KansasKansasUSA
- Department of Molecular BiosciencesThe University of KansasKansasUSA
| |
Collapse
|
3
|
Bun JS, Slack MD, Schemenauer DE, Johnson RJ. Comparative analysis of the human serine hydrolase OVCA2 to the model serine hydrolase homolog FSH1 from S. cerevisiae. PLoS One 2020; 15:e0230166. [PMID: 32182256 PMCID: PMC7077851 DOI: 10.1371/journal.pone.0230166] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2019] [Accepted: 02/20/2020] [Indexed: 11/22/2022] Open
Abstract
Over 100 metabolic serine hydrolases are present in humans with confirmed functions in metabolism, immune response, and neurotransmission. Among potentially clinically-relevant but uncharacterized human serine hydrolases is OVCA2, a serine hydrolase that has been linked with a variety of cancer-related processes. Herein, we developed a heterologous expression system for OVCA2 and determined the comprehensive substrate specificity of OVCA2 against two ester substrate libraries. Based on this analysis, OVCA2 was confirmed as a serine hydrolase with a strong preference for long-chain alkyl ester substrates (>10-carbons) and high selectivity against a variety of short, branched, and substituted esters. Substitutional analysis was used to identify the catalytic residues of OVCA2 with a Ser117-His206-Asp179 classic catalytic triad. Comparison of the substrate specificity of OVCA2 to the model homologue FSH1 from Saccharomyces cerevisiae illustrated the tighter substrate selectivity of OVCA2, but their overlapping substrate preference for extended straight-chain alkyl esters. Conformation of the overlapping biochemical properties of OVCA2 and FSH1 was used to model structural information about OVCA2. Together our analysis provides detailed substrate specificity information about a previously, uncharacterized human serine hydrolase and begins to define the biological properties of OVCA2.
Collapse
Affiliation(s)
- Jessica S. Bun
- Department of Chemistry and Biochemistry, Butler University, Indianapolis, Indiana, United States of America
| | - Michael D. Slack
- Department of Chemistry and Biochemistry, Butler University, Indianapolis, Indiana, United States of America
| | - Daniel E. Schemenauer
- Department of Chemistry and Biochemistry, Butler University, Indianapolis, Indiana, United States of America
| | - R. Jeremy Johnson
- Department of Chemistry and Biochemistry, Butler University, Indianapolis, Indiana, United States of America
- * E-mail:
| |
Collapse
|
4
|
Roel-Touris J, Don CG, V Honorato R, Rodrigues JPGLM, Bonvin AMJJ. Less Is More: Coarse-Grained Integrative Modeling of Large Biomolecular Assemblies with HADDOCK. J Chem Theory Comput 2019; 15:6358-6367. [PMID: 31539250 PMCID: PMC6854652 DOI: 10.1021/acs.jctc.9b00310] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Predicting the 3D structure of protein interactions remains a challenge in the field of computational structural biology. This is in part due to difficulties in sampling the complex energy landscape of multiple interacting flexible polypeptide chains. Coarse-graining approaches, which reduce the number of degrees of freedom of the system, help address this limitation by smoothing the energy landscape, allowing an easier identification of the global energy minimum. They also accelerate the calculations, allowing for modeling larger assemblies. Here, we present the implementation of the MARTINI coarse-grained force field for proteins into HADDOCK, our integrative modeling platform. Docking and refinement are performed at the coarse-grained level, and the resulting models are then converted back to atomistic resolution through a distance restraints-guided morphing procedure. Our protocol, tested on the largest complexes of the protein docking benchmark 5, shows an overall ∼7-fold speed increase compared to standard all-atom calculations, while maintaining a similar accuracy and yielding substantially more near-native solutions. To showcase the potential of our method, we performed simultaneous 7 body docking to model the 1:6 KaiC-KaiB complex, integrating mutagenesis and hydrogen/deuterium exchange data from mass spectrometry with symmetry restraints, and validated the resulting models against a recently published cryo-EM structure.
Collapse
Affiliation(s)
- Jorge Roel-Touris
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry , Utrecht University , Utrecht 3584CH , The Netherlands
| | - Charleen G Don
- Department of Pharmaceutical Sciences , University of Basel , 4056 Basel , Switzerland
| | - Rodrigo V Honorato
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry , Utrecht University , Utrecht 3584CH , The Netherlands
| | - João P G L M Rodrigues
- Department of Structural Biology , Stanford University School of Medicine , Stanford , California 94305 , United States
| | - Alexandre M J J Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science - Chemistry , Utrecht University , Utrecht 3584CH , The Netherlands
| |
Collapse
|
5
|
Development of a new benchmark for assessing the scoring functions applicable to protein–protein interactions. Future Med Chem 2018; 10:1555-1574. [DOI: 10.4155/fmc-2017-0261] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Aim: Scoring functions are important component of protein–protein docking methods. They need to be evaluated on high-quality benchmarks to reveal their strengths and weaknesses. Evaluation results obtained on such benchmarks can provide valuable guidance for developing more advanced scoring functions. Methodology & results: In our comparative assessment of scoring functions for protein–protein interactions benchmark, the performance of a scoring function was characterized by ‘docking power’ and ‘scoring power’. A high-quality dataset of 273 protein–protein complexes was compiled and employed in both tests. Four scoring functions, including FASTCONTACT, ZRANK, dDFIRE and ATTRACT were tested as demonstration. ZRANK and ATTRACT exhibited encouraging performance in the docking power test. However, all four scoring functions failed badly in the scoring power test. Conclusion: Our comparative assessment of scoring functions for protein–protein interaction benchmark is created especially for assessing the scoring functions applicable to protein–protein interactions. It is different from other benchmarks for assessing protein–protein docking methods. Our benchmark is available to the public at www.pdbbind-cn.org/download/CASF-PPI/ .
Collapse
|
6
|
Abstract
ExoU is a type III-secreted cytotoxin expressing A2 phospholipase activity when injected into eukaryotic target cells by the bacterium Pseudomonas aeruginosa The enzymatic activity of ExoU is undetectable in vitro unless ubiquitin, a required cofactor, is added to the reaction. The role of ubiquitin in facilitating ExoU enzymatic activity is poorly understood but of significance for designing inhibitors to prevent tissue injury during infections with strains of P. aeruginosa producing this toxin. Most ubiquitin-binding proteins, including ExoU, demonstrate a low (micromolar) affinity for monoubiquitin (monoUb). Additionally, ExoU is a large and dynamic protein, limiting the applicability of traditional structural techniques such as NMR and X-ray crystallography to define this protein-protein interaction. Recent advancements in computational methods, however, have allowed high-resolution protein modeling using sparse data. In this study, we combine double electron-electron resonance (DEER) spectroscopy and Rosetta modeling to identify potential binding interfaces of ExoU and monoUb. The lowest-energy scoring model was tested using biochemical, biophysical, and biological techniques. To verify the binding interface, Rosetta was used to design a panel of mutations to modulate binding, including one variant with enhanced binding affinity. Our analyses show the utility of computational modeling when combined with sensitive biological assays and biophysical approaches that are exquisitely suited for large dynamic proteins.
Collapse
|
7
|
Neurotoxic Effects of Linalool and β-Pinene on Tribolium castaneum Herbst. Molecules 2017; 22:molecules22122052. [PMID: 29186788 PMCID: PMC6149882 DOI: 10.3390/molecules22122052] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2017] [Accepted: 11/21/2017] [Indexed: 11/17/2022] Open
Abstract
Effective, ethical pest control requires the use of chemicals that are highly specific, safe, and ecofriendly. Linalool and β-pinene occur naturally as major constituents of the essential oils of many plant species distributed throughout the world, and thus meet these requirements. These monoterpenes were tested as repellents against Tribolium castaneum, using the area preference method, after four hours of exposure and the effect transcriptional of genes associated with neurotransmission. Changes in gene expression of acetylcholinesterase (Ace1), GABA-gated anion channel splice variant 3a6a (Rdl), GABA-gated ion channel (Grd), glutamate-gated chloride channel (Glucl), and histamine-gated chloride channel 2 (Hiscl2) were assessed and the interaction with proteins important for the insect using in silico methods was also studied. For linalool and β-pinene, the repellent concentration 50 (RC50) values were 0.11 µL/cm2 and 0.03 µL/cm2, respectively. Both compounds induced overexpression of Hiscl2 gen in adult insects, and β-pinene also promoted the overexpression of Grd and the Ace1 gene. However, β-pinene and linalool had little potential to dock on computer-generated models for GABA-gated ion channel LCCH3, nicotinic acetylcholine receptor subunits alpha1 and alpha2, and putative octopamine/tyramine receptor proteins from T. castaneum as their respective binding affinities were marginal, and therefore the repellent action probably involved mechanisms other than direct interaction with these targets. Results indicated that β-pinene was more potent than linalool in inducing insect repellency, and also had a greater capacity to generate changes in the expression of genes involved in neuronal transmission.
Collapse
|
8
|
Kundrotas PJ, Anishchenko I, Dauzhenka T, Kotthoff I, Mnevets D, Copeland MM, Vakser IA. Dockground: A comprehensive data resource for modeling of protein complexes. Protein Sci 2017; 27:172-181. [PMID: 28891124 DOI: 10.1002/pro.3295] [Citation(s) in RCA: 47] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Revised: 09/06/2017] [Accepted: 09/07/2017] [Indexed: 12/28/2022]
Abstract
Characterization of life processes at the molecular level requires structural details of protein interactions. The number of experimentally determined structures of protein-protein complexes accounts only for a fraction of known protein interactions. This gap in structural description of the interactome has to be bridged by modeling. An essential part of the development of structural modeling/docking techniques for protein interactions is databases of protein-protein complexes. They are necessary for studying protein interfaces, providing a knowledge base for docking algorithms, and developing intermolecular potentials, search procedures, and scoring functions. Development of protein-protein docking techniques requires thorough benchmarking of different parts of the docking protocols on carefully curated sets of protein-protein complexes. We present a comprehensive description of the Dockground resource (http://dockground.compbio.ku.edu) for structural modeling of protein interactions, including previously unpublished unbound docking benchmark set 4, and the X-ray docking decoy set 2. The resource offers a variety of interconnected datasets of protein-protein complexes and other data for the development and testing of different aspects of protein docking methodologies. Based on protein-protein complexes extracted from the PDB biounit files, Dockground offers sets of X-ray unbound, simulated unbound, model, and docking decoy structures. All datasets are freely available for download, as a whole or selecting specific structures, through a user-friendly interface on one integrated website.
Collapse
Affiliation(s)
- Petras J Kundrotas
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Ivan Anishchenko
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Taras Dauzhenka
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Ian Kotthoff
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Daniil Mnevets
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Matthew M Copeland
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Ilya A Vakser
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, 66045
| |
Collapse
|
9
|
Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone. Proc Natl Acad Sci U S A 2016; 113:15018-15023. [PMID: 27965389 DOI: 10.1073/pnas.1611861114] [Citation(s) in RCA: 32] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Protein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein-protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.
Collapse
|
10
|
Chakravarty D, Janin J, Robert CH, Chakrabarti P. Changes in protein structure at the interface accompanying complex formation. IUCRJ 2015; 2:643-52. [PMID: 26594372 PMCID: PMC4645109 DOI: 10.1107/s2052252515015250] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/12/2015] [Accepted: 08/16/2015] [Indexed: 06/05/2023]
Abstract
Protein interactions are essential in all biological processes. The changes brought about in the structure when a free component forms a complex with another molecule need to be characterized for a proper understanding of molecular recognition as well as for the successful implementation of docking algorithms. Here, unbound (U) and bound (B) forms of protein structures from the Protein-Protein Interaction Affinity Database are compared in order to enumerate the changes that occur at the interface atoms/residues in terms of the solvent-accessible surface area (ASA), secondary structure, temperature factors (B factors) and disorder-to-order transitions. It is found that the interface atoms optimize contacts with the atoms in the partner protein, which leads to an increase in their ASA in the bound interface in the majority (69%) of the proteins when compared with the unbound interface, and this is independent of the root-mean-square deviation between the U and B forms. Changes in secondary structure during the transition indicate a likely extension of helices and strands at the expense of turns and coils. A reduction in flexibility during complex formation is reflected in the decrease in B factors of the interface residues on going from the U form to the B form. There is, however, no distinction in flexibility between the interface and the surface in the monomeric structure, thereby highlighting the potential problem of using B factors for the prediction of binding sites in the unbound form for docking another protein. 16% of the proteins have missing (disordered) residues in the U form which are observed (ordered) in the B form, mostly with an irregular conformation; the data set also shows differences in the composition of interface and non-interface residues in the disordered polypeptide segments as well as differences in their surface burial.
Collapse
Affiliation(s)
- Devlina Chakravarty
- Department of Biochemistry, Bose Institute , P-1/12 CIT Scheme VIIM, Kolkata 700 054, India
| | - Joël Janin
- IBBMC, CNRS UMR 8619, Universite Paris-Sud 11 , Orsay, France
| | - Charles H Robert
- CNRS Laboratoire de Biochimie Theorique, Institut de Biologie Physico-Chimique (IBPC), Universite Paris Diderot, Sorbonne Paris Cité , 13 Rue Pierre et Marie Curie, 75005 Paris, France
| | - Pinak Chakrabarti
- Department of Biochemistry, Bose Institute , P-1/12 CIT Scheme VIIM, Kolkata 700 054, India
| |
Collapse
|
11
|
Soner S, Ozbek P, Garzon JI, Ben-Tal N, Haliloglu T. DynaFace: Discrimination between Obligatory and Non-obligatory Protein-Protein Interactions Based on the Complex's Dynamics. PLoS Comput Biol 2015; 11:e1004461. [PMID: 26506003 PMCID: PMC4623975 DOI: 10.1371/journal.pcbi.1004461] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2015] [Accepted: 07/08/2015] [Indexed: 12/31/2022] Open
Abstract
Protein-protein interfaces have been evolutionarily-designed to enable transduction between the interacting proteins. Thus, we hypothesize that analysis of the dynamics of the complex can reveal details about the nature of the interaction, and in particular whether it is obligatory, i.e., persists throughout the entire lifetime of the proteins, or not. Indeed, normal mode analysis, using the Gaussian network model, shows that for the most part obligatory and non-obligatory complexes differ in their decomposition into dynamic domains, i.e., the mobile elements of the protein complex. The dynamic domains of obligatory complexes often mix segments from the interacting chains, and the hinges between them do not overlap with the interface between the chains. In contrast, in non-obligatory complexes the interface often hinges between dynamic domains, held together through few anchor residues on one side of the interface that interact with their counterpart grooves in the other end. In automatic analysis, 117 of 139 obligatory (84.2%) and 203 of 246 non-obligatory (82.5%) complexes are correctly classified by our method: DynaFace. We further use DynaFace to predict obligatory and non-obligatory interactions among a set of 300 putative protein complexes. DynaFace is available at: http://safir.prc.boun.edu.tr/dynaface.
Collapse
Affiliation(s)
- Seren Soner
- Department of Computer Engineering and Polymer Research Center, Bogazici University, Istanbul, Turkey
| | - Pemra Ozbek
- Department of Bioengineering, Marmara University, Istanbul, Turkey
| | - Jose Ignacio Garzon
- Departments of Biochemistry and Molecular Biophysics and Systems Biology and Howard Hughes Medical Institute, Columbia University, New York, New York, United States of America
| | - Nir Ben-Tal
- Department of Biochemistry and Molecular Biology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Tel Aviv, Israel
| | - Turkan Haliloglu
- Department of Chemical Engineering and Polymer Research Center, Bogazici University, Istanbul, Turkey
- * E-mail:
| |
Collapse
|
12
|
Frezza E, Lavery R. Internal Normal Mode Analysis (iNMA) Applied to Protein Conformational Flexibility. J Chem Theory Comput 2015; 11:5503-12. [DOI: 10.1021/acs.jctc.5b00724] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Elisa Frezza
- BMSSI, UMR 5086 CNRS/Univ.
Lyon I, Institut de Biologie et Chimie des Protéines, 7 passage du Vercors, Lyon 69367, France
| | - Richard Lavery
- BMSSI, UMR 5086 CNRS/Univ.
Lyon I, Institut de Biologie et Chimie des Protéines, 7 passage du Vercors, Lyon 69367, France
| |
Collapse
|
13
|
Zhang W, Zhang H, Zhang T, Fan H, Hao Q. Protein-complex structure completion using IPCAS (Iterative Protein Crystal structure Automatic Solution). ACTA ACUST UNITED AC 2015; 71:1487-92. [PMID: 26143920 DOI: 10.1107/s1399004715008597] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2014] [Accepted: 05/02/2015] [Indexed: 11/10/2022]
Abstract
Protein complexes are essential components in many cellular processes. In this study, a procedure to determine the protein-complex structure from a partial molecular-replacement (MR) solution is demonstrated using a direct-method-aided dual-space iterative phasing and model-building program suite, IPCAS (Iterative Protein Crystal structure Automatic Solution). The IPCAS iteration procedure involves (i) real-space model building and refinement, (ii) direct-method-aided reciprocal-space phase refinement and (iii) phase improvement through density modification. The procedure has been tested with four protein complexes, including two previously unknown structures. It was possible to use IPCAS to build the whole complex structure from one or less than one subunit once the molecular-replacement method was able to give a partial solution. In the most challenging case, IPCAS was able to extend to the full length starting from less than 30% of the complex structure, while conventional model-building procedures were unsuccessful.
Collapse
Affiliation(s)
- Weizhe Zhang
- Department of Physiology, University of Hong Kong, Hong Kong
| | - Hongmin Zhang
- Department of Physiology, University of Hong Kong, Hong Kong
| | - Tao Zhang
- Institute of Physics, Chinese Academy of Sciences, Beijing 100080, People's Republic of China
| | - Haifu Fan
- Institute of Physics, Chinese Academy of Sciences, Beijing 100080, People's Republic of China
| | - Quan Hao
- Department of Physiology, University of Hong Kong, Hong Kong
| |
Collapse
|
14
|
Krull F, Korff G, Elghobashi-Meinhardt N, Knapp EW. ProPairs: A Data Set for Protein–Protein Docking. J Chem Inf Model 2015; 55:1495-507. [DOI: 10.1021/acs.jcim.5b00082] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Florian Krull
- Institute of Chemistry and
Biochemistry, Freie Universität Berlin, Fabeckstrasse 36a, 14195 Berlin, Germany
| | - Gerrit Korff
- Institute of Chemistry and
Biochemistry, Freie Universität Berlin, Fabeckstrasse 36a, 14195 Berlin, Germany
| | - Nadia Elghobashi-Meinhardt
- Institute of Chemistry and
Biochemistry, Freie Universität Berlin, Fabeckstrasse 36a, 14195 Berlin, Germany
| | - Ernst-Walter Knapp
- Institute of Chemistry and
Biochemistry, Freie Universität Berlin, Fabeckstrasse 36a, 14195 Berlin, Germany
| |
Collapse
|
15
|
Anishchenko I, Kundrotas PJ, Tuzikov AV, Vakser IA. Structural templates for comparative protein docking. Proteins 2015; 83:1563-70. [PMID: 25488330 DOI: 10.1002/prot.24736] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2014] [Revised: 11/15/2014] [Accepted: 11/26/2014] [Indexed: 11/07/2022]
Abstract
Structural characterization of protein-protein interactions is important for understanding life processes. Because of the inherent limitations of experimental techniques, such characterization requires computational approaches. Along with the traditional protein-protein docking (free search for a match between two proteins), comparative (template-based) modeling of protein-protein complexes has been gaining popularity. Its development puts an emphasis on full and partial structural similarity between the target protein monomers and the protein-protein complexes previously determined by experimental techniques (templates). The template-based docking relies on the quality and diversity of the template set. We present a carefully curated, nonredundant library of templates containing 4950 full structures of binary complexes and 5936 protein-protein interfaces extracted from the full structures at 12 Å distance cut-off. Redundancy in the libraries was removed by clustering the PDB structures based on structural similarity. The value of the clustering threshold was determined from the analysis of the clusters and the docking performance on a benchmark set. High structural quality of the interfaces in the template and validation sets was achieved by automated procedures and manual curation. The library is included in the Dockground resource for molecular recognition studies at http://dockground.bioinformatics.ku.edu.
Collapse
Affiliation(s)
- Ivan Anishchenko
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, 66047.,United Institute of Informatics Problems, National Academy of Sciences, Minsk, 220012, Belarus
| | - Petras J Kundrotas
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, 66047
| | - Alexander V Tuzikov
- United Institute of Informatics Problems, National Academy of Sciences, Minsk, 220012, Belarus
| | - Ilya A Vakser
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, 66047.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, 66045
| |
Collapse
|
16
|
Anishchenko I, Kundrotas PJ, Tuzikov AV, Vakser IA. Protein models docking benchmark 2. Proteins 2015; 83:891-7. [PMID: 25712716 DOI: 10.1002/prot.24784] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2014] [Revised: 01/30/2015] [Accepted: 02/14/2015] [Indexed: 12/28/2022]
Abstract
Structural characterization of protein-protein interactions is essential for our ability to understand life processes. However, only a fraction of known proteins have experimentally determined structures. Such structures provide templates for modeling of a large part of the proteome, where individual proteins can be docked by template-free or template-based techniques. Still, the sensitivity of the docking methods to the inherent inaccuracies of protein models, as opposed to the experimentally determined high-resolution structures, remains largely untested, primarily due to the absence of appropriate benchmark set(s). Structures in such a set should have predefined inaccuracy levels and, at the same time, resemble actual protein models in terms of structural motifs/packing. The set should also be large enough to ensure statistical reliability of the benchmarking results. We present a major update of the previously developed benchmark set of protein models. For each interactor, six models were generated with the model-to-native C(α) RMSD in the 1 to 6 Å range. The models in the set were generated by a new approach, which corresponds to the actual modeling of new protein structures in the "real case scenario," as opposed to the previous set, where a significant number of structures were model-like only. In addition, the larger number of complexes (165 vs. 63 in the previous set) increases the statistical reliability of the benchmarking. We estimated the highest accuracy of the predicted complexes (according to CAPRI criteria), which can be attained using the benchmark structures. The set is available at http://dockground.bioinformatics.ku.edu.
Collapse
Affiliation(s)
- Ivan Anishchenko
- Center for Bioinformatics, The University of Kansas, Lawrence, Kansas, 66047; United Institute of Informatics Problems, National Academy of Sciences, Minsk, 220012, Belarus
| | | | | | | |
Collapse
|
17
|
Wiederstein M, Gruber M, Frank K, Melo F, Sippl MJ. Structure-based characterization of multiprotein complexes. Structure 2014; 22:1063-70. [PMID: 24954616 PMCID: PMC4087271 DOI: 10.1016/j.str.2014.05.005] [Citation(s) in RCA: 39] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2014] [Revised: 05/02/2014] [Accepted: 05/05/2014] [Indexed: 01/22/2023]
Abstract
Multiprotein complexes govern virtually all cellular processes. Their 3D structures provide important clues to their biological roles, especially through structural correlations among protein molecules and complexes. The detection of such correlations generally requires comprehensive searches in databases of known protein structures by means of appropriate structure-matching techniques. Here, we present a high-speed structure search engine capable of instantly matching large protein oligomers against the complete and up-to-date database of biologically functional assemblies of protein molecules. We use this tool to reveal unseen structural correlations on the level of protein quaternary structure and demonstrate its general usefulness for efficiently exploring complex structural relationships among known protein assemblies.
Collapse
Affiliation(s)
- Markus Wiederstein
- Division of Structural Biology & Bioinformatics, Department of Molecular Biology, University of Salzburg, Hellbrunnerstraße 34, 5020 Salzburg, Austria.
| | - Markus Gruber
- Division of Structural Biology & Bioinformatics, Department of Molecular Biology, University of Salzburg, Hellbrunnerstraße 34, 5020 Salzburg, Austria
| | - Karl Frank
- Division of Structural Biology & Bioinformatics, Department of Molecular Biology, University of Salzburg, Hellbrunnerstraße 34, 5020 Salzburg, Austria
| | - Francisco Melo
- Departamento de Genetica Molecular y Microbiologia, Facultad de Ciencias Biologicas, Pontificia Universidad Catolica de Chile, Alameda 340, 8320000 Santiago, Chile; Molecular Bioinformatics Laboratory, Millennium Institute on Immunology and Immunotherapy, 8320000 Santiago, Chile
| | - Manfred J Sippl
- Division of Structural Biology & Bioinformatics, Department of Molecular Biology, University of Salzburg, Hellbrunnerstraße 34, 5020 Salzburg, Austria
| |
Collapse
|
18
|
Template-based structure modeling of protein-protein interactions. Curr Opin Struct Biol 2013; 24:10-23. [PMID: 24721449 DOI: 10.1016/j.sbi.2013.11.005] [Citation(s) in RCA: 116] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2013] [Revised: 10/29/2013] [Accepted: 11/21/2013] [Indexed: 01/21/2023]
Abstract
The structure of protein-protein complexes can be constructed by using the known structure of other protein complexes as a template. The complex structure templates are generally detected either by homology-based sequence alignments or, given the structure of monomer components, by structure-based comparisons. Critical improvements have been made in recent years by utilizing interface recognition and by recombining monomer and complex template libraries. Encouraging progress has also been witnessed in genome-wide applications of template-based modeling, with modeling accuracy comparable to high-throughput experimental data. Nevertheless, bottlenecks exist due to the incompleteness of the protein-protein complex structure library and the lack of methods for distant homologous template identification and full-length complex structure refinement.
Collapse
|
19
|
Abstract
Background The adaptive immune response is antigen-specific and triggered by pathogen recognition through T cells. Although the interactions and mechanisms of TCR-peptide-MHC (TCR-pMHC) have been studied over three decades, the biological basis for these processes remains controversial. As an increasing number of high-throughput binding epitopes and available TCR-pMHC complex structures, a fast genome-wide structural modelling of TCR-pMHC interactions is an emergent task for understanding immune interactions and developing peptide vaccines. Results We first constructed the PPI matrices and iMatrix, using 621 non-redundant PPI interfaces and 398 non-redundant antigen-antibody interfaces, respectively, for modelling the MHC-peptide and TCR-peptide interfaces, respectively. The iMatrix consists of four knowledge-based scoring matrices to evaluate the hydrogen bonds and van der Waals forces between sidechains or backbones, respectively. The predicted energies of iMatrix are high correlated (Pearson's correlation coefficient is 0.6) to 70 experimental free energies on antigen-antibody interfaces. To further investigate iMatrix and PPI matrices, we inferred the 701,897 potential peptide antigens with significant statistic from 389 pathogen genomes and modelled the TCR-pMHC interactions using available TCR-pMHC complex structures. These identified peptide antigens keep hydrogen-bond energies and consensus interactions and our TCR-pMHC models can provide detailed interacting models and crucial binding regions. Conclusions Experimental results demonstrate that our method can achieve high precision for predicting binding affinity and potential peptide antigens. We believe that iMatrix and our template-based method can be useful for the binding mechanisms of TCR-pMHC complexes and peptide vaccine designs.
Collapse
|
20
|
Skolnick J, Zhou H, Gao M. Are predicted protein structures of any value for binding site prediction and virtual ligand screening? Curr Opin Struct Biol 2013; 23:191-7. [PMID: 23415854 DOI: 10.1016/j.sbi.2013.01.009] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2012] [Revised: 01/04/2013] [Accepted: 01/23/2013] [Indexed: 01/03/2023]
Abstract
The recently developed field of ligand homology modeling (LHM) that extends the ideas of protein homology modeling to the prediction of ligand binding sites and for use in virtual ligand screening has emerged as a powerful new approach. Unlike traditional docking methodologies, LHM can be applied to low-to-moderate resolution predicted as well as experimental structures with little if any diminution in performance; thereby enabling ≈ 75% of an average proteome to have potentially significant virtual screening predictions. In large scale benchmarking, LHM is able to predict off-target ligand binding. Thus, despite the widespread belief to the contrary, low-to-moderate resolution predicted structures have considerable utility for biochemical function prediction.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318, USA.
| | | | | |
Collapse
|
21
|
McDermott JE, Wang J, Mitchell H, Webb-Robertson BJ, Hafen R, Ramey J, Rodland KD. Challenges in Biomarker Discovery: Combining Expert Insights with Statistical Analysis of Complex Omics Data. ACTA ACUST UNITED AC 2012; 7:37-51. [PMID: 23335946 DOI: 10.1517/17530059.2012.718329] [Citation(s) in RCA: 118] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
INTRODUCTION: The advent of high throughput technologies capable of comprehensive analysis of genes, transcripts, proteins and other significant biological molecules has provided an unprecedented opportunity for the identification of molecular markers of disease processes. However, it has simultaneously complicated the problem of extracting meaningful molecular signatures of biological processes from these complex datasets. The process of biomarker discovery and characterization provides opportunities for more sophisticated approaches to integrating purely statistical and expert knowledge-based approaches. AREAS COVERED: In this review we will present examples of current practices for biomarker discovery from complex omic datasets and the challenges that have been encountered in deriving valid and useful signatures of disease. We will then present a high-level review of data-driven (statistical) and knowledge-based methods applied to biomarker discovery, highlighting some current efforts to combine the two distinct approaches. EXPERT OPINION: Effective, reproducible and objective tools for combining data-driven and knowledge-based approaches to identify predictive signatures of disease are key to future success in the biomarker field. We will describe our recommendations for possible approaches to this problem including metrics for the evaluation of biomarkers.
Collapse
|
22
|
Kundrotas PJ, Zhu Z, Vakser IA. GWIDD: a comprehensive resource for genome-wide structural modeling of protein-protein interactions. Hum Genomics 2012; 6:7. [PMID: 23245398 PMCID: PMC3500202 DOI: 10.1186/1479-7364-6-7] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2012] [Accepted: 07/11/2012] [Indexed: 11/10/2022] Open
Abstract
Protein-protein interactions are a key component of life processes. The knowledge of the three-dimensional structure of these interactions is important for understanding protein function. Genome-Wide Docking Database (http://gwidd.bioinformatics.ku.edu) offers an extensive source of data for structural studies of protein-protein complexes on genome scale. The current release of the database combines the available experimental data on the structure and characteristics of protein interactions with structural modeling of protein complexes for 771 organisms spanned over the entire universe of life from viruses to humans. The interactions are stored in a relational database with user-friendly interface that includes various search options. The search results can be interactively previewed; the structures, downloaded, along with the interaction characteristics.
Collapse
|
23
|
Venkatraman V, Ritchie DW. Flexible protein docking refinement using pose-dependent normal mode analysis. Proteins 2012; 80:2262-74. [PMID: 22610423 DOI: 10.1002/prot.24115] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2011] [Revised: 04/10/2012] [Accepted: 05/12/2012] [Indexed: 11/10/2022]
Abstract
Modeling conformational changes in protein docking calculations is challenging. To make the calculations tractable, most current docking algorithms typically treat proteins as rigid bodies and use soft scoring functions that implicitly accommodate some degree of flexibility. Alternatively, ensembles of structures generated from molecular dynamics (MD) may be cross-docked. However, such combinatorial approaches can produce many thousands or even millions of docking poses, and require fast and sensitive scoring functions to distinguish them. Here, we present a novel approach called "EigenHex," which is based on normal mode analyses (NMAs) of a simple elastic network model of protein flexibility. We initially assume that the proteins to be docked are rigid, and we begin by performing conventional soft docking using the Hex polar Fourier correlation algorithm. We then apply a pose-dependent NMA to each of the top 1000 rigid body docking solutions, and we sample and re-score multiple perturbed docking conformations generated from linear combinations of up to 20 eigenvectors using a multi-threaded particle swarm optimization algorithm. When applied to the 63 "rigid body" targets of the Protein Docking Benchmark version 2.0, our results show that sampling and re-scoring from just one to three eigenvectors gives a modest but consistent improvement for these targets. Thus, pose-dependent NMA avoids the need to sample multiple eigenvectors and it offers a promising alternative to combinatorial cross-docking.
Collapse
|
24
|
Garma L, Mukherjee S, Mitra P, Zhang Y. How many protein-protein interactions types exist in nature? PLoS One 2012; 7:e38913. [PMID: 22719985 PMCID: PMC3374795 DOI: 10.1371/journal.pone.0038913] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2012] [Accepted: 05/14/2012] [Indexed: 11/18/2022] Open
Abstract
“Protein quaternary structure universe” refers to the ensemble of all protein-protein complexes across all organisms in nature. The number of quaternary folds thus corresponds to the number of ways proteins physically interact with other proteins. This study focuses on answering two basic questions: Whether the number of protein-protein interactions is limited and, if yes, how many different quaternary folds exist in nature. By all-to-all sequence and structure comparisons, we grouped the protein complexes in the protein data bank (PDB) into 3,629 families and 1,761 folds. A statistical model was introduced to obtain the quantitative relation between the numbers of quaternary families and quaternary folds in nature. The total number of possible protein-protein interactions was estimated around 4,000, which indicates that the current protein repository contains only 42% of quaternary folds in nature and a full coverage needs approximately a quarter century of experimental effort. The results have important implications to the protein complex structural modeling and the structure genomics of protein-protein interactions.
Collapse
Affiliation(s)
- Leonardo Garma
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
- Biocenter Oulu and Department of Biochemistry, University of Oulu, Oulu, Finland
| | - Srayanta Mukherjee
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Pralay Mitra
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail:
| |
Collapse
|
25
|
Ceres N, Lavery R. Coarse-grain Protein Models. INNOVATIONS IN BIOMOLECULAR MODELING AND SIMULATIONS 2012. [DOI: 10.1039/9781849735049-00219] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]
Abstract
Coarse-graining is a powerful approach for modeling biomolecules that, over the last few decades, has been extensively applied to proteins. Coarse-grain models offer access to large systems and to slow processes without becoming computationally unmanageable. In addition, they are very versatile, enabling both the protein representation and the energy function to be adapted to the biological problem in hand. This review concentrates on modeling soluble proteins and their assemblies. It presents an overview of the coarse-grain representations, of the associated interaction potentials, and of the optimization procedures used to define them. It then shows how coarse-grain models have been used to understand processes involving proteins, from their initial folding to their functional properties, their binary interactions, and the assembly of large complexes.
Collapse
Affiliation(s)
- N. Ceres
- Bases Moléculaires et Structurales des Systèmes Infectieux Université Lyon1/CNRS UMR 5086, IBCP, 7 Passage du Vercors, 69367, Lyon France
| | - R. Lavery
- Bases Moléculaires et Structurales des Systèmes Infectieux Université Lyon1/CNRS UMR 5086, IBCP, 7 Passage du Vercors, 69367, Lyon France
| |
Collapse
|
26
|
Molecular systems biology of Sic1 in yeast cell cycle regulation through multiscale modeling. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2012; 736:135-67. [PMID: 22161326 DOI: 10.1007/978-1-4419-7210-1_7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Cell cycle control is highly regulated to guarantee the precise timing of events essential for cell growth, i.e., DNA replication onset and cell division. Failure of this control plays a role in cancer and molecules called cyclin-dependent kinase (Cdk) inhibitors (Ckis) exploit a critical function in cell cycle timing. Here we present a multiscale modeling where experimental and computational studies have been employed to investigate structure, function and temporal dynamics of the Cki Sic1 that regulates cell cycle progression in Saccharomyces cerevisiae. Structural analyses reveal molecular details of the interaction between Sic1 and Cdk/cyclin complexes, and biochemical investigation reveals Sic1 function in analogy to its human counterpart p27(Kip1), whose deregulation leads to failure in timing of kinase activation and, therefore, to cancer. Following these findings, a bottom-up systems biology approach has been developed to characterize modular networks addressing Sic1 regulatory function. Through complementary experimentation and modeling, we suggest a mechanism that underlies Sic1 function in controlling temporal waves of cyclins to ensure correct timing of the phase-specific Cdk activities.
Collapse
|
27
|
AUNG ZEYAR, TAN SOONHENG, NG SEEKIONG, TAN KIANLEE. PPiClust: EFFICIENT CLUSTERING OF 3D PROTEIN–PROTEIN INTERACTION INTERFACES. J Bioinform Comput Biol 2011; 6:415-33. [DOI: 10.1142/s0219720008003485] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2007] [Revised: 12/01/2007] [Accepted: 01/03/2008] [Indexed: 11/18/2022]
Abstract
The biological mechanisms through which proteins interact with one another are best revealed by studying the structural interfaces between interacting proteins. Protein–protein interfaces can be extracted from three-dimensional (3D) structural data of protein complexes and then clustered to derive biological insights. However, conventional protein interface clustering methods lack computational scalability and statistical support. In this work, we present a new method named "PPiClust" to systematically encode, cluster, and analyze similar 3D interface patterns in protein complexes efficiently. Experimental results showed that our method is effective in discovering visually consistent and statistically significant clusters of interfaces, and at the same time sufficiently time-efficient to be performed on a single computer. The interface clusters are also useful for uncovering the structural basis of protein interactions. Analysis of the resulting interface clusters revealed groups of structurally diverse proteins having similar interface patterns. We also found, in some of the interface clusters, the presence of well-known linear binding motifs which were noncontiguous in the primary sequences. These results suggest that PPiClust can discover not only statistically significant, but also biologically significant, protein interface clusters from protein complex structural data.
Collapse
Affiliation(s)
- ZEYAR AUNG
- Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613, Singapore
| | - SOON-HENG TAN
- Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613, Singapore
| | - SEE-KIONG NG
- Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613, Singapore
| | - KIAN-LEE TAN
- School of Computing, National University of Singapore, Law Link, Singapore 117590, Singapore
| |
Collapse
|
28
|
LIU KANGPING, HSU KAICHENG, HUANG JHANGWEI, CHANG LUSHIAN, YANG JINNMOON. ATRIPPI: AN ATOM-RESIDUE PREFERENCE SCORING FUNCTION FOR PROTEIN–PROTEIN INTERACTIONS. INT J ARTIF INTELL T 2011. [DOI: 10.1142/s0218213010000169] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
We present an ATRIPPI model for analyzing protein–protein interactions. This model is a 167-atom-type and residue-specific interaction preferences with distance bins derived from 641 co-crystallized protein–protein interfaces. The ATRIPPI model is able to yield physical meanings of hydrogen bonding, disulfide bonding, electrostatic interactions, van der Waals and aromatic–aromatic interactions. We applied this model to identify the native states and near-native complex structures on 17 bound and 17 unbound complexes from thousands of decoy structures. On average, 77.5% structures (155 structures) of top rank 200 structures are closed to the native structure. These results suggest that the ATRIPPI model is able to keep the advantages of both atom–atom and residue–residue interactions and is a potential knowledge-based scoring function for protein–protein docking methods. We believe that our model is robust and provides biological meanings to support protein–protein interactions.
Collapse
Affiliation(s)
- KANG-PING LIU
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
| | - KAI-CHENG HSU
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
| | - JHANG-WEI HUANG
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
| | - LU-SHIAN CHANG
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
| | - JINN-MOON YANG
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
- Department of Biological Science and Technology, National Chiao Tung University, Hsinchu, Taiwan
- Core Facility for Structural Bioinformatics, National Chiao Tung University, Hsinchu, Taiwan
| |
Collapse
|
29
|
Mukherjee S, Zhang Y. Protein-protein complex structure predictions by multimeric threading and template recombination. Structure 2011; 19:955-66. [PMID: 21742262 DOI: 10.1016/j.str.2011.04.006] [Citation(s) in RCA: 128] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2010] [Revised: 03/30/2011] [Accepted: 04/01/2011] [Indexed: 10/18/2022]
Abstract
The total number of protein-protein complex structures currently available in the Protein Data Bank (PDB) is six times smaller than the total number of tertiary structures in the PDB, which limits the power of homology-based approaches to complex structure modeling. We present a threading-recombination approach, COTH, to boost the protein complex structure library by combining tertiary structure templates with complex alignments. The query sequences are first aligned to complex templates using a modified dynamic programming algorithm, guided by ab initio binding-site predictions. The monomer alignments are then shifted to the multimeric template framework by structural alignments. COTH was tested on 500 nonhomologous dimeric proteins, which can successfully detect correct templates for 50% of the cases after homologous templates are excluded, which significantly outperforms conventional homology modeling algorithms. It also shows a higher accuracy in interface modeling than rigid-body docking of unbound structures from ZDOCK although with lower coverage. These data demonstrate new avenues to model complex structures from nonhomologous templates.
Collapse
Affiliation(s)
- Srayanta Mukherjee
- Center for Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109-2218, USA
| | | |
Collapse
|
30
|
Ghoorah AW, Devignes MD, Smaïl-Tabbone M, Ritchie DW. Spatial clustering of protein binding sites for template based protein docking. Bioinformatics 2011; 27:2820-7. [DOI: 10.1093/bioinformatics/btr493] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
31
|
Guharoy M, Janin J, Robert CH. Side-chain rotamer transitions at protein-protein interfaces. Proteins 2011; 78:3219-25. [PMID: 20737439 DOI: 10.1002/prot.22821] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
We compare the changes in side chain conformations that accompany the formation of protein-protein complexes, in residues forming either the interface or the remainder of the solvent-accessible surface of the proteins in the Docking Benchmark 3.0. We find that the interface residues undergo significantly more changes than other surface residues, and these changes are more likely to convert them from a high-energy torsion angle state to a lower-energy one than the reverse. Moreover, in both the unbound proteins and the complexes, the interface residues are more frequently found to be in a high-energy torsion angle state than the noninterface residues. As these differences exist before the binding step, they may be relevant to specificity and help in identifying binding sites for docking predictions.
Collapse
Affiliation(s)
- Mainak Guharoy
- CNRS Laboratoire de Biochimie Théorique, Institut de Biologie Physico-Chimique (IBPC), Paris, France
| | | | | |
Collapse
|
32
|
Mooney C, Davey N, Martin AJM, Walsh I, Shields DC, Pollastri G. In silico protein motif discovery and structural analysis. Methods Mol Biol 2011; 760:341-53. [PMID: 21780007 DOI: 10.1007/978-1-61779-176-5_21] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
A wealth of in silico tools is available for protein motif discovery and structural analysis. The aim of this chapter is to collect some of the most common and useful tools and to guide the biologist in their use. A detailed explanation is provided for the use of Distill, a suite of web servers for the prediction of protein structural features and the prediction of full-atom 3D models from a protein sequence. Besides this, we also provide pointers to many other tools available for motif discovery and secondary and tertiary structure prediction from a primary amino acid sequence. The prediction of protein intrinsic disorder and the prediction of functional sites and SLiMs are also briefly discussed. Given that user queries vary greatly in size, scope and character, the trade-offs in speed, accuracy and scale need to be considered when choosing which methods to adopt.
Collapse
Affiliation(s)
- Catherine Mooney
- Complex and Adaptive Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland.
| | | | | | | | | | | |
Collapse
|
33
|
Ma S, Freedman TB, Dukor RK, Nafie LA. Near-infrared and mid-infrared Fourier transform vibrational circular dichroism of proteins in aqueous solution. APPLIED SPECTROSCOPY 2010; 64:615-626. [PMID: 20537229 DOI: 10.1366/000370210791414434] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
Vibrational circular dichroism (VCD) of a series of proteins in H(2)O solution with differing secondary structure are reported for the first time in the near-infrared (NIR) region as well as the NH-stretching region. The Fourier transform (FT) near-infrared (NIR) measurements were carried out between 6000 to 4000 cm(-1). FT-VCD measurements were simultaneously carried out for the mid-infrared (mid-IR) region from 2000 to 800 cm(-1) for direct comparison to VCD in the NIR region. The NIR VCD spectra of proteins show distinct spectral features for different protein structural motifs, indicating a valuable new method to study protein conformations. The principal VCD transitions in the NIR region are two combination bands, the amide A-II and B-II bands, of the amide A and B fundamentals with the amide II fundamental, and the second overtone of the amide II, referred to as the amide 3 x II band. VCD in the amide A and B band region consisting primarily of NH stretching motions were successfully obtained in H(2)O for the first time for an insulin fibril sample. Similar to the enhanced VCD signal observed in amide I and II regions, the amide A and B VCD of insulin fibril shows strong intensity enhancements, providing an additional valuable probe of protein fibril growth and development in solution. The relative sensitivities of the mid-IR, N-H stretching, and NIR regions are discussed.
Collapse
Affiliation(s)
- Shengli Ma
- Department of Chemistry, Syracuse University, Syracuse, New York 13244, USA
| | | | | | | |
Collapse
|
34
|
Tuncbag N, Kar G, Gursoy A, Keskin O, Nussinov R. Towards inferring time dimensionality in protein-protein interaction networks by integrating structures: the p53 example. MOLECULAR BIOSYSTEMS 2010; 5:1770-8. [PMID: 19585003 PMCID: PMC2898629 DOI: 10.1039/b905661k] [Citation(s) in RCA: 61] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Structural data, efficient structural comparison algorithms and appropriate datasets and filters can assist in getting an insight into time dimensionality in interaction networks; in predicting which interactions can and cannot co-exist; and in obtaining concrete predictions consistent with experiment.
Inspection of protein–protein interaction maps illustrates that a hub protein can interact with a very large number of proteins, reaching tens and even hundreds. Since a single protein cannot interact with such a large number of partners at the same time, this presents a challenge: can we figure out which interactions can occur simultaneously and which are mutually excluded? Addressing this question adds a fourth dimension into interaction maps: that of time. Including the time dimension in structural networks is an immense asset; time dimensionality transforms network node-and-edge maps into cellular processes, assisting in the comprehension of cellular pathways and their regulation. While the time dimensionality can be further enhanced by linking protein complexes to time series of mRNA expression data, current robust, network experimental data are lacking. Here we outline how, using structural data, efficient structural comparison algorithms and appropriate datasets and filters can assist in getting an insight into time dimensionality in interaction networks; in predicting which interactions can and cannot co-exist; and in obtaining concrete predictions consistent with experiment. As an example, we present p53-linked processes.
Collapse
Affiliation(s)
- Nurcan Tuncbag
- Koc University, Center for Computational Biology and Bioinformatics, College of Engineering, Rumelifeneri Yolu, 34450 Sariyer Istanbul, Turkey
| | | | | | | | | |
Collapse
|
35
|
Abstract
The quaternary structure (QS) of a protein is determined by measuring its molecular weight in solution. The data have to be extracted from the literature, and they may be missing even for proteins that have a crystal structure reported in the Protein Data Bank (PDB). The PDB and other databases derived from it report QS information that either was obtained from the depositors or is based on an analysis of the contacts between polypeptide chains in the crystal, and this frequently differs from the QS determined in solution.The QS of a protein can be predicted from its sequence using either homology or threading methods. However, a majority of the proteins with less than 30% sequence identity have different QSs. A model of the QS can also be derived by docking the subunits when their 3D structure is independently known, but the model is likely to be incorrect if large conformation changes take place when the oligomer assembles.
Collapse
Affiliation(s)
- Anne Poupon
- Yeast Structural Genomics, IBBMC UMR 8619 CNRS, Université Paris-Sud, Orsay, France
| | | |
Collapse
|
36
|
Kar G, Gursoy A, Keskin O. Human cancer protein-protein interaction network: a structural perspective. PLoS Comput Biol 2009; 5:e1000601. [PMID: 20011507 PMCID: PMC2785480 DOI: 10.1371/journal.pcbi.1000601] [Citation(s) in RCA: 141] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2009] [Accepted: 11/05/2009] [Indexed: 01/12/2023] Open
Abstract
Protein-protein interaction networks provide a global picture of cellular function and biological processes. Some proteins act as hub proteins, highly connected to others, whereas some others have few interactions. The dysfunction of some interactions causes many diseases, including cancer. Proteins interact through their interfaces. Therefore, studying the interface properties of cancer-related proteins will help explain their role in the interaction networks. Similar or overlapping binding sites should be used repeatedly in single interface hub proteins, making them promiscuous. Alternatively, multi-interface hub proteins make use of several distinct binding sites to bind to different partners. We propose a methodology to integrate protein interfaces into cancer interaction networks (ciSPIN, cancer structural protein interface network). The interactions in the human protein interaction network are replaced by interfaces, coming from either known or predicted complexes. We provide a detailed analysis of cancer related human protein-protein interfaces and the topological properties of the cancer network. The results reveal that cancer-related proteins have smaller, more planar, more charged and less hydrophobic binding sites than non-cancer proteins, which may indicate low affinity and high specificity of the cancer-related interactions. We also classified the genes in ciSPIN according to phenotypes. Within phenotypes, for breast cancer, colorectal cancer and leukemia, interface properties were found to be discriminating from non-cancer interfaces with an accuracy of 71%, 67%, 61%, respectively. In addition, cancer-related proteins tend to interact with their partners through distinct interfaces, corresponding mostly to multi-interface hubs, which comprise 56% of cancer-related proteins, and constituting the nodes with higher essentiality in the network (76%). We illustrate the interface related affinity properties of two cancer-related hub proteins: Erbb3, a multi interface, and Raf1, a single interface hub. The results reveal that affinity of interactions of the multi-interface hub tends to be higher than that of the single-interface hub. These findings might be important in obtaining new targets in cancer as well as finding the details of specific binding regions of putative cancer drug candidates. Protein-protein interaction networks provide a global picture of cellular function and biological processes. The dysfunction of some interactions causes many diseases, including cancer. Proteins interact through their interfaces. Therefore, studying the interface properties of cancer-related proteins will help explain their role in the interaction networks. The structural details of interfaces are immensely useful in efforts to answer some fundamental questions such as: (i) what features of cancer-related protein interfaces make them act as hubs; (ii) how hub protein interfaces can interact with tens of other proteins with varying affinities; and (iii) which interactions can occur simultaneously and which are mutually exclusive. Addressing these questions, we propose a method to characterize interactions in a human protein-protein interaction network using three-dimensional protein structures and interfaces. Protein interface analysis shows that the strength and specificity of the interactions of hub proteins and cancer proteins are different than the interactions of non-hub and non-cancer proteins, respectively. In addition, distinguishing overlapping from non-overlapping interfaces, we illustrate how a fourth dimension, that of the sequence of processes, is integrated into the network with case studies. We believe that such an approach should be useful in structural systems biology.
Collapse
Affiliation(s)
- Gozde Kar
- Center for Computational Biology and Bioinformatics and College of Engineering, Koc University, Rumeli Feneri Yolu, Sariyer Istanbul, Turkey
| | | | | |
Collapse
|
37
|
Abstract
Structural information on interacting proteins is important for understanding life processes at the molecular level. Genome-wide docking database is an integrated resource for structural studies of protein-protein interactions on the genome scale, which combines the available experimental data with models obtained by docking techniques. Current database version (August 2009) contains 25 559 experimental and modeled 3D structures for 771 organisms spanned over the entire universe of life from viruses to humans. Data are organized in a relational database with user-friendly search interface allowing exploration of the database content by a number of parameters. Search results can be interactively previewed and downloaded as PDB-formatted files, along with the information relevant to the specified interactions. The resource is freely available at http://gwidd.bioinformatics.ku.edu.
Collapse
Affiliation(s)
- Petras J Kundrotas
- Department of Molecular Biosciences, The University of Kansas, Center for Bioinformatics and Lawrence, KS 66047, USA
| | | | | |
Collapse
|
38
|
Chen CC, Hwang JK, Yang JM. (PS)2-v2: template-based protein structure prediction server. BMC Bioinformatics 2009; 10:366. [PMID: 19878598 PMCID: PMC2775752 DOI: 10.1186/1471-2105-10-366] [Citation(s) in RCA: 93] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2009] [Accepted: 10/31/2009] [Indexed: 03/11/2024] Open
Abstract
Background Template selection and target-template alignment are critical steps for template-based modeling (TBM) methods. To identify the template for the twilight zone of 15~25% sequence similarity between targets and templates is still difficulty for template-based protein structure prediction. This study presents the (PS)2-v2 server, based on our original server with numerous enhancements and modifications, to improve reliability and applicability. Results To detect homologous proteins with remote similarity, the (PS)2-v2 server utilizes the S2A2 matrix, which is a 60 × 60 substitution matrix using the secondary structure propensities of 20 amino acids, and the position-specific sequence profile (PSSM) generated by PSI-BLAST. In addition, our server uses multiple templates and multiple models to build and assess models. Our method was evaluated on the Lindahl benchmark for fold recognition and ProSup benchmark for sequence alignment. Evaluation results indicated that our method outperforms sequence-profile approaches, and had comparable performance to that of structure-based methods on these benchmarks. Finally, we tested our method using the 154 TBM targets of the CASP8 (Critical Assessment of Techniques for Protein Structure Prediction) dataset. Experimental results show that (PS)2-v2 is ranked 6th among 72 severs and is faster than the top-rank five serves, which utilize ab initio methods. Conclusion Experimental results demonstrate that (PS)2-v2 with the S2A2 matrix is useful for template selections and target-template alignments by blending the amino acid and structural propensities. The multiple-template and multiple-model strategies are able to significantly improve the accuracies for target-template alignments in the twilight zone. We believe that this server is useful in structure prediction and modeling, especially in detecting homologous templates with sequence similarity in the twilight zone.
Collapse
Affiliation(s)
- Chih-Chieh Chen
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu 30050, Taiwan, Republic of China.
| | | | | |
Collapse
|
39
|
Mooney C, Pollastri G. Beyond the Twilight Zone: Automated prediction of structural properties of proteins by recursive neural networks and remote homology information. Proteins 2009; 77:181-90. [DOI: 10.1002/prot.22429] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
40
|
Bordoli L, Kiefer F, Arnold K, Benkert P, Battey J, Schwede T. Protein structure homology modeling using SWISS-MODEL workspace. Nat Protoc 2009; 4:1-13. [PMID: 19131951 DOI: 10.1038/nprot.2008.197] [Citation(s) in RCA: 912] [Impact Index Per Article: 60.8] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Homology modeling aims to build three-dimensional protein structure models using experimentally determined structures of related family members as templates. SWISS-MODEL workspace is an integrated Web-based modeling expert system. For a given target protein, a library of experimental protein structures is searched to identify suitable templates. On the basis of a sequence alignment between the target protein and the template structure, a three-dimensional model for the target protein is generated. Model quality assessment tools are used to estimate the reliability of the resulting models. Homology modeling is currently the most accurate computational method to generate reliable structural models and is routinely used in many biological applications. Typically, the computational effort for a modeling project is less than 2 h. However, this does not include the time required for visualization and interpretation of the model, which may vary depending on personal experience working with protein structures.
Collapse
Affiliation(s)
- Lorenza Bordoli
- Biozentrum, University of Basel, Klingelbergstrasse 50-70, CH 4056 Basel, Switzerland
| | | | | | | | | | | |
Collapse
|
41
|
Abstract
AbstractProtein–protein recognition plays an essential role in structure and function. Specific non-covalent interactions stabilize the structure of macromolecular assemblies, exemplified in this review by oligomeric proteins and the capsids of icosahedral viruses. They also allow proteins to form complexes that have a very wide range of stability and lifetimes and are involved in all cellular processes. We present some of the structure-based computational methods that have been developed to characterize the quaternary structure of oligomeric proteins and other molecular assemblies and analyze the properties of the interfaces between the subunits. We compare the size, the chemical and amino acid compositions and the atomic packing of the subunit interfaces of protein–protein complexes, oligomeric proteins, viral capsids and protein–nucleic acid complexes. These biologically significant interfaces are generally close-packed, whereas the non-specific interfaces between molecules in protein crystals are loosely packed, an observation that gives a structural basis to specific recognition. A distinction is made within each interface between a core that contains buried atoms and a solvent accessible rim. The core and the rim differ in their amino acid composition and their conservation in evolution, and the distinction helps correlating the structural data with the results of site-directed mutagenesis and in vitro studies of self-assembly.
Collapse
|
42
|
Chiu YY, Hwang JK, Yang JM. Soft energy function and generic evolutionary method for discriminating native from nonnative protein conformations. J Comput Chem 2008; 29:1364-73. [PMID: 18181137 DOI: 10.1002/jcc.20897] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We have developed a soft energy function, termed GEMSCORE, for the protein structure prediction, which is one of emergent issues in the computational biology. The GEMSORE consists of the van der Waals, the hydrogen-bonding potential and the solvent potential with 12 parameters which are optimized by using a generic evolutionary method. The GEMSCORE is able to successfully identify 86 native proteins among 96 target proteins on six decoy sets from more 70,000 near-native structures. For these six benchmark datasets, the predictive performance of the GEMSCORE, based on native structure ranking and Z-scores, was superior to eight other energy functions. Our method is based solely on a simple and linear function and thus is considerably faster than other methods that rely on the additional complex calculations. In addition, the GEMSCORE recognized 17 and 2 native structures as the first and the second rank, respectively, among 21 targets in CASP6 (Critical Assessment of Techniques for Protein Structure Prediction). These results suggest that the GEMSCORE is fast and performs well to discriminate between native and nonnative structures from thousands of protein structure candidates. We believe that GEMSCORE is robust and should be a useful energy function for the protein structure prediction.
Collapse
Affiliation(s)
- Yi-yuan Chiu
- Institute of Bioinformatics, National Chiao Tung University, Hsinchu 30050, Taiwan
| | | | | |
Collapse
|
43
|
Fukuhara N, Kawabata T. HOMCOS: a server to predict interacting protein pairs and interacting sites by homology modeling of complex structures. Nucleic Acids Res 2008; 36:W185-9. [PMID: 18442990 PMCID: PMC2447736 DOI: 10.1093/nar/gkn218] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2008] [Revised: 04/04/2008] [Accepted: 04/09/2008] [Indexed: 11/18/2022] Open
Abstract
As protein-protein interactions are crucial in most biological processes, it is valuable to understand how and where protein pairs interact. We developed a web server HOMCOS (Homology Modeling of Complex Structure, http://biunit.naist.jp/homcos) to predict interacting protein pairs and interacting sites by homology modeling of complex structures. Our server is capable of three services. The first is modeling heterodimers from two query amino acid sequences posted by users. The server performs BLAST searches to identify homologous templates in the latest representative dataset of heterodimer structures generated from the PQS database. Structure validity is evaluated by the combination of sequence similarity and knowledge-based contact potential energy as previously described. The server generates a sequence-replaced model PDB file and a MODELLER script to build full atomic models of complex structures. The second service is modeling homodimers from one query sequence. The third service is identification of potentially interacting proteins for one query sequence. The server searches the dataset of heterodimer structures for a homologous template, outputs the candidate interacting sequences in the Uniprot database homologous for the interacting partner template proteins. These features are useful for wide range of researchers to predict putative interaction sites and interacting proteins.
Collapse
Affiliation(s)
- Naoshi Fukuhara
- Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0192 and CREST, Japan Science and Technology Agency, Japan
| | - Takeshi Kawabata
- Graduate School of Information Science, Nara Institute of Science and Technology, 8916-5 Takayama, Ikoma, Nara 630-0192 and CREST, Japan Science and Technology Agency, Japan
| |
Collapse
|
44
|
Levy ED, Pereira-Leal JB. Evolution and dynamics of protein interactions and networks. Curr Opin Struct Biol 2008; 18:349-57. [DOI: 10.1016/j.sbi.2008.03.003] [Citation(s) in RCA: 77] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2007] [Revised: 03/04/2008] [Accepted: 03/04/2008] [Indexed: 12/29/2022]
|
45
|
Abstract
In a cell, it has been estimated that each protein on average interacts with roughly 10 others, resulting in tens of thousands of proteins known or suspected to have interaction partners; of these, only a tiny fraction have solved protein structures. To partially address this problem, we have developed M-TASSER, a hierarchical method to predict protein quaternary structure from sequence that involves template identification by multimeric threading, followed by multimer model assembly and refinement. The final models are selected by structure clustering. M-TASSER has been tested on a benchmark set comprising 241 dimers having templates with weak sequence similarity and 246 without multimeric templates in the dimer library. Of the total of 207 targets predicted to interact as dimers, 165 (80%) were correctly assigned as interacting with a true positive rate of 68% and a false positive rate of 17%. The initial best template structures have an average root mean-square deviation to native of 5.3, 6.7, and 7.4 A for the monomer, interface, and dimer structures. The final model shows on average a root mean-square deviation improvement of 1.3, 1.3, and 1.5 A over the initial template structure for the monomer, interface, and dimer structures, with refinement evident for 87% of the cases. Thus, we have developed a promising approach to predict full-length quaternary structure for proteins that have weak sequence similarity to proteins of solved quaternary structure.
Collapse
Affiliation(s)
| | - Jeffrey Skolnick
- Address reprint requests to Jeffrey Skolnick, Tel.: 404-407-8975; Fax: 404-385-7478.
| |
Collapse
|
46
|
Brock K, Talley K, Coley K, Kundrotas P, Alexov E. Optimization of electrostatic interactions in protein-protein complexes. Biophys J 2007; 93:3340-52. [PMID: 17693468 PMCID: PMC2072065 DOI: 10.1529/biophysj.107.112367] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
In this article, we present a statistical analysis of the electrostatic properties of 298 protein-protein complexes and 356 domain-domain structures extracted from the previously developed database of protein complexes (ProtCom, http://www.ces.clemson.edu/compbio/protcom). For each structure in the dataset we calculated the total electrostatic energy of the binding and its two components, Coulombic and reaction field energy. It was found that in a vast majority of the cases (>90%), the total electrostatic component of the binding energy was unfavorable. At the same time, the Coulombic component of the binding energy was found to favor the complex formation while the reaction field component of the binding energy opposed the binding. It was also demonstrated that the components in a wild-type (WT) structure are optimized/anti-optimized with respect to the corresponding distributions, arising from random shuffling of the charged side chains. The degree of this optimization was assessed through the Z-score of WT energy in respect to the random distribution. It was found that the Z-scores of Coulombic interactions peak at a considerably negative value for all 654 cases considered while the Z-score of the reaction field energy varied among different types of complexes. All these findings indicate that the Coulombic interactions within WT protein-protein complexes are optimized to favor the complex formation while the total electrostatic energy predominantly opposes the binding. This observation was used to discriminate WT structures among sets of structural decoys and showed that the electrostatic component of the binding energy is not a good discriminator of the WT; while, Coulombic or reaction field energies perform better depending upon the decoy set used.
Collapse
Affiliation(s)
- Kelly Brock
- South Carolina Governor School for Science and Mathematics, Hartsville, South Carolina, USA
| | | | | | | | | |
Collapse
|
47
|
Kundrotas P, Alexov E. Predicting interacting and interfacial residues using continuous sequence segments. Int J Biol Macromol 2007; 41:615-23. [PMID: 17850859 DOI: 10.1016/j.ijbiomac.2007.08.002] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2007] [Revised: 07/31/2007] [Accepted: 08/01/2007] [Indexed: 01/07/2023]
Abstract
Development of sequence-based methods for predicting putative interfacial residues is an extremely important task in modeling 3D structures of protein-protein complexes. In the present paper we used non-gapped sequence segments to predict both interacting and interfacial residues. We demonstrated that continuous sequence segments do occur at the protein-protein interfaces and showed that continuous interacting interfacial segments (CIIS) of length nine are presented on average, in approximately 37% of the complexes in our dataset. Our results indicate that CIIS consist mostly of interacting strands and/or loops, while the CIIS involving the helixes are scarce. We performed scoring of CIIS using four different scoring mechanisms and found that scores of CIIS differ significantly from the scores calculated for random stretches of residues. We argue that such statistical difference inferred thought the corresponding Z-scores could be used for detecting putative interfacial residue segments without using any structural information. This hypothesis was tested on our dataset and benchmarking resulted to 10-60% prediction accuracy depending on type of benchmarking and scoring scheme used in calculations. Such predictions that do not depend on the availability of the 3D structures of monomers can be quite valuable in modeling 3D structures of obligatory complexes, for which structures of separated monomers do not exist.
Collapse
Affiliation(s)
- Petras Kundrotas
- Computational Biophysics and Bioinformatics, Department of Physics, Clemson University, Clemson, SC 29634, United States
| | | |
Collapse
|
48
|
Musso GA, Zhang Z, Emili A. Experimental and computational procedures for the assessment of protein complexes on a genome-wide scale. Chem Rev 2007; 107:3585-600. [PMID: 17630806 DOI: 10.1021/cr0682857] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Gabriel A Musso
- Banting and Best Department of Medical Research, Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, 160 College Street, Toronto, Ontario, Canada M5S 3E1
| | | | | |
Collapse
|
49
|
Abstract
MOTIVATION One of the more challenging problems in biology is to determine the cellular protein interaction network. Progress has been made to predict protein-protein interactions based on structural information, assuming that structural similar proteins interact in a similar way. In a previous publication, we have determined a genome-wide Ras-effector interaction network based on homology models, with a high accuracy of predicting binding and non-binding domains. However, for a prediction on a genome-wide scale, homology modelling is a time-consuming process. Therefore, we here successfully developed a faster method using position energy matrices, where based on different Ras-effector X-ray template structures, all amino acids in the effector binding domain are sequentially mutated to all other amino acid residues and the effect on binding energy is calculated. Those pre-calculated matrices can then be used to score for binding any Ras or effector sequences. RESULTS Based on position energy matrices, the sequences of putative Ras-binding domains can be scanned quickly to calculate an energy sum value. By calibrating energy sum values using quantitative experimental binding data, thresholds can be defined and thus non-binding domains can be excluded quickly. Sequences which have energy sum values above this threshold are considered to be potential binding domains, and could be further analysed using homology modelling. This prediction method could be applied to other protein families sharing conserved interaction types, in order to determine in a fast way large scale cellular protein interaction networks. Thus, it could have an important impact on future in silico structural genomics approaches, in particular with regard to increasing structural proteomics efforts, aiming to determine all possible domain folds and interaction types. AVAILABILITY All matrices are deposited in the ADAN database (http://adan-embl.ibmc.umh.es/). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Christina Kiel
- EMBL-CRG Systems Biology Unit, CRG-Centre de Regulacio Genomica, Dr Aiguader 88, 08003 Barcelona, Spain.
| | | |
Collapse
|
50
|
Devos D, Russell RB. A more complete, complexed and structured interactome. Curr Opin Struct Biol 2007; 17:370-7. [PMID: 17574831 DOI: 10.1016/j.sbi.2007.05.011] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2007] [Revised: 04/18/2007] [Accepted: 05/31/2007] [Indexed: 11/16/2022]
Abstract
Multiprotein complexes are key players in virtually all important cellular processes. The past year has seen the publication of several papers that have illuminated what we know about the number and composition of these molecular machines, using high-throughput purification methods. Other studies have illuminated structural and functional aspects of protein interactions, networks and molecular assemblies. As a result, we have a more complete view of how many complexes are in living systems, what they look like and the roles they play in the cell.
Collapse
Affiliation(s)
- Damien Devos
- EMBL, Meyerhofstrasse 1, 69117 Heidelberg, Germany
| | | |
Collapse
|