1
|
Tahti EF, Blount JM, Jackson SN, Gao M, Gill NP, Smith SN, Pederson NJ, Rumph SN, Struyvenberg SA, Mackley IGP, Madden DR, Amacher JF. Additive energetic contributions of multiple peptide positions determine the relative promiscuity of viral and human sequences for PDZ domain targets. Protein Sci 2023; 32:e4611. [PMID: 36851847 PMCID: PMC10022582 DOI: 10.1002/pro.4611] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2022] [Revised: 02/13/2023] [Accepted: 02/23/2023] [Indexed: 03/01/2023]
Abstract
Protein-protein interactions that involve recognition of short peptides are critical in cellular processes. Protein-peptide interaction surface areas are relatively small and shallow, and there are often overlapping specificities in families of peptide-binding domains. Therefore, dissecting selectivity determinants can be challenging. PDZ domains are a family of peptide-binding domains located in several intracellular signaling and trafficking pathways. These domains are also directly targeted by pathogens, and a hallmark of many oncogenic viral proteins is a PDZ-binding motif. However, amidst sequences that target PDZ domains, there is a wide spectrum in relative promiscuity. For example, the viral HPV16 E6 oncoprotein recognizes over double the number of PDZ domain-containing proteins as the cystic fibrosis transmembrane conductance regulator (CFTR) in the cell, despite similar PDZ targeting-sequences and identical motif residues. Here, we determine binding affinities for PDZ domains known to bind either HPV16 E6 alone or both CFTR and HPV16 E6, using peptides matching WT and hybrid sequences. We also use energy minimization to model PDZ-peptide complexes and use sequence analyses to investigate this difference. We find that while the majority of single mutations had marginal effects on overall affinity, the additive effect on the free energy of binding accurately describes the selectivity observed. Taken together, our results describe how complex and differing PDZ interactomes can be programmed in the cell.
Collapse
Affiliation(s)
- Elise F. Tahti
- Department of ChemistryWestern Washington UniversityBellinghamWashingtonUSA
| | - Jadon M. Blount
- Department of ChemistryWestern Washington UniversityBellinghamWashingtonUSA
| | - Sophie N. Jackson
- Department of ChemistryWestern Washington UniversityBellinghamWashingtonUSA
| | - Melody Gao
- Department of ChemistryWestern Washington UniversityBellinghamWashingtonUSA
| | - Nicholas P. Gill
- Department of BiochemistryGeisel School of Medicine at DartmouthHanoverNew HampshireUSA
| | - Sarah N. Smith
- Department of ChemistryWestern Washington UniversityBellinghamWashingtonUSA
| | - Nick J. Pederson
- Department of ChemistryWestern Washington UniversityBellinghamWashingtonUSA
| | | | | | - Iain G. P. Mackley
- Department of ChemistryWestern Washington UniversityBellinghamWashingtonUSA
| | - Dean R. Madden
- Department of BiochemistryGeisel School of Medicine at DartmouthHanoverNew HampshireUSA
| | - Jeanine F. Amacher
- Department of ChemistryWestern Washington UniversityBellinghamWashingtonUSA
| |
Collapse
|
2
|
Tahti EF, Blount JM, Jackson SN, Gao M, Gill NP, Smith SN, Pederson NJ, Rumph SN, Struyvenberg SA, Mackley IGP, Madden DR, Amacher JF. Additive energetic contributions of multiple peptide positions determine the relative promiscuity of viral and human sequences for PDZ domain targets. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2022.12.31.522388. [PMID: 36711692 PMCID: PMC9881875 DOI: 10.1101/2022.12.31.522388] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/18/2023]
Abstract
Protein-protein interactions that include recognition of short sequences of amino acids, or peptides, are critical in cellular processes. Protein-peptide interaction surface areas are relatively small and shallow, and there are often overlapping specificities in families of peptide-binding domains. Therefore, dissecting selectivity determinants can be challenging. PDZ domains are an example of a peptide-binding domain located in several intracellular signaling and trafficking pathways, which form interactions critical for the regulation of receptor endocytic trafficking, tight junction formation, organization of supramolecular complexes in neurons, and other biological systems. These domains are also directly targeted by pathogens, and a hallmark of many oncogenic viral proteins is a PDZ-binding motif. However, amidst sequences that target PDZ domains, there is a wide spectrum in relative promiscuity. For example, the viral HPV16 E6 oncoprotein recognizes over double the number of PDZ domain-containing proteins as the cystic fibrosis transmembrane conductance regulator (CFTR) in the cell, despite similar PDZ targeting-sequences and identical motif residues. Here, we determine binding affinities for PDZ domains known to bind either HPV16 E6 alone or both CFTR and HPV16 E6, using peptides matching WT and hybrid sequences. We also use energy minimization to model PDZ-peptide complexes and use sequence analyses to investigate this difference. We find that while the majority of single mutations had a marginal effect on overall affinity, the additive effect on the free energy of binding accurately describes the selectivity observed. Taken together, our results describe how complex and differing PDZ interactomes can be programmed in the cell.
Collapse
Affiliation(s)
- Elise F. Tahti
- Department of Chemistry, Western Washington University, Bellingham, WA, USA
| | - Jadon M. Blount
- Department of Chemistry, Western Washington University, Bellingham, WA, USA
| | - Sophie N. Jackson
- Department of Chemistry, Western Washington University, Bellingham, WA, USA
| | - Melody Gao
- Department of Chemistry, Western Washington University, Bellingham, WA, USA
| | - Nicholas P. Gill
- Department of Biochemistry, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Sarah N. Smith
- Department of Chemistry, Western Washington University, Bellingham, WA, USA
| | - Nick J. Pederson
- Department of Chemistry, Western Washington University, Bellingham, WA, USA
| | - Simone N. Rumph
- Department of Biochemistry, Bowdoin College, Brunswick, ME, USA
| | | | - Iain G. P. Mackley
- Department of Chemistry, Western Washington University, Bellingham, WA, USA
| | - Dean R. Madden
- Department of Biochemistry, Geisel School of Medicine at Dartmouth, Hanover, NH, USA
| | - Jeanine F. Amacher
- Department of Chemistry, Western Washington University, Bellingham, WA, USA
| |
Collapse
|
3
|
Delaunay M, Ha-Duong T. Computational Tools and Strategies to Develop Peptide-Based Inhibitors of Protein-Protein Interactions. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2022; 2405:205-230. [PMID: 35298816 DOI: 10.1007/978-1-0716-1855-4_11] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein-protein interactions play crucial and subtle roles in many biological processes and modifications of their fine mechanisms generally result in severe diseases. Peptide derivatives are very promising therapeutic agents for modulating protein-protein associations with sizes and specificities between those of small compounds and antibodies. For the same reasons, rational design of peptide-based inhibitors naturally borrows and combines computational methods from both protein-ligand and protein-protein research fields. In this chapter, we aim to provide an overview of computational tools and approaches used for identifying and optimizing peptides that target protein-protein interfaces with high affinity and specificity. We hope that this review will help to implement appropriate in silico strategies for peptide-based drug design that builds on available information for the systems of interest.
Collapse
Affiliation(s)
| | - Tâp Ha-Duong
- Université Paris-Saclay, CNRS, BioCIS, Châtenay-Malabry, France.
| |
Collapse
|
4
|
|
5
|
Fine MS, Lum PS, Brokaw EB, Caywood MS, Metzger AJ, Libin AV, Terner J, Tsao JW, Norris JN, Milzman D, Williams D, Colombe J, Dromerick AW. Dynamic motor tracking is sensitive to subacute mTBI. Exp Brain Res 2016; 234:3173-3184. [DOI: 10.1007/s00221-016-4714-5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2016] [Accepted: 06/27/2016] [Indexed: 11/28/2022]
|
6
|
Rosenfeld L, Heyne M, Shifman JM, Papo N. Protein Engineering by Combined Computational and In Vitro Evolution Approaches. Trends Biochem Sci 2016; 41:421-433. [PMID: 27061494 DOI: 10.1016/j.tibs.2016.03.002] [Citation(s) in RCA: 32] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2016] [Revised: 02/29/2016] [Accepted: 03/09/2016] [Indexed: 12/30/2022]
Abstract
Two alternative strategies are commonly used to study protein-protein interactions (PPIs) and to engineer protein-based inhibitors. In one approach, binders are selected experimentally from combinatorial libraries of protein mutants that are displayed on a cell surface. In the other approach, computational modeling is used to explore an astronomically large number of protein sequences to select a small number of sequences for experimental testing. While both approaches have some limitations, their combination produces superior results in various protein engineering applications. Such applications include the design of novel binders and inhibitors, the enhancement of affinity and specificity, and the mapping of binding epitopes. The combination of these approaches also aids in the understanding of the specificity profiles of various PPIs.
Collapse
Affiliation(s)
- Lior Rosenfeld
- Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel
| | - Michael Heyne
- Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel; Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel
| | - Julia M Shifman
- Department of Biological Chemistry, The Alexander Silberman Institute of Life Sciences, The Hebrew University of Jerusalem, Jerusalem, Israel.
| | - Niv Papo
- Department of Biotechnology Engineering and the National Institute of Biotechnology in the Negev, Ben-Gurion University of the Negev, Beer-Sheva, Israel.
| |
Collapse
|
7
|
Wang H, Heilshorn SC. Adaptable hydrogel networks with reversible linkages for tissue engineering. ADVANCED MATERIALS (DEERFIELD BEACH, FLA.) 2015; 27:3717-36. [PMID: 25989348 PMCID: PMC4528979 DOI: 10.1002/adma.201501558] [Citation(s) in RCA: 460] [Impact Index Per Article: 46.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/01/2015] [Revised: 04/18/2015] [Indexed: 05/19/2023]
Abstract
Adaptable hydrogels have recently emerged as a promising platform for three-dimensional (3D) cell encapsulation and culture. In conventional, covalently crosslinked hydrogels, degradation is typically required to allow complex cellular functions to occur, leading to bulk material degradation. In contrast, adaptable hydrogels are formed by reversible crosslinks. Through breaking and re-formation of the reversible linkages, adaptable hydrogels can be locally modified to permit complex cellular functions while maintaining their long-term integrity. In addition, these adaptable materials can have biomimetic viscoelastic properties that make them well suited for several biotechnology and medical applications. In this review, an overview of adaptable-hydrogel design considerations and linkage selections is presented, with a focus on various cell-compatible crosslinking mechanisms that can be exploited to form adaptable hydrogels for tissue engineering.
Collapse
Affiliation(s)
- Huiyuan Wang
- Department of Materials Science & Engineering, Stanford University, Stanford, CA 94305, USA
| | - Sarah C. Heilshorn
- Department of Materials Science & Engineering, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
8
|
Kamisetty H, Ghosh B, Langmead CJ, Bailey-Kellogg C. Learning sequence determinants of protein:protein interaction specificity with sparse graphical models. J Comput Biol 2015; 22:474-86. [PMID: 25973864 DOI: 10.1089/cmb.2014.0289] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In studying the strength and specificity of interaction between members of two protein families, key questions center on which pairs of possible partners actually interact, how well they interact, and why they interact while others do not. The advent of large-scale experimental studies of interactions between members of a target family and a diverse set of possible interaction partners offers the opportunity to address these questions. We develop here a method, DgSpi (data-driven graphical models of specificity in protein:protein interactions), for learning and using graphical models that explicitly represent the amino acid basis for interaction specificity (why) and extend earlier classification-oriented approaches (which) to predict the ΔG of binding (how well). We demonstrate the effectiveness of our approach in analyzing and predicting interactions between a set of 82 PDZ recognition modules against a panel of 217 possible peptide partners, based on data from MacBeath and colleagues. Our predicted ΔG values are highly predictive of the experimentally measured ones, reaching correlation coefficients of 0.69 in 10-fold cross-validation and 0.63 in leave-one-PDZ-out cross-validation. Furthermore, the model serves as a compact representation of amino acid constraints underlying the interactions, enabling protein-level ΔG predictions to be naturally understood in terms of residue-level constraints. Finally, the model DgSpi readily enables the design of new interacting partners, and we demonstrate that designed ligands are novel and diverse.
Collapse
Affiliation(s)
| | - Bornika Ghosh
- 3Department of Computer Science, Dartmouth, Hanover, New Hampshire
| | | | | |
Collapse
|
9
|
Daqrouq K, Alhmouz R, Balamesh A, Memic A. Application of wavelet transform for PDZ domain classification. PLoS One 2015; 10:e0122873. [PMID: 25860375 PMCID: PMC4393179 DOI: 10.1371/journal.pone.0122873] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2014] [Accepted: 02/24/2015] [Indexed: 11/18/2022] Open
Abstract
PDZ domains have been identified as part of an array of signaling proteins that are often unrelated, except for the well-conserved structural PDZ domain they contain. These domains have been linked to many disease processes including common Avian influenza, as well as very rare conditions such as Fraser and Usher syndromes. Historically, based on the interactions and the nature of bonds they form, PDZ domains have most often been classified into one of three classes (class I, class II and others - class III), that is directly dependent on their binding partner. In this study, we report on three unique feature extraction approaches based on the bigram and trigram occurrence and existence rearrangements within the domain's primary amino acid sequences in assisting PDZ domain classification. Wavelet packet transform (WPT) and Shannon entropy denoted by wavelet entropy (WE) feature extraction methods were proposed. Using 115 unique human and mouse PDZ domains, the existence rearrangement approach yielded a high recognition rate (78.34%), which outperformed our occurrence rearrangements based method. The recognition rate was (81.41%) with validation technique. The method reported for PDZ domain classification from primary sequences proved to be an encouraging approach for obtaining consistent classification results. We anticipate that by increasing the database size, we can further improve feature extraction and correct classification.
Collapse
Affiliation(s)
- Khaled Daqrouq
- Electrical and Computer Engineering Department, Faculty of Engineering, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Rami Alhmouz
- Electrical and Computer Engineering Department, Faculty of Engineering, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Ahmed Balamesh
- Electrical and Computer Engineering Department, Faculty of Engineering, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
| | - Adnan Memic
- Center of Nanotechnology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia
- * E-mail:
| |
Collapse
|
10
|
Potapov V, Kaplan JB, Keating AE. Data-driven prediction and design of bZIP coiled-coil interactions. PLoS Comput Biol 2015; 11:e1004046. [PMID: 25695764 PMCID: PMC4335062 DOI: 10.1371/journal.pcbi.1004046] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2014] [Accepted: 11/19/2014] [Indexed: 11/18/2022] Open
Abstract
Selective dimerization of the basic-region leucine-zipper (bZIP) transcription factors presents a vivid example of how a high degree of interaction specificity can be achieved within a family of structurally similar proteins. The coiled-coil motif that mediates homo- or hetero-dimerization of the bZIP proteins has been intensively studied, and a variety of methods have been proposed to predict these interactions from sequence data. In this work, we used a large quantitative set of 4,549 bZIP coiled-coil interactions to develop a predictive model that exploits knowledge of structurally conserved residue-residue interactions in the coiled-coil motif. Our model, which expresses interaction energies as a sum of interpretable residue-pair and triplet terms, achieves a correlation with experimental binding free energies of R = 0.68 and significantly out-performs other scoring functions. To use our model in protein design applications, we devised a strategy in which synthetic peptides are built by assembling 7-residue native-protein heptad modules into new combinations. An integer linear program was used to find the optimal combination of heptads to bind selectively to a target human bZIP coiled coil, but not to target paralogs. Using this approach, we designed peptides to interact with the bZIP domains from human JUN, XBP1, ATF4 and ATF5. Testing more than 132 candidate protein complexes using a fluorescence resonance energy transfer assay confirmed the formation of tight and selective heterodimers between the designed peptides and their targets. This approach can be used to make inhibitors of native proteins, or to develop novel peptides for applications in synthetic biology or nanotechnology.
Collapse
Affiliation(s)
- Vladimir Potapov
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Jenifer B. Kaplan
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
| | - Amy E. Keating
- Department of Biology, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- Department of Biological Engineering, Massachusetts Institute of Technology, Cambridge, Massachusetts, United States of America
- * E-mail:
| |
Collapse
|
11
|
Engelmann BW, Kim Y, Wang M, Peters B, Rock RS, Nash PD. The development and application of a quantitative peptide microarray based approach to protein interaction domain specificity space. Mol Cell Proteomics 2014; 13:3647-62. [PMID: 25135669 DOI: 10.1074/mcp.o114.038695] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Protein interaction domain (PID) linear peptide motif interactions direct diverse cellular processes in a specific and coordinated fashion. PID specificity, or the interaction selectivity derived from affinity preferences between possible PID-peptide pairs is the basis of this ability. Here, we develop an integrated experimental and computational cellulose peptide conjugate microarray (CPCMA) based approach for the high throughput analysis of PID specificity that provides unprecedented quantitative resolution and reproducibility. As a test system, we quantify the specificity preferences of four Src Homology 2 domains and 124 physiological phosphopeptides to produce a novel quantitative interactome. The quantitative data set covers a broad affinity range, is highly precise, and agrees well with orthogonal biophysical validation, in vivo interactions, and peptide library trained algorithm predictions. In contrast to preceding approaches, the CPCMAs proved capable of confidently assigning interactions into affinity categories, resolving the subtle affinity contributions of residue correlations, and yielded predictive peptide motif affinity matrices. Unique CPCMA enabled modes of systems level analysis reveal a physiological interactome with expected node degree value decreasing as a function of affinity, resulting in minimal high affinity binding overlap between domains; uncover that Src Homology 2 domains bind ligands with a similar average affinity yet strikingly different levels of promiscuity and binding dynamic range; and parse with unprecedented quantitative resolution contextual factors directing specificity. The CPCMA platform promises broad application within the fields of PID specificity, synthetic biology, specificity focused drug design, and network biology.
Collapse
Affiliation(s)
- Brett W Engelmann
- From the ‡The Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois 60637;
| | - Yohan Kim
- ¶The La Jolla Institute for Allergy and Immunology, La Jolla, California 92037
| | - Miaoyan Wang
- ‖The Department of Statistics, The University of Chicago, Chicago, Illinois 60637
| | - Bjoern Peters
- ¶The La Jolla Institute for Allergy and Immunology, La Jolla, California 92037
| | - Ronald S Rock
- From the ‡The Department of Biochemistry and Molecular Biology, The University of Chicago, Chicago, Illinois 60637
| | - Piers D Nash
- **The Ben May Department for Cancer Research, The University of Chicago, Chicago, Illinois 60637
| |
Collapse
|
12
|
Tiwari G, Mohanty D. Structure-based multiscale approach for identification of interaction partners of PDZ domains. J Chem Inf Model 2014; 54:1143-56. [PMID: 24593775 DOI: 10.1021/ci400627y] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
PDZ domains are peptide recognition modules which mediate specific protein-protein interactions and are known to have a complex specificity landscape. We have developed a novel structure-based multiscale approach which identifies crucial specificity determining residues (SDRs) of PDZ domains from explicit solvent molecular dynamics (MD) simulations on PDZ-peptide complexes and uses these SDRs in combination with knowledge-based scoring functions for proteomewide identification of their interaction partners. Multiple explicit solvent simulations ranging from 5 to 50 ns duration have been carried out on 28 PDZ-peptide complexes with known binding affinities. MM/PBSA binding energy values calculated from these simulations show a correlation coefficient of 0.755 with the experimental binding affinities. On the basis of the SDRs of PDZ domains identified by MD simulations, we have developed a simple scoring scheme for evaluating binding energies for PDZ-peptide complexes using residue based statistical pair potentials. This multiscale approach has been benchmarked on a mouse PDZ proteome array data set by calculating the binding energies for 217 different substrate peptides in binding pockets of 64 different mouse PDZ domains. Receiver operating characteristic (ROC) curve analysis indicates that, the area under curve (AUC) values for binder vs nonbinder classification by our structure based method is 0.780. Our structure based method does not require experimental PDZ-peptide binding data for training.
Collapse
Affiliation(s)
- Garima Tiwari
- Bioinformatics Center, National Institute of Immunology , Aruna Asaf Ali Marg, New Delhi-110067, India
| | | |
Collapse
|
13
|
Abstract
Background PDZ domains are one of the most promiscuous protein recognition modules that bind with short linear peptides and play an important role in cellular signaling. Recently, few high-throughput techniques (e.g. protein microarray screen, phage display) have been applied to determine in-vitro binding specificity of PDZ domains. Currently, many computational methods are available to predict PDZ-peptide interactions but they often provide domain specific models and/or have a limited domain coverage. Results Here, we composed the largest set of PDZ domains derived from human, mouse, fly and worm proteomes and defined binding models for PDZ domain families to improve the domain coverage and prediction specificity. For that purpose, we first identified a novel set of 138 PDZ families, comprising of 548 PDZ domains from aforementioned organisms, based on efficient clustering according to their sequence identity. For 43 PDZ families, covering 226 PDZ domains with available interaction data, we built specialized models using a support vector machine approach. The advantage of family-wise models is that they can also be used to determine the binding specificity of a newly characterized PDZ domain with sufficient sequence identity to the known families. Since most current experimental approaches provide only positive data, we have to cope with the class imbalance problem. Thus, to enrich the negative class, we introduced a powerful semi-supervised technique to generate high confidence non-interaction data. We report competitive predictive performance with respect to state-of-the-art approaches. Conclusions Our approach has several contributions. First, we show that domain coverage can be increased by applying accurate clustering technique. Second, we developed an approach based on a semi-supervised strategy to get high confidence negative data. Third, we allowed high order correlations between the amino acid positions in the binding peptides. Fourth, our method is general enough and will easily be applicable to other peptide recognition modules such as SH2 domains and finally, we performed a genome-wide prediction for 101 human and 102 mouse PDZ domains and uncovered novel interactions with biological relevance. We make all the predictive models and genome-wide predictions freely available to the scientific community. Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-S1-S5) contains supplementary material, which is available to authorized users.
Collapse
|
14
|
Nakariyakul S, Liu ZP, Chen L. A sequence-based computational approach to predicting PDZ domain-peptide interactions. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2014; 1844:165-70. [DOI: 10.1016/j.bbapap.2013.04.008] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/30/2012] [Revised: 03/28/2013] [Accepted: 04/11/2013] [Indexed: 10/26/2022]
|
15
|
Kamisetty H, Ghosh B, Langmead CJ, Bailey-Kellogg C. Learning Sequence Determinants of Protein:protein Interaction Specificity with Sparse Graphical Models. RESEARCH IN COMPUTATIONAL MOLECULAR BIOLOGY : ... ANNUAL INTERNATIONAL CONFERENCE, RECOMB ... : PROCEEDINGS. RECOMB (CONFERENCE : 2005- ) 2014; 8394:129-143. [PMID: 25414914 DOI: 10.1007/978-3-319-05269-4_10] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
In studying the strength and specificity of interaction between members of two protein families, key questions center on which pairs of possible partners actually interact, how well they interact, and why they interact while others do not. The advent of large-scale experimental studies of interactions between members of a target family and a diverse set of possible interaction partners offers the opportunity to address these questions. We develop here a method, DgSpi (Data-driven Graphical models of Specificity in Protein:protein Interactions), for learning and using graphical models that explicitly represent the amino acid basis for interaction specificity (why) and extend earlier classification-oriented approaches (which) to predict the ΔG of binding (how well). We demonstrate the effectiveness of our approach in analyzing and predicting interactions between a set of 82 PDZ recognition modules, against a panel of 217 possible peptide partners, based on data from MacBeath and colleagues. Our predicted ΔG values are highly predictive of the experimentally measured ones, reaching correlation coefficients of 0.69 in 10-fold cross-validation and 0.63 in leave-one-PDZ-out cross-validation. Furthermore, the model serves as a compact representation of amino acid constraints underlying the interactions, enabling protein-level ΔG predictions to be naturally understood in terms of residue-level constraints. Finally, as a generative model, DgSpi readily enables the design of new interacting partners, and we demonstrate that designed ligands are novel and diverse.
Collapse
|
16
|
Crivelli JJ, Lemmon G, Kaufmann KW, Meiler J. Simultaneous prediction of binding free energy and specificity for PDZ domain-peptide interactions. J Comput Aided Mol Des 2013; 27:1051-65. [PMID: 24305904 DOI: 10.1007/s10822-013-9696-9] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2013] [Accepted: 11/29/2013] [Indexed: 12/20/2022]
Abstract
Interactions between protein domains and linear peptides underlie many biological processes. Among these interactions, the recognition of C-terminal peptides by PDZ domains is one of the most ubiquitous. In this work, we present a mathematical model for PDZ domain-peptide interactions capable of predicting both affinity and specificity of binding based on X-ray crystal structures and comparative modeling with ROSETTA. We developed our mathematical model using a large phage display dataset describing binding specificity for a wild type PDZ domain and 91 single mutants, as well as binding affinity data for a wild type PDZ domain binding to 28 different peptides. Structural refinement was carried out through several ROSETTA protocols, the most accurate of which included flexible peptide docking and several iterations of side chain repacking and backbone minimization. Our findings emphasize the importance of backbone flexibility and the energetic contributions of side chain-side chain hydrogen bonds in accurately predicting interactions. We also determined that predicting PDZ domain-peptide interactions became increasingly challenging as the length of the peptide increased in the N-terminal direction. In the training dataset, predicted binding energies correlated with those derived through calorimetry and specificity switches introduced through single mutations at interface positions were recapitulated. In independent tests, our best performing protocol was capable of predicting dissociation constants well within one order of magnitude of the experimental values and specificity profiles at the level of accuracy of previous studies. To our knowledge, this approach represents the first integrated protocol for predicting both affinity and specificity for PDZ domain-peptide interactions.
Collapse
Affiliation(s)
- Joseph J Crivelli
- Department of Chemistry, Vanderbilt University, Station B #351822, Nashville, TN, 37235, USA
| | | | | | | |
Collapse
|
17
|
Tian F, Tan R, Guo T, Zhou P, Yang L. Fast and reliable prediction of domain–peptide binding affinity using coarse-grained structure models. Biosystems 2013; 113:40-9. [DOI: 10.1016/j.biosystems.2013.04.004] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2013] [Revised: 04/15/2013] [Accepted: 04/20/2013] [Indexed: 10/26/2022]
|
18
|
Predicting PDZ domain mediated protein interactions from structure. BMC Bioinformatics 2013; 14:27. [PMID: 23336252 PMCID: PMC3602153 DOI: 10.1186/1471-2105-14-27] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2012] [Accepted: 12/19/2012] [Indexed: 12/03/2022] Open
Abstract
Background PDZ domains are structural protein domains that recognize simple linear amino acid motifs, often at protein C-termini, and mediate protein-protein interactions (PPIs) in important biological processes, such as ion channel regulation, cell polarity and neural development. PDZ domain-peptide interaction predictors have been developed based on domain and peptide sequence information. Since domain structure is known to influence binding specificity, we hypothesized that structural information could be used to predict new interactions compared to sequence-based predictors. Results We developed a novel computational predictor of PDZ domain and C-terminal peptide interactions using a support vector machine trained with PDZ domain structure and peptide sequence information. Performance was estimated using extensive cross validation testing. We used the structure-based predictor to scan the human proteome for ligands of 218 PDZ domains and show that the predictions correspond to known PDZ domain-peptide interactions and PPIs in curated databases. The structure-based predictor is complementary to the sequence-based predictor, finding unique known and novel PPIs, and is less dependent on training–testing domain sequence similarity. We used a functional enrichment analysis of our hits to create a predicted map of PDZ domain biology. This map highlights PDZ domain involvement in diverse biological processes, some only found by the structure-based predictor. Based on this analysis, we predict novel PDZ domain involvement in xenobiotic metabolism and suggest new interactions for other processes including wound healing and Wnt signalling. Conclusions We built a structure-based predictor of PDZ domain-peptide interactions, which can be used to scan C-terminal proteomes for PDZ interactions. We also show that the structure-based predictor finds many known PDZ mediated PPIs in human that were not found by our previous sequence-based predictor and is less dependent on training–testing domain sequence similarity. Using both predictors, we defined a functional map of human PDZ domain biology and predict novel PDZ domain function. Users may access our structure-based and previous sequence-based predictors at
http://webservice.baderlab.org/domains/POW.
Collapse
|
19
|
Hawkins JC, Zhu H, Teyra J, Pisabarro MT. Reduced false positives in PDZ binding prediction using sequence and structural descriptors. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2012; 9:1492-1503. [PMID: 22508908 DOI: 10.1109/tcbb.2012.54] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Abstract—Identifying the binding partners of proteins is a problem of fundamental importance in computational biology. The PDZ is one of the most common and well-studied protein binding domains, hence it is a perfect model system for designing protein binding predictors. The standard approach to identifying the binding partners of PDZ domains uses multiple sequence alignments to infer the set of contact residues that are used in a predictive model. We expand on the sequence alignment approach by incorporating structural information to generate descriptors of the binding site geometry. Furthermore, we generate a real-value score for binary predictions by applying a filter based on models that predict the probability distributions of contact residues at each of the canonical PDZ ligand binding positions. Under training cross validation, our model produced an order of magnitude more predictions at a false positive proportion (FPP) of 10 percent than our benchmark model chosen from the literature. Evaluated using an independent cross validation, with computationally predicted structures, our model was able to make five times as many predictions as the benchmark model, with a Matthews' correlation coefficient (MCC) of 0.33. In addition, our model achieved a false positive proportion of 0.14, while the benchmark model had a 0.25 false positive proportion.
Collapse
Affiliation(s)
- John C Hawkins
- Structural Bioinformatics, BIOTEC TU Dresden, Dresden, Germany.
| | | | | | | |
Collapse
|
20
|
Teyra J, Sidhu SS, Kim PM. Elucidation of the binding preferences of peptide recognition modules: SH3 and PDZ domains. FEBS Lett 2012; 586:2631-7. [PMID: 22691579 DOI: 10.1016/j.febslet.2012.05.043] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2012] [Accepted: 05/15/2012] [Indexed: 12/20/2022]
Abstract
Peptide-binding domains play a critical role in regulation of cellular processes by mediating protein interactions involved in signalling. In recent years, the development of large-scale technologies has enabled exhaustive studies on the peptide recognition preferences for a number of peptide-binding domain families. These efforts have provided significant insights into the binding specificities of these modular domains. Many research groups have taken advantage of this unprecedented volume of specificity data and have developed a variety of new algorithms for the prediction of binding specificities of peptide-binding domains and for the prediction of their natural binding targets. This knowledge has also been applied to the design of synthetic peptide-binding domains in order to rewire protein-protein interaction networks. Here, we describe how these experimental technologies have impacted on our understanding of peptide-binding domain specificities and on the elucidation of their natural ligands. We discuss SH3 and PDZ domains as well characterized examples, and we explore the feasibility of expanding high-throughput experiments to other peptide-binding domains.
Collapse
Affiliation(s)
- Joan Teyra
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Canada ON M5S 3E1
| | | | | |
Collapse
|
21
|
Reimand J, Hui S, Jain S, Law B, Bader GD. Domain-mediated protein interaction prediction: From genome to network. FEBS Lett 2012; 586:2751-63. [PMID: 22561014 DOI: 10.1016/j.febslet.2012.04.027] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2012] [Accepted: 04/17/2012] [Indexed: 11/19/2022]
Abstract
Protein-protein interactions (PPIs), involved in many biological processes such as cellular signaling, are ultimately encoded in the genome. Solving the problem of predicting protein interactions from the genome sequence will lead to increased understanding of complex networks, evolution and human disease. We can learn the relationship between genomes and networks by focusing on an easily approachable subset of high-resolution protein interactions that are mediated by peptide recognition modules (PRMs) such as PDZ, WW and SH3 domains. This review focuses on computational prediction and analysis of PRM-mediated networks and discusses sequence- and structure-based interaction predictors, techniques and datasets for identifying physiologically relevant PPIs, and interpreting high-resolution interaction networks in the context of evolution and human disease.
Collapse
Affiliation(s)
- Jüri Reimand
- The Donnelly Centre, University of Toronto, 160 College Street, Toronto, Ontario, Canada.
| | | | | | | | | |
Collapse
|
22
|
Garcia-Garcia J, Bonet J, Guney E, Fornes O, Planas J, Oliva B. Networks of ProteinProtein Interactions: From Uncertainty to Molecular Details. Mol Inform 2012; 31:342-62. [PMID: 27477264 DOI: 10.1002/minf.201200005] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2012] [Accepted: 03/09/2012] [Indexed: 11/08/2022]
Abstract
Proteins are the bricks and mortar of cells. The work of proteins is structural and functional, as they are the principal element of the organization of the cell architecture, but they also play a relevant role in its metabolism and regulation. To perform all these functions, proteins need to interact with each other and with other bio-molecules, either to form complexes or to recognize precise targets of their action. For instance, a particular transcription factor may activate one gene or another depending on its interactions with other proteins and not only with DNA. Hence, the ability of a protein to interact with other bio-molecules, and the partners they have at each particular time and location can be crucial to characterize the role of a protein. Proteins rarely act alone; they rather constitute a mingled network of physical interactions or other types of relationships (such as metabolic and regulatory) or signaling cascades. In this context, understanding the function of a protein implies to recognize the members of its neighborhood and to grasp how they associate, both at the systemic and atomic level. The network of physical interactions between the proteins of a system, cell or organism, is defined as the interactome. The purpose of this review is to deepen the description of interactomes at different levels of detail: from the molecular structure of complexes to the global topology of the network of interactions. The approaches and techniques applied experimentally and computationally to attain each level are depicted. The limits of each technique and its integration into a model network, the challenges and actual problems of completeness of an interactome, and the reliability of the interactions are reviewed and summarized. Finally, the application of the current knowledge of protein-protein interactions on modern network medicine and protein function annotation is also explored.
Collapse
Affiliation(s)
- Javier Garcia-Garcia
- Structural Bioinformatics Group, GRIB-IMIM, Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Catalonia, Spain
| | - Jaume Bonet
- Structural Bioinformatics Group, GRIB-IMIM, Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Catalonia, Spain
| | - Emre Guney
- Structural Bioinformatics Group, GRIB-IMIM, Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Catalonia, Spain
| | - Oriol Fornes
- Structural Bioinformatics Group, GRIB-IMIM, Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Catalonia, Spain
| | - Joan Planas
- Structural Bioinformatics Group, GRIB-IMIM, Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Catalonia, Spain
| | - Baldo Oliva
- Structural Bioinformatics Group, GRIB-IMIM, Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Catalonia, Spain.
| |
Collapse
|
23
|
Gfeller D. Uncovering new aspects of protein interactions through analysis of specificity landscapes in peptide recognition domains. FEBS Lett 2012; 586:2764-72. [PMID: 22710167 DOI: 10.1016/j.febslet.2012.03.054] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2012] [Revised: 03/27/2012] [Accepted: 03/27/2012] [Indexed: 12/20/2022]
Abstract
Protein interactions underlie all biological processes. An important class of protein interactions, often observed in signaling pathways, consists of peptide recognition domains binding short protein segments on the surface of their target proteins. Recent developments in experimental techniques have uncovered many such interactions and shed new lights on their specificity. To analyze these data, novel computational methods have been introduced that can accurately describe the specificity landscape of peptide recognition domains and predict new interactions. Combining large-scale analysis of binding specificity data with structure-based modeling can further reveal new biological insights into the molecular recognition events underlying signaling pathways.
Collapse
Affiliation(s)
- David Gfeller
- Swiss Institute of Bioinformatics, Quartier Sorge, Bâtiment Génopode, CH-1015 Lausanne, Switzerland.
| |
Collapse
|
24
|
He L, Friedman AM, Bailey-Kellogg C. A divide-and-conquer approach to determine the Pareto frontier for optimization of protein engineering experiments. Proteins 2012; 80:790-806. [PMID: 22180081 PMCID: PMC4939273 DOI: 10.1002/prot.23237] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2011] [Revised: 10/06/2011] [Accepted: 10/21/2011] [Indexed: 01/07/2023]
Abstract
In developing improved protein variants by site-directed mutagenesis or recombination, there are often competing objectives that must be considered in designing an experiment (selecting mutations or breakpoints): stability versus novelty, affinity versus specificity, activity versus immunogenicity, and so forth. Pareto optimal experimental designs make the best trade-offs between competing objectives. Such designs are not "dominated"; that is, no other design is better than a Pareto optimal design for one objective without being worse for another objective. Our goal is to produce all the Pareto optimal designs (the Pareto frontier), to characterize the trade-offs and suggest designs most worth considering, but to avoid explicitly considering the large number of dominated designs. To do so, we develop a divide-and-conquer algorithm, Protein Engineering Pareto FRontier (PEPFR), that hierarchically subdivides the objective space, using appropriate dynamic programming or integer programming methods to optimize designs in different regions. This divide-and-conquer approach is efficient in that the number of divisions (and thus calls to the optimizer) is directly proportional to the number of Pareto optimal designs. We demonstrate PEPFR with three protein engineering case studies: site-directed recombination for stability and diversity via dynamic programming, site-directed mutagenesis of interacting proteins for affinity and specificity via integer programming, and site-directed mutagenesis of a therapeutic protein for activity and immunogenicity via integer programming. We show that PEPFR is able to effectively produce all the Pareto optimal designs, discovering many more designs than previous methods. The characterization of the Pareto frontier provides additional insights into the local stability of design choices as well as global trends leading to trade-offs between competing criteria.
Collapse
Affiliation(s)
- Lu He
- Department of Computer Science, Dartmouth College, Hanover NH 03755
| | - Alan M. Friedman
- Department of Biological Sciences, Markey Center for Structural Biology, Purdue Cancer Center, and Bindley Bioscience Center, Purdue University
| | | |
Collapse
|
25
|
Luck K, Fournane S, Kieffer B, Masson M, Nominé Y, Travé G. Putting into practice domain-linear motif interaction predictions for exploration of protein networks. PLoS One 2011; 6:e25376. [PMID: 22069443 PMCID: PMC3206016 DOI: 10.1371/journal.pone.0025376] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2011] [Accepted: 09/02/2011] [Indexed: 12/22/2022] Open
Abstract
PDZ domains recognise short sequence motifs at the extreme C-termini of proteins. A model based on microarray data has been recently published for predicting the binding preferences of PDZ domains to five residue long C-terminal sequences. Here we investigated the potential of this predictor for discovering novel protein interactions that involve PDZ domains. When tested on real negative data assembled from published literature, the predictor displayed a high false positive rate (FPR). We predicted and experimentally validated interactions between four PDZ domains derived from the human proteins MAGI1 and SCRIB and 19 peptides derived from human and viral C-termini of proteins. Measured binding intensities did not correlate with prediction scores, and the high FPR of the predictor was confirmed. Results indicate that limitations of the predictor may arise from an incomplete model definition and improper training of the model. Taking into account these limitations, we identified several novel putative interactions between PDZ domains of MAGI1 and SCRIB and the C-termini of the proteins FZD4, ARHGAP6, NET1, TANC1, GLUT7, MARCH3, MAS, ABC1, DLL1, TMEM215 and CYSLTR2. These proteins are localised to the membrane or suggested to act close to it and are often involved in G protein signalling. Furthermore, we showed that, while extension of minimal interacting domains or peptides toward tandem constructs or longer peptides never suppressed their ability to interact, the measured affinities and inferred specificity patterns often changed significantly. This suggests that if protein fragments interact, the full length proteins are also likely to interact, albeit possibly with altered affinities and specificities. Therefore, predictors dealing with protein fragments are promising tools for discovering protein interaction networks but their application to predict binding preferences within networks may be limited.
Collapse
Affiliation(s)
- Katja Luck
- Group Onco-Proteins, Institut de Recherche de l'Ecole de Biotechnologie de Strasbourg, 1, BP 10413, Illkirch, France
| | - Sadek Fournane
- Group Onco-Proteins, Institut de Recherche de l'Ecole de Biotechnologie de Strasbourg, 1, BP 10413, Illkirch, France
| | - Bruno Kieffer
- Biomolecular NMR group, Institut de Génétique et de Biologie Moléculaire et Cellulaire, 1, BP 10413, Illkirch, France
| | - Murielle Masson
- Group Onco-Proteins, Institut de Recherche de l'Ecole de Biotechnologie de Strasbourg, 1, BP 10413, Illkirch, France
| | - Yves Nominé
- Group Onco-Proteins, Institut de Recherche de l'Ecole de Biotechnologie de Strasbourg, 1, BP 10413, Illkirch, France
| | - Gilles Travé
- Group Onco-Proteins, Institut de Recherche de l'Ecole de Biotechnologie de Strasbourg, 1, BP 10413, Illkirch, France
- * E-mail:
| |
Collapse
|
26
|
Gfeller D, Butty F, Wierzbicka M, Verschueren E, Vanhee P, Huang H, Ernst A, Dar N, Stagljar I, Serrano L, Sidhu SS, Bader GD, Kim PM. The multiple-specificity landscape of modular peptide recognition domains. Mol Syst Biol 2011; 7:484. [PMID: 21525870 PMCID: PMC3097085 DOI: 10.1038/msb.2011.18] [Citation(s) in RCA: 68] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2010] [Accepted: 03/11/2011] [Indexed: 12/17/2022] Open
Abstract
Using large scale experimental datasets, the authors show how modular protein interaction domains such as PDZ, SH3 or WW domains, frequently display unexpected multiple binding specificity. The observed multiple specificity leads to new structural insights and accurately predicts new protein interactions. Modular protein domains interacting with short linear peptides, such as PDZ, SH3 or WW domains, display a rich binding specificity with significant interplay (or correlation) between ligand residues. The binding specificity of these domains is more accurately described with a multiple specificity model. The multiple specificity reveals new structural insights and predicts new protein interactions.
Modular protein domains have a central role in the complex network of signaling pathways that governs cellular processes. Many of them, called peptide recognition domains, bind short linear regions in their target proteins, such as the well-known SH3 or PDZ domains. These domain–peptide interactions are the predominant form of protein interaction in signaling pathways. Because of the relative simplicity of the interaction, their binding specificity is generally represented using a simple model, analogous to transcription factor binding: the domain binds a short stretch of amino acids and at each position some amino acids are preferred over other ones. Thus, for each position, a probability can be assigned to each amino acid and these probabilities are often grouped into a matrix called position weight matrix (PWM) or position-specific scoring matrix. Such a matrix can then be represented in a highly intuitive manner as a so-called sequence logo (see Figure 1). A main shortcoming of this specificity model is that, although intuitive and interpretable, it inherently assumes that all residues in the peptide contribute independently to binding. On the basis of statistical analyses of large data sets of peptides binding to PDZ, SH3 and WW domains, we show that for most domains, this is not the case. Indeed, there is complex and highly significant interplay between the ligand residues. To overcome this issue, we develop a computational model that can both take into account such correlations and also preserve the advantages of PWMs, namely its straightforward interpretability. Briefly, our method detects whether the domain is capable of binding its targets not only with a single specificity but also with multiple specificities. If so, it will determine all the relevant specificities (see Figure 1). This is accomplished by using a machine learning algorithm based on mixture models, and the results can be effectively visualized as multiple sequence logos. In other words, based on experimentally derived data sets of binding peptides, we determine for every domain, in addition to the known specificity, one or more new specificities. As such, we capture more real information, and our model performs better than previous models of binding specificity. A crucial question is what these new specificities correspond to: are they simply mathematical artifacts coming out of some algorithm or do they represent something we can understand on a biophysical or structural level? Overall, the new specificities provide us with substantial new intuitive insight about the structural basis of binding for these domains. We can roughly identify two cases. First, we have neighboring (or very close in sequence) amino acids in the ligand that show significant correlations. These usually correspond to amino acids whose side chains point in the same directions and often occupy the same physical space, and therefore can directly influence each other. In other cases, we observe that multiple specificities found for a single domain are very different from each other. They correspond to different ways that the domain accommodates its binders. Often, conformational changes are required to switch from one binding mode to another. In almost all cases, only one canonical binding mode was previously known, and our analysis enables us to predict several interesting non-canonical ones. Specifically, we discuss one example in detail in Figure 5. In a PDZ domain of DLG1, we identify a novel binding specificity that differs from the canonical one by the presence of an additional tryptophan at the C terminus of the ligand. From a structural point of view, this would require a flexible loop to move out of the way to accommodate this rather large side chain. We find evidence of this predicted new binding mode based on both existing crystal structures and structural modeling. Finally, our model of binding specificity leads to predictions of many new and previously unknown protein interactions. We validate a number of these using the membrane yeast two-hybrid approach. In summary, we show here that multiple specificity is a general and underappreciated phenomenon for modular peptide recognition domains and that it leads to substantial new insight into the basis of protein interactions. Modular protein interaction domains form the building blocks of eukaryotic signaling pathways. Many of them, known as peptide recognition domains, mediate protein interactions by recognizing short, linear amino acid stretches on the surface of their cognate partners with high specificity. Residues in these stretches are usually assumed to contribute independently to binding, which has led to a simplified understanding of protein interactions. Conversely, we observe in large binding peptide data sets that different residue positions display highly significant correlations for many domains in three distinct families (PDZ, SH3 and WW). These correlation patterns reveal a widespread occurrence of multiple binding specificities and give novel structural insights into protein interactions. For example, we predict a new binding mode of PDZ domains and structurally rationalize it for DLG1 PDZ1. We show that multiple specificity more accurately predicts protein interactions and experimentally validate some of the predictions for the human proteins DLG1 and SCRIB. Overall, our results reveal a rich specificity landscape in peptide recognition domains, suggesting new ways of encoding specificity in protein interaction networks.
Collapse
Affiliation(s)
- David Gfeller
- Banting and Best Department of Medical Research, The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
27
|
Chimura T, Launey T, Ito M. Evolutionarily conserved bias of amino-acid usage refines the definition of PDZ-binding motif. BMC Genomics 2011; 12:300. [PMID: 21649932 PMCID: PMC3138430 DOI: 10.1186/1471-2164-12-300] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2011] [Accepted: 06/08/2011] [Indexed: 11/18/2022] Open
Abstract
Background The interactions between PDZ (PSD-95, Dlg, ZO-1) domains and PDZ-binding motifs play central roles in signal transductions within cells. Proteins with PDZ domains bind to PDZ-binding motifs almost exclusively when the motifs are located at the carboxyl (C-) terminal ends of their binding partners. However, it remains little explored whether PDZ-binding motifs show any preferential location at the C-terminal ends of proteins, at genome-level. Results Here, we examined the distribution of the type-I (x-x-S/T-x-I/L/V) or type-II (x-x-V-x-I/V) PDZ-binding motifs in proteins encoded in the genomes of five different species (human, mouse, zebrafish, fruit fly and nematode). We first established that these PDZ-binding motifs are indeed preferentially present at their C-terminal ends. Moreover, we found specific amino acid (AA) bias for the 'x' positions in the motifs at the C-terminal ends. In general, hydrophilic AAs were favored. Our genomics-based findings confirm and largely extend the results of previous interaction-based studies, allowing us to propose refined consensus sequences for all of the examined PDZ-binding motifs. An ontological analysis revealed that the refined motifs are functionally relevant since a large fraction of the proteins bearing the motif appear to be involved in signal transduction. Furthermore, co-precipitation experiments confirmed two new protein interactions predicted by our genomics-based approach. Finally, we show that influenza virus pathogenicity can be correlated with PDZ-binding motif, with high-virulence viral proteins bearing a refined PDZ-binding motif. Conclusions Our refined definition of PDZ-binding motifs should provide important clues for identifying functional PDZ-binding motifs and proteins involved in signal transduction.
Collapse
Affiliation(s)
- Takahiko Chimura
- Laboratory for Memory and Learning, RIKEN Brain Science Institute, Wako, Saitama 351-0198, Japan.
| | | | | |
Collapse
|