1
|
Mosca R, Céol A, Aloy P. Interactome3D: adding structural details to protein networks. Nat Methods 2013; 10:47-53. [PMID: 23399932 DOI: 10.1038/nmeth.2289] [Citation(s) in RCA: 337] [Impact Index Per Article: 28.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2012] [Accepted: 10/30/2012] [Indexed: 01/13/2023]
|
2
|
Lee HS, Im W. Identification of ligand templates using local structure alignment for structure-based drug design. J Chem Inf Model 2012; 52:2784-95. [PMID: 22978550 DOI: 10.1021/ci300178e] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
With a rapid increase in the number of high-resolution protein-ligand structures, the known protein-ligand structures can be used to gain insight into ligand-binding modes in a target protein. On the basis of the fact that the structurally similar binding sites share information about their ligands, we have developed a local structure alignment tool, G-LoSA (graph-based local structure alignment). The known protein-ligand binding-site structure library is searched by G-LoSA to detect binding-site structures with similar geometry and physicochemical properties to a query binding-site structure regardless of sequence continuity and protein fold. Then, the ligands in the identified complexes are used as templates (i.e., template ligands) to predict/design a ligand for the target protein. The performance of G-LoSA is validated against 76 benchmark targets from the Astex diverse set. Using the currently available protein-ligand structure library, G-LoSA is able to identify a single template ligand (from a nonhomologous protein complex) that is highly similar to the target ligand in more than half of the benchmark targets. In addition, our benchmark analyses show that an assembly of structural fragments from multiple template ligands with partial similarity to the target ligand can be used to design novel ligand structures specific to the target protein. This study clearly indicates that a template-based ligand modeling has potential for de novo ligand design and can be a complementary approach to the receptor structure based methods.
Collapse
Affiliation(s)
- Hui Sun Lee
- Department of Molecular Biosciences and Center for Bioinformatics, The University of Kansas, 2030 Becker Drive, Lawrence, Kansas 66047, USA.
| | | |
Collapse
|
3
|
Overton IM, Barton GJ. Computational approaches to selecting and optimising targets for structural biology. Methods 2011; 55:3-11. [PMID: 21906678 PMCID: PMC3202631 DOI: 10.1016/j.ymeth.2011.08.014] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2011] [Revised: 08/18/2011] [Accepted: 08/22/2011] [Indexed: 11/29/2022] Open
Abstract
Selection of protein targets for study is central to structural biology and may be influenced by numerous factors. A key aim is to maximise returns for effort invested by identifying proteins with the balance of biophysical properties that are conducive to success at all stages (e.g. solubility, crystallisation) in the route towards a high resolution structural model. Selected targets can be optimised through construct design (e.g. to minimise protein disorder), switching to a homologous protein, and selection of experimental methodology (e.g. choice of expression system) to prime for efficient progress through the structural proteomics pipeline. Here we discuss computational techniques in target selection and optimisation, with more detailed focus on tools developed within the Scottish Structural Proteomics Facility (SSPF); namely XANNpred, ParCrys, OB-Score (target selection) and TarO (target optimisation). TarO runs a large number of algorithms, searching for homologues and annotating the pool of possible alternative targets. This pool of putative homologues is presented in a ranked, tabulated format and results are also visualised as an automatically generated and annotated multiple sequence alignment. The target selection algorithms each predict the propensity of a selected protein target to progress through the experimental stages leading to diffracting crystals. This single predictor approach has advantages for target selection, when compared with an approach using two or more predictors that each predict for success at a single experimental stage. The tools described here helped SSPF achieve a high (21%) success rate in progressing cloned targets to diffraction-quality crystals.
Collapse
Affiliation(s)
- Ian M Overton
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, Western General Hospital, Crewe Road, Edinburgh EH4 2XU, United Kingdom.
| | | |
Collapse
|
4
|
Feliu E, Aloy P, Oliva B. On the analysis of protein-protein interactions via knowledge-based potentials for the prediction of protein-protein docking. Protein Sci 2011; 20:529-41. [PMID: 21432933 DOI: 10.1002/pro.585] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Development of effective methods to screen binary interactions obtained by rigid-body protein-protein docking is key for structure prediction of complexes and for elucidating physicochemical principles of protein-protein binding. We have derived empirical knowledge-based potential functions for selecting rigid-body docking poses. These potentials include the energetic component that provides the residues with a particular secondary structure and surface accessibility. These scoring functions have been tested on a state-of-art benchmark dataset and on a decoy dataset of permanent interactions. Our results were compared with a residue-pair potential scoring function (RPScore) and an atomic-detailed scoring function (Zrank). We have combined knowledge-based potentials to score protein-protein poses of decoys of complexes classified either as transient or as permanent protein-protein interactions. Being defined from residue-pair statistical potentials and not requiring of an atomic level description, our method surpassed Zrank for scoring rigid-docking decoys where the unbound partners of an interaction have to endure conformational changes upon binding. However, when only moderate conformational changes are required (in rigid docking) or when the right conformational changes are ensured (in flexible docking), Zrank is the most successful scoring function. Finally, our study suggests that the physicochemical properties necessary for the binding are allocated on the proteins previous to its binding and with independence of the partner. This information is encoded at the residue level and could be easily incorporated in the initial grid scoring for Fast Fourier Transform rigid-body docking methods.
Collapse
Affiliation(s)
- Elisenda Feliu
- Algebra and Geometry Department, Mathematics Faculty, Universitat de Barcelona, Spain
| | | | | |
Collapse
|
5
|
Perkins JR, Diboun I, Dessailly BH, Lees JG, Orengo C. Transient protein-protein interactions: structural, functional, and network properties. Structure 2011; 18:1233-43. [PMID: 20947012 DOI: 10.1016/j.str.2010.08.007] [Citation(s) in RCA: 386] [Impact Index Per Article: 27.6] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2010] [Revised: 07/13/2010] [Accepted: 08/02/2010] [Indexed: 11/28/2022]
Abstract
Transient interactions, which involve protein interactions that are formed and broken easily, are important in many aspects of cellular function. Here we describe structural and functional properties of transient interactions between globular domains and between globular domains, short peptides, and disordered regions. The importance of posttranslational modifications in transient interactions is also considered. We review techniques used in the detection of the different types of transient protein-protein interactions. We also look at the role of transient interactions within protein-protein interaction networks and consider their contribution to different aspects of these networks.
Collapse
Affiliation(s)
- James R Perkins
- Department of Structural and Molecular Biology, University College of London, Gower Street, WC1E 6BT London, UK.
| | | | | | | | | |
Collapse
|
6
|
Geppert T, Hoy B, Wessler S, Schneider G. Context-Based Identification of Protein-Protein Interfaces and “Hot-Spot” Residues. ACTA ACUST UNITED AC 2011; 18:344-53. [DOI: 10.1016/j.chembiol.2011.01.005] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2010] [Revised: 12/03/2010] [Accepted: 01/05/2011] [Indexed: 02/07/2023]
|
7
|
Stein A, Mosca R, Aloy P. Three-dimensional modeling of protein interactions and complexes is going 'omics. Curr Opin Struct Biol 2011; 21:200-8. [PMID: 21320770 DOI: 10.1016/j.sbi.2011.01.005] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Revised: 01/11/2011] [Accepted: 01/13/2011] [Indexed: 10/18/2022]
Abstract
High-throughput interaction discovery initiatives have revealed the existence of hundreds of multiprotein complexes whose functions are regulated through thousands of protein-protein interactions (PPIs). However, the structural details of these interactions, often necessary to understand their function, are only available for a tiny fraction, and the experimental difficulties surrounding complex structure determination make computational modeling techniques paramount. In this manuscript, we critically review some of the most recent developments in the field of structural bioinformatics applied to the modeling of protein interactions and complexes, from large macromolecular machines to domain-domain and peptide-mediated interactions. In particular, we place a special emphasis on those methods that can be applied in a proteome-wide manner, and discuss how they will help in the ultimate objective of building 3D interactome networks.
Collapse
Affiliation(s)
- Amelie Stein
- Institute for Research in Biomedicine (IRB Barcelona), Joint IRB-BSC Program in Computational Biology, c/Baldiri i Reixac 10-12, 08028 Barcelona, Spain
| | | | | |
Collapse
|
8
|
Brooks MA, Gewartowski K, Mitsiki E, Létoquart J, Pache RA, Billier Y, Bertero M, Corréa M, Czarnocki-Cieciura M, Dadlez M, Henriot V, Lazar N, Delbos L, Lebert D, Piwowarski J, Rochaix P, Böttcher B, Serrano L, Séraphin B, van Tilbeurgh H, Aloy P, Perrakis A, Dziembowski A. Systematic bioinformatics and experimental validation of yeast complexes reduces the rate of attrition during structural investigations. Structure 2011; 18:1075-82. [PMID: 20826334 DOI: 10.1016/j.str.2010.08.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2010] [Revised: 06/30/2010] [Accepted: 08/07/2010] [Indexed: 10/19/2022]
Abstract
For high-throughput structural studies of protein complexes of composition inferred from proteomics data, it is crucial that candidate complexes are selected accurately. Herein, we exemplify a procedure that combines a bioinformatics tool for complex selection with in vivo validation, to deliver structural results in a medium-throughout manner. We have selected a set of 20 yeast complexes, which were predicted to be feasible by either an automated bioinformatics algorithm, by manual inspection of primary data, or by literature searches. These complexes were validated with two straightforward and efficient biochemical assays, and heterologous expression technologies of complex components were then used to produce the complexes to assess their feasibility experimentally. Approximately one-half of the selected complexes were useful for structural studies, and we detail one particular success story. Our results underscore the importance of accurate target selection and validation in avoiding transient, unstable, or simply nonexistent complexes from the outset.
Collapse
Affiliation(s)
- Mark A Brooks
- IBBMC-CNRS UMR8619, IFR 115, Bât. 430, Université Paris-Sud, 91405 Orsay, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
9
|
Geppert T, Proschak E, Schneider G. Protein-protein docking by shape-complementarity and property matching. J Comput Chem 2010; 31:1919-28. [PMID: 20087900 DOI: 10.1002/jcc.21479] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
We present a computational approach to protein-protein docking based on surface shape complementarity ("ProBinder"). Within this docking approach, we implemented a new surface decomposition method that considers local shape features on the protein surface. This new surface shape decomposition results in a deterministic representation of curvature features on the protein surface, such as "knobs," "holes," and "flats" together with their point normals. For the actual docking procedure, we used geometric hashing, which allows for the rapid, translation-, and rotation-free comparison of point coordinates. Candidate solutions were scored based on knowledge-based potentials and steric criteria. The potentials included electrostatic complementarity, desolvation energy, amino acid contact preferences, and a van-der-Waals potential. We applied ProBinder to a diverse test set of 68 bound and 30 unbound test cases compiled from the Dockground database. Sixty-four percent of the protein-protein test complexes were ranked with an root mean square deviation (RMSD) < 5 A to the target solution among the top 10 predictions for the bound data set. In 82% of the unbound samples, docking poses were ranked within the top ten solutions with an RMSD < 10 A to the target solution.
Collapse
Affiliation(s)
- Tim Geppert
- Department of Biochemistry, Chemistry and Pharmacy, Institute of Organic Chemistry and Chemical Biology, LiFF/ZAFES, Johann Wolfgang Goethe-University, Frankfurt am Main, Germany
| | | | | |
Collapse
|
10
|
Gao M, Skolnick J. iAlign: a method for the structural comparison of protein-protein interfaces. ACTA ACUST UNITED AC 2010; 26:2259-65. [PMID: 20624782 DOI: 10.1093/bioinformatics/btq404] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Protein-protein interactions play an essential role in many cellular processes. The rapid accumulation of protein-protein complex structures provides an unprecedented opportunity for comparative studies of protein-protein interactions. To facilitate such studies, it is necessary to develop an accurate and efficient computational algorithm for the comparison of protein-protein interaction modes. While there are many structural comparison approaches developed for individual proteins, very few methods are available for protein-protein complexes. RESULTS We present a novel interface alignment method, iAlign, for the structural alignment of protein-protein interfaces. New scoring schemes for measuring interface similarity are introduced, and an iterative dynamic programming algorithm is implemented. We find that the similarity scores follow extreme value distributions. Using statistical models, we empirically estimate their statistical significance, which is in good agreement with manual classifications by human experts. Large-scale tests of iAlign were conducted on both artificial docking models and experimental structures. In a benchmark test on 1517 dimers, iAlign successfully detects biologically related, structurally similar protein-protein interfaces at a coverage percentage of 90% and an error per query of 0.05. When compared against previously published methods, iAlign is substantially more accurate and efficient. AVAILABILITY The iAlign software package is freely available at http://cssb.biology.gatech.edu/iAlign.
Collapse
Affiliation(s)
- Mu Gao
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| | | |
Collapse
|
11
|
Sawyer TK. A Pursuit of Smart Chemistry Tackling Complex Biology by Way of Inventive Drug Design. Chem Biol Drug Des 2010; 75:1-2. [DOI: 10.1111/j.1747-0285.2009.00918.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023]
|
12
|
Sawyer TK. The Race for Chemical and Biological Space: Drug Discovery and Innovative Technologies. Chem Biol Drug Des 2009; 73:1-2. [DOI: 10.1111/j.1747-0285.2008.00760.x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
13
|
Pache RA, Aloy P. Incorporating high-throughput proteomics experiments into structural biology pipelines: identification of the low-hanging fruits. Proteomics 2008; 8:1959-64. [PMID: 18491310 DOI: 10.1002/pmic.200700966] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
The last years have seen the emergence of many large-scale proteomics initiatives that have identified thousands of new protein interactions and macromolecular assemblies. However, unfortunately, only a few among the discovered complexes meet the high-quality standards required to be promptly used in structural studies. This has thus created an increasing gap between the number of known protein interactions and complexes and those for which a high-resolution 3-D structure is available. Here, we present and validate a computational strategy to distinguish those complexes found in high-throughput affinity purification experiments that will stand the best chances to successfully express, purify and crystallize with little further intervention. Our method suggests that there are some 50 complexes recently discovered in yeast that could readily enter the structural biology pipelines.
Collapse
Affiliation(s)
- Roland A Pache
- Institute for Research in Biomedicine (IRB) and Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | | |
Collapse
|
14
|
Abstract
The success of the whole genome sequencing projects brought considerable credence to the belief that high-throughput approaches, rather than traditional hypothesis-driven research, would be essential to structurally and functionally annotate the rapid growth in available sequence data within a reasonable time frame. Such observations supported the emerging field of structural genomics, which is now faced with the task of providing a library of protein structures that represent the biological diversity of the protein universe. To run efficiently, structural genomics projects aim to define a set of targets that maximize the potential of each structure discovery whether it represents a novel structure, novel function, or missing evolutionary link. However, not all protein sequences make suitable structural genomics targets: It takes considerably more effort to determine the structure of a protein than the sequence of its gene because of the increased complexity of the methods involved and also because the behavior of targeted proteins can be extremely variable at the different stages in the structural genomics "pipeline." Therefore, structural genomics target selection must identify and prioritize the most suitable candidate proteins for structure determination, avoiding "problematic" proteins while also ensuring the ultimate goals of the project are followed.
Collapse
|
15
|
Protein interactions: analysis using allele libraries. ADVANCES IN BIOCHEMICAL ENGINEERING/BIOTECHNOLOGY 2008; 110:47-66. [PMID: 18528666 DOI: 10.1007/10_2008_102] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Interaction defective alleles (IDAs) are alleles that contain mutations affecting their ability to interact with their wild type binding partners. The locations of the mutations may lead to the identification of protein interaction domains and interaction interfaces. IDAs may also distinguish different binding interfaces of multidomain proteins that are part of large complexes, thus shedding light on large protein structures that have yet to be determined. IDAs may also be used in conjunction with RNAi to dissect protein interaction networks. Here, the wild type allele is knocked down and replaced with an IDA that has lost the ability to interact with a specific binding partner. As a result, interactions are disrupted rather than knocking out the entire gene. Thus, IDAs have the potential to be extremely valuable tools in protein interaction network analysis. IDAs can be isolated by reverse two-hybrid analysis, which was demonstrated over a decade ago, but high background levels caused by truncated IDAs have prevented its widespread adoption. We recently described a novel method for full-length allele library generation that eliminates this background and increases the efficiency of the reverse two-hybrid protocol (and IDA isolation) significantly. Here we discuss our strategy for allele library generation, the potential uses of IDAs as outlined above, and additional applications of allele libraries.
Collapse
|
16
|
Huang YJ, Hang D, Lu LJ, Tong L, Gerstein MB, Montelione GT. Targeting the human cancer pathway protein interaction network by structural genomics. Mol Cell Proteomics 2008; 7:2048-60. [PMID: 18487680 DOI: 10.1074/mcp.m700550-mcp200] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Structural genomics provides an important approach for characterizing and understanding systems biology. As a step toward better integrating protein three-dimensional (3D) structural information in cancer systems biology, we have constructed a Human Cancer Pathway Protein Interaction Network (HCPIN) by analysis of several classical cancer-associated signaling pathways and their physical protein-protein interactions. Many well known cancer-associated proteins play central roles as "hubs" or "bottlenecks" in the HCPIN. At least half of HCPIN proteins are either directly associated with or interact with multiple signaling pathways. Although some 45% of residues in these proteins are in sequence segments that meet criteria sufficient for approximate homology modeling (Basic Local Alignment Search Tool (BLAST) E-value <10(-6)), only approximately 20% of residues in these proteins are structurally covered using high accuracy homology modeling criteria (i.e. BLAST E-value <10(-6) and at least 80% sequence identity) or by actual experimental structures. The HCPIN Website provides a comprehensive description of this biomedically important multipathway network together with experimental and homology models of HCPIN proteins useful for cancer biology research. To complement and enrich cancer systems biology, the Northeast Structural Genomics Consortium is targeting >1000 human proteins and protein domains from the HCPIN for sample production and 3D structure determination. The long range goal of this effort is to provide a comprehensive 3D structure-function database for human cancer-associated proteins and protein complexes in the context of their interaction networks. The network-based target selection (BioNet) approach described here is an example of a general strategy for targeting co-functioning proteins by structural genomics projects.
Collapse
Affiliation(s)
- Yuanpeng Janet Huang
- Department of Molecular Biology and Biochemistry, Center for Advanced Biotechnology and Medicine, Rutgers University, Piscataway, New Jersey 08854, USA
| | | | | | | | | | | |
Collapse
|
17
|
Launay G, Mendez R, Wodak S, Simonson T. Recognizing protein-protein interfaces with empirical potentials and reduced amino acid alphabets. BMC Bioinformatics 2007; 8:270. [PMID: 17662112 PMCID: PMC2034607 DOI: 10.1186/1471-2105-8-270] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2007] [Accepted: 07/27/2007] [Indexed: 11/25/2022] Open
Abstract
Background In structural genomics, an important goal is the detection and classification of protein–protein interactions, given the structures of the interacting partners. We have developed empirical energy functions to identify native structures of protein–protein complexes among sets of decoy structures. To understand the role of amino acid diversity, we parameterized a series of functions, using a hierarchy of amino acid alphabets of increasing complexity, with 2, 3, 4, 6, and 20 amino acid groups. Compared to previous work, we used the simplest possible functional form, with residue–residue interactions and a stepwise distance-dependence. We used increased computational ressources, however, constructing 290,000 decoys for 219 protein–protein complexes, with a realistic docking protocol where the protein partners are flexible and interact through a molecular mechanics energy function. The energy parameters were optimized to correctly assign as many native complexes as possible. To resolve the multiple minimum problem in parameter space, over 64000 starting parameter guesses were tried for each energy function. The optimized functions were tested by cross validation on subsets of our native and decoy structures, by blind tests on series of native and decoy structures available on the Web, and on models for 13 complexes submitted to the CAPRI structure prediction experiment. Results Performance is similar to several other statistical potentials of the same complexity. For example, the CAPRI target structure is correctly ranked ahead of 90% of its decoys in 6 cases out of 13. The hierarchy of amino acid alphabets leads to a coherent hierarchy of energy functions, with qualitatively similar parameters for similar amino acid types at all levels. Most remarkably, the performance with six amino acid classes is equivalent to that of the most detailed, 20-class energy function. Conclusion This suggests that six carefully chosen amino acid classes are sufficient to encode specificity in protein–protein interactions, and provide a starting point to develop more complicated energy functions.
Collapse
Affiliation(s)
- Guillaume Launay
- Laboratoire de Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, 91128, Palaiseau, France
| | - Raul Mendez
- Service de Conformation de Macromolécules Biologiques et Bioinformatique, Centre de Biologie Structurale et Bioinformatique, Université Libre de Bruxelles, Belgium
| | - Shoshana Wodak
- Structural Biology Program, Hospital for Sick Children, Toronto, Canada
| | - Thomas Simonson
- Laboratoire de Biochimie (UMR CNRS 7654), Department of Biology, Ecole Polytechnique, 91128, Palaiseau, France
| |
Collapse
|
18
|
Schuster-Böckler B, Bateman A. Reuse of structural domain-domain interactions in protein networks. BMC Bioinformatics 2007; 8:259. [PMID: 17640363 PMCID: PMC1940023 DOI: 10.1186/1471-2105-8-259] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2007] [Accepted: 07/18/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein interactions are thought to be largely mediated by interactions between structural domains. Databases such as iPfam relate interactions in protein structures to known domain families. Here, we investigate how the domain interactions from the iPfam database are distributed in protein interactions taken from the HPRD, MPact, BioGRID, DIP and IntAct databases. RESULTS We find that known structural domain interactions can only explain a subset of 4-19% of the available protein interactions, nevertheless this fraction is still significantly bigger than expected by chance. There is a correlation between the frequency of a domain interaction and the connectivity of the proteins it occurs in. Furthermore, a large proportion of protein interactions can be attributed to a small number of domain interactions. We conclude that many, but not all, domain interactions constitute reusable modules of molecular recognition. A substantial proportion of domain interactions are conserved between E. coli, S. cerevisiae and H. sapiens. These domains are related to essential cellular functions, suggesting that many domain interactions were already present in the last universal common ancestor. CONCLUSION Our results support the concept of domain interactions as reusable, conserved building blocks of protein interactions, but also highlight the limitations currently imposed by the small number of available protein structures.
Collapse
Affiliation(s)
| | - Alex Bateman
- Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Hinxton, UK
| |
Collapse
|
19
|
Jenney FE, Adams MWW. The impact of extremophiles on structural genomics (and vice versa). Extremophiles 2007; 12:39-50. [PMID: 17563834 DOI: 10.1007/s00792-007-0087-9] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2006] [Accepted: 04/19/2007] [Indexed: 11/24/2022]
Abstract
The advent of the complete genome sequences of various organisms in the mid-1990s raised the issue of how one could determine the function of hypothetical proteins. While insight might be obtained from a 3D structure, the chances of being able to predict such a structure is limited for the deduced amino acid sequence of any uncharacterized gene. A template for modeling is required, but there was only a low probability of finding a protein closely-related in sequence with an available structure. Thus, in the late 1990s, an international effort known as structural genomics (SG) was initiated, its primary goal to "fill sequence-structure space" by determining the 3D structures of representatives of all known protein families. This was to be achieved mainly by X-ray crystallography and it was estimated that at least 5,000 new structures would be required. While the proteins (genes) for SG have subsequently been derived from hundreds of different organisms, extremophiles and particularly thermophiles have been specifically targeted due to the increased stability and ease of handling of their proteins, relative to those from mesophiles. This review summarizes the significant impact that extremophiles and proteins derived from them have had on SG projects worldwide. To what extent SG has influenced the field of extremophile research is also discussed.
Collapse
Affiliation(s)
- Francis E Jenney
- Department of Biochemistry and Molecular Biology, University of Georgia, Davison Life Sciences Complex, Green Street, Athens, GA 30602-7229, USA
| | | |
Collapse
|
20
|
Belda I, Madurga S, Tarragó T, Llorà X, Giralt E. Evolutionary computation and multimodal search: a good combination to tackle molecular diversity in the field of peptide design. Mol Divers 2006; 11:7-21. [PMID: 17165156 DOI: 10.1007/s11030-006-9053-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2006] [Accepted: 09/24/2006] [Indexed: 10/23/2022]
Abstract
The awesome degree of structural diversity accessible in peptide design has created a demand for computational resources that can evaluate a multitude of candidate structures. In our specific case, we translate the peptide design problem to an optimization problem, and use evolutionary computation (EC) in tandem with docking to carry out a combinatorial search. However, the use of EC in huge search spaces with different optima may pose certain drawbacks. For example, EC is prone to focus a search in the first good region found. This is a problem not only because of the undesirable and automatic rejection of potentially good search space regions, but also because the found solution may be extremely difficult to synthesize chemically or may even be a false docking positive. In order to avoid rejecting potentially good solutions and to maximize the molecular diversity of the search, we have implemented evolutionary multimodal search techniques, as well as the molecular diversity metric needed by the multimodal algorithms to measure differences between various regions of the search space.
Collapse
Affiliation(s)
- Ignasi Belda
- Institut de Recerca Biomèdica, Parc Científic de Barcelona, Universitat de Barcelona, Josep Samitier, Barcelona, Spain
| | | | | | | | | |
Collapse
|