Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Fidelis K, Stern PS, Bacon D, Moult J. Comparison of systematic search and database methods for constructing segments of protein structure. Protein Eng 1994;7:953-60. [PMID: 7809034 DOI: 10.1093/protein/7.8.953] [Citation(s) in RCA: 101] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

For:	Fidelis K, Stern PS, Bacon D, Moult J. Comparison of systematic search and database methods for constructing segments of protein structure. Protein Eng 1994;7:953-60. [PMID: 7809034 DOI: 10.1093/protein/7.8.953] [Citation(s) in RCA: 101] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

Number

Cited by Other Article(s)

Kryshtafovych A, Schwede T, Topf M, Fidelis K, Moult J. Critical assessment of methods of protein structure prediction (CASP)-Round XIV. Proteins 2021;89:1607-1617. [PMID: 34533838 PMCID: PMC8726744 DOI: 10.1002/prot.26237] [Citation(s) in RCA: 198] [Impact Index Per Article: 66.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2021] [Accepted: 07/28/2021] [Indexed: 01/14/2023]

Regep C, Georges G, Shi J, Popovic B, Deane CM. The H3 loop of antibodies shows unique structural characteristics. Proteins 2017;85:1311-1318. [PMID: 28342222 PMCID: PMC5535007 DOI: 10.1002/prot.25291] [Citation(s) in RCA: 56] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Revised: 03/06/2017] [Accepted: 03/17/2017] [Indexed: 12/12/2022]

Critical Features of Fragment Libraries for Protein Structure Prediction. PLoS One 2017;12:e0170131. [PMID: 28085928 PMCID: PMC5235372 DOI: 10.1371/journal.pone.0170131] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2016] [Accepted: 12/29/2016] [Indexed: 11/19/2022] Open

Webb B, Sali A. Comparative Protein Structure Modeling Using MODELLER. ACTA ACUST UNITED AC 2016;86:2.9.1-2.9.37. [PMID: 27801516 DOI: 10.1002/cpps.20] [Citation(s) in RCA: 367] [Impact Index Per Article: 45.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Kolodny R, Guibas L, Levitt M, Koehl P. Inverse Kinematics in Biology: The Protein Loop Closure Problem. Int J Rob Res 2016. [DOI: 10.1177/0278364905050352] [Citation(s) in RCA: 52] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Webb B, Sali A. Comparative Protein Structure Modeling Using MODELLER. CURRENT PROTOCOLS IN BIOINFORMATICS 2016;54:5.6.1-5.6.37. [PMID: 27322406 PMCID: PMC5031415 DOI: 10.1002/cpbi.3] [Citation(s) in RCA: 1820] [Impact Index Per Article: 227.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]

Peng X, He J, Niemi AJ. Clustering and percolation in protein loop structures. BMC STRUCTURAL BIOLOGY 2015;15:22. [PMID: 26510704 PMCID: PMC4625449 DOI: 10.1186/s12900-015-0049-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/07/2015] [Accepted: 10/13/2015] [Indexed: 11/24/2022]

Abstract

Background

High precision protein loop modelling remains a challenge, both in template based and template independent approaches to protein structure prediction.

Method

We introduce the concepts of protein loop clustering and percolation, to develop a quantitative approach to systematically classify the modular building blocks of loops in crystallographic folded proteins. These fragments are all different parameterisations of a unique kink solution to a generalised discrete nonlinear Schrödinger (DNLS) equation. Accordingly, the fragments are also local energy minima of the ensuing energy function.

Results

We show how the loop fragments cover practically all ultrahigh resolution crystallographic protein structures in Protein Data Bank (PDB), with a 0.2 Ångström root-mean-square (RMS) precision. We find that no more than 12 different loop fragments are needed, to describe around 38 % of ultrahigh resolution loops in PDB. But there is also a large number of loop fragments that are either unique, or very rare, and examples of unique fragments are found even in the structure of a myoglobin.

Conclusions

Protein loops are built in a modular fashion. The loops are composed of fragments that can be modelled by the kink of the DNLS equation. The majority of loop fragments are also common, which are shared by many proteins. These common fragments are probably important for supporting the overall protein conformation. But there are also several fragments that are either unique to a given protein, or very rare. Such fragments are probably related to the function of the protein. Furthermore, we have found that the amino acid sequence does not determine the structure in a unique fashion. There are many examples of loop fragments with an identical amino acid sequence, but with a very different structure.

Electronic supplementary material

The online version of this article (doi:10.1186/s12900-015-0049-x) contains supplementary material, which is available to authorized users.

Collapse

Kim H, Kihara D. Detecting local residue environment similarity for recognizing near-native structure models. Proteins 2014;82:3255-72. [PMID: 25132526 DOI: 10.1002/prot.24658] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2014] [Revised: 06/10/2014] [Accepted: 07/21/2014] [Indexed: 12/14/2022]

Webb B, Sali A. Comparative Protein Structure Modeling Using MODELLER. ACTA ACUST UNITED AC 2014;47:5.6.1-32. [PMID: 25199792 DOI: 10.1002/0471250953.bi0506s47] [Citation(s) in RCA: 757] [Impact Index Per Article: 75.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Fiser A. Protein structure modeling in the proteomics era. Expert Rev Proteomics 2014;1:97-110. [PMID: 15966803 DOI: 10.1586/14789450.1.1.97] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Webb B, Eswar N, Fan H, Khuri N, Pieper U, Dong G, Sali A. Comparative Modeling of Drug Target Proteins☆. REFERENCE MODULE IN CHEMISTRY, MOLECULAR SCIENCES AND CHEMICAL ENGINEERING 2014. [PMCID: PMC7157477 DOI: 10.1016/b978-0-12-409547-2.11133-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]

Mishra S, Saxena A, Sangwan RS. Fundamentals of Homology Modeling Steps and Comparison among Important Bioinformatics Tools: An Overview. ACTA ACUST UNITED AC 2013. [DOI: 10.17311/sciintl.2013.237.252] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]

Fernandez-Fuentes N, Fiser A. A modular perspective of protein structures: application to fragment based loop modeling. Methods Mol Biol 2013;932:141-58. [PMID: 22987351 PMCID: PMC3635063 DOI: 10.1007/978-1-62703-065-6_9] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/30/2023]

Porta JM, Jaillet L. Exploring the energy landscapes of flexible molecular loops using higher-dimensional continuation. J Comput Chem 2012;34:234-44. [PMID: 23015474 DOI: 10.1002/jcc.23128] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2012] [Revised: 08/06/2012] [Accepted: 08/30/2012] [Indexed: 12/27/2022]

Modeling Structures and Motions of Loops in Protein Molecules. ENTROPY 2012. [DOI: 10.3390/e14020252] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

di Luccio E, Koehl P. A quality metric for homology modeling: the H-factor. BMC Bioinformatics 2011;12:48. [PMID: 21291572 PMCID: PMC3213331 DOI: 10.1186/1471-2105-12-48] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2010] [Accepted: 02/04/2011] [Indexed: 11/10/2022] Open

Totrov M. Loop simulations. Methods Mol Biol 2011;857:207-29. [PMID: 22323223 DOI: 10.1007/978-1-61779-588-6_9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Clinciu DL, Chen YF, Ko CN, Lo CC, Yang JM. TSCC: Two-Stage Combinatorial Clustering for virtual screening using protein-ligand interactions and physicochemical features. BMC Genomics 2010;11 Suppl 4:S26. [PMID: 21143810 PMCID: PMC3005922 DOI: 10.1186/1471-2164-11-s4-s26] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open

Choi Y, Deane CM. FREAD revisited: Accurate loop structure prediction using a database search algorithm. Proteins 2010;78:1431-40. [PMID: 20034110 DOI: 10.1002/prot.22658] [Citation(s) in RCA: 121] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Al Nasr K, Sun W, He J. Structure prediction for the helical skeletons detected from the low resolution protein density map. BMC Bioinformatics 2010;11 Suppl 1:S44. [PMID: 20122218 PMCID: PMC3009517 DOI: 10.1186/1471-2105-11-s1-s44] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

Fiser A. Template-based protein structure modeling. Methods Mol Biol 2010;673:73-94. [PMID: 20835794 DOI: 10.1007/978-1-60761-842-3_6] [Citation(s) in RCA: 127] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]

Hvidsten TR, Kryshtafovych A, Fidelis K. Local descriptors of protein structure: a systematic analysis of the sequence-structure relationship in proteins using short- and long-range interactions. Proteins 2009;75:870-84. [PMID: 19025980 DOI: 10.1002/prot.22296] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Abstract

Local protein structure representations that incorporate long-range contacts between residues are often considered in protein structure comparison but have found relatively little use in structure prediction where assembly from single backbone fragments dominates. Here, we introduce the concept of local descriptors of protein structure to characterize local neighborhoods of amino acids including short- and long-range interactions. We build a library of recurring local descriptors and show that this library is general enough to allow assembly of unseen protein structures. The library could on average re-assemble 83% of 119 unseen structures, and showed little or no performance decrease between homologous targets and targets with folds not represented among domains used to build it. We then systematically evaluate the descriptor library to establish the level of the sequence signal in sets of protein fragments of similar geometrical conformation. In particular, we test whether that signal is strong enough to facilitate correct assignment and alignment of these local geometries to new sequences. We use the signal to assign descriptors to a test set of 479 sequences with less than 40% sequence identity to any domain used to build the library, and show that on average more than 50% of the backbone fragments constituting descriptors can be correctly aligned. We also use the assigned descriptors to infer SCOP folds, and show that correct predictions can be made in many of the 151 cases where PSI-BLAST was unable to detect significant sequence similarity to proteins in the library. Although the combinatorial problem of simultaneously aligning several fragments to sequence is a major bottleneck compared with single fragment methods, the advantage of the current approach is that correct alignments imply correct long range distance constraints. The lack of these constraints is most likely the major reason why structure prediction methods fail to consistently produce adequate models when good templates are unavailable or undetectable. Thus, we believe that the current study offers new and valuable insight into the prediction of sequence-structure relationships in proteins.

Collapse

Ikeda K, Hirokawa T, Higo J, Tomii K. Protein-segment universe exhibiting transitions at intermediate segment length in conformational subspaces. BMC STRUCTURAL BIOLOGY 2008;8:37. [PMID: 18700043 PMCID: PMC2529298 DOI: 10.1186/1472-6807-8-37] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2008] [Accepted: 08/13/2008] [Indexed: 12/05/2022]

Abstract

Background

Many studies have examined rules governing two aspects of protein structures: short segments and proteins' structural domains. Nevertheless, the organization and nature of the conformational space of segments with intermediate length between short segments and domains remain unclear. Conformational spaces of intermediate length segments probably differ from those of short segments. We investigated the identification and characterization of the boundary(s) between peptide-like (short segment) and protein-like (long segment) distributions. We generated ensembles embedded in globular proteins comprising segments 10–50 residues long. We explored the relationships between the conformational distribution of segments and their lengths, and also protein structural classes using principal component analysis based on the intra-segment C_α-C_αatomic distances.

Results

Our statistical analyses of segment conformations and length revealed critical dual transitions in their conformational distribution with segments derived from all four structural classes. Dual transitions were identified with the intermediate phase between the short segments and domains. Consequently, protein segment universes were categorized. i) Short segments (10–22 residues) showed a distribution with a high frequency of secondary structure clusters. ii) Medium segments (23–26 residues) showed a distribution corresponding to an intermediate state of transitions. iii) Long segments (27–50 residues) showed a distribution converging on one huge cluster containing compact conformations with a smaller radius of gyration. This distribution reflects the protein structures' organization and protein domains' origin. Three major conformational components (radius of gyration, structural symmetry with respect to the N-terminal and C-terminal halves, and single-turn/two-turn structure) well define most of the segment universes. Furthermore, we identified several conformational components that were unique to each structural class. Those characteristics suggest that protein segment conformation is described by compositions of the three common structural variables with large contributions and specific structural variables with small contributions.

Conclusion

The present results of the analyses of four protein structural classes show the universal role of three major components as segment conformational descriptors. The obtained perspectives of distribution changes related to the segment lengths using the three key components suggest both the adequacy and the possibility of further progress on the prediction strategies used in the recent de novo structure-prediction methods.

Collapse

Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A. Comparative protein structure modeling using MODELLER. ACTA ACUST UNITED AC 2008;Chapter 2:Unit 2.9. [PMID: 18429317 DOI: 10.1002/0471140864.ps0209s50] [Citation(s) in RCA: 750] [Impact Index Per Article: 46.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Li SC, Bu D, Gao X, Xu J, Li M. Designing succinct structural alphabets. Bioinformatics 2008;24:i182-9. [PMID: 18586712 PMCID: PMC2718643 DOI: 10.1093/bioinformatics/btn165] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022] Open

Reconstruction of protein backbones from the BriX collection of canonical protein fragments. PLoS Comput Biol 2008;4:e1000083. [PMID: 18483555 PMCID: PMC2367438 DOI: 10.1371/journal.pcbi.1000083] [Citation(s) in RCA: 39] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2007] [Accepted: 04/07/2008] [Indexed: 12/23/2022] Open

Abstract

As modeling of changes in backbone conformation still lacks a computationally efficient solution, we developed a discretisation of the conformational states accessible to the protein backbone similar to the successful rotamer approach in side chains. The BriX fragment database, consisting of fragments from 4 to 14 residues long, was realized through identification of recurrent backbone fragments from a non-redundant set of high-resolution protein structures. BriX contains an alphabet of more than 1,000 frequently observed conformations per peptide length for 6 different variation levels. Analysis of the performance of BriX revealed an average structural coverage of protein structures of more than 99% within a root mean square distance (RMSD) of 1 Angstrom. Globally, we are able to reconstruct protein structures with an average accuracy of 0.48 Angstrom RMSD. As expected, regular structures are well covered, but, interestingly, many loop regions that appear irregular at first glance are also found to form a recurrent structural motif, albeit with lower frequency of occurrence than regular secondary structures. Larger loop regions could be completely reconstructed from smaller recurrent elements, between 4 and 8 residues long. Finally, we observed that a significant amount of short sequences tend to display strong structural ambiguity between alpha helix and extended conformations. When the sequence length increases, this so-called sequence plasticity is no longer observed, illustrating the context dependency of polypeptide structures.

Large-scale DNA sequencing efforts produce large amounts of protein sequence data. However, in order to understand the function of a protein, its tertiary three-dimensional structure is required. Despite worldwide efforts in structural biology, experimental protein structures are determined at a significantly slower pace. As a result, computational methods for protein structure prediction receive significant attention. A large part of the structure prediction problem lies in the enormous size of the problem: proteins seem to occur in an infinite variety of shapes. Here, we propose that this huge complexity may be overcome by identifying recurrent protein fragments, which are frequently reused as building blocks to construct proteins that were hitherto thought to be unrelated. The BriX database is the outcome of identifying about 2,000 canonical shapes among 1,261 protein structures. We show any given protein can be reconstructed from this library of building blocks at a very high resolution, suggesting that the modelling of protein backbones may be greatly aided by our database.

Collapse

Eswar N, Webb B, Marti-Renom MA, Madhusudhan MS, Eramian D, Shen MY, Pieper U, Sali A. Comparative protein structure modeling using Modeller. ACTA ACUST UNITED AC 2008;Chapter 5:Unit-5.6. [PMID: 18428767 DOI: 10.1002/0471250953.bi0506s15] [Citation(s) in RCA: 1758] [Impact Index Per Article: 109.9] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

A historical perspective of template-based protein structure prediction. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2008;413:3-42. [PMID: 18075160 DOI: 10.1007/978-1-59745-574-9_1] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Wu Y, Lu M, Chen M, Li J, Ma J. OPUS-Ca: a knowledge-based potential function requiring only Calpha positions. Protein Sci 2007;16:1449-63. [PMID: 17586777 PMCID: PMC2206690 DOI: 10.1110/ps.072796107] [Citation(s) in RCA: 52] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/23/2022]

Eswar N, Sali A. Comparative Modeling of Drug Target Proteins. COMPREHENSIVE MEDICINAL CHEMISTRY II 2007. [PMCID: PMC7151936 DOI: 10.1016/b0-08-045044-x/00251-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]

Fernandez-Fuentes N, Fiser A. Saturating representation of loop conformational fragments in structure databanks. BMC STRUCTURAL BIOLOGY 2006;6:15. [PMID: 16820050 PMCID: PMC1574324 DOI: 10.1186/1472-6807-6-15] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/27/2006] [Accepted: 07/04/2006] [Indexed: 11/30/2022]

Abstract

Background

Short fragments of proteins are fundamental starting points in various structure prediction applications, such as in fragment based loop modeling methods but also in various full structure build-up procedures. The applicability and performance of these approaches depend on the availability of short fragments in structure databanks.

Results

We studied the representation of protein loop fragments up to 14 residues in length. All possible query fragments found in sequence databases (Sequence Space) were clustered and cross referenced with available structural fragments in Protein Data Bank (Structure Space). We found that the expansion of PDB in the last few years resulted in a dense coverage of loop conformational fragments. For each loops of length 8 in the current Sequence Space there is at least one loop in Structure Space with 50% or higher sequence identity. By correlating sequence and structure clusters of loops we found that a 50% sequence identity generally guarantees structural similarity. These percentages of coverage at 50% sequence cutoff drop to 96, 94, 68, 53, 33 and 13% for loops of length 9, 10, 11, 12, 13, and 14, respectively. There is not a single loop in the current Sequence Space at any length up to 14 residues that is not matched with a conformational segment that shares at least 20% sequence identity. This minimum observed identity is 40% for loops of 12 residues or shorter and is as high as 50% for 10 residue or shorter loops. We also assessed the impact of rapidly growing sequence databanks on the estimated number of new loop conformations and found that while the number of sequentially unique sequence segments increased about six folds during the last five years there are almost no unique conformational segments among these up to 12 residues long fragments.

Conclusion

The results suggest that fragment based prediction approaches are not limited any more by the completeness of fragments in databanks but rather by the effective scoring and search algorithms to locate them. The current favorable coverage and trends observed will be further accentuated with the progress of Protein Structure Initiative that targets new protein folds and ultimately aims at providing an exhaustive coverage of the structure space.

Collapse

Xiang Z. Advances in homology protein structure modeling. Curr Protein Pept Sci 2006;7:217-27. [PMID: 16787261 PMCID: PMC1839925 DOI: 10.2174/138920306777452312] [Citation(s) in RCA: 302] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]

Fernandez-Fuentes N, Oliva B, Fiser A. A supersecondary structure library and search algorithm for modeling loops in protein structures. Nucleic Acids Res 2006;34:2085-97. [PMID: 16617149 PMCID: PMC1440879 DOI: 10.1093/nar/gkl156] [Citation(s) in RCA: 61] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

Abstract

We present a fragment-search based method for predicting loop conformations in protein models. A hierarchical and multidimensional database has been set up that currently classifies 105 950 loop fragments and loop flanking secondary structures. Besides the length of the loops and types of bracing secondary structures the database is organized along four internal coordinates, a distance and three types of angles characterizing the geometry of stem regions. Candidate fragments are selected from this library by matching the length, the types of bracing secondary structures of the query and satisfying the geometrical restraints of the stems and subsequently inserted in the query protein framework where their fit is assessed by the root mean square deviation (r.m.s.d.) of stem regions and by the number of rigid body clashes with the environment. In the final step remaining candidate loops are ranked by a Z-score that combines information on sequence similarity and fit of predicted and observed ϕ/ψ main chain dihedral angle propensities. Confidence Z-score cut-offs were determined for each loop length that identify those predicted fragments that outperform a competitive ab initio method. A web server implements the method, regularly updates the fragment library and performs prediction. Predicted segments are returned, or optionally, these can be completed with side chain reconstruction and subsequently annealed in the environment of the query protein by conjugate gradient minimization. The prediction method was tested on artificially prepared search datasets where all trivial sequence similarities on the SCOP superfamily level were removed. Under these conditions it is possible to predict loops of length 4, 8 and 12 with coverage of 98, 78 and 28% with at least of 0.22, 1.38 and 2.47 Å of r.m.s.d. accuracy, respectively. In a head-to-head comparison on loops extracted from freshly deposited new protein folds the current method outperformed in a ∼5:1 ratio an earlier developed database search method.

Collapse

Fernandez-Fuentes N, Querol E, Aviles FX, Sternberg MJE, Oliva B. Prediction of the conformation and geometry of loops in globular proteins: testing ArchDB, a structural classification of loops. Proteins 2006;60:746-57. [PMID: 16021623 DOI: 10.1002/prot.20516] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Szarecka A, Meirovitch H. Optimization of the GB/SA solvation model for predicting the structure of surface loops in proteins. J Phys Chem B 2006;110:2869-80. [PMID: 16471897 PMCID: PMC1945207 DOI: 10.1021/jp055771+] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]

Abstract

Implicit solvation models are commonly optimized with respect to experimental data or Poisson-Boltzmann (PB) results obtained for small molecules, where the force field is sometimes not considered. In previous studies, we have developed an optimization procedure for cyclic peptides and surface loops in proteins based on the entire system studied and the specific force field used. Thus, the loop has been modeled by the simplified solvation function E(tot) = E(FF) (epsilon = 2r) + Sigma(i) sigma(i)A(i), where E(FF) (epsilon = nr) is the AMBER force field energy with a distance-dependent dielectric function, epsilon = nr, A(i) is the solvent accessible surface area of atom i, and sigma(i) is its atomic solvation parameter. During the optimization process, the loop is free to move while the protein template is held fixed in its X-ray structure. To improve on the results of this model, in the present work we apply our optimization procedure to the physically more rigorous solvation model, the generalized Born with surface area (GB/SA) (together with the all-atom AMBER force field) as suggested by Still and co-workers (J. Phys. Chem. A 1997, 101, 3005). The six parameters of the GB/SA model, namely, P(1)-P(5) and the surface area parameter, sigma (programmed in the TINKER package) are reoptimized for a "training" group of nine loops, and a best-fit set is defined from the individual sets of optimized parameters. The best-fit set and Still's original set of parameters (where Lys, Arg, His, Glu, and Asp are charged or neutralized) were applied to the training group as well as to a "test" group of seven loops, and the energy gaps and the corresponding RMSD values were calculated. These GB/SA results based on the three sets of parameters have been found to be comparable; surprisingly, however, they are somewhat inferior (e.g, of larger energy gaps) to those obtained previously from the simplified model described above. We discuss recent results for loops obtained by other solvation models and potential directions for future studies.

Collapse

Watanabe YS, Fukunishi Y, Nakamura H. Generation of a flexible loop structural ensemble and its application to induced-fit structural changes following ligand binding. Biophysics (Nagoya-shi) 2006;2:1-12. [PMID: 27857555 PMCID: PMC5036648 DOI: 10.2142/biophysics.2.1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2005] [Accepted: 12/05/2005] [Indexed: 12/04/2022] Open

Jiang H, Blouin C. Ab initio construction of all-atom loop conformations. J Mol Model 2005;12:221-8. [PMID: 16247602 DOI: 10.1007/s00894-005-0030-x] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2004] [Accepted: 06/23/2005] [Indexed: 11/28/2022]

Boomsma W, Hamelryck T. Full cyclic coordinate descent: solving the protein loop closure problem in Calpha space. BMC Bioinformatics 2005;6:159. [PMID: 15985178 PMCID: PMC1192790 DOI: 10.1186/1471-2105-6-159] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2005] [Accepted: 06/28/2005] [Indexed: 11/10/2022] Open

Abstract

BACKGROUND

Various forms of the so-called loop closure problem are crucial to protein structure prediction methods. Given an N- and a C-terminal end, the problem consists of finding a suitable segment of a certain length that bridges the ends seamlessly. In homology modelling, the problem arises in predicting loop regions. In de novo protein structure prediction, the problem is encountered when implementing local moves for Markov Chain Monte Carlo simulations. Most loop closure algorithms keep the bond angles fixed or semi-fixed, and only vary the dihedral angles. This is appropriate for a full-atom protein backbone, since the bond angles can be considered as fixed, while the (phi, psi) dihedral angles are variable. However, many de novo structure prediction methods use protein models that only consist of Calpha atoms, or otherwise do not make use of all backbone atoms. These methods require a method that alters both bond and dihedral angles, since the pseudo bond angle between three consecutive Calpha atoms also varies considerably.

RESULTS

Here we present a method that solves the loop closure problem for Calpha only protein models. We developed a variant of Cyclic Coordinate Descent (CCD), an inverse kinematics method from the field of robotics, which was recently applied to the loop closure problem. Since the method alters both bond and dihedral angles, which is equivalent to applying a full rotation matrix, we call our method Full CCD (FCDD). FCCD replaces CCD's vector-based optimization of a rotation around an axis with a singular value decomposition-based optimization of a general rotation matrix. The method is easy to implement and numerically stable.

CONCLUSION

We tested the method's performance on sets of random protein Calpha segments between 5 and 30 amino acids long, and a number of loops of length 4, 8 and 12. FCCD is fast, has a high success rate and readily generates conformations close to those of real loops. The presence of constraints on the angles only has a small effect on the performance. A reference implementation of FCCD in Python is available as supplementary information.

Collapse

Kuznetsov IB, Rackovsky S. Comparative computational analysis of prion proteins reveals two fragments with unusual structural properties and a pattern of increase in hydrophobicity associated with disease-promoting mutations. Protein Sci 2005;13:3230-44. [PMID: 15557265 PMCID: PMC2287303 DOI: 10.1110/ps.04833404] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Kuznetsov IB, Rackovsky S. On the properties and sequence context of structurally ambivalent fragments in proteins. Protein Sci 2004;12:2420-33. [PMID: 14573856 PMCID: PMC2366964 DOI: 10.1110/ps.03209703] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]

Rohl CA, Strauss CEM, Chivian D, Baker D. Modeling structurally variable regions in homologous proteins with rosetta. Proteins 2004;55:656-77. [PMID: 15103629 DOI: 10.1002/prot.10629] [Citation(s) in RCA: 242] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]

Abstract

A major limitation of current comparative modeling methods is the accuracy with which regions that are structurally divergent from homologues of known structure can be modeled. Because structural differences between homologous proteins are responsible for variations in protein function and specificity, the ability to model these differences has important functional consequences. Although existing methods can provide reasonably accurate models of short loop regions, modeling longer structurally divergent regions is an unsolved problem. Here we describe a method based on the de novo structure prediction algorithm, Rosetta, for predicting conformations of structurally divergent regions in comparative models. Initial conformations for short segments are selected from the protein structure database, whereas longer segments are built up by using three- and nine-residue fragments drawn from the database and combined by using the Rosetta algorithm. A gap closure term in the potential in combination with modified Newton's method for gradient descent minimization is used to ensure continuity of the peptide backbone. Conformations of variable regions are refined in the context of a fixed template structure using Monte Carlo minimization together with rapid repacking of side-chains to iteratively optimize backbone torsion angles and side-chain rotamers. For short loops, mean accuracies of 0.69, 1.45, and 3.62 A are obtained for 4, 8, and 12 residue loops, respectively. In addition, the method can provide reasonable models of conformations of longer protein segments: predicted conformations of 3A root-mean-square deviation or better were obtained for 5 of 10 examples of segments ranging from 13 to 34 residues. In combination with a sequence alignment algorithm, this method generates complete, ungapped models of protein structures, including regions both similar to and divergent from a homologous structure. This combined method was used to make predictions for 28 protein domains in the Critical Assessment of Protein Structure 4 (CASP 4) and 59 domains in CASP 5, where the method ranked highly among comparative modeling and fold recognition methods. Model accuracy in these blind predictions is dominated by alignment quality, but in the context of accurate alignments, long protein segments can be accurately modeled. Notably, the method correctly predicted the local structure of a 39-residue insertion into a TIM barrel in CASP 5 target T0186.

Collapse

Li X, Jacobson MP, Friesner RA. High-resolution prediction of protein helix positions and orientations. Proteins 2004;55:368-82. [PMID: 15048828 DOI: 10.1002/prot.20014] [Citation(s) in RCA: 50] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Comparative Protein Structure Modeling and its Applications to Drug Discovery. ANNUAL REPORTS IN MEDICINAL CHEMISTRY 2004. [DOI: 10.1016/s0065-7743(04)39020-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]

Marti‐Renom MA, Madhusudhan M, Eswar N, Pieper U, Shen M, Sali A, Fiser A, Mirkovic N, John B, Stuart A. Modeling Protein Structure from its Sequence. ACTA ACUST UNITED AC 2003. [DOI: 10.1002/0471250953.bi0501s03] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Bastard K, Thureau A, Lavery R, Prévost C. Docking macromolecules with flexible segments. J Comput Chem 2003;24:1910-20. [PMID: 14515373 DOI: 10.1002/jcc.10329] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

DePristo MA, De Bakker PIW, Shetty RP, Blundell TL. Discrete restraint-based protein modeling and the Calpha-trace problem. Protein Sci 2003;12:2032-46. [PMID: 12931001 PMCID: PMC2323999 DOI: 10.1110/ps.0386903] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

de Bakker PIW, DePristo MA, Burke DF, Blundell TL. Ab initio construction of polypeptide fragments: Accuracy of loop decoy discrimination by an all-atom statistical potential and the AMBER force field with the Generalized Born solvation model. Proteins 2003;51:21-40. [PMID: 12596261 DOI: 10.1002/prot.10235] [Citation(s) in RCA: 120] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]

Abstract

The accuracy of model selection from decoy ensembles of protein loop conformations was explored by comparing the performance of the Samudrala-Moult all-atom statistical potential (RAPDF) and the AMBER molecular mechanics force field, including the Generalized Born/surface area solvation model. Large ensembles of consistent loop conformations, represented at atomic detail with idealized geometry, were generated for a large test set of protein loops of 2 to 12 residues long by a novel ab initio method called RAPPER that relies on fine-grained residue-specific phi/psi propensity tables for conformational sampling. Ranking the conformers on the basis of RAPDF scores resulted in selected conformers that had an average global, non-superimposed RMSD for all heavy mainchain atoms ranging from 1.2 A for 4-mers to 2.9 A for 8-mers to 6.2 A for 12-mers. After filtering on the basis of anchor geometry and RAPDF scores, ranking by energy minimization of the AMBER/GBSA potential energy function selected conformers that had global RMSD values of 0.5 A for 4-mers, 2.3 A for 8-mers, and 5.0 A for 12-mers. Minimized fragments had, on average, consistently lower RMSD values (by 0.1 A) than their initial conformations. The importance of the Generalized Born solvation energy term is reflected by the observation that the average RMSD accuracy for all loop lengths was worse when this term is omitted. There are, however, still many cases where the AMBER gas-phase minimization selected conformers of lower RMSD than the AMBER/GBSA minimization. The AMBER/GBSA energy function had better correlation with RMSD to native than the RAPDF. When the ensembles were supplemented with conformations extracted from experimental structures, a dramatic improvement in selection accuracy was observed at longer lengths (average RMSD of 1.3 A for 8-mers) when scoring with the AMBER/GBSA force field. This work provides the basis for a promising hybrid approach of ab initio and knowledge-based methods for loop modeling.

Collapse

DePristo MA, de Bakker PIW, Lovell SC, Blundell TL. Ab initio construction of polypeptide fragments: efficient generation of accurate, representative ensembles. Proteins 2003;51:41-55. [PMID: 12596262 DOI: 10.1002/prot.10285] [Citation(s) in RCA: 116] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/27/2022]

Hunter CG, Subramaniam S. Protein local structure prediction from sequence. Proteins 2003;50:572-9. [PMID: 12577263 DOI: 10.1002/prot.10310] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]

Hunter CG, Subramaniam S. Protein fragment clustering and canonical local shapes. Proteins 2003;50:580-8. [PMID: 12577264 DOI: 10.1002/prot.10309] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]