1
|
Helix-Coil Transition at a Glycine Following a Nascent α-Helix: A Synergetic Guidance Mechanism for Helix Growth. J Phys Chem A 2020; 124:7478-7490. [PMID: 32877193 DOI: 10.1021/acs.jpca.0c05489] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Abstract
A detailed understanding of forces guiding the rapid folding of a polypeptide from an apparently random coil state to an ordered α-helical structure following the rate-limiting preorganization of the initial three residue backbones into helical conformation is imperative to comprehending and regulating protein folding and for the rational design of biological mimetics. However, several details of this process are still unknown. First, although the helix-coil transition was proposed to originate at the residue level (J. Chem. Phys. 1959, 31, 526-535; J. Chem. Phys. 1961, 34, 1963-1974), all helix-folding studies have only established it between time-averaged bulk states of a long-lived helix and several transiently populated random coils, along the whole helix model sequence. Second, the predominant thermodynamic forces driving either this two-state transition or the faster helix growth following helix nucleation are still unclear. Third, the conformational space of the random coil state is not well-defined unlike its corresponding α-helix. Here we investigate the restrictions placed on the conformational space of a Gly residue backbone, as a result of it immediately succeeding a nascent α-helical turn. Analyses of the temperature-dependent 1D-, 2D-NMR, FT-IR, and CD spectra and GROMACS MD simulation trajectory of a Gly residue backbone following a model α-helical turn, which is artificially rigidified by a covalent hydrogen bond surrogate, reveal that: (i) the α-helical turn guides the ϕ torsion of the Gly exclusively into either a predominantly populated entropically favored α-helical (α-ϕ) state or a scarcely populated random coil (RC-ϕ) state; (ii) the α-ϕ state of Gly in turn favors the stability of the preceding α-helical turn, while the RC-ϕ state disrupts it, revealing an entropy-driven synergetic guidance for helix growth in the residue following helix nucleation. The applicability of a current synergetic guidance mechanism to explain rapid helix growth in folded and unfolded states of proteins and helical peptides is discussed.
Collapse
|
2
|
Fast and flexible coarse-grained prediction of protein folding routes using ensemble modeling and evolutionary sequence variation. Bioinformatics 2020; 36:1420-1428. [PMID: 31584628 DOI: 10.1093/bioinformatics/btz743] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2019] [Revised: 09/22/2019] [Accepted: 09/28/2019] [Indexed: 11/15/2022] Open
Abstract
MOTIVATION Protein folding is a dynamic process through which polypeptide chains reach their native 3D structures. Although the importance of this mechanism is widely acknowledged, very few high-throughput computational methods have been developed to study it. RESULTS In this paper, we report a computational platform named P3Fold that combines statistical and evolutionary information for predicting and analyzing protein folding routes. P3Fold uses coarse-grained modeling and efficient combinatorial schemes to predict residue contacts and evaluate the folding routes of a protein sequence within minutes or hours. To facilitate access to this technology, we devise graphical representations and implement an interactive web interface that allows end-users to leverage P3Fold predictions. Finally, we use P3Fold to conduct large and short scale experiments on the human proteome that reveal the broad conservation and variations of structural intermediates within protein families. AVAILABILITY AND IMPLEMENTATION A Web server of P3Fold is freely available at http://csb.cs.mcgill.ca/P3Fold. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
3
|
Conformational and functional characterization of artificially conjugated non-canonical ubiquitin dimers. Sci Rep 2019; 9:19991. [PMID: 31882959 PMCID: PMC6934565 DOI: 10.1038/s41598-019-56458-z] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Accepted: 12/03/2019] [Indexed: 11/30/2022] Open
Abstract
Ubiquitylation is an eminent posttranslational modification referring to the covalent attachment of single ubiquitin molecules or polyubiquitin chains to a target protein dictating the fate of such labeled polypeptide chains. Here, we have biochemically produced artificially Lys11-, and Lys27-, and Lys63-linked ubiquitin dimers based on click-chemistry generating milligram quantities in high purity. We show that the artificial linkage used for the conjugation of two ubiquitin moieties represents a fully reliable surrogate of the natural isopeptide bond by acquiring highly resolved nuclear magnetic resonance (NMR) spectroscopic data including ligand binding studies. Extensive coarse grained and atomistic molecular dynamics (MD) simulations allow to extract structures representing the ensemble of domain-domain conformations used to verify the experimental data. Advantageously, this methodology does not require individual isotopic labeling of both ubiquitin moieties as NMR data have been acquired on the isotopically labeled proximal moiety and complementary MD simulations have been used to fully interpret the experimental data in terms of domain-domain conformation. This combined approach intertwining NMR spectroscopy with MD simulations makes it possible to describe the conformational space non-canonically Lys11-, and Lys27-linked ubiquitin dimers occupy in a solution averaged ensemble by taking atomically resolved information representing all residues in ubiquitin dimers into account.
Collapse
|
4
|
Complex Folding Landscape of Apomyoglobin at Acidic pH Revealed by Ultrafast Kinetic Analysis of Core Mutants. J Phys Chem B 2018; 122:11228-11239. [PMID: 30133301 DOI: 10.1021/acs.jpcb.8b06895] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Under mildly acidic conditions (pH 4-4.5) apomyoglobin (apoMb) adopts a partially structured equilibrium state ( M-state) that structurally resembles a kinetic intermediate encountered at a late stage of folding to the native structure at neutral pH. We have previously reported that the M-state is formed rapidly (<1 ms) via a multistate process and thus offers a unique opportunity for exploring early stages of folding by both experimental and computational techniques. In order to gain structural insight into intermediates and barriers at the residue level, we studied the folding/unfolding kinetics of 12 apoMb mutants at pH 4.2 using fluorescence-detected ultrafast mixing techniques. Global analysis of the submillisecond folding/unfolding kinetics vs urea concentration for each variant, based on a sequential four-state mechanism ( U ⇔ I ⇔ L ⇔ M), allowed us to determine elementary rate constants and their dependence on urea concentration for most transitions. Comparison of the free energy diagrams constructed from the kinetic data of the mutants with that of wild-type apoMb yielded quantitative information on the effects of mutations on the free energy (ΔΔ G) of both intermediates and the first two kinetic barriers encountered during folding. Truncation of conserved aliphatic side chains on helices A, G, and H gives rise to a stepwise increase in ΔΔ G as the protein advances from U toward M, consistent with progressive stabilization of native-like contacts within the primary core of apoMb. Helix-helix contacts in the primary core contribute little to the first folding barrier ( U ⇔ I) and thus are not required for folding initiation but are critical for the stability of the late intermediate, L, and the M-state. Alanine substitution of hydrophobic residues at more peripheral helix-helix contact sites of the native structure, which are still absent or unstable in the M-state, shows both positive (destabilizing) and negative (stabilizing) ΔΔ G, indicating that non-native contacts are formed initially and weakened or lost as a result of subsequent structural rearrangement steps.
Collapse
|
5
|
Biophysical Models of Protein Evolution: Understanding the Patterns of Evolutionary Sequence Divergence. Annu Rev Biophys 2017; 46:85-103. [PMID: 28301766 DOI: 10.1146/annurev-biophys-070816-033819] [Citation(s) in RCA: 68] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
For decades, rates of protein evolution have been interpreted in terms of the vague concept of functional importance. Slowly evolving proteins or sites within proteins were assumed to be more functionally important and thus subject to stronger selection pressure. More recently, biophysical models of protein evolution, which combine evolutionary theory with protein biophysics, have completely revolutionized our view of the forces that shape sequence divergence. Slowly evolving proteins have been found to evolve slowly because of selection against toxic misfolding and misinteractions, linking their rate of evolution primarily to their abundance. Similarly, most slowly evolving sites in proteins are not directly involved in function, but mutating these sites has a large impact on protein structure and stability. In this article, we review the studies in the emerging field of biophysical protein evolution that have shaped our current understanding of sequence divergence patterns. We also propose future research directions to develop this nascent field.
Collapse
|
6
|
Fold and flexibility: what can proteins' mechanical properties tell us about their folding nucleus? J R Soc Interface 2016; 12:rsif.2015.0876. [PMID: 26577596 DOI: 10.1098/rsif.2015.0876] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
The determination of a protein's folding nucleus, i.e. a set of native contacts playing an important role during its folding process, remains an elusive yet essential problem in biochemistry. In this work, we investigate the mechanical properties of 70 protein structures belonging to 14 protein families presenting various folds using coarse-grain Brownian dynamics simulations. The resulting rigidity profiles combined with multiple sequence alignments show that a limited set of rigid residues, which we call the consensus nucleus, occupy conserved positions along the protein sequence. These residues' side chains form a tight interaction network within the protein's core, thus making our consensus nuclei potential folding nuclei. A review of experimental and theoretical literature shows that most (above 80%) of these residues were indeed identified as folding nucleus member in earlier studies.
Collapse
|
7
|
The interface of protein structure, protein biophysics, and molecular evolution. Protein Sci 2012; 21:769-85. [PMID: 22528593 PMCID: PMC3403413 DOI: 10.1002/pro.2071] [Citation(s) in RCA: 140] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2012] [Revised: 03/22/2012] [Accepted: 03/23/2012] [Indexed: 12/20/2022]
Abstract
Abstract The interface of protein structural biology, protein biophysics, molecular evolution, and molecular population genetics forms the foundations for a mechanistic understanding of many aspects of protein biochemistry. Current efforts in interdisciplinary protein modeling are in their infancy and the state-of-the art of such models is described. Beyond the relationship between amino acid substitution and static protein structure, protein function, and corresponding organismal fitness, other considerations are also discussed. More complex mutational processes such as insertion and deletion and domain rearrangements and even circular permutations should be evaluated. The role of intrinsically disordered proteins is still controversial, but may be increasingly important to consider. Protein geometry and protein dynamics as a deviation from static considerations of protein structure are also important. Protein expression level is known to be a major determinant of evolutionary rate and several considerations including selection at the mRNA level and the role of interaction specificity are discussed. Lastly, the relationship between modeling and needed high-throughput experimental data as well as experimental examination of protein evolution using ancestral sequence resurrection and in vitro biochemistry are presented, towards an aim of ultimately generating better models for biological inference and prediction.
Collapse
|
8
|
SEARCH FOR FOLDING INITIATION SITES FROM AMINO ACID SEQUENCE. J Bioinform Comput Biol 2011; 6:681-91. [DOI: 10.1142/s021972000800362x] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2007] [Revised: 01/02/2008] [Accepted: 01/04/2008] [Indexed: 11/18/2022]
Abstract
A crucial event in protein folding is the formation of a folding nucleus, which is a structured part of the protein chain in the transition state. We demonstrate a correlation between locations of residues involved in the folding nuclei and locations of predicted amyloidogenic regions. The average Φ-values are significantly greater inside amyloidogenic regions than outside them. We have found that fibril formation and normal folding involve many of the same key residues, giving an opportunity to outline the folding initiation site in protein chains. The search for folding initiation sites for apomyoglobin and ribonuclease. A coincides with the predictions made by other approaches.
Collapse
|
9
|
Abstract
Aromatic residues are key widespread elements of protein structures and have been shown to be important for structure stability, folding, protein-protein recognition, and ligand binding. The interactions of pairs of aromatic residues (aromatic dimers) have been extensively studied in protein structures. Isolated aromatic molecules tend to form higher order clusters, like trimers, tetramers, and pentamers, that adopt particular well-defined structures. Taking this into account, we have surveyed protein structures deposited in the Protein Data Bank in order to find clusters of aromatic residues in proteins larger than dimers and characterized them. Our results show that larger clusters are found in one of every two unique proteins crystallized so far, that the clusters are built adopting the same trimer motifs found for benzene clusters in vacuum, and that they are clearly nonlocal brining primary structure distant sites together. We extensively analyze the trimers and tetramers conformations and found two main cluster types: a symmetric cluster and an extended ladder. Finally, using calmodulin as a test case, we show aromatic clsuters possible role in folding and protein-protein interactions. All together, our study highlights the relevance of aromatic clusters beyond the dimer in protein function, stability, and ligand recognition.
Collapse
|
10
|
pH-induced equilibrium unfolding of apomyoglobin: substitutions at conserved Trp14 and Met131 and non-conserved Val17 positions. BIOCHEMISTRY (MOSCOW) 2008; 73:693-701. [PMID: 18620536 DOI: 10.1134/s0006297908060102] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
Abstract
A number of residues in globins family are well conserved but are not directly involved in the primary oxygen-carrying function of these proteins. A possible role for these conserved, non-functional residues has been suggested in promoting a rapid and correct folding process to the native tertiary structure. To test this hypothesis, we have studied pH-induced equilibrium unfolding of mutant apomyoglobins with substitutions of the conserved residues Trp14 and Met131, which are not involved in the function of myoglobin, by various amino acids. This allowed estimating their impact on the stability of various conformational states of the proteins and selecting conditions for a folding kinetics study. The results obtained from circular dichroism, tryptophan fluorescence, and differential scanning microcalorimetry for these mutant proteins were compared with those for the wild type protein and for a mutant with the non-conserved Val17 substituted by Ala. In the native folded state, all of the mutant apoproteins have a compact globular structure, but are destabilized in comparison to the wild type protein. The pH-induced denaturation of the mutant proteins occurs through the formation of a molten globule-like intermediate similar to that of the wild type protein. Thermodynamic parameters for all of the proteins were calculated using the three state model. Stability of equilibrium intermediates at pH ~4.0 was shown to be slightly affected by the mutations. Thus, all of the above substitutions influence the stability of the native state of these proteins. The cooperativity of conformational transitions and the exposed to solvent protein surface were also changed, but not for the substitution at Val17.
Collapse
|
11
|
Accurate structural correlations from maximum likelihood superpositions. PLoS Comput Biol 2008; 4:e43. [PMID: 18282091 PMCID: PMC2242818 DOI: 10.1371/journal.pcbi.0040043] [Citation(s) in RCA: 74] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2007] [Accepted: 01/11/2008] [Indexed: 11/19/2022] Open
Abstract
The cores of globular proteins are densely packed, resulting in complicated networks of structural interactions. These interactions in turn give rise to dynamic structural correlations over a wide range of time scales. Accurate analysis of these complex correlations is crucial for understanding biomolecular mechanisms and for relating structure to function. Here we report a highly accurate technique for inferring the major modes of structural correlation in macromolecules using likelihood-based statistical analysis of sets of structures. This method is generally applicable to any ensemble of related molecules, including families of nuclear magnetic resonance (NMR) models, different crystal forms of a protein, and structural alignments of homologous proteins, as well as molecular dynamics trajectories. Dominant modes of structural correlation are determined using principal components analysis (PCA) of the maximum likelihood estimate of the correlation matrix. The correlations we identify are inherently independent of the statistical uncertainty and dynamic heterogeneity associated with the structural coordinates. We additionally present an easily interpretable method (“PCA plots”) for displaying these positional correlations by color-coding them onto a macromolecular structure. Maximum likelihood PCA of structural superpositions, and the structural PCA plots that illustrate the results, will facilitate the accurate determination of dynamic structural correlations analyzed in diverse fields of structural biology. Biological macromolecules comprise extensive networks of interconnected atoms. These complex coupled networks result in correlated structural dynamics, where atoms and residues move and evolve together as concerted conformational changes. The availability of a wealth of macromolecular structures necessitates the use of robust strategies for analyzing the correlated modes of motion found in molecular ensembles. Current strategies use a combination of least-squares superpositions and statistical analysis of the structural covariance matrix. However, the least-squares treatment implicitly requires that atoms are uncorrelated and that each atom has the same positional uncertainty, two assumptions which are violated in structural ensembles. For example, the atoms in the proteins are connected by chemical bonds, covalent and non-covalent, resulting in strong correlations. Furthermore, different atoms have different variances, because some atoms are known with less precision or have greater mobility. Using maximum likelihood (ML) analysis, we have developed a technique that is markedly more accurate than the classical least-squares approach by accounting for both correlations and heterogeneous variances. The improved ability to accurately analyze the major modes of dynamic structural correlations will benefit a diverse range of biological disciplines, including nuclear magnetic resonance (NMR) spectroscopy, crystallography, molecular dynamics, and molecular evolution.
Collapse
|
12
|
In silico protein fragmentation reveals the importance of critical nuclei on domain reassembly. Biophys J 2007; 94:1575-88. [PMID: 17993485 DOI: 10.1529/biophysj.107.119651] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Protein complementation assays (PCAs) based on split protein fragments have become powerful tools that facilitate the study and engineering of intracellular protein-protein interactions. These assays are based on the observation that a given protein can be split into two inactive fragments and these fragments can reassemble into the original properly folded and functional structure. However, one experimentally observed limitation of PCA systems is that the folding of a protein from its fragments is dramatically slower relative to that of the unsplit parent protein. This is due in part to a poor understanding of how PCA design parameters such as split site position in the primary sequence and size of the resulting fragments contribute to the efficiency of protein reassembly. We used a minimalist on-lattice model to analyze how the dynamics of the reassembly process for two model proteins was affected by the location of the split site. Our results demonstrate that the balanced distribution of the "folding nucleus," a subset of residues that are critical to the formation of the transition state leading to productive folding, between protein fragments is key to their reassembly.
Collapse
|
13
|
Multivariate Analysis of Conserved Sequence–Structure Relationships in Kinesins: Coupling of the Active Site and a Tubulin-binding Sub-domain. J Mol Biol 2007; 368:1231-48. [PMID: 17399740 DOI: 10.1016/j.jmb.2007.02.049] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2006] [Revised: 01/30/2007] [Accepted: 02/06/2007] [Indexed: 11/17/2022]
Abstract
An extensive computational analysis of available sequence and crystal structure data was used to identify functionally important residue interactions within the motor domain of the kinesin molecular motor. Principal component analysis revealed that all current kinesin crystal structures reside in one of two main conformations, which differ at the active site, and in the position of a microtubule-binding sub-domain relative to a rigid central core. This sub-domain consists of secondary structure elements alpha4-loop12-alpha5-loop13 and contains a conserved hydrophilic surface patch that may be involved in strong binding to microtubules. A hinge point for the sub-domain motion lies near a conserved glycine at position 292. Statistical coupling analysis revealed a network of co-evolving positions that link this region to the nucleotide-binding site, via a highly conserved histidine in the switch I loop. The data are consistent with a model in which the nucleotide status of the active site shifts kinesin between weak and strong binding conformations via reconfiguration of the identified sub-domain. Our data provide a statistically supported framework for further examination of this and other structure-function relationships in the kinesin family.
Collapse
|
14
|
Abstract
The authors studied the temperature-induced unfolding of ubiquitin by all-atom Monte Carlo simulations. The unfolding behavior is compared with that seen in previous simulations of the mechanical unfolding of this protein, based on the same model. In mechanical unfolding, secondary-structure elements were found to break in a quite well-defined order. In thermal unfolding, the authors saw somewhat larger event-to-event fluctuations, but the unfolding pathway was still far from random. Two long-lived secondary-structure elements could be identified in the simulations. These two elements have been found experimentally to be the thermally most stable ones. Interestingly, one of these long-lived elements, the first beta-hairpin, was found to break early in the mechanical unfolding simulations. Their combined simulation results thus enable the authors to predict in detail important differences between the thermal and mechanical unfolding behaviors of ubiquitin.
Collapse
|
15
|
Massive sequence perturbation of the Raf ras binding domain reveals relationships between sequence conservation, secondary structure propensity, hydrophobic core organization and stability. J Mol Biol 2006; 362:151-71. [PMID: 16916524 DOI: 10.1016/j.jmb.2006.06.061] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2006] [Revised: 05/23/2006] [Accepted: 06/21/2006] [Indexed: 11/25/2022]
Abstract
The contributions of specific residues to the delicate balance between function, stability and folding rates could be determined, in part by [corrected] comparing the sequences of structures having identical folds, but insignificant sequence homology. Recently, we have devised an experimental strategy to thoroughly explore residue substitutions consistent with a specific class of structure. Using this approach, the amino acids tolerated at virtually all residues of the c-Raf/Raf1 ras binding domain (Raf RBD), an exemplar of the common beta-grasp ubiquitin-like topology, were obtained and used to define the sequence determinants of this fold. Herein, we present analyses suggesting that more subtle sequence selection pressure, including propensity for secondary structure, the hydrophobic core organization and charge distribution are imposed on the Raf RBD sequence. Secondly, using the Gibbs free energies (DeltaG(F-U)) obtained for 51 mutants of Raf RBD, we demonstrate a strong correlation between amino acid conservation and the destabilization induced by truncating mutants. In addition, four mutants are shown to significantly stabilize Raf RBD native structure. Two of these mutations, including the well-studied R89L, are known to severely compromise binding affinity for ras. Another stabilized mutant consisted of a deletion of amino acid residues E104-K106. This deletion naturally occurs in the homologues a-Raf and b-Raf and could indicate functional divergence. Finally, the combination of mutations affecting five of 78 residues of Raf RBD results in stabilization of the structure by approximately 12 kJ mol(-1) (DeltaG(F-U) is -22 and -34 kJ mol(-1) for wt and mutant, respectively). The sequence perturbation approach combined with sequence/structure analysis of the ubiquitin-like fold provide a basis for the identification of sequence-specific requirements for function, stability and folding rate of the Raf RBD and structural analogues, highlighting the utility of conservation profiles as predictive tools of structural organization.
Collapse
|
16
|
Abstract
Based on the C(alpha) Go-type model, the folding kinetics and mechanisms of protein ubiquitin with mixed alpha/beta topology are studied by molecular dynamics simulations. The relaxation kinetics shows that there are three phases, namely the major phase, the intermediate phase and the slowest minor phase. The existence of these three phases are relevant to the phenomenon found in experiments. According to our simulations, the folding at high temperatures around the folding transition temperature T(f) is of a two-state process, and the folding nucleus is consisted of contacts between the front end of alpha-helix and the turn(4). The folding at low temperature (approximately T = 0.8) is also studied, where an A-state like structure is found lying on the major folding pathway. The appearance of this structure is related to the stability of the first part (residue 1-51) of protein ubiquitin. As the temperature decreases, the formation of secondary structures, tertiary structures and collapse of the protein are found to be decoupled gradually and the folding mechanism changes from the nucleation-condensation to the diffusion-collision. This feature indicates a unifying common folding mechanism for proteins. The intermediate phase is also studied and is found to represent a folding process via a long-lived intermediate state which is stabilized by strong interactions between the beta(1) and the beta(5) strand. These strong interactions are important for the function of protein ubiquitin as a molecular chaperone. Thus the intermediate phase is assumed as a byproduct of the requirement of protein function. In addition, the validity of the current Go-model is also investigated, and a lower limited temperature for protein ubiquitin T(limit) = 0.8 is proposed. At temperatures higher than this value, the kinetic traps due to glass dynamics cannot be significantly populated and the intermediate states can be reliably identified although there is slight chevron rollover in the folding rates. At temperature lower than T(limit), however, the traps due to glass dynamics become dominant and may be mistaken for real intermediate states. This limitation of valid temperature range prevents us to reveal the burst phase intermediate in the major folding phase since it might only be stabilized at temperatures lower than T(limit), according to experiments. Our works show that caution must be taken when studying low-temperature intermediate states by using the C(alpha) Go-models.
Collapse
|
17
|
|
18
|
Abstract
The small alpha/beta protein ubiquitin has been used as a model system for experimental and computational studies on protein folding for many years. Here, we present a comprehensive phi-value analysis and characterize the structure and energetics of the transition state ensemble (TSE). Twenty-seven non-disruptive mutations are made throughout the structure and a range of phi-values from zero to one are observed. The values cluster such that medium and high values and found only in the N-terminal region of the protein, whilst the C-terminal region has consistently low phi-values. In the TSE, the main alpha-helix appears to be fully formed (two phi-values which specifically probe helical structure are one) and the helix is stabilized by packing against the first beta-turn, which is partially structured. In striking comparison, the phi-values in the C-terminal region are all very low, suggesting that this region of the protein is largely unstructured in the TSE. Data are consistent with a nucleation-condensation mechanism in which there is a highly polarized folding nucleus comprising the first beta-hairpin and the alpha-helix. Data presented from the protein engineering study and phi-value analysis are compared with results from other experimental studies and also computational studies.
Collapse
|
19
|
Conformational Analysis of Invariant Peptide Sequences in Bacterial Genomes. J Mol Biol 2005; 345:937-55. [PMID: 15644196 DOI: 10.1016/j.jmb.2004.11.008] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2004] [Revised: 10/26/2004] [Accepted: 11/05/2004] [Indexed: 10/26/2022]
Abstract
The functional significance of evolutionarily conserved motifs/patterns of short regions in proteins is well documented. Although a large number of sequences are conserved, only a small fraction of these are invariant across several organisms. Here, we have examined the structural features of the functionally important peptide sequences, which have been found invariant across diverse bacterial genera. Ramachandran angles (phi,psi) have been used to analyze the conformation, folding patterns and geometrical location (buried/exposed) of these invariant peptides in different crystal structures harboring these sequences. The analysis indicates that the peptides preferred a single conformation in different protein structures, with the exception of only a few longer peptides that exhibited some conformational variability. In addition, it is noticed that the variability of conformation occurs mainly due to flipping of peptide units about the virtual C(alpha)...C(alpha) bond. However, for a given invariant peptide, the folding patterns are found to be similar in almost all the cases. Over and above, such peptides are found to be buried in the protein core. Thus, we can safely conclude that these invariant peptides are structurally important for the proteins, since they acquire unique structures across different proteins and can act as structural determinants (SD) of the proteins. The location of these SD peptides on the protein chain indicated that most of them are clustered towards the N-terminal and middle region of the protein with the C-terminal region exhibiting low preference. Another feature that emerges out of this study is that some of these SD peptides can also play the roles of "fold boundaries" or "hinge nucleus" in the protein structure. The study indicates that these SD peptides may act as chain-reversal signatures, guiding the proteins to adopt appropriate folds. In some cases the invariant signature peptides may also act as folding nuclei (FN) of the proteins.
Collapse
|
20
|
Evolutionarily conserved regions and hydrophobic contacts at the superfamily level: The case of the fold-type I, pyridoxal-5'-phosphate-dependent enzymes. Protein Sci 2004; 13:2992-3005. [PMID: 15498941 PMCID: PMC2286575 DOI: 10.1110/ps.04938104] [Citation(s) in RCA: 33] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2004] [Revised: 07/30/2004] [Accepted: 08/02/2004] [Indexed: 10/26/2022]
Abstract
The wealth of biological information provided by structural and genomic projects opens new prospects of understanding life and evolution at the molecular level. In this work, it is shown how computational approaches can be exploited to pinpoint protein structural features that remain invariant upon long evolutionary periods in the fold-type I, PLP-dependent enzymes. A nonredundant set of 23 superposed crystallographic structures belonging to this superfamily was built. Members of this family typically display high-structural conservation despite low-sequence identity. For each structure, a multiple-sequence alignment of orthologous sequences was obtained, and the 23 alignments were merged using the structural information to obtain a comprehensive multiple alignment of 921 sequences of fold-type I enzymes. The structurally conserved regions (SCRs), the evolutionarily conserved residues, and the conserved hydrophobic contacts (CHCs) were extracted from this data set, using both sequence and structural information. The results of this study identified a structural pattern of hydrophobic contacts shared by all of the superfamily members of fold-type I enzymes and involved in native interactions. This profile highlights the presence of a nucleus for this fold, in which residues participating in the most conserved native interactions exhibit preferential evolutionary conservation, that correlates significantly (r = 0.70) with the extent of mean hydrophobic contact value of their apolar fraction.
Collapse
|
21
|
Protein-protein interactions; coupling of structurally conserved residues and of hot spots across interfaces. Implications for docking. Structure 2004; 12:1027-38. [PMID: 15274922 DOI: 10.1016/j.str.2004.04.009] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2003] [Revised: 04/01/2004] [Accepted: 04/01/2004] [Indexed: 11/30/2022]
Abstract
Hot spot residues contribute dominantly to protein-protein interactions. Statistically, conserved residues correlate with hot spots, and their occurrence can distinguish between binding sites and the remainder of the protein surface. The hot spot and conservation analyses have been carried out on one side of the interface. Here, we show that both experimental hot spots and conserved residues tend to couple across two-chain interfaces. Intriguingly, the local packing density around both hot spots and conserved residues is higher than expected. We further observe a correlation between local packing density and experimental deltadeltaG. Favorable conserved pairs include Gly coupled with aromatics, charged and polar residues, as well as aromatic residue coupling. Remarkably, charged residue couples are underrepresented. Overall, protein-protein interactions appear to consist of regions of high and low packing density, with the hot spots organized in the former. The high local packing density in binding interfaces is reminiscent of protein cores.
Collapse
|
22
|
Abstract
Protein is the working molecule of the cell, and evolution is the hallmark of life. It is important to understand how protein folding and evolution influence each other. Several studies correlating experimental measurement of residue participation in folding nucleus and sequence conservation have reached different conclusions. These studies are based on assessment of sequence conservation at folding nucleus sites using entropy or relative entropy measurement derived from multiple sequence alignment. Here we report analysis of conservation of folding nucleus using an evolutionary model alternative to entropy-based approaches. We employ a continuous time Markov model of codon substitution to distinguish mutation fixed by evolution and mutation fixed by chance. This model takes into account bias in codon frequency, bias-favoring transition over transversion, as well as explicit phylogenetic information. We measure selection pressure using the ratio omega of synonymous versus non-synonymous substitution at individual residue site. The omega-values are estimated using the PAML method, a maximum-likelihood estimator. Our results show that there is little correlation between the extent of kinetic participation in protein folding nucleus as measured by experimental phi-value and selection pressure as measured by omega-value. In addition, two randomization tests failed to show that folding nucleus residues are significantly more conserved than the whole protein, or the median omega value of all residues in the protein. These results suggest that at the level of codon substitution, there is no indication that folding nucleus residues are significantly more conserved than other residues. We further reconstruct candidate ancestral residues of the folding nucleus and suggest possible test tube mutation studies for testing folding behavior of ancient folding nucleus.
Collapse
|
23
|
Analysis of the differences in the folding kinetics of structurally homologous proteins based on predictions of the gross features of residue contacts. Proteins 2003; 51:515-30. [PMID: 12784211 DOI: 10.1002/prot.10378] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
It is a general notion that proteins with very similar three-dimensional structures would show very similar folding kinetics. However, recent studies reveal that the folding kinetic properties of some proteins contradict this thought (i.e., the members in a same protein family fold through different pathways). For example, it has been reported that some beta-proteins in the intracellular lipid-binding protein family fold through quite different pathways (Burns et al., Proteins 1998;33:107-118). Similar differences in folding kinetics are also observed in the members of the globin family (Nishimura et al., Nat Struct Biol 2000;7:679-686). In our study, we examine the possibility of predicting qualitative differences in folding kinetics of the intracellular lipid-binding proteins and two globin proteins (i.e., myoglobin and leghemoglobin). The problem is tackled by means of a contact map based on the average distance statistics between residues, the Average Distance Map (ADM), as constructed from sequence. The ADMs for the three proteins show overall similarity, but some local differences among maps are also observed. Our results demonstrate that some properties of the protein folding kinetics are consistent with local differences in the ADMs. We also discuss the general possibility of predicting folding kinetics from sequence information.
Collapse
|
24
|
The ensemble folding kinetics of protein G from an all-atom Monte Carlo simulation. Proc Natl Acad Sci U S A 2002; 99:11175-80. [PMID: 12165568 PMCID: PMC123229 DOI: 10.1073/pnas.162268099] [Citation(s) in RCA: 134] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2002] [Indexed: 12/17/2022] Open
Abstract
Protein G is folded with an all-atom Monte Carlo simulation by using a Gō potential. When folding is monitored by using burial of the lone tryptophan in protein G as the reaction coordinate, the ensemble kinetics is single exponential. Other experimental observations, such as the burst phase and mutational data, are also reproduced. However, more detailed analysis reveals that folding occurs over three distinct, three-state pathways. We show that, because of this tryptophan's asymmetric location in the tertiary fold, its burial (i) does not detect certain intermediates and (ii) may not correspond to the folding event. This finding demonstrates that ensemble averaging can disguise the presence of multiple pathways and intermediates when a non-ideal reaction coordinate is used. Finally, all observed folding pathways eventually converge to a common rate-limiting step, which is the formation of a specific nucleus involving hydrophobic core residues. These residues are conserved in the ubiquitin superfamily and in a phage display experiment, suggesting that fold topology is a strong determinant of the transition state.
Collapse
|
25
|
Abstract
By using three-dimensional (3D) structure alignments and a previously published method to determine Conserved Key Amino Acid Positions (CKAAPs) we propose a theoretical method to design mutations that can be used to morph the protein folds. The original Paracelsus challenge, met by several groups, called for the engineering of a stable but different structure by modifying less than 50% of the amino acid residues. We have used the sequences from the Protein Data Bank (PDB) identifiers 1ROP, and 2CRO, which were previously used in the Paracelsus challenge by those groups, and suggest mutation to CKAAPs to morph the protein fold. The total number of mutations suggested is less than 40% of the starting sequence theoretically improving the challenge results. From secondary structure prediction experiments of the proposed mutant sequence structures, we observe that each of the suggested mutant protein sequences likely folds to a different, non-native potentially stable target structure. These results are an early indicator that analyses using structure alignments leading to CKAAPs of a given structure are of value in protein engineering experiments.
Collapse
|
26
|
Abstract
The Conserved Key Amino Acid Positions DataBase (CKAAPs DB) provides access to an analysis of structurally similar proteins with dissimilar sequences where key residues within a common fold are identified. CKAAPs may be important in protein folding and structural stability and function, and hence useful for protein engineering studies. This paper provides an update to the initial report of CKAAPs DB [Li et al. (2001) Nucleic Acids Res., 29, 329-331]. CKAAPs DB contains CKAAPs for the representative set of polypeptide chains derived from the CE and FSSP databases, as well as subdomains (conserved regions of the order of 100 residues within a domain) identified by CE. The new version now offers different perspectives on the CKAAPs. First, CKAAPs are mapped onto their respective Protein Data Bank (PDB) structures rendered by Molscript, providing a spatial context for the CKAAPs. Secondly, CKAAPs may be highlighted within a structure-based sequence alignment, as well as secondary structure alignment. Thirdly, the resulting sequence homologs from the structure alignment may be viewed in alignments colorized based on identities and property groups using Mview. New search capabilities have also been provided for searching by keyword combinations, PDB IDs, EC numbers, GI numbers, LocusLink ID, taxonomy, gene ontology and pathways. A new custom CKAAPs analysis interface has been implemented where a user may change the criteria for inclusion of chains, initiate CKAAPs analysis and retrieve results. CKAAPs DB is accessible through the web at http://ckaaps.sdsc.edu/. Plain text analysis results are available by FTP at ftp://ftp.sdsc.edu/pub/sdsc/biology/ckaap.
Collapse
|
27
|
Protein folding theory: from lattice to all-atom models. ANNUAL REVIEW OF BIOPHYSICS AND BIOMOLECULAR STRUCTURE 2001; 30:361-96. [PMID: 11340064 DOI: 10.1146/annurev.biophys.30.1.361] [Citation(s) in RCA: 232] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
This review focuses on recent advances in understanding protein folding kinetics in the context of nucleation theory. We present basic concepts such as nucleation, folding nucleus, and transition state ensemble and then discuss recent advances and challenges in theoretical understanding of several key aspects of protein folding kinetics. We cover recent topology-based approaches as well as evolutionary studies and molecular dynamics approaches to determine protein folding nucleus and analyze other aspects of folding kinetics. Finally, we briefly discuss successful all-atom Monte-Carlo simulations of protein folding and conclude with a brief outlook for the future.
Collapse
|
28
|
Increasing protein stability using a rational approach combining sequence homology and structural alignment: Stabilizing the WW domain. Protein Sci 2001; 10:1454-65. [PMID: 11420447 PMCID: PMC2374112 DOI: 10.1110/ps.640101] [Citation(s) in RCA: 39] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/14/2022]
Abstract
This study shows that a combination of sequence homology and structural information can be used to increase the stability of the WW domain by 2.5 kcal mol(-1) and increase the T(m) by 28 degrees C. Previous homology-based protein design efforts typically investigate positions with low sequence identity, whereas this study focuses on semi-conserved core residues and proximal residues, exploring their role(s) in mediating stabilizing interactions on the basis of structural considerations. The A20R and L30Y mutations allow increased hydrophobic interactions because of complimentary surfaces and an electrostatic interaction with a third residue adjacent to the ligand-binding hydrophobic cluster, increasing stability significantly beyond what additivity would predict for the single mutations. The D34T mutation situated in a pi-turn possibly disengages Asn31, allowing it to make up to three hydrogen bonds with the backbone in strand 1 and loop 2. The synergistic mutations A20R/L30Y in combination with the remotely located mutation D34T add together to create a hYap WW domain that is significantly more stable than any of the protein structures on which the design was based (Pin and FBP28 WW domains).
Collapse
|
29
|
Abstract
The folding mechanisms of cellular retinol binding protein II (CRBP II), cellular retinoic acid binding protein I (CRABP I), and cellular retinoic acid binding protein II (CRABP II) were examined. These beta-sheet proteins have very similar structures and higher sequence homologies than most proteins in this diverse family. They have similar stabilities and show completely reversible folding at equilibrium with urea as a denaturant. The unfolding kinetics of these proteins were monitored during folding and unfolding by circular dichroism (CD) and fluorescence. During unfolding, CRABP II showed no intermediates, CRABP I had an intermediate with nativelike secondary structure, and CRBP II had an intermediate that lacked secondary structure. The refolding kinetics of these proteins were more similar. Each protein showed a burst-phase change in intensity by both CD and fluorescence, followed by a single observed phase by both CD and fluorescence and one or two additional refolding phases by fluorescence. The fluorescence spectral properties of the intermediate states were similar and suggested a gradual increase in the amount of native tertiary structure present for each step in a sequential path. However, the rates of folding differed by as much as 3 orders of magnitude and were slower than those expected from the contact order and topology of these proteins. As such, proteins with the same final structure may not follow the same route to the native state.
Collapse
|
30
|
Abstract
Here, we present statistical analysis of conservation profiles in families of homologous sequences for nine proteins whose folding nucleus was determined by protein engineering methods. We show that in all but one protein (AcP) folding nucleus residues are significantly more conserved than the rest of the protein. Two aspects of our study are especially important: (i) grouping of amino acid residues into classes according to their physical-chemical properties and (ii) proper normalization of amino acid probabilities that reflects the fact that evolutionary pressure to conserve some amino acid types may itself affect concentration of various amino acid types in protein families. Neglect of any of those two factors may make physical and biological "signals" from conservation profiles disappear.
Collapse
|
31
|
Abstract
Fifty-five molecular dynamics runs of two three-stranded antiparallel beta-sheet peptides were performed to investigate the relative importance of amino acid sequence and native topology. The two peptides consist of 20 residues each and have a sequence identity of 15 %. One peptide has Gly-Ser (GS) at both turns, while the other has d-Pro-Gly ((D)PG). The simulations successfully reproduce the NMR solution conformations, irrespective of the starting structure. The large number of folding events sampled along the trajectories at 360 K (total simulation time of about 5 micros) yield a projection of the free-energy landscape onto two significant progress variables. The two peptides have compact denatured states, similar free-energy surfaces, and folding pathways that involve the formation of a beta-hairpin followed by consolidation of the unstructured strand. For the GS peptide, there are 33 folding events that start by the formation of the 2-3 beta-hairpin and 17 with first the 1-2 beta-hairpin. For the (D)PG peptide, the statistical predominance is opposite, 16 and 47 folding events start from the 2-3 beta-hairpin and the 1-2 beta-hairpin, respectively. These simulation results indicate that the overall shape of the free-energy surface is defined primarily by the native-state topology, in agreement with an ever-increasing amount of experimental and theoretical evidence, while the amino acid sequence determines the statistically predominant order of the events.
Collapse
|
32
|
|
33
|
Abstract
An all-against-all protein structure comparison using the Combinatorial Extension (CE) algorithm applied to a representative set of PDB structures revealed a gallery of common substructures in proteins (http://cl.sdsc.edu/ce.html). These substructures represent commonly identified folds, domains, or components thereof. Most of the subsequences forming these similar substructures have no significant sequence similarity. We present a method to identify conserved amino acid positions and residue-dependent property clusters within these subsequences starting with structure alignments. Each of the subsequences is aligned to its homologues in SWALL, a nonredundant protein sequence database. The most similar sequences are purged into a common frequency matrix, and weighted homologues of each one of the subsequences are used in scoring for conserved key amino acid positions (CKAAPs). We have set the top 20% of the high-scoring positions in each substructure to be CKAAPs. It is hypothesized that CKAAPs may be responsible for the common folding patterns in either a local or global view of the protein-folding pathway. Where a significant number of structures exist, CKAAPs have also been identified in structure alignments of complete polypeptide chains from the same protein family or superfamily. Evidence to support the presence of CKAAPs comes from other computational approaches and experimental studies of mutation and protein-folding experiments, notably the Paracelsus challenge. Finally, the structural environment of CKAAPs versus non-CKAAPs is examined for solvent accessibility, hydrogen bonding, and secondary structure. The identification of CKAAPs has important implications for protein engineering, fold recognition, modeling, and structure prediction studies and is dependent on the availability of structures and an accurate structure alignment methodology. Proteins 2001;42:148-163.
Collapse
|
34
|
Abstract
The Conserved Key Amino Acid Positions DataBase (CKAAPs DB) provides access to an analysis of structurally similar proteins with dissimilar sequences where key residues within a common fold are identified. The derivation and significance of CKAAPs starting from pairwise structure alignments is described fully in Reddy et al. [Reddy,B.V.B., Li,W.W., Shindyalov,I.N. and Bourne,P.E. (2000) PROTEINS:, in press]. The CKAAPs identified from this theoretical analysis are provided to experimentalists and theoreticians for potential use in protein engineering and modeling. It has been suggested that CKAAPs may be crucial features for protein folding, structural stability and function. Over 170 substructures, as defined by the Combinatorial Extension (CE) database, which are found in approximately 3000 representative polypeptide chains have been analyzed and are available in the CKAAPs DB. CKAAPs DB also provides CKAAPs of the representative set of proteins derived from the CE and FSSP databases. Thus the database contains over 5000 representative poly-peptide chains, covering all known structures in the PDB. A web interface to a relational database permits fast retrieval of structure-sequence alignments, CKAAPs and associated statistics. Users may query by PDB ID, protein name, function and Enzyme Classification number. Users may also submit protein alignments of their own to obtain CKAAPs. An interface to display CKAAPs on each structure from a web browser is also being implemented. CKAAPs DB is maintained by the San Diego Supercomputer Center and accessible at the URL http://ckaaps.sdsc.edu.
Collapse
|
35
|
Abstract
The high structural resolution of the main transition states for the formation of native structure for the six small proteins of which Phi-values for a large set of mutants have become available, barstar, barnase, chymotrypsin inhibitor 2, Arc repressor, the src SH3 domain, and a tetrameric p53 domain reveals that for the first 5 of these proteins: (1) Residues that belong to regular secondary structure have a significantly larger average fraction of native structural consolidation than residues in loops; (2) on the other hand, secondary and tertiary structures have built up to the same degree, or at least a high degree, but nonuniformly distributed over the molecule; (3) the most consolidated parts of each protein molecule in the transition state cluster together, and these clusters contain a significantly higher percentage of residues that belong to regular secondary structure than the rest of the molecule. These observations further reconcile the framework model with the nucleation-condensation mechanism for folding: The amazing speed of protein folding can be understood as caused by the catalytic effect of the formation of clusters of residues which have particularly high preferences for the early formation of regular secondary structure in the presence of significant amounts of tertiary structure interactions.
Collapse
|
36
|
The identification of conserved interactions within the SH3 domain by alignment of sequences and structures. Protein Sci 2000; 9:2170-80. [PMID: 11152127 PMCID: PMC2144485 DOI: 10.1110/ps.9.11.2170] [Citation(s) in RCA: 127] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
Abstract
The SH3 domain, comprised of approximately 60 residues, is found within a wide variety of proteins, and is a mediator of protein-protein interactions. Due to the large number of SH3 domain sequences and structures in the databases, this domain provides one of the best available systems for the examination of sequence and structural conservation within a protein family. In this study, a large and diverse alignment of SH3 domain sequences was constructed, and the pattern of conservation within this alignment was compared to conserved structural features, as deduced from analysis of eighteen different SH3 domain structures. Seventeen SH3 domain structures solved in the presence of bound peptide were also examined to identify positions that are consistently most important in mediating the peptide-binding function of this domain. Although residues at the two most conserved positions in the alignment are directly involved in peptide binding, residues at most other conserved positions play structural roles, such as stabilizing turns or comprising the hydrophobic core. Surprisingly, several highly conserved side-chain to main-chain hydrogen bonds were observed in the functionally crucial RT-Src loop between residues with little direct involvement in peptide binding. These hydrogen bonds may be important for maintaining this region in the precise conformation necessary for specific peptide recognition. In addition, a previously unrecognized yet highly conserved beta-bulge was identified in the second beta-strand of the domain, which appears to provide a necessary kink in this strand, allowing it to hydrogen bond to both sheets comprising the fold.
Collapse
|
37
|
Abstract
The sequence and structural conservation of folding transition states have been predicted on theoretical grounds. Using homologous sequence alignments of proteins previously characterized via coupled mutagenesis/kinetics studies, we tested these predictions experimentally. Only one of the six appropriately characterized proteins exhibits a statistically significant correlation between residues' roles in transition state structure and their evolutionary conservation. However, a significant correlation is observed between the contributions of individual sequence positions to the transition state structure across a set of homologous proteins. Thus the structure of the folding transition state ensemble appears to be more highly conserved than the specific interactions that stabilize it.
Collapse
|
38
|
A theoretical search for folding/unfolding nuclei in three-dimensional protein structures. Proc Natl Acad Sci U S A 1999; 96:11299-304. [PMID: 10500171 PMCID: PMC18028 DOI: 10.1073/pnas.96.20.11299] [Citation(s) in RCA: 278] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
When a protein folds or unfolds, it has to pass through many half-folded microstates. Only a few of them can be seen experimentally. In a two-state transition proceeding with no accumulation of metastable intermediates [Fersht, A. R. (1995) Curr. Opin. Struct. Biol. 5, 79-84], only the semifolded microstates corresponding to the transition state can be outlined; they influence the folding/unfolding kinetics. Our aim is to calculate them, provided the three-dimensional protein structure is given. The presented approach follows from the capillarity theory of protein folding and unfolding [Wolynes, P. G. (1997) Proc. Natl. Acad. Sci. USA 94, 6170-6175]. The approach is based on a search for free-energy saddle point(s) on a network of protein unfolding pathways. Under some approximations, this search is rapidly performed by dynamic programming and, despite its relative simplicity, gives a good correlation with experiment. The computed folding nuclei look like ensembles of those compact and closely packed parts of the three-dimensional native folds that contain a small number of disordered protruding loops. Their estimated free energy is consistent with the rapid (within seconds) folding and unfolding of small proteins at the point of thermodynamic equilibrium between the native fold and the coil.
Collapse
|
39
|
Abstract
Understanding the mechanism of protein folding would allow prediction of the three-dimensional structure from sequence data alone. It has been shown that small proteins fold in a small number of kinetic steps and that significantly populated intermediate states exist for some of them. Studies of these intermediates have demonstrated the existence of specific interactions established during the initial stages of folding. Comparison of the amino acids participating in these specific and essential interactions and constituting the folding nucleus with conserved hydrophobic positions of a given fold shows a striking correspondence. This finding opens the perspective of predicting the folding nucleus knowing only a set of divergent sequences of a protein family.
Collapse
|