51
|
Parente DJ, Swint-Kruse L. Multiple co-evolutionary networks are supported by the common tertiary scaffold of the LacI/GalR proteins. PLoS One 2013; 8:e84398. [PMID: 24391951 PMCID: PMC3877293 DOI: 10.1371/journal.pone.0084398] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2013] [Accepted: 11/15/2013] [Indexed: 11/18/2022] Open
Abstract
Protein families might evolve paralogous functions on their common tertiary scaffold in two ways. First, the locations of functionally-important sites might be "hard-wired" into the structure, with novel functions evolved by altering the amino acid (e.g. Ala vs Ser) at these positions. Alternatively, the tertiary scaffold might be adaptable, accommodating a unique set of functionally important sites for each paralogous function. To discriminate between these possibilities, we compared the set of functionally important sites in the six largest paralogous subfamilies of the LacI/GalR transcription repressor family. LacI/GalR paralogs share a common tertiary structure, but have low sequence identity (≤ 30%), and regulate a variety of metabolic processes. Functionally important positions were identified by conservation and co-evolutionary sequence analyses. Results showed that conserved positions use a mixture of the "hard-wired" and "accommodating" scaffold frameworks, but that the co-evolution networks were highly dissimilar between any pair of subfamilies. Therefore, the tertiary structure can accommodate multiple networks of functionally important positions. This possibility should be included when designing and interpreting sequence analyses of other protein families. Software implementing conservation and co-evolution analyses is available at https://sourceforge.net/projects/coevolutils/.
Collapse
Affiliation(s)
- Daniel J. Parente
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, United States of America
| | - Liskin Swint-Kruse
- Department of Biochemistry and Molecular Biology, The University of Kansas Medical Center, Kansas City, Kansas, United States of America
- * E-mail:
| |
Collapse
|
52
|
The Global Sequence Signature algorithm unveils a structural network surrounding heavy chain CDR3 loop in Camelidae variable domains. Biochim Biophys Acta Gen Subj 2013; 1830:3373-81. [DOI: 10.1016/j.bbagen.2013.02.014] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2013] [Revised: 02/13/2013] [Accepted: 02/15/2013] [Indexed: 11/16/2022]
|
53
|
Xie L, Ng C, Ali T, Valencia R, Ferreira BL, Xue V, Tanweer M, Zhou D, Haddad GG, Bourne PE, Xie L. Multiscale modeling of the causal functional roles of nsSNPs in a genome-wide association study: application to hypoxia. BMC Genomics 2013; 14 Suppl 3:S9. [PMID: 23819581 PMCID: PMC3665574 DOI: 10.1186/1471-2164-14-s3-s9] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023] Open
Abstract
BACKGROUND It is a great challenge of modern biology to determine the functional roles of non-synonymous Single Nucleotide Polymorphisms (nsSNPs) on complex phenotypes. Statistical and machine learning techniques establish correlations between genotype and phenotype, but may fail to infer the biologically relevant mechanisms. The emerging paradigm of Network-based Association Studies aims to address this problem of statistical analysis. However, a mechanistic understanding of how individual molecular components work together in a system requires knowledge of molecular structures, and their interactions. RESULTS To address the challenge of understanding the genetic, molecular, and cellular basis of complex phenotypes, we have, for the first time, developed a structural systems biology approach for genome-wide multiscale modeling of nsSNPs--from the atomic details of molecular interactions to the emergent properties of biological networks. We apply our approach to determine the functional roles of nsSNPs associated with hypoxia tolerance in Drosophila melanogaster. The integrated view of the functional roles of nsSNP at both molecular and network levels allows us to identify driver mutations and their interactions (epistasis) in H, Rad51D, Ulp1, Wnt5, HDAC4, Sol, Dys, GalNAc-T2, and CG33714 genes, all of which are involved in the up-regulation of Notch and Gurken/EGFR signaling pathways. Moreover, we find that a large fraction of the driver mutations are neither located in conserved functional sites, nor responsible for structural stability, but rather regulate protein activity through allosteric transitions, protein-protein interactions, or protein-nucleic acid interactions. This finding should impact future Genome-Wide Association Studies. CONCLUSIONS Our studies demonstrate that the consolidation of statistical, structural, and network views of biomolecules and their interactions can provide new insight into the functional role of nsSNPs in Genome-Wide Association Studies, in a way that neither the knowledge of molecular structures nor biological networks alone could achieve. Thus, multiscale modeling of nsSNPs may prove to be a powerful tool for establishing the functional roles of sequence variants in a wide array of applications.
Collapse
Affiliation(s)
- Li Xie
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, La Jolla, CA 92093, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
54
|
Abstract
Co-evolution is a fundamental component of the theory of evolution and is essential for understanding the relationships between species in complex ecological networks. A wide range of co-evolution-inspired computational methods has been designed to predict molecular interactions, but it is only recently that important advances have been made. Breakthroughs in the handling of phylogenetic information and in disentangling indirect relationships have resulted in an improved capacity to predict interactions between proteins and contacts between different protein residues. Here, we review the main co-evolution-based computational approaches, their theoretical basis, potential applications and foreseeable developments.
Collapse
Affiliation(s)
- David de Juan
- Structural Biology and Biocomputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | | | | |
Collapse
|
55
|
Bécu JM, Pelé J, Rodien P, Abdi H, Chabbert M. Structural evolution of G-protein-coupled receptors: a sequence space approach. Methods Enzymol 2013; 520:49-66. [PMID: 23332695 DOI: 10.1016/b978-0-12-391861-1.00003-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/11/2023]
Abstract
Class A G-protein-coupled receptors (GPCRs) provide a fascinating example of evolutionary success. In this review, we discuss how metric multidimensional scaling (MDS), a multivariate analysis method, complements traditional tree-based phylogenetic methods and helps decipher the mechanisms that drove the evolution of class A GPCRs. MDS provides low-dimensional representations of a distance matrix. Applied to a multiple sequence alignment, MDS represents the sequences in a Euclidean space as points whose interdistances are as close as possible to the distances in the alignment (the so-called sequence space). We detail how to perform the MDS analysis of a multiple sequence alignment and how to analyze and interpret the resulting sequence space. We also show that the projection of supplementary data (a property of the MDS method) can be used to straightforwardly monitor the evolutionary drift of specific subfamilies. The sequence space of class A GPCRs reveals the key role of mutations at the level of the TM2 and TM5 proline residues in the evolution of class A GPCRs.
Collapse
|
56
|
|
57
|
Durani V, Magliery TJ. Protein engineering and stabilization from sequence statistics: variation and covariation analysis. Methods Enzymol 2013; 523:237-56. [PMID: 23422433 DOI: 10.1016/b978-0-12-394292-0.00011-4] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
The concepts of consensus and correlation in multiple sequence alignments (MSAs) have been used in the past to understand and engineer proteins. However, there are multiple ways of acquiring MSA databases and also numerous mathematical metrics that can be applied to calculate each of the parameters. This chapter describes an overall methodology that we have chosen to employ for acquiring and statistically analyzing MSAs. We have provided a step-by-step protocol for calculating relative entropy and mutual information metrics and describe how they can be used to predict mutations that have a high probability of stabilizing a protein. This protocol allows for flexibility for modification of formulae and parameters without using anything more complicated than Microsoft Excel. We have also demonstrated various aspects of data analysis by carrying out a sample analysis on the BPTI-Kunitz family of proteins and identified mutations that would be predicted to stabilize this protein based on consensus and correlation values.
Collapse
Affiliation(s)
- Venuka Durani
- Department of Chemistry, The Ohio State University, Columbus, Ohio, USA
| | | |
Collapse
|
58
|
Parnas A, Nisemblat S, Weiss C, Levy-Rimler G, Pri-Or A, Zor T, Lund PA, Bross P, Azem A. Identification of elements that dictate the specificity of mitochondrial Hsp60 for its co-chaperonin. PLoS One 2012; 7:e50318. [PMID: 23226518 PMCID: PMC3514286 DOI: 10.1371/journal.pone.0050318] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2012] [Accepted: 10/18/2012] [Indexed: 01/28/2023] Open
Abstract
Type I chaperonins (cpn60/Hsp60) are essential proteins that mediate the folding of proteins in bacteria, chloroplast and mitochondria. Despite the high sequence homology among chaperonins, the mitochondrial chaperonin system has developed unique properties that distinguish it from the widely-studied bacterial system (GroEL and GroES). The most relevant difference to this study is that mitochondrial chaperonins are able to refold denatured proteins only with the assistance of the mitochondrial co-chaperonin. This is in contrast to the bacterial chaperonin, which is able to function with the help of co-chaperonin from any source. The goal of our work was to determine structural elements that govern the specificity between chaperonin and co-chaperonin pairs using mitochondrial Hsp60 as model system. We used a mutagenesis approach to obtain human mitochondrial Hsp60 mutants that are able to function with the bacterial co-chaperonin, GroES. We isolated two mutants, a single mutant (E321K) and a double mutant (R264K/E358K) that, together with GroES, were able to rescue an E. coli strain, in which the endogenous chaperonin system was silenced. Although the mutations are located in the apical domain of the chaperonin, where the interaction with co-chaperonin takes place, none of the residues are located in positions that are directly responsible for co-chaperonin binding. Moreover, while both mutants were able to function with GroES, they showed distinct functional and structural properties. Our results indicate that the phenotype of the E321K mutant is caused mainly by a profound increase in the binding affinity to all co-chaperonins, while the phenotype of R264K/E358K is caused by a slight increase in affinity toward co-chaperonins that is accompanied by an alteration in the allosteric signal transmitted upon nucleotide binding. The latter changes lead to a great increase in affinity for GroES, with only a minor increase in affinity toward the mammalian mitochondrial co-chaperonin.
Collapse
Affiliation(s)
- Avital Parnas
- Department of Biochemistry and Molecular Biology, Tel Aviv University, Tel Aviv, Israel
| | - Shahar Nisemblat
- Department of Biochemistry and Molecular Biology, Tel Aviv University, Tel Aviv, Israel
| | - Celeste Weiss
- Department of Biochemistry and Molecular Biology, Tel Aviv University, Tel Aviv, Israel
| | - Galit Levy-Rimler
- Department of Biochemistry and Molecular Biology, Tel Aviv University, Tel Aviv, Israel
| | - Amir Pri-Or
- Department of Biochemistry and Molecular Biology, Tel Aviv University, Tel Aviv, Israel
| | - Tsaffrir Zor
- Department of Biochemistry and Molecular Biology, Tel Aviv University, Tel Aviv, Israel
| | - Peter A. Lund
- School of Biosciences, University of Birmingham, Birmingham, United Kingdom
| | - Peter Bross
- Research Unit for Molecular Medicine, Aarhus University Hospital, Aarhus, Denmark
| | - Abdussalam Azem
- Department of Biochemistry and Molecular Biology, Tel Aviv University, Tel Aviv, Israel
- * E-mail:
| |
Collapse
|
59
|
Li X, Zhang Z, Song J. Computational enzyme design approaches with significant biological outcomes: progress and challenges. Comput Struct Biotechnol J 2012; 2:e201209007. [PMID: 24688648 PMCID: PMC3962085 DOI: 10.5936/csbj.201209007] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2012] [Revised: 09/27/2012] [Accepted: 10/04/2012] [Indexed: 11/29/2022] Open
Abstract
Enzymes are powerful biocatalysts, however, so far there is still a large gap between the number of enzyme-based practical applications and that of naturally occurring enzymes. Multiple experimental approaches have been applied to generate nearly all possible mutations of target enzymes, allowing the identification of desirable variants with improved properties to meet the practical needs. Meanwhile, an increasing number of computational methods have been developed to assist in the modification of enzymes during the past few decades. With the development of bioinformatic algorithms, computational approaches are now able to provide more precise guidance for enzyme engineering and make it more efficient and less laborious. In this review, we summarize the recent advances of method development with significant biological outcomes to provide important insights into successful computational protein designs. We also discuss the limitations and challenges of existing methods and the future directions that should improve them.
Collapse
Affiliation(s)
- Xiaoman Li
- National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, Tianjin 300308, China
| | - Ziding Zhang
- State Key Laboratory of Agrobiotechnology, College of Biological Sciences, China Agricultural University, Beijing 100193, China
| | - Jiangning Song
- National Engineering Laboratory for Industrial Enzymes and Key Laboratory of Systems Microbial Biotechnology, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, Tianjin, Tianjin 300308, China ; Department of Biochemistry and Molecular Biology and ARC Centre of Excellence in Structural and Functional Microbial Genomics, Monash University, Melbourne, VIC 3800, Australia
| |
Collapse
|
60
|
Accurate simulation and detection of coevolution signals in multiple sequence alignments. PLoS One 2012; 7:e47108. [PMID: 23091608 PMCID: PMC3473043 DOI: 10.1371/journal.pone.0047108] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2012] [Accepted: 09/10/2012] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND While the conserved positions of a multiple sequence alignment (MSA) are clearly of interest, non-conserved positions can also be important because, for example, destabilizing effects at one position can be compensated by stabilizing effects at another position. Different methods have been developed to recognize the evolutionary relationship between amino acid sites, and to disentangle functional/structural dependencies from historical/phylogenetic ones. METHODOLOGY/PRINCIPAL FINDINGS We have used two complementary approaches to test the efficacy of these methods. In the first approach, we have used a new program, MSAvolve, for the in silico evolution of MSAs, which records a detailed history of all covarying positions, and builds a global coevolution matrix as the accumulated sum of individual matrices for the positions forced to co-vary, the recombinant coevolution, and the stochastic coevolution. We have simulated over 1600 MSAs for 8 protein families, which reflect sequences of different sizes and proteins with widely different functions. The calculated coevolution matrices were compared with the coevolution matrices obtained for the same evolved MSAs with different coevolution detection methods. In a second approach we have evaluated the capacity of the different methods to predict close contacts in the representative X-ray structures of an additional 150 protein families using only experimental MSAs. CONCLUSIONS/SIGNIFICANCE Methods based on the identification of global correlations between pairs were found to be generally superior to methods based only on local correlations in their capacity to identify coevolving residues using either simulated or experimental MSAs. However, the significant variability in the performance of different methods with different proteins suggests that the simulation of MSAs that replicate the statistical properties of the experimental MSA can be a valuable tool to identify the coevolution detection method that is most effective in each case.
Collapse
|
61
|
The emergence of protein complexes: quaternary structure, dynamics and allostery. Colworth Medal Lecture. Biochem Soc Trans 2012; 40:475-91. [PMID: 22616857 DOI: 10.1042/bst20120056] [Citation(s) in RCA: 64] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]
Abstract
All proteins require physical interactions with other proteins in order to perform their functions. Most of them oligomerize into homomers, and a vast majority of these homomers interact with other proteins, at least part of the time, forming transient or obligate heteromers. In the present paper, we review the structural, biophysical and evolutionary aspects of these protein interactions. We discuss how protein function and stability benefit from oligomerization, as well as evolutionary pathways by which oligomers emerge, mostly from the perspective of homomers. Finally, we emphasize the specificities of heteromeric complexes and their structure and evolution. We also discuss two analytical approaches increasingly being used to study protein structures as well as their interactions. First, we review the use of the biological networks and graph theory for analysis of protein interactions and structure. Secondly, we discuss recent advances in techniques for detecting correlated mutations, with the emphasis on their role in identifying pathways of allosteric communication.
Collapse
|
62
|
Gomes M, Hamer R, Reinert G, Deane CM. Mutual information and variants for protein domain-domain contact prediction. BMC Res Notes 2012; 5:472. [PMID: 23244412 PMCID: PMC3532072 DOI: 10.1186/1756-0500-5-472] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2012] [Accepted: 08/10/2012] [Indexed: 01/20/2023] Open
Abstract
BACKGROUND Predicting protein contacts solely based on sequence information remains a challenging problem, despite the huge amount of sequence data at our disposal. Mutual Information (MI), an information theory measure, has been extensively employed and modified to identify residues within a protein (intra-protein) that are in contact. More recently MI and its variants have also been used in the prediction of contacts between proteins (inter-protein). METHODS Here we assess the predictive power of MI and variants for domain-domain contact prediction. We test original MI and these variants, which are called MIp, MIc and ZNMI, on 40 domain-domain test cases containing 10,753 sequences. We also propose and evaluate two new versions of MI that consider triangles of residues and the physiochemical properties of the amino acids, respectively. RESULTS We found that all versions of MI are skewed towards predicting surface residues. Since domain-domain contacts are on the surface of each domain, we considered only surface residues when attempting to predict contacts. Our analysis shows that MIc is the best current MI domain-domain contact predictor. At 20% recall MIc achieved a precision of 44.9% when only surface residues were considered. Our triangle and reduced alphabet variants of MI highlight the delicate trade-off between signal and noise in the use of MI for domain-domain contact prediction. We also examine a specific "successful" case study and demonstrate that here, when considering surface residues, even the most accurate domain-domain contact predictor, MIc, performs no better than random. CONCLUSIONS All tested variants of MI are skewed towards predicting surface residues. When considering surface residues only, we find MIc to be the best current MI domain-domain contact predictor. Its performance, however, is not as good as a non-MI based contact predictor, i-Patch. Additionally, the intra-protein contact prediction capabilities of MIc outperform its domain-domain contact prediction abilities.
Collapse
Affiliation(s)
- Mireille Gomes
- Department of Statistics, University of Oxford, Oxford, UK
| | | | | | | |
Collapse
|
63
|
Arnold R, Boonen K, Sun MG, Kim PM. Computational analysis of interactomes: current and future perspectives for bioinformatics approaches to model the host-pathogen interaction space. Methods 2012; 57:508-18. [PMID: 22750305 PMCID: PMC7128575 DOI: 10.1016/j.ymeth.2012.06.011] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2012] [Revised: 06/20/2012] [Accepted: 06/21/2012] [Indexed: 11/05/2022] Open
Abstract
Bacterial and viral pathogens affect their eukaryotic host partly by interacting with proteins of the host cell. Hence, to investigate infection from a systems' perspective we need to construct complete and accurate host-pathogen protein-protein interaction networks. Because of the paucity of available data and the cost associated with experimental approaches, any construction and analysis of such a network in the near future has to rely on computational predictions. Specifically, this challenge consists of a number of sub-problems: First, prediction of possible pathogen interactors (e.g. effector proteins) is necessary for bacteria and protozoa. Second, the prospective host binding partners have to be determined and finally, the impact on the host cell analyzed. This review gives an overview of current bioinformatics approaches to obtain and understand host-pathogen interactions. As an application example of the methods covered, we predict host-pathogen interactions of Salmonella and discuss the value of these predictions as a prospective for further research.
Collapse
Affiliation(s)
- Roland Arnold
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada M5S 3E1
| | - Kurt Boonen
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada M5S 3E1
| | - Mark G.F. Sun
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada M5S 3E1
| | - Philip M. Kim
- Terrence Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, ON, Canada M5S 3E1
- Banting and Best Department of Medical Research, University of Toronto, Toronto, ON, Canada M5S 3E1
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada M5S 3E1
- Department of Computer Science, University of Toronto, Toronto, ON, Canada M5S 3E1
| |
Collapse
|
64
|
Shang L, Xu W, Ozer S, Gutell RR. Structural constraints identified with covariation analysis in ribosomal RNA. PLoS One 2012; 7:e39383. [PMID: 22724009 PMCID: PMC3378556 DOI: 10.1371/journal.pone.0039383] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2011] [Accepted: 05/24/2012] [Indexed: 11/19/2022] Open
Abstract
Covariation analysis is used to identify those positions with similar patterns of sequence variation in an alignment of RNA sequences. These constraints on the evolution of two positions are usually associated with a base pair in a helix. While mutual information (MI) has been used to accurately predict an RNA secondary structure and a few of its tertiary interactions, early studies revealed that phylogenetic event counting methods are more sensitive and provide extra confidence in the prediction of base pairs. We developed a novel and powerful phylogenetic events counting method (PEC) for quantifying positional covariation with the Gutell lab’s new RNA Comparative Analysis Database (rCAD). The PEC and MI-based methods each identify unique base pairs, and jointly identify many other base pairs. In total, both methods in combination with an N-best and helix-extension strategy identify the maximal number of base pairs. While covariation methods have effectively and accurately predicted RNAs secondary structure, only a few tertiary structure base pairs have been identified. Analysis presented herein and at the Gutell lab’s Comparative RNA Web (CRW) Site reveal that the majority of these latter base pairs do not covary with one another. However, covariation analysis does reveal a weaker although significant covariation between sets of nucleotides that are in proximity in the three-dimensional RNA structure. This reveals that covariation analysis identifies other types of structural constraints beyond the two nucleotides that form a base pair.
Collapse
MESH Headings
- Algorithms
- Base Pairing
- Computational Biology/methods
- Nucleic Acid Conformation
- RNA, Bacterial/chemistry
- RNA, Bacterial/genetics
- RNA, Ribosomal/chemistry
- RNA, Ribosomal/genetics
- RNA, Ribosomal, 16S/chemistry
- RNA, Ribosomal, 16S/genetics
- RNA, Ribosomal, 23S/chemistry
- RNA, Ribosomal, 23S/genetics
- RNA, Ribosomal, 5S/chemistry
- RNA, Ribosomal, 5S/genetics
Collapse
Affiliation(s)
- Lei Shang
- Institute for Cellular and Molecular Biology, Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, Texas, United States of America
| | - Weijia Xu
- Texas Advanced Computing Center, The University of Texas at Austin, Austin, Texas, United States of America
| | - Stuart Ozer
- Microsoft Corporation, Redmond, Washington, United States of America
| | - Robin R. Gutell
- Institute for Cellular and Molecular Biology, Center for Computational Biology and Bioinformatics, The University of Texas at Austin, Austin, Texas, United States of America
- * E-mail:
| |
Collapse
|
65
|
Dickson RJ, Gloor GB. Protein sequence alignment analysis by local covariation: coevolution statistics detect benchmark alignment errors. PLoS One 2012; 7:e37645. [PMID: 22715369 PMCID: PMC3371027 DOI: 10.1371/journal.pone.0037645] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2011] [Accepted: 04/26/2012] [Indexed: 11/19/2022] Open
Abstract
The use of sequence alignments to understand protein families is ubiquitous in molecular biology. High quality alignments are difficult to build and protein alignment remains one of the largest open problems in computational biology. Misalignments can lead to inferential errors about protein structure, folding, function, phylogeny, and residue importance. Identifying alignment errors is difficult because alignments are built and validated on the same primary criteria: sequence conservation. Local covariation identifies systematic misalignments and is independent of conservation. We demonstrate an alignment curation tool, LoCo, that integrates local covariation scores with the Jalview alignment editor. Using LoCo, we illustrate how local covariation is capable of identifying alignment errors due to the reduction of positional independence in the region of misalignment. We highlight three alignments from the benchmark database, BAliBASE 3, that contain regions of high local covariation, and investigate the causes to illustrate these types of scenarios. Two alignments contain sequential and structural shifts that cause elevated local covariation. Realignment of these misaligned segments reduces local covariation; these alternative alignments are supported with structural evidence. We also show that local covariation identifies active site residues in a validated alignment of paralogous structures. Loco is available at https://sourceforge.net/projects/locoprotein/files/
Collapse
Affiliation(s)
| | - Gregory B. Gloor
- Department of Biochemistry, The University of Western Ontario, London, Canada
- * E-mail:
| |
Collapse
|
66
|
Lambert C, Till R, Hobley L, Sockett RE. Mutagenesis of RpoE-like sigma factor genes in Bdellovibrio reveals differential control of groEL and two groES genes. BMC Microbiol 2012; 12:99. [PMID: 22676653 PMCID: PMC3464611 DOI: 10.1186/1471-2180-12-99] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2012] [Accepted: 06/07/2012] [Indexed: 12/30/2022] Open
Abstract
BACKGROUND Bdellovibrio bacteriovorus HD100 must regulate genes in response to a variety of environmental conditions as it enters, preys upon and leaves other bacteria, or grows axenically without prey. In addition to "housekeeping" sigma factors, its genome encodes several alternate sigma factors, including 2 Group IV-RpoE-like proteins, which may be involved in the complex regulation of its predatory lifestyle. RESULTS We find that one sigma factor gene, bd3314, cannot be deleted from Bdellovibrio in either predatory or prey-independent growth states, and is therefore possibly essential, likely being an alternate sigma 70. Deletion of one of two Group IV-like sigma factor genes, bd0881, affects flagellar gene regulation and results in less efficient predation, although not due to motility changes; deletion of the second, bd0743, showed that it normally represses chaperone gene expression and intriguingly we find an alternative groES gene is expressed at timepoints in the predatory cycle where intensive protein synthesis at Bdellovibrio septation, prior to prey lysis, will be occurring. CONCLUSIONS We have taken the first step in understanding how alternate sigma factors regulate different processes in the predatory lifecycle of Bdellovibrio and discovered that alternate chaperones regulated by one of them are expressed at different stages of the lifecycle.
Collapse
Affiliation(s)
- Carey Lambert
- Centre for Genetics and Genomics, School of Biology, University of Nottingham Medical School, QMC, Nottingham, NG7 2UH, UK
| | | | | | | |
Collapse
|
67
|
Kozma D, Simon I, Tusnády GE. CMWeb: an interactive on-line tool for analysing residue-residue contacts and contact prediction methods. Nucleic Acids Res 2012; 40:W329-33. [PMID: 22669913 PMCID: PMC3394325 DOI: 10.1093/nar/gks488] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A contact map is a 2D derivative of the 3D structure of proteins, containing various residue–residue (RR) contacts within the structure. Contact maps can be used for the reconstruction of structure with high accuracy and can be predicted from the amino acid sequence. Therefore understanding the various properties of contact maps is an important step in protein structure prediction. For investigating basic properties of contact formation and contact clusters we set up an integrated system called Contact Map Web Viewer, or CMWeb for short. The server can be used to visualize contact maps, to link contacts and to show them both in 3D structures and in multiple sequence alignments and to calculate various statistics on contacts. Moreover, we have implemented five contact prediction methods in the CMWeb server to visualize the predicted and real RR contacts in one contact map. The results of other RR contact prediction methods can be uploaded as a benchmark test onto the server as well. All of these functionality is behind a web server, thus for using our application only a Java-capable web browser is needed, no further program installation is required. The CMWeb is freely accessible at http://cmweb.enzim.hu.
Collapse
Affiliation(s)
- Dániel Kozma
- Institute of Enzymology, Research Centre for Natural Sciences, Hungarian Academy of Sciences, PO Box 7, H-1518 Budapest, Hungary
| | | | | |
Collapse
|
68
|
Mendoza JL, Schmidt A, Li Q, Nuvaga E, Barrett T, Bridges RJ, Feranchak AP, Brautigam CA, Thomas PJ. Requirements for efficient correction of ΔF508 CFTR revealed by analyses of evolved sequences. Cell 2012; 148:164-74. [PMID: 22265409 DOI: 10.1016/j.cell.2011.11.023] [Citation(s) in RCA: 219] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2011] [Revised: 10/20/2011] [Accepted: 11/03/2011] [Indexed: 12/14/2022]
Abstract
Misfolding of ΔF508 cystic fibrosis (CF) transmembrane conductance regulator (CFTR) underlies pathology in most CF patients. F508 resides in the first nucleotide-binding domain (NBD1) of CFTR near a predicted interface with the fourth intracellular loop (ICL4). Efforts to identify small molecules that restore function by correcting the folding defect have revealed an apparent efficacy ceiling. To understand the mechanistic basis of this obstacle, positions statistically coupled to 508, in evolved sequences, were identified and assessed for their impact on both NBD1 and CFTR folding. The results indicate that both NBD1 folding and interaction with ICL4 are altered by the ΔF508 mutation and that correction of either individual process is only partially effective. By contrast, combination of mutations that counteract both defects restores ΔF508 maturation and function to wild-type levels. These results provide a mechanistic rationale for the limited efficacy of extant corrector compounds and suggest approaches for identifying compounds that correct both defective steps.
Collapse
Affiliation(s)
- Juan L Mendoza
- Molecular Biophysics Program, and Department of Physiology, University of Texas Southwestern Medical Center, Dallas, TX 75390-9040, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
69
|
Dey SS, Xue Y, Joachimiak MP, Friedland GD, Burnett JC, Zhou Q, Arkin AP, Schaffer DV. Mutual information analysis reveals coevolving residues in Tat that compensate for two distinct functions in HIV-1 gene expression. J Biol Chem 2012; 287:7945-55. [PMID: 22253435 DOI: 10.1074/jbc.m111.302653] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Viral genomes are continually subjected to mutations, and functionally deleterious ones can be rescued by reversion or additional mutations that restore fitness. The error prone nature of HIV-1 replication has resulted in highly diverse viral sequences, and it is not clear how viral proteins such as Tat, which plays a critical role in viral gene expression and replication, retain their complex functions. Although several important amino acid positions in Tat are conserved, we hypothesized that it may also harbor functionally important residues that may not be individually conserved yet appear as correlated pairs, whose analysis could yield new mechanistic insights into Tat function and evolution. To identify such sites, we combined mutual information analysis and experimentation to identify coevolving positions and found that residues 35 and 39 are strongly correlated. Mutation of either residue of this pair into amino acids that appear in numerous viral isolates yields a defective virus; however, simultaneous introduction of both mutations into the heterologous Tat sequence restores gene expression close to wild-type Tat. Furthermore, in contrast to most coevolving protein residues that contribute to the same function, structural modeling and biochemical studies showed that these two residues contribute to two mechanistically distinct steps in gene expression: binding P-TEFb and promoting P-TEFb phosphorylation of the C-terminal domain in RNAPII. Moreover, Tat variants that mimic HIV-1 subtypes B or C at sites 35 and 39 have evolved orthogonal strengths of P-TEFb binding versus RNAPII phosphorylation, suggesting that subtypes have evolved alternate transcriptional strategies to achieve similar gene expression levels.
Collapse
Affiliation(s)
- Siddharth S Dey
- Department of Chemical and Biomolecular Engineering and the Helen Wills Neuroscience Institute, University of California, Berkeley, California 94720, USA
| | | | | | | | | | | | | | | |
Collapse
|
70
|
Livesay DR, Kreth KE, Fodor AA. A critical evaluation of correlated mutation algorithms and coevolution within allosteric mechanisms. Methods Mol Biol 2012; 796:385-398. [PMID: 22052502 DOI: 10.1007/978-1-61779-334-9_21] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
The notion of using the evolutionary history encoded within multiple sequence alignments to predict allosteric mechanisms is appealing. In this approach, correlated mutations are expected to reflect coordinated changes that maintain intramolecular coupling between residue pairs. Despite much early fanfare, the general suitability of correlated mutations to predict allosteric couplings has not yet been established. Lack of progress along these lines has been hindered by several algorithmic limitations including phylogenetic artifacts within alignments masking true covariance and the computational intractability of consideration of more than two correlated residues at a time. Recent progress in algorithm development, however, has been substantial with a new generation of correlated mutation algorithms that have made fundamental progress toward solving these difficult problems. Despite these encouraging results, there remains little evidence to suggest that the evolutionary constraints acting on allosteric couplings are sufficient to be recovered from multiple sequence alignments. In this review, we argue that due to the exquisite sensitivity of protein dynamics, and hence that of allosteric mechanisms, the latter vary widely within protein families. If it turns out to be generally true that even very similar homologs display a wide divergence of allosteric mechanisms, then even a perfect correlated mutation algorithm could not be reliably used as a general mechanism for discovery of allosteric pathways.
Collapse
Affiliation(s)
- Dennis R Livesay
- Department of Bioinformatics and Genomics, University of North Carolina at Charlotte, Charlotte, NC, USA
| | | | | |
Collapse
|
71
|
Henriksen SB, Mortensen RJ, Geertz-Hansen HM, Neves-Petersen MT, Arnason O, Söring J, Petersen SB. Hyperdimensional analysis of amino acid pair distributions in proteins. PLoS One 2011; 6:e25638. [PMID: 22174733 PMCID: PMC3235099 DOI: 10.1371/journal.pone.0025638] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2011] [Accepted: 09/08/2011] [Indexed: 01/06/2023] Open
Abstract
Our manuscript presents a novel approach to protein structure analyses. We have organized an 8-dimensional data cube with protein 3D-structural information from 8706 high-resolution non-redundant protein-chains with the aim of identifying packing rules at the amino acid pair level. The cube contains information about amino acid type, solvent accessibility, spatial and sequence distance, secondary structure and sequence length. We are able to pose structural queries to the data cube using program ProPack. The response is a 1, 2 or 3D graph. Whereas the response is of a statistical nature, the user can obtain an instant list of all PDB-structures where such pair is found. The user may select a particular structure, which is displayed highlighting the pair in question. The user may pose millions of different queries and for each one he will receive the answer in a few seconds. In order to demonstrate the capabilities of the data cube as well as the programs, we have selected well known structural features, disulphide bridges and salt bridges, where we illustrate how the queries are posed, and how answers are given. Motifs involving cysteines such as disulphide bridges, zinc-fingers and iron-sulfur clusters are clearly identified and differentiated. ProPack also reveals that whereas pairs of Lys residues virtually never appear in close spatial proximity, pairs of Arg are abundant and appear at close spatial distance, contrasting the belief that electrostatic repulsion would prevent this juxtaposition and that Arg-Lys is perceived as a conservative mutation. The presented programs can find and visualize novel packing preferences in proteins structures allowing the user to unravel correlations between pairs of amino acids. The new tools allow the user to view statistical information and visualize instantly the structures that underpin the statistical information, which is far from trivial with most other SW tools for protein structure analysis.
Collapse
Affiliation(s)
- Svend B. Henriksen
- NanoBiotechnology Group, Department of Physics and Nanotechnology, Aalborg University, Aalborg, Denmark
| | - Rasmus J. Mortensen
- NanoBiotechnology Group, Department of Physics and Nanotechnology, Aalborg University, Aalborg, Denmark
| | - Henrik M. Geertz-Hansen
- NanoBiotechnology Group, Department of Physics and Nanotechnology, Aalborg University, Aalborg, Denmark
| | - Maria Teresa Neves-Petersen
- International Iberian Nanotechnol Lab (INL), Braga, Portugal
- Nanobiotechnology Group, Department of Biotechnology, Chemistry and Environmental Sciences, University of Aalborg, Aalborg, Denmark
- * E-mail:
| | - Omar Arnason
- NanoBiotechnology Group, Department of Physics and Nanotechnology, Aalborg University, Aalborg, Denmark
| | - Jón Söring
- NanoBiotechnology Group, Department of Physics and Nanotechnology, Aalborg University, Aalborg, Denmark
| | - Steffen B. Petersen
- Nanobiotechnology Group, Department of Health Science and Technology, Aalborg University, Aalborg, Denmark
- The Institute for Lasers, Photonics and Biophotonics, University at Buffalo, The State University of New York, Buffalo, New York, United States of America
| |
Collapse
|
72
|
Ghosh A, Sakaguchi R, Liu C, Vishveshwara S, Hou YM. Allosteric communication in cysteinyl tRNA synthetase: a network of direct and indirect readout. J Biol Chem 2011; 286:37721-31. [PMID: 21890630 DOI: 10.1074/jbc.m111.246702] [Citation(s) in RCA: 58] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Protein structure networks are constructed for the identification of long-range signaling pathways in cysteinyl tRNA synthetase (CysRS). Molecular dynamics simulation trajectory of CysRS-ligand complexes were used to determine conformational ensembles in order to gain insight into the allosteric signaling paths. Communication paths between the anticodon binding region and the aminoacylation region have been identified. Extensive interaction between the helix bundle domain and the anticodon binding domain, resulting in structural rigidity in the presence of tRNA, has been detected. Based on the predicted model, six residues along the communication paths have been examined by mutations (single and double) and shown to mediate a coordinated coupling between anticodon recognition and activation of amino acid at the active site. This study on CysRS clearly shows that specific key residues, which are involved in communication between distal sites in allosteric proteins but may be elusive in direct structure analysis, can be identified from dynamics of protein structure networks.
Collapse
Affiliation(s)
- Amit Ghosh
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India
| | | | | | | | | |
Collapse
|
73
|
|
74
|
Pelé J, Abdi H, Moreau M, Thybert D, Chabbert M. Multidimensional scaling reveals the main evolutionary pathways of class A G-protein-coupled receptors. PLoS One 2011; 6:e19094. [PMID: 21544207 PMCID: PMC3081337 DOI: 10.1371/journal.pone.0019094] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2010] [Accepted: 03/16/2011] [Indexed: 11/21/2022] Open
Abstract
Class A G-protein-coupled receptors (GPCRs) constitute the largest family of transmembrane receptors in the human genome. Understanding the mechanisms which drove the evolution of such a large family would help understand the specificity of each GPCR sub-family with applications to drug design. To gain evolutionary information on class A GPCRs, we explored their sequence space by metric multidimensional scaling analysis (MDS). Three-dimensional mapping of human sequences shows a non-uniform distribution of GPCRs, organized in clusters that lay along four privileged directions. To interpret these directions, we projected supplementary sequences from different species onto the human space used as a reference. With this technique, we can easily monitor the evolutionary drift of several GPCR sub-families from cnidarians to humans. Results support a model of radiative evolution of class A GPCRs from a central node formed by peptide receptors. The privileged directions obtained from the MDS analysis are interpretable in terms of three main evolutionary pathways related to specific sequence determinants. The first pathway was initiated by a deletion in transmembrane helix 2 (TM2) and led to three sub-families by divergent evolution. The second pathway corresponds to the differentiation of the amine receptors. The third pathway corresponds to parallel evolution of several sub-families in relation with a covarion process involving proline residues in TM2 and TM5. As exemplified with GPCRs, the MDS projection technique is an important tool to compare orthologous sequence sets and to help decipher the mutational events that drove the evolution of protein families.
Collapse
Affiliation(s)
- Julien Pelé
- CNRS UMR 6214 – INSERM 771, Faculté de Médecine, Angers, France
| | - Hervé Abdi
- School of Behavioral and Brain Sciences, The University of Texas at Dallas, Richardson, Texas, United States of America
| | - Matthieu Moreau
- CNRS UMR 6214 – INSERM 771, Faculté de Médecine, Angers, France
| | - David Thybert
- CNRS UMR 6214 – INSERM 771, Faculté de Médecine, Angers, France
| | - Marie Chabbert
- CNRS UMR 6214 – INSERM 771, Faculté de Médecine, Angers, France
| |
Collapse
|
75
|
Use of mutual information arrays to predict coevolving sites in the full length HIV gp120 protein for subtypes B and C. Virol Sin 2011; 26:95-104. [PMID: 21468932 DOI: 10.1007/s12250-011-3188-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2011] [Accepted: 02/22/2011] [Indexed: 10/18/2022] Open
Abstract
It is well established that different sites within a protein evolve at different rates according to their role within the protein; identification of these correlated mutations can aid in tasks such as ab initio protein structure, structure function analysis or sequence alignment. Mutual Information is a standard measure for coevolution between two sites but its application is limited by signal to noise ratio. In this work we report a preliminary study to investigate whether larger sequence sets could circumvent this problem by calculating mutual information arrays for two sets of drug naïve sequences from the HIV gp120 protein for the B and C subtypes. Our results suggest that while the larger sequences sets can improve the signal to noise ratio, the gain is offset by the high mutation rate of the HIV virus which makes it more difficult to achieve consistent alignments. Nevertheless, we were able to predict a number of coevolving sites that were supported by previous experimental studies as well as a region close to the C terminal of the protein that was highly variable in the C subtype but highly conserved in the B subtype.
Collapse
|
76
|
Jeon J, Nam HJ, Choi YS, Yang JS, Hwang J, Kim S. Molecular evolution of protein conformational changes revealed by a network of evolutionarily coupled residues. Mol Biol Evol 2011; 28:2675-85. [PMID: 21470969 DOI: 10.1093/molbev/msr094] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/25/2022] Open
Abstract
An improved understanding of protein conformational changes has broad implications for elucidating the mechanisms of various biological processes and for the design of protein engineering experiments. Understanding rearrangements of residue interactions is a key component in the challenge of describing structural transitions. Evolutionary properties of protein sequences and structures are extensively studied; however, evolution of protein motions, especially with respect to interaction rearrangements, has yet to be explored. Here, we investigated the relationship between sequence evolution and protein conformational changes and discovered that structural transitions are encoded in amino acid sequences as coevolving residue pairs. Furthermore, we found that highly coevolving residues are clustered in the flexible regions of proteins and facilitate structural transitions by forming and disrupting their interactions cooperatively. Our results provide insight into the evolution of protein conformational changes and help to identify residues important for structural transitions.
Collapse
Affiliation(s)
- Jouhyun Jeon
- Division of Molecular and Life Science, Pohang University of Science and Technology, Pohang, Korea
| | | | | | | | | | | |
Collapse
|
77
|
Abstract
The development of peptides with therapeutic activities can be based on naturally occurring peptides or alternatively on de novo design. The discovery of natural peptides is often a matter of serendipity. In part, this is because natural peptides are typically proteolytically cleaved out from precursor proteins, a feature that averts the direct benefits of the genomic revolution. The first part of this review describes attempts to create a more systematic identification of natural peptides relying on a two step process. In the initial step, an in silico peptidome is predicted through the use of machine learning. Then, various computational biology tools are tailored to focus on peptides predicted to have the desired biological activity; for example, activating a GPCR or modulating the cellular arm of the immune system. The second part of the review is devoted to de novo peptide design and focuses on arguably the simplest scenario in which the designed peptide corresponds to a contiguous protein subsequence. Amongst these peptides, those corresponding to helical segments are prominent, mainly due to their relative ability to fold independently. Inspired by the clinical success of viral entry inhibitors, which are peptides corresponding to helical segments in viral envelope proteins, a computational tool for the identification of intramolecular helix-helix interactions was developed. Using this approach, peptides having anti-cancer, anti-angiogenic, and anti-inflammatory activities have been recently rationally designed and biologically characterized.
Collapse
Affiliation(s)
- Yossef Kliger
- Compugen LTD, 72 Pinchas Rosen, Tel Aviv 69512, Israel.
| |
Collapse
|
78
|
Skjaerven L, Grant B, Muga A, Teigen K, McCammon JA, Reuter N, Martinez A. Conformational sampling and nucleotide-dependent transitions of the GroEL subunit probed by unbiased molecular dynamics simulations. PLoS Comput Biol 2011; 7:e1002004. [PMID: 21423709 PMCID: PMC3053311 DOI: 10.1371/journal.pcbi.1002004] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2010] [Accepted: 12/09/2010] [Indexed: 12/01/2022] Open
Abstract
GroEL is an ATP dependent molecular chaperone that promotes the folding of a large number of substrate proteins in E. coli. Large-scale conformational transitions occurring during the reaction cycle have been characterized from extensive crystallographic studies. However, the link between the observed conformations and the mechanisms involved in the allosteric response to ATP and the nucleotide-driven reaction cycle are not completely established. Here we describe extensive (in total long) unbiased molecular dynamics (MD) simulations that probe the response of GroEL subunits to ATP binding. We observe nucleotide dependent conformational transitions, and show with multiple 100 ns long simulations that the ligand-induced shift in the conformational populations are intrinsically coded in the structure-dynamics relationship of the protein subunit. Thus, these simulations reveal a stabilization of the equatorial domain upon nucleotide binding and a concomitant “opening” of the subunit, which reaches a conformation close to that observed in the crystal structure of the subunits within the ADP-bound oligomer. Moreover, we identify changes in a set of unique intrasubunit interactions potentially important for the conformational transition. Molecular machines convert chemical energy to mechanical work in the process of carrying out their specific tasks. Often these proteins are fueled by ATP binding and hydrolysis, enabling switching between different conformations. The ATP-dependent chaperone GroEL is a molecular machine that opens and closes its barrel-like structure in order to provide a folding cage for unfolded proteins. The quest to fully understand and control GroEL and other molecular machines is enhanced by complementing experimental work with computational approaches. Here, we provide a description of the molecular basis for the conformational changes in the GroEL subunit by performing extensive molecular dynamics simulations. The simulations sample the conformational population for the different nucleotide-free and bound states in the isolated subunit. The results reveal that the conformations of the subunit when isolated resemble those of the subunit integrated in the GroEL complex. Moreover, the molecular dynamics simulations allow following detailed changes in individual interatomic interactions brought about by ATP-binding.
Collapse
Affiliation(s)
- Lars Skjaerven
- Department of Biomedicine, University of Bergen, Bergen, Norway
| | | | | | | | | | | | | |
Collapse
|
79
|
Ackerman SH, Gatti DL. The contribution of coevolving residues to the stability of KDO8P synthase. PLoS One 2011; 6:e17459. [PMID: 21408011 PMCID: PMC3052366 DOI: 10.1371/journal.pone.0017459] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2010] [Accepted: 02/03/2011] [Indexed: 12/03/2022] Open
Abstract
Background The evolutionary tree of 3-deoxy-D-manno-octulosonate 8-phosphate (KDO8P) synthase (KDO8PS), a bacterial enzyme that catalyzes a key step in the biosynthesis of bacterial endotoxin, is evenly divided between metal and non-metal forms, both having similar structures, but diverging in various degrees in amino acid sequence. Mutagenesis, crystallographic and computational studies have established that only a few residues determine whether or not KDO8PS requires a metal for function. The remaining divergence in the amino acid sequence of KDO8PSs is apparently unrelated to the underlying catalytic mechanism. Methodology/Principal Findings The multiple alignment of all known KDO8PS sequences reveals that several residue pairs coevolved, an indication of their possible linkage to a structural constraint. In this study we investigated by computational means the contribution of coevolving residues to the stability of KDO8PS. We found that about 1/4 of all strongly coevolving pairs probably originated from cycles of mutation (decreasing stability) and suppression (restoring it), while the remaining pairs are best explained by a succession of neutral or nearly neutral covarions. Conclusions/Significance Both sequence conservation and coevolution are involved in the preservation of the core structure of KDO8PS, but the contribution of coevolving residues is, in proportion, smaller. This is because small stability gains or losses associated with selection of certain residues in some regions of the stability landscape of KDO8PS are easily offset by a large number of possible changes in other regions. While this effect increases the tolerance of KDO8PS to deleterious mutations, it also decreases the probability that specific pairs of residues could have a strong contribution to the thermodynamic stability of the protein.
Collapse
Affiliation(s)
- Sharon H. Ackerman
- Department of Biochemistry and Molecular Biology, Wayne State University School of Medicine, Detroit, Michigan, United States of America
| | - Domenico L. Gatti
- Department of Biochemistry and Molecular Biology, Wayne State University School of Medicine, Detroit, Michigan, United States of America
- Cardiovascular Research Institute, Wayne State University School of Medicine, Detroit, Michigan, United States of America
- * E-mail:
| |
Collapse
|
80
|
Hamer R, Luo Q, Armitage JP, Reinert G, Deane CM. i-Patch: interprotein contact prediction using local network information. Proteins 2011; 78:2781-97. [PMID: 20635422 DOI: 10.1002/prot.22792] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Biological processes are commonly controlled by precise protein-protein interactions. These connections rely on specific amino acids at the binding interfaces. Here we predict the binding residues of such interprotein complexes. We have developed a suite of methods, i-Patch, which predict the interprotein contact sites by considering the two proteins as a network, with residues as nodes and contacts as edges. i-Patch starts with two proteins, A and B, which are assumed to interact, but for which the structure of the complex is not available. However, we assume that for each protein, we have a reference structure and a multiple sequence alignment of homologues. i-Patch then uses the propensities of patches of residues to interact, to predict interprotein contact sites. i-Patch outperforms several other tested algorithms for prediction of interprotein contact sites. It gives 59% precision with 20% recall on a blind test set of 31 protein pairs. Combining the i-Patch scores with an existing correlated mutation algorithm, McBASC, using a logistic model gave little improvement. Results from a case study, on bacterial chemotaxis protein complexes, demonstrate that our predictions can identify contact residues, as well as suggesting unknown interfaces in multiprotein complexes.
Collapse
Affiliation(s)
- Rebecca Hamer
- Oxford Centre for Integrative Systems Biology, Department of Biochemistry, University of Oxford, Oxford, United Kingdom
| | | | | | | | | |
Collapse
|
81
|
Smock RG, Rivoire O, Russ WP, Swain JF, Leibler S, Ranganathan R, Gierasch LM. An interdomain sector mediating allostery in Hsp70 molecular chaperones. Mol Syst Biol 2011; 6:414. [PMID: 20865007 PMCID: PMC2964120 DOI: 10.1038/msb.2010.65] [Citation(s) in RCA: 109] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2010] [Accepted: 05/20/2010] [Indexed: 11/09/2022] Open
Abstract
Allosteric coupling between protein domains is fundamental to many cellular processes. For example, Hsp70 molecular chaperones use ATP binding by their actin-like N-terminal ATPase domain to control substrate interactions in their C-terminal substrate-binding domain, a reaction that is critical for protein folding in cells. Here, we generalize the statistical coupling analysis to simultaneously evaluate co-evolution between protein residues and functional divergence between sequences in protein sub-families. Applying this method in the Hsp70/110 protein family, we identify a sparse but structurally contiguous group of co-evolving residues called a 'sector', which is an attribute of the allosteric Hsp70 sub-family that links the functional sites of the two domains across a specific interdomain interface. Mutagenesis of Escherichia coli DnaK supports the conclusion that this interdomain sector underlies the allosteric coupling in this protein family. The identification of the Hsp70 sector provides a basis for further experiments to understand the mechanism of allostery and introduces the idea that cooperativity between interacting proteins or protein domains can be mediated by shared sectors.
Collapse
Affiliation(s)
- Robert G Smock
- Department of Biochemistry and Molecular Biology, University of Massachusetts, Amherst, MA 01003, USA
| | | | | | | | | | | | | |
Collapse
|
82
|
Csanády L, Vergani P, Gulyás-Kovács A, Gadsby DC. Electrophysiological, biochemical, and bioinformatic methods for studying CFTR channel gating and its regulation. Methods Mol Biol 2011; 741:443-469. [PMID: 21594801 DOI: 10.1007/978-1-61779-117-8_28] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/30/2023]
Abstract
CFTR is the only member of the ABC (ATP-binding cassette) protein superfamily known to function as an ion channel. Most other ABC proteins are ATP-driven transporters, in which a cycle of ATP binding and hydrolysis, at intracellular nucleotide binding domains (NBDs), powers uphill substrate translocation across the membrane. In CFTR, this same ATP-driven cycle opens and closes a transmembrane pore through which chloride ions flow rapidly down their electrochemical gradient. Detailed analysis of the pattern of gating of CFTR channels thus offers the opportunity to learn about mechanisms of function not only of CFTR channels but also of their ABC transporter ancestors. In addition, CFTR channel gating is subject to complex regulation by kinase-mediated phosphorylation at multiple consensus sites in a cytoplasmic regulatory domain that is unique to CFTR. Here we offer a practical guide to extract useful information about the mechanisms that control opening and closing of CFTR channels: on how to plan (including information obtained from analysis of multiple sequence alignments), carry out, and analyze electrophysiological and biochemical experiments, as well as on how to circumvent potential pitfalls.
Collapse
Affiliation(s)
- László Csanády
- Department of Medical Biochemistry, Semmelweis University, Budapest, Hungary.
| | | | | | | |
Collapse
|
83
|
Abstract
Domain Interaction MAp (DIMA, available at http://webclu.bio.wzw.tum.de/dima) is a database of predicted and known interactions between protein domains. It integrates 5807 structurally known interactions imported from the iPfam and 3did databases and 46 900 domain interactions predicted by four computational methods: domain phylogenetic profiling, domain pair exclusion algorithm correlated mutations and domain interaction prediction in a discriminative way. Additionally predictions are filtered to exclude those domain pairs that are reported as non-interacting by the Negatome database. The DIMA Web site allows to calculate domain interaction networks either for a domain of interest or for entire organisms, and to explore them interactively using the Flash-based Cytoscape Web software.
Collapse
Affiliation(s)
- Qibin Luo
- Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, 85350 Freising, Germany
| | | | | | | |
Collapse
|
84
|
Kowarsch A, Fuchs A, Frishman D, Pagel P. Correlated mutations: a hallmark of phenotypic amino acid substitutions. PLoS Comput Biol 2010; 6. [PMID: 20862353 PMCID: PMC2940720 DOI: 10.1371/journal.pcbi.1000923] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2009] [Accepted: 08/09/2010] [Indexed: 11/18/2022] Open
Abstract
Point mutations resulting in the substitution of a single amino acid can cause severe functional consequences, but can also be completely harmless. Understanding what determines the phenotypical impact is important both for planning targeted mutation experiments in the laboratory and for analyzing naturally occurring mutations found in patients. Common wisdom suggests using the extent of evolutionary conservation of a residue or a sequence motif as an indicator of its functional importance and thus vulnerability in case of mutation. In this work, we put forward the hypothesis that in addition to conservation, co-evolution of residues in a protein influences the likelihood of a residue to be functionally important and thus associated with disease. While the basic idea of a relation between co-evolution and functional sites has been explored before, we have conducted the first systematic and comprehensive analysis of point mutations causing disease in humans with respect to correlated mutations. We included 14,211 distinct positions with known disease-causing point mutations in 1,153 human proteins in our analysis. Our data show that (1) correlated positions are significantly more likely to be disease-associated than expected by chance, and that (2) this signal cannot be explained by conservation patterns of individual sequence positions. Although correlated residues have primarily been used to predict contact sites, our data are in agreement with previous observations that (3) many such correlations do not relate to physical contacts between amino acid residues. Access to our analysis results are provided at http://webclu.bio.wzw.tum.de/~pagel/supplements/correlated-positions/. Point mutations (i.e., changes of a single sequence element) can have a severe impact on protein function. Many diseases are caused by such minute defects. On the other hand, the majority of such mutations does not lead to noticeable effects. Although previous research has revealed important aspects that influence or predict the chance of a mutation to cause disease, much remains to be learned before we fully understand this complex problem. In our work, we use the observation that sometimes certain positions in a protein mutate in an apparently correlated fashion and analyze this correlation with respect to mutation vulnerability. Our results show that positions exhibiting evolutionary correlation are significantly more likely to be vulnerable to mutation than average positions. On one hand, our data further support the concept of correlated positions to not only be associated with protein contacts but also functional sites and/or disease positions (as introduced by others). On the other hand, this could be useful to further improve the understanding and prediction of the consequences of mutations. Our work is the first to attempt a large-scale quantitation of this relationship.
Collapse
Affiliation(s)
- Andreas Kowarsch
- Lehrstuhl für Genomorientierte Bioinformatik, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany
- Institut für Bioinformatik und Systembiologie/MIPS, Helmholtz Zentrum München – Deutsches Forschungszentrum für Gesundheit und Umwelt, Neuherberg, Germany
| | - Angelika Fuchs
- Lehrstuhl für Genomorientierte Bioinformatik, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany
| | - Dmitrij Frishman
- Lehrstuhl für Genomorientierte Bioinformatik, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany
- Institut für Bioinformatik und Systembiologie/MIPS, Helmholtz Zentrum München – Deutsches Forschungszentrum für Gesundheit und Umwelt, Neuherberg, Germany
| | - Philipp Pagel
- Lehrstuhl für Genomorientierte Bioinformatik, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany
- Institut für Bioinformatik und Systembiologie/MIPS, Helmholtz Zentrum München – Deutsches Forschungszentrum für Gesundheit und Umwelt, Neuherberg, Germany
- * E-mail:
| |
Collapse
|
85
|
Barash D, Churkin A. Mutational analysis in RNAs: comparing programs for RNA deleterious mutation prediction. Brief Bioinform 2010; 12:104-14. [DOI: 10.1093/bib/bbq059] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
|
86
|
Brown CA, Brown KS. Validation of coevolving residue algorithms via pipeline sensitivity analysis: ELSC and OMES and ZNMI, oh my! PLoS One 2010; 5:e10779. [PMID: 20531955 PMCID: PMC2879359 DOI: 10.1371/journal.pone.0010779] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2010] [Accepted: 04/25/2010] [Indexed: 11/26/2022] Open
Abstract
Correlated amino acid substitution algorithms attempt to discover groups of residues that co-fluctuate due to either structural or functional constraints. Although these algorithms could inform both ab initio protein folding calculations and evolutionary studies, their utility for these purposes has been hindered by a lack of confidence in their predictions due to hard to control sources of error. To complicate matters further, naive users are confronted with a multitude of methods to choose from, in addition to the mechanics of assembling and pruning a dataset. We first introduce a new pair scoring method, called ZNMI (Z-scored-product Normalized Mutual Information), which drastically improves the performance of mutual information for co-fluctuating residue prediction. Second and more important, we recast the process of finding coevolving residues in proteins as a data-processing pipeline inspired by the medical imaging literature. We construct an ensemble of alignment partitions that can be used in a cross-validation scheme to assess the effects of choices made during the procedure on the resulting predictions. This pipeline sensitivity study gives a measure of reproducibility (how similar are the predictions given perturbations to the pipeline?) and accuracy (are residue pairs with large couplings on average close in tertiary structure?). We choose a handful of published methods, along with ZNMI, and compare their reproducibility and accuracy on three diverse protein families. We find that (i) of the algorithms tested, while none appear to be both highly reproducible and accurate, ZNMI is one of the most accurate by far and (ii) while users should be wary of predictions drawn from a single alignment, considering an ensemble of sub-alignments can help to determine both highly accurate and reproducible couplings. Our cross-validation approach should be of interest both to developers and end users of algorithms that try to detect correlated amino acid substitutions.
Collapse
Affiliation(s)
- Christopher A. Brown
- Department of Chemistry and Chemical Biology, Harvard University, Cambridge, Massachusetts, United States of America
- FAS Center for Systems Biology, Harvard University, Cambridge, Massachusetts, United States of America
| | - Kevin S. Brown
- Department of Physics, University of California Santa Barbara, Santa Barbara, California, United States of America
- Institute for Collaborative Biotechnologies, University of California Santa Barbara, Santa Barbara, California, United States of America
- * E-mail:
| |
Collapse
|
87
|
Ashkenazy H, Kliger Y. Reducing phylogenetic bias in correlated mutation analysis. Protein Eng Des Sel 2010; 23:321-6. [PMID: 20067922 DOI: 10.1093/protein/gzp078] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Correlated mutation analysis (CMA) is a sequence-based approach for ab initio protein contact map prediction. The basis of this approach is the observed correlation between mutations in interacting amino acid residues. These correlations are often estimated by either calculating the Pearson's correlation coefficient (PCC) or the mutual information (MI) between columns in a multiple sequence alignment (MSA) of the protein of interest and its homologs. A major challenge of CMA is to filter out the background noise originating from phylogenetic relatedness between sequences included in the MSA. Recently, a procedure to reduce this background noise was demonstrated to improve an MI-based predictor. Herein, we tested whether a similar approach can also improve the performance of the classical PCC-based method. Indeed, performance improvements were achieved for all four major SCOP classes. Furthermore, the results reveal that the improved PCC-based method is superior to MI-based methods for proteins having MSAs of up to 100 sequences.
Collapse
|
88
|
Lunt B, Szurmant H, Procaccini A, Hoch JA, Hwa T, Weigt M. Inference of Direct Residue Contacts in Two-Component Signaling. Methods Enzymol 2010; 471:17-41. [DOI: 10.1016/s0076-6879(10)71002-8] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
89
|
Noivirt-Brik O, Horovitz A, Unger R. Trade-off between positive and negative design of protein stability: from lattice models to real proteins. PLoS Comput Biol 2009; 5:e1000592. [PMID: 20011105 PMCID: PMC2781108 DOI: 10.1371/journal.pcbi.1000592] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2009] [Accepted: 11/03/2009] [Indexed: 11/18/2022] Open
Abstract
Two different strategies for stabilizing proteins are (i) positive design in which the native state is stabilized and (ii) negative design in which competing non-native conformations are destabilized. Here, the circumstances under which one strategy might be favored over the other are explored in the case of lattice models of proteins and then generalized and discussed with regard to real proteins. The balance between positive and negative design of proteins is found to be determined by their average "contact-frequency", a property that corresponds to the fraction of states in the conformational ensemble of the sequence in which a pair of residues is in contact. Lattice model proteins with a high average contact-frequency are found to use negative design more than model proteins with a low average contact-frequency. A mathematical derivation of this result indicates that it is general and likely to hold also for real proteins. Comparison of the results of correlated mutation analysis for real proteins with typical contact-frequencies to those of proteins likely to have high contact-frequencies (such as disordered proteins and proteins that are dependent on chaperonins for their folding) indicates that the latter tend to have stronger interactions between residues that are not in contact in their native conformation. Hence, our work indicates that negative design is employed when insufficient stabilization is achieved via positive design owing to high contact-frequencies.
Collapse
Affiliation(s)
- Orly Noivirt-Brik
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
| | - Amnon Horovitz
- Department of Structural Biology, Weizmann Institute of Science, Rehovot, Israel
- * E-mail:
| | - Ron Unger
- The Mina and Everard Goodman Faculty of Life Sciences, Bar-Ilan University, Ramat-Gan, Israel
| |
Collapse
|
90
|
Tori K, Dassa B, Johnson MA, Southworth MW, Brace LE, Ishino Y, Pietrokovski S, Perler FB. Splicing of the mycobacteriophage Bethlehem DnaB intein: identification of a new mechanistic class of inteins that contain an obligate block F nucleophile. J Biol Chem 2009; 285:2515-26. [PMID: 19940146 PMCID: PMC2807308 DOI: 10.1074/jbc.m109.069567] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Inteins are single turnover enzymes that splice out of protein precursors during maturation of the host protein (extein). The Cys or Ser at the N terminus of most inteins initiates a four-step protein splicing reaction by forming a (thio)ester bond at the N-terminal splice junction. Several recently identified inteins cannot perform this acyl rearrangement because they do not begin with Cys, Thr, or Ser. This study analyzes one of these, the mycobacteriophage Bethlehem DnaB intein, which we describe here as the prototype for a new class of inteins based on sequence comparisons, reactivity, and mechanism. These Class 3 inteins are characterized by a non-nucleophilic N-terminal residue that co-varies with a non-contiguous Trp, Cys, Thr triplet (WCT) and a Thr or Ser as the first C-extein residue. Several mechanistic differences were observed when compared with standard inteins or previously studied atypical KlbA Ala1 inteins: (a) cleavage at the N-terminal splice junction in the absence of all standard N- and C-terminal splice junction nucleophiles, (b) activation of the N-terminal splice junction by a variant Block B motif that includes the WCT triplet Trp, (c) decay of the branched intermediate by thiols or Cys despite an ester linkage at the C-extein branch point, and (d) an absolute requirement for the WCT triplet Block F Cys. Based on biochemical data and confirmed by molecular modeling, we propose roles for these newly identified conserved residues, a novel protein splicing mechanism that includes a second branched intermediate, and an intein classification with three mechanistic categories.
Collapse
Affiliation(s)
- Kazuo Tori
- New England Biolabs, Ipswich, Massachusetts 01938, USA
| | | | | | | | | | | | | | | |
Collapse
|
91
|
Lu HM, Liang J. Perturbation-based Markovian transmission model for probing allosteric dynamics of large macromolecular assembling: a study of GroEL-GroES. PLoS Comput Biol 2009; 5:e1000526. [PMID: 19798437 PMCID: PMC2741606 DOI: 10.1371/journal.pcbi.1000526] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2009] [Accepted: 08/31/2009] [Indexed: 11/19/2022] Open
Abstract
Large macromolecular assemblies are often important for biological processes in cells. Allosteric communications between different parts of these molecular machines play critical roles in cellular signaling. Although studies of the topology and fluctuation dynamics of coarse-grained residue networks can yield important insights, they do not provide characterization of the time-dependent dynamic behavior of these macromolecular assemblies. Here we develop a novel approach called Perturbation-based Markovian Transmission (PMT) model to study globally the dynamic responses of the macromolecular assemblies. By monitoring simultaneous responses of all residues (>8,000) across many (>6) decades of time spanning from the initial perturbation until reaching equilibrium using a Krylov subspace projection method, we show that this approach can yield rich information. With criteria based on quantitative measurements of relaxation half-time, flow amplitude change, and oscillation dynamics, this approach can identify pivot residues that are important for macromolecular movement, messenger residues that are key to signal mediating, and anchor residues important for binding interactions. Based on a detailed analysis of the GroEL-GroES chaperone system, we found that our predictions have an accuracy of 71-84% judged by independent experimental studies reported in the literature. This approach is general and can be applied to other large macromolecular machineries such as the virus capsid and ribosomal complex.
Collapse
Affiliation(s)
- Hsiao-Mei Lu
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, United States of America
| | - Jie Liang
- Department of Bioengineering, University of Illinois at Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
92
|
Protein sectors: evolutionary units of three-dimensional structure. Cell 2009; 138:774-86. [PMID: 19703402 DOI: 10.1016/j.cell.2009.07.038] [Citation(s) in RCA: 523] [Impact Index Per Article: 32.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2009] [Revised: 07/03/2009] [Accepted: 07/30/2009] [Indexed: 11/23/2022]
Abstract
Proteins display a hierarchy of structural features at primary, secondary, tertiary, and higher-order levels, an organization that guides our current understanding of their biological properties and evolutionary origins. Here, we reveal a structural organization distinct from this traditional hierarchy by statistical analysis of correlated evolution between amino acids. Applied to the S1A serine proteases, the analysis indicates a decomposition of the protein into three quasi-independent groups of correlated amino acids that we term "protein sectors." Each sector is physically connected in the tertiary structure, has a distinct functional role, and constitutes an independent mode of sequence divergence in the protein family. Functionally relevant sectors are evident in other protein families as well, suggesting that they may be general features of proteins. We propose that sectors represent a structural organization of proteins that reflects their evolutionary histories.
Collapse
|
93
|
Lahti JL, Silverman AP, Cochran JR. Interrogating and predicting tolerated sequence diversity in protein folds: application to E. elaterium trypsin inhibitor-II cystine-knot miniprotein. PLoS Comput Biol 2009; 5:e1000499. [PMID: 19730675 PMCID: PMC2725296 DOI: 10.1371/journal.pcbi.1000499] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2009] [Accepted: 08/04/2009] [Indexed: 11/18/2022] Open
Abstract
Cystine-knot miniproteins (knottins) are promising molecular scaffolds for protein engineering applications. Members of the knottin family have multiple loops capable of displaying conformationally constrained polypeptides for molecular recognition. While previous studies have illustrated the potential of engineering knottins with modified loop sequences, a thorough exploration into the tolerated loop lengths and sequence space of a knottin scaffold has not been performed. In this work, we used the Ecballium elaterium trypsin inhibitor II (EETI) as a model member of the knottin family and constructed libraries of EETI loop-substituted variants with diversity in both amino acid sequence and loop length. Using yeast surface display, we isolated properly folded EETI loop-substituted clones and applied sequence analysis tools to assess the tolerated diversity of both amino acid sequence and loop length. In addition, we used covariance analysis to study the relationships between individual positions in the substituted loops, based on the expectation that correlated amino acid substitutions will occur between interacting residue pairs. We then used the results of our sequence and covariance analyses to successfully predict loop sequences that facilitated proper folding of the knottin when substituted into EETI loop 3. The sequence trends we observed in properly folded EETI loop-substituted clones will be useful for guiding future protein engineering efforts with this knottin scaffold. Furthermore, our findings demonstrate that the combination of directed evolution with sequence and covariance analyses can be a powerful tool for rational protein engineering.
Collapse
Affiliation(s)
- Jennifer L. Lahti
- Department of Bioengineering, Cancer Center, Bio-X Program, Stanford University, Stanford, California, United States of America
| | - Adam P. Silverman
- Department of Bioengineering, Cancer Center, Bio-X Program, Stanford University, Stanford, California, United States of America
| | - Jennifer R. Cochran
- Department of Bioengineering, Cancer Center, Bio-X Program, Stanford University, Stanford, California, United States of America
- * E-mail:
| |
Collapse
|
94
|
Computation of conformational coupling in allosteric proteins. PLoS Comput Biol 2009; 5:e1000484. [PMID: 19714199 PMCID: PMC2720451 DOI: 10.1371/journal.pcbi.1000484] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2009] [Accepted: 07/23/2009] [Indexed: 11/19/2022] Open
Abstract
In allosteric regulation, an effector molecule binding a protein at one site induces conformational changes, which alter structure and function at a distant active site. Two key challenges in the computational modeling of allostery are the prediction of the structure of one allosteric state starting from the structure of the other, and elucidating the mechanisms underlying the conformational coupling of the effector and active sites. Here we approach these two challenges using the Rosetta high-resolution structure prediction methodology. We find that the method can recapitulate the relaxation of effector-bound forms of single domain allosteric proteins into the corresponding ligand-free states, particularly when sampling is focused on regions known to change conformation most significantly. Analysis of the coupling between contacting pairs of residues in large ensembles of conformations spread throughout the landscape between and around the two allosteric states suggests that the transitions are built up from blocks of tightly coupled interacting sets of residues that are more loosely coupled to one another.
Collapse
|
95
|
Large-scale evaluation of dynamically important residues in proteins predicted by the perturbation analysis of a coarse-grained elastic model. BMC STRUCTURAL BIOLOGY 2009; 9:45. [PMID: 19591676 PMCID: PMC2719638 DOI: 10.1186/1472-6807-9-45] [Citation(s) in RCA: 51] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2009] [Accepted: 07/10/2009] [Indexed: 11/10/2022]
Abstract
Backgrounds It is increasingly recognized that protein functions often require intricate conformational dynamics, which involves a network of key amino acid residues that couple spatially separated functional sites. Tremendous efforts have been made to identify these key residues by experimental and computational means. Results We have performed a large-scale evaluation of the predictions of dynamically important residues by a variety of computational protocols including three based on the perturbation and correlation analysis of a coarse-grained elastic model. This study is performed for two lists of test cases with >500 pairs of protein structures. The dynamically important residues predicted by the perturbation and correlation analysis are found to be strongly or moderately conserved in >67% of test cases. They form a sparse network of residues which are clustered both in 3D space and along protein sequence. Their overall conservation is attributed to their dynamic role rather than ligand binding or high network connectivity. Conclusion By modeling how the protein structural fluctuations respond to residue-position-specific perturbations, our highly efficient perturbation and correlation analysis can be used to dissect the functional conformational changes in various proteins with a residue level of detail. The predictions of dynamically important residues serve as promising targets for mutational and functional studies.
Collapse
|
96
|
Liu Z, Chen J, Thirumalai D. On the accuracy of inferring energetic coupling between distant sites in protein families from evolutionary imprints: Illustrations using lattice model. Proteins 2009; 77:823-31. [DOI: 10.1002/prot.22498] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
|
97
|
Xu F, Du P, Shen H, Hu H, Wu Q, Xie J, Yu L. Correlated mutation analysis on the catalytic domains of serine/threonine protein kinases. PLoS One 2009; 4:e5913. [PMID: 19526051 PMCID: PMC2690836 DOI: 10.1371/journal.pone.0005913] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2009] [Accepted: 05/11/2009] [Indexed: 01/15/2023] Open
Abstract
Background Protein kinases (PKs) have emerged as the largest family of signaling proteins in eukaryotic cells and are involved in every aspect of cellular regulation. Great progresses have been made in understanding the mechanisms of PKs phosphorylating their substrates, but the detailed mechanisms, by which PKs ensure their substrate specificity with their structurally conserved catalytic domains, still have not been adequately understood. Correlated mutation analysis based on large sets of diverse sequence data may provide new insights into this question. Methodology/Principal Findings Statistical coupling, residue correlation and mutual information analyses along with clustering were applied to analyze the structure-based multiple sequence alignment of the catalytic domains of the Ser/Thr PK family. Two clusters of highly coupled sites were identified. Mapping these positions onto the 3D structure of PK catalytic domain showed that these two groups of positions form two physically close networks. We named these two networks as θ-shaped and γ-shaped networks, respectively. Conclusions/Significance The θ-shaped network links the active site cleft and the substrate binding regions, and might participate in PKs recognizing and interacting with their substrates. The γ-shaped network is mainly situated in one side of substrate binding regions, linking the activation loop and the substrate binding regions. It might play a role in supporting the activation loop and substrate binding regions before catalysis, and participate in product releasing after phosphoryl transfer. Our results exhibit significant correlations with experimental observations, and can be used as a guide to further experimental and theoretical studies on the mechanisms of PKs interacting with their substrates.
Collapse
Affiliation(s)
- Feng Xu
- State Key Laboratory of Genetic Engineering, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai, China
- * E-mail: (FX); (LY)
| | - Pan Du
- Biomedical Informatics Center, Northwestern University, Chicago, Illinois, United States of America
| | - Hongbo Shen
- State Key Laboratory of Genetic Engineering, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai, China
| | - Hairong Hu
- State Key Laboratory of Genetic Engineering, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai, China
| | - Qi Wu
- State Key Laboratory of Genetic Engineering, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai, China
| | - Jun Xie
- State Key Laboratory of Genetic Engineering, Institute of Genetics, School of Life Sciences, Fudan University, Shanghai, China
| | - Long Yu
- Institute of Biomedical Sciences, Fudan University, Shanghai, China
- * E-mail: (FX); (LY)
| |
Collapse
|
98
|
Abstract
Covariation between sites can arise due to a common evolutionary history. At the same time, structure and function of proteins play significant role in evolvability of different sites that are not directly connected with the common ancestry. The nature of forces which cause residues to coevolve is still not thoroughly understood, it is especially not clear how coevolutionary processes are related to functional diversification within protein families. We analyzed both functional and structural factors that might cause covariation of specificity determinants and showed that they more often participate in coevolutionary relationships with each other and other sites compared with functional sites and those sites that are not under strong functional constraints. We also found that protein sites with higher number of coevolutionary connections with other sites have a tendency to evolve slower. Our results indicate that in some cases coevolutionary connections exist between specificity sites that are located far away in space but are under similar functional constraints. Such correlated changes and compensations can be realized through the stepwise coevolutionary processes which in turn can shed light on the mechanisms of functional diversification.
Collapse
Affiliation(s)
- Saikat Chakrabarti
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA.
| | | |
Collapse
|
99
|
Qiu P, Sanfiorenzo V, Curry S, Guo Z, Liu S, Skelton A, Xia E, Cullen C, Ralston R, Greene J, Tong X. Identification of HCV protease inhibitor resistance mutations by selection pressure-based method. Nucleic Acids Res 2009; 37:e74. [PMID: 19395595 PMCID: PMC2691846 DOI: 10.1093/nar/gkp251] [Citation(s) in RCA: 36] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022] Open
Abstract
A major challenge to successful antiviral therapy is the emergence of drug-resistant viruses. Recent studies have developed several automated analyses of HIV sequence polymorphism based on calculations of selection pressure (Ka/Ks) to predict drug resistance mutations. Similar resistance analysis programs for HCV inhibitors are not currently available. Taking advantage of the recently available sequence data of patient HCV samples from a Phase II clinical study of protease inhibitor boceprevir, we calculated the selection pressure for all codons in the HCV protease region (amino acid 1–181) to identify potential resistance mutations. The correlation between mutations was also calculated to evaluate linkage between any two mutations. Using this approach, we identified previously known major resistant mutations, including a recently reported mutation V55A. In addition, a novel mutation V158I was identified, and we further confirmed its resistance to boceprevir in protease enzyme and replicon assay. We also extended the approach to analyze potential interactions between individual mutations and identified three pairs of correlated changes. Our data suggests that selection pressure-based analysis and correlation mapping could provide useful tools to analyze large amount of sequencing data from clinical samples and to identify new drug resistance mutations as well as their linkage and correlations.
Collapse
Affiliation(s)
- Ping Qiu
- Molecular Design and Informatics, Schering-Plough Research Institute, 2015 Galloping Hill Road, Kenilworth, NJ 07033, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
100
|
Allosteric transitions of supramolecular systems explored by network models: application to chaperonin GroEL. PLoS Comput Biol 2009; 5:e1000360. [PMID: 19381265 PMCID: PMC2664929 DOI: 10.1371/journal.pcbi.1000360] [Citation(s) in RCA: 108] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2008] [Accepted: 03/13/2009] [Indexed: 11/19/2022] Open
Abstract
Identification of pathways involved in the structural transitions of biomolecular
systems is often complicated by the transient nature of the conformations
visited across energy barriers and the multiplicity of paths accessible in the
multidimensional energy landscape. This task becomes even more challenging in
exploring molecular systems on the order of megadaltons. Coarse-grained models
that lend themselves to analytical solutions appear to be the only possible
means of approaching such cases. Motivated by the utility of elastic network
models for describing the collective dynamics of biomolecular systems and by the
growing theoretical and experimental evidence in support of the intrinsic
accessibility of functional substates, we introduce a new method,
adaptive anisotropic network model (aANM),
for exploring functional transitions. Application to bacterial chaperonin GroEL
and comparisons with experimental data, results from action minimization
algorithm, and previous simulations support the utility of aANM
as a computationally efficient, yet physically plausible, tool for unraveling
potential transition pathways sampled by large complexes/assemblies. An
important outcome is the assessment of the critical inter-residue interactions
formed/broken near the transition state(s), most of which involve conserved
residues. Most proteins are biomolecular machines. They perform their function by
undergoing changes between different structures. Understanding the mechanism of
transition between these structures is of major importance to design methods for
controlling such transitions, and thereby modulating protein function. Although
there are many computational methods for exploring the transitions of small
proteins, the task of exploring the transition pathways becomes prohibitively
expensive in the case of supramolecular systems. The bacterial chaperonin GroEL
is such a supramolecular machine. It plays an important role in assisting
protein folding. During its function, GroEL undergoes structural transitions
between multiple forms. Here, we are introducing a new methodology, based on
elastic network models, for elucidating the transition mechanisms in such
supramolecular systems. Application to GroEL provides us with biologically
significant information on critical interactions and sequence of events
occurring during the chaperonin machinery and key contacts that make and break
at the transition. The method can be readily applied to other supramolecular
machines.
Collapse
|