1
|
Jarnot P, Ziemska-Legiecka J, Dobson L, Merski M, Mier P, Andrade-Navarro MA, Hancock JM, Dosztányi Z, Paladin L, Necci M, Piovesan D, Tosatto SCE, Promponas VJ, Grynberg M, Gruca A. PlaToLoCo: the first web meta-server for visualization and annotation of low complexity regions in proteins. Nucleic Acids Res 2020; 48:W77-W84. [PMID: 32421769 PMCID: PMC7319588 DOI: 10.1093/nar/gkaa339] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2020] [Revised: 04/08/2020] [Accepted: 05/01/2020] [Indexed: 12/25/2022] Open
Abstract
Low complexity regions (LCRs) in protein sequences are characterized by a less diverse amino acid composition compared to typically observed sequence diversity. Recent studies have shown that LCRs may co-occur with intrinsically disordered regions, are highly conserved in many organisms, and often play important roles in protein functions and in diseases. In previous decades, several methods have been developed to identify regions with LCRs or amino acid bias, but most of them as stand-alone applications and currently there is no web-based tool which allows users to explore LCRs in protein sequences with additional functional annotations. We aim to fill this gap by providing PlaToLoCo - PLAtform of TOols for LOw COmplexity-a meta-server that integrates and collects the output of five different state-of-the-art tools for discovering LCRs and provides functional annotations such as domain detection, transmembrane segment prediction, and calculation of amino acid frequencies. In addition, the union or intersection of the results of the search on a query sequence can be obtained. By developing the PlaToLoCo meta-server, we provide the community with a fast and easily accessible tool for the analysis of LCRs with additional information included to aid the interpretation of the results. The PlaToLoCo platform is available at: http://platoloco.aei.polsl.pl/.
Collapse
Affiliation(s)
- Patryk Jarnot
- Department of Computer Networks and Systems, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| | | | - Laszlo Dobson
- Faculty of Information Technology and Bionics, Pázmány Péter Catholic University, Práter u. 50/A, 1083 Budapest, Hungary.,Research Centre for Natural Sciences, Magyar Tudósok Körútja 2, 1117 Budapest, Hungary
| | - Matthew Merski
- Structural Biology Group, Biological and Chemical Research Centre, Department of Chemistry, University of Warsaw, Żwirki i Wigury 101, 02-089 Warsaw, Poland
| | - Pablo Mier
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - Miguel A Andrade-Navarro
- Faculty of Biology, Johannes Gutenberg University Mainz, Hans-Dieter-Hüsch-Weg 15, 55128 Mainz, Germany
| | - John M Hancock
- ELIXIR, Wellcome Genome Campus, Hinxton, Cambridgeshire CB10 1SD, UK
| | - Zsuzsanna Dosztányi
- Department of Biochemistry, ELTE Eötvös LorándUniversity, Budapest, Pázmány Péter stny 1/c 1117, Budapest, Hungary
| | - Lisanna Paladin
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Marco Necci
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Damiano Piovesan
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Silvio C E Tosatto
- Department of Biomedical Sciences, University of Padova, Via Ugo Bassi 58/B, 35131 Padova, Italy
| | - Vasilis J Promponas
- Bioinformatics Research Laboratory, Department of Biological Sciences, University of Cyprus, P.O. Box 20537, Nicosia, CY 1678, Cyprus
| | - Marcin Grynberg
- Institute of Biochemistry and Biophysics PAS, Pawinskiego 5A, 02-106 Warsaw, Poland
| | - Aleksandra Gruca
- Department of Computer Networks and Systems, Silesian University of Technology, Akademicka 16, 44-100 Gliwice, Poland
| |
Collapse
|
2
|
NKNK: a New Essential Motif in the C-Terminal Domain of HIV-1 Group M Integrases. J Virol 2020; 94:JVI.01035-20. [PMID: 32727879 DOI: 10.1128/jvi.01035-20] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 07/17/2020] [Indexed: 11/20/2022] Open
Abstract
Using coevolution network interference based on comparison of two phylogenetically distantly related isolates, one from the main group M and the other from the minor group O of HIV-1, we identify, in the C-terminal domain (CTD) of integrase, a new functional motif constituted by four noncontiguous amino acids (N222K240N254K273). Mutating the lysines abolishes integration through decreased 3' processing and inefficient nuclear import of reverse-transcribed genomes. Solution of the crystal structures of wild-type (wt) and mutated CTDs shows that the motif generates a positive surface potential that is important for integration. The number of charges in the motif appears more crucial than their position within the motif. Indeed, the positions of the K's could be permutated or additional K's could be inserted in the motif, generally without affecting integration per se Despite this potential genetic flexibility, the NKNK arrangement is strictly conserved in natural sequences, indicative of an effective purifying selection exerted at steps other than integration. Accordingly, reverse transcription was reduced even in the mutants that retained wt integration levels, indicating that specifically the wt sequence is optimal for carrying out the multiple functions that integrase exerts. We propose that the existence of several amino acid arrangements within the motif, with comparable efficiencies of integration per se, might have constituted an asset for the acquisition of additional functions during viral evolution.IMPORTANCE Intensive studies of HIV-1 have revealed its extraordinary ability to adapt to environmental and immunological challenges, an ability that is also at the basis of antiviral treatment escape. Here, by deconvoluting the different roles of the viral integrase in the various steps of the infectious cycle, we report how the existence of alternative equally efficient structural arrangements for carrying out one function opens up the possibility of adapting to the optimization of further functionalities exerted by the same protein. Such a property provides an asset to increase the efficiency of the infectious process. On the other hand, though, the identification of this new motif provides a potential target for interfering simultaneously with multiple functions of the protein.
Collapse
|
3
|
Zhang R, Pan Y, Ahmed L, Block E, Zhang Y, Batista VS, Zhuang H. A Multispecific Investigation of the Metal Effect in Mammalian Odorant Receptors for Sulfur-Containing Compounds. Chem Senses 2019; 43:357-366. [PMID: 29659735 DOI: 10.1093/chemse/bjy022] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Metal-coordinating compounds are generally known to have strong smells, a phenomenon that can be attributed to the fact that odorant receptors for intense-smelling compounds, such as those containing sulfur, may be metalloproteins. We previously identified a mouse odorant receptor (OR), Olfr1509, that requires copper ions for sensitive detection of a series of metal-coordinating odorants, including (methylthio)methanethiol (MTMT), a strong-smelling component of male mouse urine that attracts female mice. By combining mutagenesis and quantum mechanics/molecular mechanics (QM/MM) modeling, we identified candidate binding sites in Olfr1509 that may bind to the copper-MTMT complex. However, whether there are other receptors utilizing metal ions for ligand-binding and other sites important for receptor activation is still unknown. In this study, we describe a second mouse OR for MTMT with a copper effect, namely Olfr1019. In an attempt to investigate the functional changes of metal-coordinating ORs in multiple species and to decipher additional sites involved in the metal effect, we cloned various mammalian orthologs of the 2 mouse MTMT receptors, and a third mouse MTMT receptor, Olfr15, that does not have a copper effect. We found that the function of all 3 MTMT receptors varies greatly among species and that the response to MTMT always co-occurred with the copper effect. Furthermore, using ancestral reconstruction and QM/MM modeling combined with receptor functional assay, we found that the amino acid residue R260 in Olfr1509 and the respective R261 site in Olfr1019 may be important for receptor activation.
Collapse
Affiliation(s)
- Ruina Zhang
- Department of Pathophysiology, Key Laboratory of Cell Differentiation and Apoptosis of the Chinese Ministry of Education, Shanghai Jiaotong University School of Medicine, Huangpu District, Shanghai, P. R. China
| | - Yi Pan
- Department of Pathophysiology, Key Laboratory of Cell Differentiation and Apoptosis of the Chinese Ministry of Education, Shanghai Jiaotong University School of Medicine, Huangpu District, Shanghai, P. R. China
| | - Lucky Ahmed
- Department of Chemistry, Yale University, New Haven, CT, USA
| | - Eric Block
- Department of Chemistry, University at Albany, State University of New York, NY, USA
| | - Yuetian Zhang
- Department of Pathophysiology, Key Laboratory of Cell Differentiation and Apoptosis of the Chinese Ministry of Education, Shanghai Jiaotong University School of Medicine, Huangpu District, Shanghai, P. R. China
| | | | - Hanyi Zhuang
- Department of Pathophysiology, Key Laboratory of Cell Differentiation and Apoptosis of the Chinese Ministry of Education, Shanghai Jiaotong University School of Medicine, Huangpu District, Shanghai, P. R. China
- Institute of Health Sciences, Shanghai Jiaotong University School of Medicine/Shanghai Institutes for Biological Sciences of Chinese Academy of Sciences, Xuhui District, Shanghai, P. R. China
| |
Collapse
|
4
|
Neuwald AF, Aravind L, Altschul SF. Inferring joint sequence-structural determinants of protein functional specificity. eLife 2018; 7. [PMID: 29336305 PMCID: PMC5770160 DOI: 10.7554/elife.29880] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2017] [Accepted: 12/22/2017] [Indexed: 01/05/2023] Open
Abstract
Residues responsible for allostery, cooperativity, and other subtle but functionally important interactions remain difficult to detect. To aid such detection, we employ statistical inference based on the assumption that residues distinguishing a protein subgroup from evolutionarily divergent subgroups often constitute an interacting functional network. We identify such networks with the aid of two measures of statistical significance. One measure aids identification of divergent subgroups based on distinguishing residue patterns. For each subgroup, a second measure identifies structural interactions involving pattern residues. Such interactions are derived either from atomic coordinates or from Direct Coupling Analysis scores, used as surrogates for structural distances. Applying this approach to N-acetyltransferases, P-loop GTPases, RNA helicases, synaptojanin-superfamily phosphatases and nucleases, and thymine/uracil DNA glycosylases yielded results congruent with biochemical understanding of these proteins, and also revealed striking sequence-structural features overlooked by other methods. These and similar analyses can aid the design of drugs targeting allosteric sites.
Collapse
Affiliation(s)
- Andrew F Neuwald
- Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, United States.,Department of Biochemistry and Molecular Biology, University of Maryland School of Medicine, Baltimore, United States
| | - L Aravind
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, United States
| | - Stephen F Altschul
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, United States
| |
Collapse
|
5
|
Screening of nucleotide variations in genomic sequences encoding charged protein regions in the human genome. BMC Genomics 2017; 18:588. [PMID: 28789634 PMCID: PMC5549384 DOI: 10.1186/s12864-017-4000-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2017] [Accepted: 08/01/2017] [Indexed: 11/24/2022] Open
Abstract
Background Studying genetic variation distribution in proteins containing charged regions, called charge clusters (CCs), is of great interest to unravel their functional role. Charge clusters are 20 to 75 residue segments with high net positive charge, high net negative charge, or high total charge relative to the overall charge composition of the protein. We previously developed a bioinformatics tool (FCCP) to detect charge clusters in proteomes and scanned the human proteome for the occurrence of CCs. In this paper we investigate the genetic variations in the human proteins harbouring CCs. Results We studied the coding regions of 317 positively charged clusters and 1020 negatively charged ones previously detected in human proteins. Results revealed that coding parts of CCs are richer in sequence variants than their corresponding genes, full mRNAs, and exonic + intronic sequences and that these variants are predominately rare (Minor allele frequency < 0.005). Furthermore, variants occurring in the coding parts of positively charged regions of proteins are more often pathogenic than those occurring in negatively charged ones. Classification of variants according to their types showed that substitution is the major type followed by Indels (Insertions-deletions). Concerning substitutions, it was found that within clusters of both charges, the charged amino acids were the greatest loser groups whereas polar residues were the greatest gainers. Conclusions Our findings highlight the prominent features of the human charged regions from the DNA up to the protein sequence which might provide potential clues to improve the current understanding of those charged regions and their implication in the emergence of diseases. Electronic supplementary material The online version of this article (doi:10.1186/s12864-017-4000-3) contains supplementary material, which is available to authorized users.
Collapse
|
6
|
Abstract
We study a simple abstract problem motivated by a variety of applications in protein sequence analysis. Consider a string of 0s and 1s of length L, and containing D 1s. If we believe that some or all of the 1s may be clustered near the start of the sequence, which subset is the most significantly so clustered, and how significant is this clustering? We approach this question using the minimum description length principle and illustrate its application by analyzing residues that distinguish translational initiation and elongation factor guanosine triphosphatases (GTPases) from other P-loop GTPases. Within a structure of yeast elongation factor 1[Formula: see text], these residues form a significant cluster centered on a region implicated in guanine nucleotide exchange. Various biomedical questions may be cast as the abstract problem considered here.
Collapse
Affiliation(s)
- Stephen F. Altschul
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland
| | - Andrew F. Neuwald
- Department of Biochemistry and Molecular Biology, Institute for Genome Sciences, University of Maryland School of Medicine, Baltimore, Maryland
| |
Collapse
|
7
|
Kharrat N, Belmabrouk S, Abdelhedi R, Benmarzoug R, Assidi M, Al Qahtani MH, Rebai A. Screening for clusters of charge in human virus proteomes. BMC Genomics 2016; 17:758. [PMID: 27766959 PMCID: PMC5073957 DOI: 10.1186/s12864-016-3086-3] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Background The identification of charge clusters (runs of charged residues) in proteins and their mapping within the protein structure sequence is an important step toward a comprehensive analysis of how these particular motifs mediate, via electrostatic interactions, various molecular processes such as protein sorting, translocation, docking, orientation and binding to DNA and to other proteins. Few algorithms that specifically identify these charge clusters have been designed and described in the literature. In this study, 197 distinctive human viral proteomes were screened for the occurrence of charge clusters (CC) using a new computational approach. Results Three hundred and seventy three CC have been identified within the 2549 viral protein sequences screened. The number of protein sequences that are CC-free is 2176 (85.3 %) while 150 and 180 proteins contained positive charge (PCC) and negative charge clusters (NCC), respectively. The NCCs (211 detected) were more prevalent than PCC (162). PCC-containing proteins are significantly longer than those having NCCs (p = 2.10-16). The most prevalent virus families having PCC and NCC were Herpesviridae followed by Papillomaviridae. However, the single-strand RNA group has in average three times more NCC than PCC. According to the functional domain classification, a significant difference in distribution was observed between PCC and NCC (p = 2. 10−8) with the occurrence of NCCs being more frequent in C-terminal region while PCC more often fall within functional domains. Only 29 proteins sequences contained both NCC and PCC. Moreover, 101 NCC were conserved in 84 proteins while only 62 PCC were conserved in 60 protein sequences. To understand the mechanism by which the membrane translocation functionalities are embedded in viral proteins, we screened our PCC for sequences corresponding to cell-penetrating peptides (CPPs) using two online databases: CellPPd and CPPpred. We found that all our PCCs, having length varying from 7 to 30 amino-acids were predicted as CPPs. Experimental validation is required to improve our understanding of the role of these PCCs in viral infection process. Conclusions Screening distinctive cluster charges in viral proteomes suggested a functional role of these protein regions and might provide potential clues to improve the current understanding of viral diseases in order to tailor better preventive and therapeutic approaches. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3086-3) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Najla Kharrat
- Centre of Biotechnology of Sfax, Laboratory of Molecular and Cellular Screening Processes, Bioinformatics Group, PO. Box:1177, 3018, Sfax, Tunisia.
| | - Sabrine Belmabrouk
- Centre of Biotechnology of Sfax, Laboratory of Molecular and Cellular Screening Processes, Bioinformatics Group, PO. Box:1177, 3018, Sfax, Tunisia
| | - Rania Abdelhedi
- Centre of Biotechnology of Sfax, Laboratory of Molecular and Cellular Screening Processes, Bioinformatics Group, PO. Box:1177, 3018, Sfax, Tunisia
| | - Riadh Benmarzoug
- Centre of Biotechnology of Sfax, Laboratory of Molecular and Cellular Screening Processes, Bioinformatics Group, PO. Box:1177, 3018, Sfax, Tunisia
| | - Mourad Assidi
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah, Saudi Arabia.,Center of Innovation in Personalized Medicine, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Mohammed H Al Qahtani
- Center of Excellence in Genomic Medicine Research, King Abdulaziz University, Jeddah, Saudi Arabia
| | - Ahmed Rebai
- Centre of Biotechnology of Sfax, Laboratory of Molecular and Cellular Screening Processes, Bioinformatics Group, PO. Box:1177, 3018, Sfax, Tunisia
| |
Collapse
|
8
|
Gamaleldin Elsadig Karar M, Matei MF, Jaiswal R, Illenberger S, Kuhnert N. Neuraminidase inhibition of Dietary chlorogenic acids and derivatives – potential antivirals from dietary sources. Food Funct 2016; 7:2052-9. [DOI: 10.1039/c5fo01412c] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
Plants rich in chlorogenic acids (CGAs), caffeic acids and their derivatives have been found to exert antiviral effects against influenza virus neuroaminidase.
Collapse
Affiliation(s)
| | - Marius-Febi Matei
- Department of Life Sciences and Chemistry
- Jacobs University Bremen
- 28759 Bremen
- Germany
| | - Rakesh Jaiswal
- Department of Life Sciences and Chemistry
- Jacobs University Bremen
- 28759 Bremen
- Germany
| | - Susanne Illenberger
- Department of Life Sciences and Chemistry
- Jacobs University Bremen
- 28759 Bremen
- Germany
| | - Nikolai Kuhnert
- Department of Life Sciences and Chemistry
- Jacobs University Bremen
- 28759 Bremen
- Germany
| |
Collapse
|
9
|
Belmabrouk S, Kharrat N, Benmarzoug R, Rebai A. Exploring proteome-wide occurrence of clusters of charged residues in eukaryotes. Proteins 2015; 83:1252-61. [PMID: 25963617 DOI: 10.1002/prot.24823] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2015] [Revised: 04/17/2015] [Accepted: 04/29/2015] [Indexed: 11/09/2022]
Abstract
Clusters of charged residues are one of the key features of protein primary structure since they have been associated to important functions of proteins. Here, we present a proteome wide scan for the occurrence of Charge Clusters in Protein sequences using a new search tool (FCCP) based on a score-based methodology. The FCCP was run to search charge clusters in seven eukaryotic proteomes: Arabidopsis thaliana, Caenorhabditis elegans, Danio rerio, Drosophila melanogaster, Homo sapiens, Mus musculus, and Saccharomyces cerevisiae. We found that negative charge clusters (NCCs) are three to four times more frequent than positive charge clusters (PCCs). The Drosophila proteome is on average the most charged, whereas the human proteome is the least charged. Only 3 to 8% of the studied protein sequences have negative charge clusters, while 1.6 to 3% having PCCs and only 0.07 to 0.6% have both types of clusters. NCCs are localized predominantly in the N-terminal and C-terminal domains, while PCCs tend to be localized within the functional domains of the protein sequences. Furthermore, the gene ontology classification revealed that the protein sequences with negative and PCCs are mainly binding proteins.
Collapse
Affiliation(s)
- Sabrine Belmabrouk
- Laboratory of Molecular and Cellular Screening Processes, Centre De Biotechnologie De Sfax, Bioinformatics Group, PoBox '1177,'3018 Sfax, Tunisia
| | - Najla Kharrat
- Laboratory of Molecular and Cellular Screening Processes, Centre De Biotechnologie De Sfax, Bioinformatics Group, PoBox '1177,'3018 Sfax, Tunisia
| | - Riadh Benmarzoug
- Laboratory of Molecular and Cellular Screening Processes, Centre De Biotechnologie De Sfax, Bioinformatics Group, PoBox '1177,'3018 Sfax, Tunisia
| | - Ahmed Rebai
- Laboratory of Molecular and Cellular Screening Processes, Centre De Biotechnologie De Sfax, Bioinformatics Group, PoBox '1177,'3018 Sfax, Tunisia
| |
Collapse
|
10
|
Park J, Saitou K. ROTAS: a rotamer-dependent, atomic statistical potential for assessment and prediction of protein structures. BMC Bioinformatics 2014; 15:307. [PMID: 25236673 PMCID: PMC4262145 DOI: 10.1186/1471-2105-15-307] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2014] [Accepted: 09/09/2014] [Indexed: 12/31/2022] Open
Abstract
Background Multibody potentials accounting for cooperative effects of molecular interactions have shown better accuracy than typical pairwise potentials. The main challenge in the development of such potentials is to find relevant structural features that characterize the tightly folded proteins. Also, the side-chains of residues adopt several specific, staggered conformations, known as rotamers within protein structures. Different molecular conformations result in different dipole moments and induce charge reorientations. However, until now modeling of the rotameric state of residues had not been incorporated into the development of multibody potentials for modeling non-bonded interactions in protein structures. Results In this study, we develop a new multibody statistical potential which can account for the influence of rotameric states on the specificity of atomic interactions. In this potential, named “rotamer-dependent atomic statistical potential” (ROTAS), the interaction between two atoms is specified by not only the distance and relative orientation but also by two state parameters concerning the rotameric state of the residues to which the interacting atoms belong. It was clearly found that the rotameric state is correlated to the specificity of atomic interactions. Such rotamer-dependencies are not limited to specific type or certain range of interactions. The performance of ROTAS was tested using 13 sets of decoys and was compared to those of existing atomic-level statistical potentials which incorporate orientation-dependent energy terms. The results show that ROTAS performs better than other competing potentials not only in native structure recognition, but also in best model selection and correlation coefficients between energy and model quality. Conclusions A new multibody statistical potential, ROTAS accounting for the influence of rotameric states on the specificity of atomic interactions was developed and tested on decoy sets. The results show that ROTAS has improved ability to recognize native structure from decoy models compared to other potentials. The effectiveness of ROTAS may provide insightful information for the development of many applications which require accurate side-chain modeling such as protein design, mutation analysis, and docking simulation. Electronic supplementary material The online version of this article (doi:10.1186/1471-2105-15-307) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Kazuhiro Saitou
- Department of Mechanical Engineering, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
11
|
VISHVESHWARA SARASWATHI, BRINDA KV, KANNAN N. PROTEIN STRUCTURE: INSIGHTS FROM GRAPH THEORY. JOURNAL OF THEORETICAL & COMPUTATIONAL CHEMISTRY 2012. [DOI: 10.1142/s0219633602000117] [Citation(s) in RCA: 130] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
The sequence and structure of a large body of proteins are becoming increasingly available. It is desirable to explore mathematical tools for efficient extraction of information from such sources. The principles of graph theory, which was earlier applied in fields such as electrical engineering and computer networks are now being adopted to investigate protein structure, folding, stability, function and dynamics. This review deals with a brief account of relevant graphs and graph theoretic concepts. The concepts of protein graph construction are discussed. The manner in which graphs are analyzed and parameters relevant to protein structure are extracted, are explained. The structural and biological information derived from protein structures using these methods is presented.
Collapse
Affiliation(s)
| | - K. V. BRINDA
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India
| | - N. KANNAN
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India
| |
Collapse
|
12
|
Galat A. Functional drift of sequence attributes in the FK506-binding proteins (FKBPs). J Chem Inf Model 2008; 48:1118-30. [PMID: 18412331 DOI: 10.1021/ci700429n] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Diverse members of the FK506-binding proteins (FKBPs) group and their complexes with different macrocyclic ligands of fungal origins such as FK506, rapamycin, ascomycin, and their immunosuppressive and nonimmunosuppressive derivatives display a variety of cellular and biological activities. The functional relatedness of the FKBPs was estimated from the following attributes of their aligned sequences: 1 degrees conservation of the consensus sequence; 2 degrees sequence similarity; 3 degrees pI; 4 degrees hydrophobicity; 5 degrees amino acid hydrophobicity and bulkiness profiles. Analyses of the multiple sequence alignments and intramolecular interaction networks calculated from a series of structures of the FKBPs revealed some variations in the interaction clusters formed by the AA residues that are crucial for sustaining peptidylprolyl cis/trans isomerases (PPIases) activity and binding capacity of the FKBPs. Fine diversification of the sequences of the multiple paralogues and orthologues of the FKBPs encoded in different genomes alter the intramolecular interaction patterns of their structures and allowed them to gain some selectivity in binding to diverse targets (functional drift).
Collapse
Affiliation(s)
- Andrzej Galat
- Institute de Biologie et de Technologies de Saclay, DSV/CEA, CE-Saclay, F-91191 Gif-sur-Yvette Cedex, France.
| |
Collapse
|
13
|
|
14
|
Abstract
Metals play a variety of roles in biological processes, and hence their presence in a protein structure can yield vital functional information. Because the residues that coordinate a metal often undergo conformational changes upon binding, detection of binding sites based on simple geometric criteria in proteins without bound metal is difficult. However, aspects of the physicochemical environment around a metal binding site are often conserved even when this structural rearrangement occurs. We have developed a Bayesian classifier using known zinc binding sites as positive training examples and nonmetal binding regions that nonetheless contain residues frequently observed in zinc sites as negative training examples. In order to allow variation in the exact positions of atoms, we average a variety of biochemical and biophysical properties in six concentric spherical shells around the site of interest. At a specificity of 99.8%, this method achieves 75.5% sensitivity in unbound proteins at a positive predictive value of 73.6%. We also test its accuracy on predicted protein structures obtained by homology modeling using templates with 30%-50% sequence identity to the target sequences. At a specificity of 99.8%, we correctly identify at least one zinc binding site in 65.5% of modeled proteins. Thus, in many cases, our model is accurate enough to identify metal binding sites in proteins of unknown structure for which no high sequence identity homologs of known structure exist. Both the source code and a Web interface are available to the public at http://feature.stanford.edu/metals.
Collapse
Affiliation(s)
- Jessica C Ebert
- Department of Genetics, Stanford University, Stanford, California 94305, USA
| | | |
Collapse
|
15
|
Chakrabarti P, Bhattacharyya R. Geometry of nonbonded interactions involving planar groups in proteins. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2007; 95:83-137. [PMID: 17629549 DOI: 10.1016/j.pbiomolbio.2007.03.016] [Citation(s) in RCA: 154] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/24/2006] [Accepted: 03/18/2007] [Indexed: 11/26/2022]
Abstract
Although hydrophobic interaction is the main contributing factor to the stability of the protein fold, the specificity of the folding process depends on many directional interactions. An analysis has been carried out on the geometry of interaction between planar moieties of ten side chains (Phe, Tyr, Trp, His, Arg, Pro, Asp, Glu, Asn and Gln), the aromatic residues and the sulfide planes (of Met and cystine), and the aromatic residues and the peptide planes within the protein tertiary structures available in the Protein Data Bank. The occurrence of hydrogen bonds and other nonconventional interactions such as C-H...pi, C-H...O, electrophile-nucleophile interactions involving the planar moieties has been elucidated. The specific nature of the interactions constraints many of the residue pairs to occur with a fixed sequence difference, maintaining a sequential order, when located in secondary structural elements, such as alpha-helices and beta-turns. The importance of many of these interactions (for example, aromatic residues interacting with Pro or cystine sulfur atom) is revealed by the higher degree of conservation observed for them in protein structures and binding regions. The planar residues are well represented in the active sites, and the geometry of their interactions does not deviate from the general distribution. The geometrical relationship between interacting residues provides valuable insights into the process of protein folding and would be useful for the design of protein molecules and modulation of their binding properties.
Collapse
Affiliation(s)
- Pinak Chakrabarti
- Department of Biochemistry and Bioinformatics Centre, Bose Institute, P-1/12 CIT Scheme VIIM, Kolkata 700054, India.
| | | |
Collapse
|
16
|
Mihalek I, Res I, Lichtarge O. Evolutionary and structural feedback on selection of sequences for comparative analysis of proteins. Proteins 2006; 63:87-99. [PMID: 16397893 DOI: 10.1002/prot.20866] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
It has been noted that slowly evolving protein residues have two properties: (a) they tend to cluster in the native fold, and (b) they delineate functional surfaces-parts of the surface through which the protein interacts with other proteins or small ligands. Herein, we demonstrate that the two are coupled sufficiently strongly that one effect, when observed, statistically implies the other. Detection of both can be accomplished in multiple sequence alignment related methods by the careful selection of relevant sequences. For the demonstration, we use two sets of protein families: a small set of diverse proteins with diverse functional surfaces, and a large set of homodimerizing enzymes. A practical outcome of our considerations is a simple prescriptive rule for the selection of homologous sequences for the comparative analysis of proteins: in order to optimize the detection of (potentially unknown) functional surfaces, it is sufficient to select sequences in such a way that the residues observed at any level of evolutionary divergence, as implied by the alignment, cluster on the folded protein.
Collapse
Affiliation(s)
- I Mihalek
- Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, Texas, USA.
| | | | | |
Collapse
|
17
|
Mayewski S. A multibody, whole-residue potential for protein structures, with testing by Monte Carlo simulated annealing. Proteins 2006; 59:152-69. [PMID: 15723360 DOI: 10.1002/prot.20397] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
A new multibody, whole-residue potential for protein tertiary structure is described. The potential is based on the local environment surrounding each main-chain alpha carbon (CA), defined as the set of all residues whose CA coordinates lie within a spherical volume of set radius in 3-dimensional (3D) space surrounding that position. It is shown that the relative positions of the CAs in these local environments belong to a set of preferred templates. The templates are derived by cluster analysis of the presently available database of over 3000 protein chains (750,000 residues) having not more than 30% sequence similarity. For each template is derived also a set of residue propensities for each topological position in the template. Using lookup tables of these derived templates, it is then possible to calculate an energy for any conformation of a given protein sequence. The application of the potential to ab initio protein tertiary structure prediction is evaluated by performing Monte Carlo simulated annealing on test protein sequences.
Collapse
Affiliation(s)
- Stefan Mayewski
- Max-Planck-Institut für Biochemie, 82152 Martinsried, Germany.
| |
Collapse
|
18
|
Kulkarni PP, She YM, Smith SD, Roberts EA, Sarkar B. Proteomics of Metal Transport and Metal-Associated Diseases. Chemistry 2006; 12:2410-22. [PMID: 16134204 DOI: 10.1002/chem.200500664] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022]
Abstract
Proteomics technology has the potential to identify groups of proteins that have similar biological function. However, few attempts have been made to identify and characterize metal-binding proteins by using proteomics strategies. Many transition metals are essential to sustain life. Copper, iron, and zinc are the most abundant transition metals relevant to biological systems. In addition to their important biological functions, metals can also catalyze the formation of damaging free radical species. Hence, their intracellular transport is tightly regulated. Despite recent insights into the intracellular transport of copper and other metals, our overall understanding of intracellular metal metabolism remains incomplete and it is likely that many metal-binding proteins remain undiscovered. Furthermore, the protein targets for metals during metal-associated disease states or during exposure to toxic levels of environmental metals are yet to be unravelled. A proteomics strategy for the analysis of metal-transporting or metal-binding proteins has the potential to uncover how a large number of proteins function in normal or metal-associated diseased states. Here we discuss the principal aspects of metal metabolism, and the recent developments in the area of the proteomics of metal transport.
Collapse
Affiliation(s)
- Prasad P Kulkarni
- Department of Biochemistry, University of Toronto, Medical Sciences Building, Toronto, ON, M5S 1A8, Canada
| | | | | | | | | |
Collapse
|
19
|
Najmanovich RJ, Torrance JW, Thornton JM. Prediction of protein function from structure: insights from methods for the detection of local structural similarities. Biotechniques 2005; 38:847, 849, 851. [PMID: 16018542 DOI: 10.2144/05386te01] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
20
|
Summa CM, Levitt M, Degrado WF. An atomic environment potential for use in protein structure prediction. J Mol Biol 2005; 352:986-1001. [PMID: 16126228 DOI: 10.1016/j.jmb.2005.07.054] [Citation(s) in RCA: 52] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2004] [Revised: 06/20/2005] [Accepted: 07/20/2005] [Indexed: 11/25/2022]
Abstract
We describe the derivation and testing of a knowledge-based atomic environment potential for the modeling of protein structural energetics. An analysis of the probabilities of atomic interactions in a dataset of high-resolution protein structures shows that the probabilities of non-bonded inter-atomic contacts are not statistically independent events, and that the multi-body contact frequencies are poorly predicted from pairwise contact potentials. A pseudo-energy function is defined that measures the preferences for protein atoms to be in a given microenvironment defined by the number of contacting atoms in the environment and its atomic composition. This functional form is tested for its ability to recognize native protein structures amongst an ensemble of decoy structures and a detailed relative performance comparison is made with a number of common functions used in protein structure prediction.
Collapse
Affiliation(s)
- Christopher M Summa
- Department of Biochemistry and Biophysics, The University of Pennsylvania Medical School, Philadelphia, PA 19104-6059, USA
| | | | | |
Collapse
|
21
|
Abstract
The Arthur M. Sackler Colloquium of the National Academy of Sciences, "Frontiers in Bioinformatics: Unsolved Problems and Challenges," organized by David Eisenberg, Russ Altman, and myself, was held October 15-17, 2004, to provide a forum for discussing concepts and methods in bioinformatics serving the biological and medical sciences. The deluge of genomic and proteomic data in the last two decades has driven the creation of tools that search and analyze biomolecular sequences and structures. Bioinformatics is highly interdisciplinary, using knowledge from mathematics, statistics, computer science, biology, medicine, physics, chemistry, and engineering.
Collapse
Affiliation(s)
- Samuel Karlin
- Department of Mathematics, Stanford University, Stanford, CA 94305-2125, USA.
| |
Collapse
|
22
|
Gromiha MM, Selvaraj S. Inter-residue interactions in protein folding and stability. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2004; 86:235-77. [PMID: 15288760 DOI: 10.1016/j.pbiomolbio.2003.09.003] [Citation(s) in RCA: 209] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
Abstract
During the process of protein folding, the amino acid residues along the polypeptide chain interact with each other in a cooperative manner to form the stable native structure. The knowledge about inter-residue interactions in protein structures is very helpful to understand the mechanism of protein folding and stability. In this review, we introduce the classification of inter-residue interactions into short, medium and long range based on a simple geometric approach. The features of these interactions in different structural classes of globular and membrane proteins, and in various folds have been delineated. The development of contact potentials and the application of inter-residue contacts for predicting the structural class and secondary structures of globular proteins, solvent accessibility, fold recognition and ab initio tertiary structure prediction have been evaluated. Further, the relationship between inter-residue contacts and protein-folding rates has been highlighted. Moreover, the importance of inter-residue interactions in protein-folding kinetics and for understanding the stability of proteins has been discussed. In essence, the information gained from the studies on inter-residue interactions provides valuable insights for understanding protein folding and de novo protein design.
Collapse
Affiliation(s)
- M Michael Gromiha
- Computational Biology Research Center, National Institute of Advanced Industrial Science and Technology, Aomi Frontier Building 17F, 2-43 Aomi, Koto-ku, Tokyo 135-0064, Japan.
| | | |
Collapse
|
23
|
Wei L, Altman RB. Recognizing complex, asymmetric functional sites in protein structures using a Bayesian scoring function. J Bioinform Comput Biol 2004; 1:119-38. [PMID: 15290784 DOI: 10.1142/s0219720003000150] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2002] [Revised: 03/03/2003] [Accepted: 03/03/2003] [Indexed: 11/18/2022]
Abstract
The increase in known three-dimensional protein structures enables us to build statistical profiles of important functional sites in protein molecules. These profiles can then be used to recognize sites in large-scale automated annotations of new protein structures. We report an improved FEATURE system which recognizes functional sites in protein structures. FEATURE defines multi-level physico-chemical properties and recognizes sites based on the spatial distribution of these properties in the sites' microenvironments. It uses a Bayesian scoring function to compare a query region with the statistical profile built from known examples of sites and control nonsites. We have previously shown that FEATURE can accurately recognize calcium-binding sites and have reported interesting results scanning for calcium-binding sites in the entire Protein Data Bank. Here we report the ability of the improved FEATURE to characterize and recognize geometrically complex and asymmetric sites such as ATP-binding sites and disulfide bond-forming sites. FEATURE does not rely on conserved residues or conserved residue geometry of the sites. We also demonstrate that, in the absence of a statistical profile of the sites, FEATURE can use an artificially constructed profile based on a priori knowledge to recognize the sites in new structures, using redoxin active sites as an example.
Collapse
Affiliation(s)
- Liping Wei
- Nexus Genomics, Inc., 229 Polaris Ave., Suite 6, Mountain View, CA 94043, USA.
| | | |
Collapse
|
24
|
Mihalek I, Res I, Yao H, Lichtarge O. Combining inference from evolution and geometric probability in protein structure evaluation. J Mol Biol 2003; 331:263-79. [PMID: 12875851 DOI: 10.1016/s0022-2836(03)00663-6] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Starting from the hypothesis that evolutionarily important residues form a spatially limited cluster in a protein's native fold, we discuss the possibility of detecting a non-native structure based on the absence of such clustering. The relevant residues are determined using the Evolutionary Trace method. We propose a quantity to measure clustering of the selected residues on the structure and show that the exact values for its average and variance over several ensembles of interest can be found. This enables us to study the behavior of the associated z-scores. Since our approach rests on an analytic result, it proves to be general, customizable, and computationally fast. We find that clustering is indeed detectable in a large representative protein set. Furthermore, we show that non-native structures tend to achieve lower residue-clustering z-scores than those attained by the native folds. The most important conclusion that we draw from this work is that consistency between structural and evolutionary information, manifested in clustering of key residues, imposes powerful constraints on the conformational space of a protein.
Collapse
Affiliation(s)
- I Mihalek
- Department of Molecular and Human Genetics, Baylor College of Medicine, One Baylor Plaza, Houston, TX 77030, USA
| | | | | | | |
Collapse
|
25
|
Jambon M, Imberty A, Deléage G, Geourjon C. A new bioinformatic approach to detect common 3D sites in protein structures. Proteins 2003; 52:137-45. [PMID: 12833538 DOI: 10.1002/prot.10339] [Citation(s) in RCA: 110] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
An innovative bioinformatic method has been designed and implemented to detect similar three-dimensional (3D) sites in proteins. This approach allows the comparison of protein structures or substructures and detects local spatial similarities: this method is completely independent from the amino acid sequence and from the backbone structure. In contrast to already existing tools, the basis for this method is a representation of the protein structure by a set of stereochemical groups that are defined independently from the notion of amino acid. An efficient heuristic for finding similarities that uses graphs of triangles of chemical groups to represent the protein structures has been developed. The implementation of this heuristic constitutes a software named SuMo (Surfing the Molecules), which allows the dynamic definition of chemical groups, the selection of sites in the proteins, and the management and screening of databases. To show the relevance of this approach, we focused on two extreme examples illustrating convergent and divergent evolution. In two unrelated serine proteases, SuMo detects one common site, which corresponds to the catalytic triad. In the legume lectins family composed of >100 structures that share similar sequences and folds but may have lost their ability to bind a carbohydrate molecule, SuMo discriminates between functional and non-functional lectins with a selectivity of 96%. The time needed for searching a given site in a protein structure is typically 0.1 s on a PIII 800MHz/Linux computer; thus, in further studies, SuMo will be used to screen the PDB.
Collapse
Affiliation(s)
- Martin Jambon
- Institut de Biologie et Chimie des Protéines (IBCP), Lyon, France
| | | | | | | |
Collapse
|
26
|
Acuña-Cueva ER, Faure R, Illán-Cabeza NA, Jiménez-Pulido SB, Moreno-Carretero MN, Quirós-Olozábal M. Synthesis and characterization of several lumazine derivative complexes of Co(II), Ni(II), Cu(II), Cd(II), Pd(II) and Pt(II). X-ray structures of a mononuclear copper complex and a dinuclear cadmium complex. Inorganica Chim Acta 2003. [DOI: 10.1016/s0020-1693(03)00172-5] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]
|
27
|
Bhattacharyya R, Saha RP, Samanta U, Chakrabarti P. Geometry of interaction of the histidine ring with other planar and basic residues. J Proteome Res 2003; 2:255-63. [PMID: 12814265 DOI: 10.1021/pr025584d] [Citation(s) in RCA: 55] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Among the aromatic residues in protein structures, histidine (His) is unique, as it can exist in the neutral or positively charged form at the physiological pH. As such, it can interact with other aromatic residues as well as form hydrogen bonds with polar and charged (both negative and positive) residues. We have analyzed the geometry of interaction of His residues with nine other planar side chains containing aromatic (residues Phe, Tyr, Trp, and His), carboxylate (Asp and Glu), carboxamide (Asn and Gln) and guanidinium (Arg) groups in 432 polypeptide chains. With the exception of the aspartic (Asp) and glutamic (Glu) acid side-chains, all other residues prefer to interact in a face-to-face or offset-face-stacked orientation with the His ring. Such a geometry is different from the edge-to-face relative orientation normally associated with the aromatic-aromatic interaction. His-His pair prefers to interact in a face-to-face orientation; however, when both the residues bind the same metal ion, the interplanar angle is close to 90 degrees. The occurrence of different interactions (including the nonconventional N-H...pi and C-H...pi hydrogen bonds) have been correlated with the relative orientations between the interacting residues. Several structural motifs, mostly involved in binding metal ions, have been identified by considering the cases where His residues are in contact with four other planar moieties. About 10% of His residues used here are also found in sequence patterns in PROSITE database. There are examples of the amino end of the Lys side chain interacting with His residues in such a way that it is located on an arc around a ring nitrogen atom.
Collapse
Affiliation(s)
- Rajasri Bhattacharyya
- Department of Biochemistry and Bioinformatics Centre, Bose Institute, P-1/12 CIT Scheme VIIM, Calcutta 700 054, India
| | | | | | | |
Collapse
|
28
|
Abstract
Achieving a thorough explanation of the behavior of metal sites in the formation of native metalloprotein structures is an exciting challenge in the biochemistry of metallobiomacromolecules. This study presents a personal insight into the subject. It is proposed that a metal center and its exogenous ligand compose a template. A template may impose a clear stereochemical preference on the loose peptide chains, and organize them into natural stereospecificity via the metal-ligand interaction, a long-range and strong interaction. Therefore, the stable peptide conformation induced by the template effect surrounding a template polyhedron could be called a template-mediated structural motif (TMSM).
Collapse
Affiliation(s)
- Changlin Liu
- Department of Chemistry, Huazhong University of Science and Technology, Wuhan 430074, PR China.
| | | |
Collapse
|
29
|
Brocchieri L, Karlin S. Conservation among HSP60 sequences in relation to structure, function, and evolution. Protein Sci 2000; 9:476-86. [PMID: 10752609 PMCID: PMC2144576 DOI: 10.1110/ps.9.3.476] [Citation(s) in RCA: 143] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/19/2022]
Abstract
The chaperonin HSP60 (GroEL) proteins are essential in eubacterial genomes and in eukaryotic organelles. Functional regions inferred from mutation studies and the Escherichia coli GroEL 3D crystal complexes are evaluated in a multiple alignment across 43 diverse HSP60 sequences, centering on ATP/ADP and Mg2+ binding sites, on residues interacting with substrate, on GroES contact positions, on interface regions between monomers and domains, and on residues important in allosteric conformational changes. The most evolutionary conserved residues relate to the ATP/ADP and Mg2+ binding sites. Hydrophobic residues that contribute in substrate binding are also significantly conserved. A large number of charged residues line the central cavity of the GroEL-GroES complex in the substrate-releasing conformation. These span statistically significant intra- and inter-monomer three-dimensional (3D) charge clusters that are highly conserved among sequences and presumably play an important role interacting with the substrate. Unaligned short segments between blocks of alignment are generally exposed at the outside wall of the Anfinsen cage complex. The multiple alignment reveals regions of divergence common to specific evolutionary groups. For example, rickettsial sequences diverge in the ATP/ADP binding domain and gram-positive sequences diverge in the allosteric transition domain. The evolutionary information of the multiple alignment proffers attractive sites for mutational studies.
Collapse
Affiliation(s)
- L Brocchieri
- Department of Mathematics, Stanford University, California 94305-2125, USA
| | | |
Collapse
|
30
|
Skolnick J, Fetrow JS. From genes to protein structure and function: novel applications of computational approaches in the genomic era. Trends Biotechnol 2000; 18:34-9. [PMID: 10631780 DOI: 10.1016/s0167-7799(99)01398-0] [Citation(s) in RCA: 92] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
The genome-sequencing projects are providing a detailed 'parts list' of life. A key to comprehending this list is understanding the function of each gene and each protein at various levels. Sequence-based methods for function prediction are inadequate because of the multifunctional nature of proteins. However, just knowing the structure of the protein is also insufficient for prediction of multiple functional sites. Structural descriptors for protein functional sites are crucial for unlocking the secrets in both the sequence and structural-genomics projects.
Collapse
Affiliation(s)
- J Skolnick
- Danforth Plant Science Center, Laboratory of Computational Genomics, St Louis, MO 63108, USA.
| | | |
Collapse
|
31
|
Gromiha MM, Selvaraj S. Influence of medium and long range interactions in protein folding. Prep Biochem Biotechnol 1999; 29:339-51. [PMID: 10548251 DOI: 10.1080/10826069908544933] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Protein structures are stabilized by both local and long range interactions. In this work, we analyze the residue-residue contacts and the role of medium- and long-range interactions in globular proteins belonging to different structural classes. The results show that while medium range interactions predominate in all-alpha class proteins, long-range interactions predominate in all-beta class. Based on this, we analyze the performance of several structure prediction methods in different structural classes of globular proteins and found that all the methods predict the secondary structures of all-alpha proteins more accurately than other classes. Also, we observed that the residues occurring in the range of 21-30 residues apart contributes more towards long-range contacts and about 85% of residues are involved in long-range contacts. Further, the preference of residue pairs to the folding and stability of globular proteins is discussed.
Collapse
Affiliation(s)
- M M Gromiha
- RIKEN Life Science Center, The Institute of Physical and Chemical Research, Tsukuba, Ibaraki, Japan
| | | |
Collapse
|
32
|
Abstract
A hierarchy of residue density assessments and packing properties in protein structures are contrasted, including a regular density, a variety of charge densities, a hydrophobic density, a polar density, and an aromatic density. These densities are investigated by alternative distance measures and also at the interface of multiunit structures. Amino acids are divided into nine structural categories according to three secondary structure states and three solvent accessibility levels. To take account of amino acid abundance differences across protein structures, we normalize the observed density by the expected density defining a density index. Solvent accessibility levels exert the predominant influence in determinations of the regular residue density. Explicitly, the regular density values vary approximately linearly with respect to solvent accessibility levels, the linearity parameters depending on the amino acid. The charge index reveals pronounced inequalities between lysine and arginine in their interactions with acidic residues. The aromatic density calculations in all structural categories parallel the regular density calculations, indicating that the aromatic residues are distributed as a random sample of all residues. Moreover, aromatic residues are found to be over-represented in the neighborhood of all amino acids. This result might be attributed to nucleation sites and protein stability being substantially associated with aromatic residues.
Collapse
Affiliation(s)
- F Baud
- Department of Mathematics, Stanford University, Stanford, CA 94305-2125, USA
| | | |
Collapse
|
33
|
Kannan N, Vishveshwara S. Identification of side-chain clusters in protein structures by a graph spectral method. J Mol Biol 1999; 292:441-64. [PMID: 10493887 DOI: 10.1006/jmbi.1999.3058] [Citation(s) in RCA: 220] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
This paper presents a novel method to detect side-chain clusters in protein three-dimensional structures using a graph spectral approach. Protein side-chain interactions are represented by a labeled graph in which the nodes of the graph represent the Cbeta atoms and the edges represent the distance between the Cbeta atoms. The distance information and the non-bonded connectivity of the residues are represented in the form of a matrix called the Laplacian matrix. The constructed matrix is diagonalized and clustering information is obtained from the vector components associated with the second lowest eigenvalue and cluster centers are obtained from the vector components associated with the top eigenvalues. The method uses global information for clustering and a single numeric computation is required to detect clusters of interest. The approach has been adopted here to detect a variety of side-chain clusters and identify the residue which makes the largest number of interactions among the residues forming the cluster (cluster centers). Detecting such clusters and cluster centers are important from a protein structure and folding point of view. The crucial residues which are important in the folding pathway as determined by PhiF values (which is a measure of the effect of a mutation on the stability of the transition state of folding) as obtained from protein engineering methods, can be identified from the vector components corresponding to the top eigenvalues. Expanded clusters are detected near the active and binding site of the protein, supporting the nucleation condensation hypothesis for folding. The method is also shown to detect domains in protein structures and conserved side-chain clusters in topologically similar proteins.
Collapse
Affiliation(s)
- N Kannan
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore, 560 012, India
| | | |
Collapse
|
34
|
|
35
|
Karlin S, Zhu ZY, Karlin KD. Extended metal environments of cytochrome c oxidase structures. Biochemistry 1998; 37:17726-34. [PMID: 9922138 DOI: 10.1021/bi981390t] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
The metals of the cytochrome c oxidase structures of the bovine heart mitochondrion (PDB code 1occ) and of the soil bacterium Paracoccus denitrificans (1arl) include a dicopper center (CuA), magnesium, two proximal hemes, a copper (CuB) atom, and a calcium. The mitochondrial structure also possesses a bound distant zinc ion. The extended environments of the metal sites are analyzed emphasizing residues of the second shell in terms of polarity, hydrophobicity, secondary structure, solvent accessibility, and H-bonding networks. A significant difference in the CuA metal environments concerns D-51 I in 1occ, absent from 1arl. The D-51 I appears to play an important role in the proton pumping pathway. Our analysis uncovers several statistically significant residue clusters, including a cysteine-histidine-tyrosine cluster overlapping the CuA-Mg complex; a histidine-acidic cluster enveloping the environment of Mg, the two hemes, and CuB; and on the protein surface a mixed charge cluster, which may help stabilize the quaternary structure and/or mediate docking to cytochrome c. These clusters may constitute possible pathways for electron transfer, for O2 diffusion, and for H2O movement. Many hydrogen bonding relations along the interface of subunits I and II demarcate this surface as a potential participant in proton pumping.
Collapse
Affiliation(s)
- S Karlin
- Department of Mathematics, Stanford University, California 94305-2125, USA.
| | | | | |
Collapse
|
36
|
Abstract
The two-dimensional contact map of interresidue distances is a visual analysis technique for protein structures. We present two standalone software tools designed to be used in combination to increase the versatility of this simple yet powerful technique. First, the program Structer calculates contact maps from three-dimensional molecular structural data. The contact map matrix can then be viewed in the graphical matrix-visualization program Dotter. Instead of using a predefined distance cutoff, we exploit Dotter's dynamic rendering control, allowing interactive exploration at varying distance cutoffs after calculating the matrix once. Structer can use a number of distance measures, can incorporate multiple chains in one contact map, and allows masking of user-defined residue sets. It works either directly with PDB files, or can use the MMDB network API for reading structures.
Collapse
Affiliation(s)
- E L Sonnhammer
- Computational Biology Branch, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | | |
Collapse
|
37
|
Karlin S, Zhu ZY. Classification of mononuclear zinc metal sites in protein structures. Proc Natl Acad Sci U S A 1997; 94:14231-6. [PMID: 9405595 PMCID: PMC24919 DOI: 10.1073/pnas.94.26.14231] [Citation(s) in RCA: 92] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Our study of the extended metal environment, particularly of the second shell, focuses in this paper on zinc sites. Key findings include: (i) The second shell of mononuclear zinc centers is generally more polar than hydrophobic and prominently features charged residues engaged in an abundance of hydrogen bonding with histidine ligands. Histidine-acidic or histidine-tyrosine clusters commonly overlap the environment of zinc ions. (ii) Histidine tautomeric metal bonding patterns in ligating zinc ions are mixed. For example, carboxypeptidase A, thermolysin, and sonic hedgehog possess the same ligand group (two histidines, one unibidentate acidic ligand, and a bound water), but their histidine tautomeric geometries markedly differ such that the carboxypeptidase A makes only Ndelta1 contacts, thermolysin makes only Nepsilon2 contacts, and sonic hedgehog uses one of each. Thus the presence of a similar ligand cohort does not necessarily imply the same topology or function at the active site. (iii) Two close histidine ligands HXmH, m </= 5, rarely both coordinate a single metal ion in the Ndelta1 tautomeric conformation, presumably to avoid steric conflicts. Mononuclear zinc sites can be classified into six types depending on the ligand composition and geometry. Implications of the results are discussed in terms of divergent and convergent evolution.
Collapse
Affiliation(s)
- S Karlin
- Department of Mathematics, Stanford University, Stanford, CA 94305-2125, USA.
| | | |
Collapse
|
38
|
Karlin S, Zhu ZY, Karlin KD. The extended environment of mononuclear metal centers in protein structures. Proc Natl Acad Sci U S A 1997; 94:14225-30. [PMID: 9405594 PMCID: PMC24917 DOI: 10.1073/pnas.94.26.14225] [Citation(s) in RCA: 95] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 10/03/1997] [Indexed: 02/05/2023] Open
Abstract
The objectives of this and the following paper are to identify commonalities and disparities of the extended environment of mononuclear metal sites centering on Cu, Fe, Mn, and Zn. The extended environment of a metal site within a protein embodies at least three layers: the metal core, the ligand group, and the second shell, which is defined here to consist of all residues distant less than 3.5 A from some ligand of the metal core. The ligands and second-shell residues can be characterized in terms of polarity, hydrophobicity, secondary structures, solvent accessibility, hydrogen-bonding interactions, and membership in statistically significant residue clusters of different kinds. Findings include the following: (i) Both histidine ligands of type I copper ions exclusively attach the Ndelta1 nitrogen of the histidine imidazole ring to the metal, whereas histidine ligands for all mononuclear iron ions and nearly all type II copper ions are ligated via the Nepsilon2 nitrogen. By contrast, multinuclear copper centers are coordinated predominantly by histidine Nepsilon2, whereas diiron histidine contacts are predominantly Ndelta1. Explanations in terms of steric differences between Ndelta1 and Nepsilon2 are considered. (ii) Except for blue copper (type I), the second-shell composition favors polar residues. (iii) For blue copper, the second shell generally contains multiple methionine residues, which are elements of a statistically significant histidine-cysteine-methionine cluster. Almost half of the second shell of blue copper consists of solvent-accessible residues, putatively facilitating electron transfer. (iv) Mononuclear copper atoms are never found with acidic carboxylate ligands, whereas single Mn2+ ion ligands are predominantly acidic and the second shell tends to be mostly buried. (v) The extended environment of mononuclear Fe sites often is associated with histidine-tyrosine or histidine-acidic clusters.
Collapse
Affiliation(s)
- S Karlin
- Department of Mathematics, Stanford University, Stanford, CA 94305-2125, USA.
| | | | | |
Collapse
|
39
|
Gromiha MM, Selvaraj S. Influence of Medium and Long Range Interactions in (α/β)(8) Barrel Proteins. J Biol Phys 1997; 23:209-17. [PMID: 23345662 PMCID: PMC3456500 DOI: 10.1023/a:1005071232497] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The residue-residue contacts and the role of medium and long rangeinteractions in 36 (α/β)(8) barrel proteins have beenanalysed. The influence of long range contacts in the formation ofphysico-chemically similar clusters, and the preference of amino acidresidues towards long range contacts have also been studied. Theresults reveal a nearly uniform level of medium and long rangecontacts in most of the proteins. The residues Gln and Ala havehighest medium range contacts and the residue Pro has the lowestmedium range contacts. The residue Cys has the highest long rangecontact followed by other hydrophobic residues namely Val, Ile andLeu. In the physico-chemically similar clusters identified in theseproteins, 25-40 percent residues are influenced by long rangecontacts, and the residues Cys, Ile, Val and Met are the mostpreferred ones.
Collapse
Affiliation(s)
- M M Gromiha
- Tsukuba Life Science Center, The Institute of Physical and Chemical Research (RIKEN), Tsukuba, Japan
| | | |
Collapse
|
40
|
Zhu ZY, Karlin S. Clusters of charged residues in protein three-dimensional structures. Proc Natl Acad Sci U S A 1996; 93:8350-5. [PMID: 8710874 PMCID: PMC38674 DOI: 10.1073/pnas.93.16.8350] [Citation(s) in RCA: 57] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023] Open
Abstract
Statistically significant charge clusters (basic, acidic, or of mixed charge) in tertiary protein structures are identified by new methods from a large representative collection of protein structures. About 10% of protein structures show at least one charge cluster, mostly of mixed type involving about equally anionic and cationic residues. Positive charge clusters are very rare. Negative (or histidine-acidic) charge clusters often coordinate calcium, or magnesium or zinc ions [e.g., thermolysin (PDB code: 3tln), mannose-binding protein (2msb), aminopeptidase (1amp)]. Mixed-charge clusters are prominent at interchain contacts where they stabilize quaternary protein formation [e.g., glutathione S-transferase (2gst), catalase (8act), and fructose-1,6-bisphosphate aldolase (1fba)]. They are also involved in protein-protein interaction and in substrate binding. For example, the mixed-charge cluster of aspartate carbamoyl-transferase (8atc) envelops the aspartate carbonyl substrate in a flexible manner (alternating tense and relaxed states) where charge associations can vary from weak to strong. Other proteins with charge clusters include the P450 cytochrome family (BM-3, Terp, Cam), several flavocytochromes, neuraminidase, hemagglutinin, the photosynthetic reaction center, and annexin. In each case in Table 2 we discuss the possible role of the charge clusters with respect to protein structure and function.
Collapse
Affiliation(s)
- Z Y Zhu
- Department of Mathematics, Stanford University, CA 94305-2125, USA
| | | |
Collapse
|