1
|
Venkataraman S, Selvarajan R, Subramanian SS, Handanahalli SS. Insights into the capsid structure of banana bunchy top virus. 3 Biotech 2022; 12:144. [PMID: 35694237 DOI: 10.1007/s13205-022-03204-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Accepted: 03/05/2022] [Indexed: 11/01/2022] Open
Abstract
Banana is the major staple food crop for approximately 400 million people. Bunchy top disease of banana is one of the most devastating diseases caused by banana bunchy top virus (BBTV), which results in stunting of plants, bunchy appearance of leaves and a significant loss of yield. While many isolates of BBTV from various regions of India have been characterized by different groups, no structural study exists for this important virus. To bridge this gap, the pET28a clone of the coat protein (CP) gene from BBTV isolate of Hill banana grown in lower Pulney Hills (Virupakshi) of Tamilnadu was expressed in BL21 (DE3) pLysS. Purification of the CP was achieved by Ni-NTA affinity chromatography. In vitro capsid assembly studied using sucrose density gradient centrifugation suggested that the CP did not assemble as a virus-like particle (VLP), but remained as smaller oligomers. Studies using dynamic light scattering (DLS) indicate that the purified protein is poly-dispersed, represented majorly as pentamers. Homology modeling studies provided useful insights into the probable fold of the CP suggesting that it is a β-sandwich, similar to that seen in the majority of plant viruses. In silico capsid reconstruction aided the understanding of the quaternary organization of subunits in the capsid and their molecular interactions. The location of the aphid-binding EAG motif was identified on the surface loops close to the pentameric axis indicating its role in vector-mediated transmission. Comparison with the CP and capsid structure of geminiviruses provided useful insights into the mode of nucleic acid binding and the role of genome during capsid assembly. Supplementary Information The online version contains supplementary material available at 10.1007/s13205-022-03204-4.
Collapse
Affiliation(s)
| | - Ramasamy Selvarajan
- ICAR National Research Centre for Banana, Thayanur Post, Tiruchirapalli, 620102 India
| | | | | |
Collapse
|
2
|
Porta-Pardo E, Ruiz-Serra V, Valentini S, Valencia A. The structural coverage of the human proteome before and after AlphaFold. PLoS Comput Biol 2022; 18:e1009818. [PMID: 35073311 PMCID: PMC8812986 DOI: 10.1371/journal.pcbi.1009818] [Citation(s) in RCA: 50] [Impact Index Per Article: 25.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2021] [Revised: 02/03/2022] [Accepted: 01/07/2022] [Indexed: 12/12/2022] Open
Abstract
The protein structure field is experiencing a revolution. From the increased throughput of techniques to determine experimental structures, to developments such as cryo-EM that allow us to find the structures of large protein complexes or, more recently, the development of artificial intelligence tools, such as AlphaFold, that can predict with high accuracy the folding of proteins for which the availability of homology templates is limited. Here we quantify the effect of the recently released AlphaFold database of protein structural models in our knowledge on human proteins. Our results indicate that our current baseline for structural coverage of 48%, considering experimentally-derived or template-based homology models, elevates up to 76% when including AlphaFold predictions. At the same time the fraction of dark proteome is reduced from 26% to just 10% when AlphaFold models are considered. Furthermore, although the coverage of disease-associated genes and mutations was near complete before AlphaFold release (69% of Clinvar pathogenic mutations and 88% of oncogenic mutations), AlphaFold models still provide an additional coverage of 3% to 13% of these critically important sets of biomedical genes and mutations. Finally, we show how the contribution of AlphaFold models to the structural coverage of non-human organisms, including important pathogenic bacteria, is significantly larger than that of the human proteome. Overall, our results show that the sequence-structure gap of human proteins has almost disappeared, an outstanding success of direct consequences for the knowledge on the human genome and the derived medical applications. Protein structures are key to understand many biological phenomena at the molecular scale: from the effects of genetic variation to how different proteins interact with each other to create molecular pathways that, together, have a biological function. Obtaining experimental structures, however, is extremely consuming in terms of both, time and resources. For this and other reasons, scientists have long worked to develop computational approaches that predict the structure of a protein using only its sequence as input. Recently, a group of scientists at Deepmind have developed AlphaFold2, a computational tool that is extremely accurate at this task. Moreover, they have used this tool to predict the structures of all human proteins. In this manuscript we provide an overview of the structural coverage of the human proteome before AlphaFold models were released and how much we have gained thanks to these models. We also show how the gain affects our understanding of human pathogenic variants, both germline and somatic. Finally, we provide evidence suggesting that the gain in non-human organisms is larger than for the human proteome, particularly in the case of bacteria.
Collapse
Affiliation(s)
- Eduard Porta-Pardo
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain
- * E-mail: (EP-P); (AV)
| | - Victoria Ruiz-Serra
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Josep Carreras Leukaemia Research Institute (IJC), Badalona, Spain
| | - Samuel Valentini
- Department of Cellular, Computational and Integrative Biology (CIBIO), University of Trento, Trento, Italy
| | - Alfonso Valencia
- Barcelona Supercomputing Center (BSC), Barcelona, Spain
- Institució Catalana de Recerca Avançada (ICREA), Barcelona, Spain
- * E-mail: (EP-P); (AV)
| |
Collapse
|
3
|
Banerjee R, Sheet T, Banerjee S, Biondi B, Formaggio F, Toniolo C, Peggion C. C α-Methyl-l-valine: A Preferential Choice over α-Aminoisobutyric Acid for Designing Right-Handed α-Helical Scaffolds. Biochemistry 2021; 60:2704-2714. [PMID: 34463474 DOI: 10.1021/acs.biochem.1c00340] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
In synthetic peptides containing Gly and coded α-amino acids, one of the most common practices to enhance their helical extent is to incorporate a large number of l-Ala residues along with noncoded, strongly foldameric α-aminoisobutyric acid (Aib) units. Earlier studies have established that Aib-based peptides, with propensity for both the 310- and α-helices, have a tendency to form ordered three-dimensional structure that is much stronger than that exhibited by their l-Ala rich counterparts. However, the achiral nature of Aib induces an inherent, equal preference for the right- and left-handed helical conformations as found in Aib homopeptide stretches. This property poses challenges in the analysis of a model peptide helical conformation based on chirospectroscopic techniques like electronic circular dichroism (ECD), a very important tool for assigning secondary structures. To overcome such ambiguity, we have synthesized and investigated a thermally stable 14-mer peptide in which each of the Aib residues of our previously designed and reported analogue ABGY (where B stands for Aib) is replaced by Cα-methyl-l-valine (L-AMV). Analysis of the results described here from complementary ECD and 1H nuclear magnetic resonance spectroscopic techniques in a variety of environments firmly establishes that the L-AMV-containing peptide exhibits a significantly stronger preference compared to that of its Aib parent in terms of conferring α-helical character. Furthermore, being a chiral α-amino acid, L-AMV shows an intrinsic, extremely strong bias for a quite specific (right-handed) screw sense. These findings emphasize the relevance of L-AMV as a more appropriate unit for the design of right-handed α-helical peptide models that may be utilized as conformationally constrained scaffolds.
Collapse
Affiliation(s)
| | | | | | - Barbara Biondi
- Department of Chemical Sciences, University of Padova, 35131 Padova, Italy.,Institute of Biomolecular Chemistry, Padova Unit, CNR, 35131 Padova, Italy
| | - Fernando Formaggio
- Department of Chemical Sciences, University of Padova, 35131 Padova, Italy.,Institute of Biomolecular Chemistry, Padova Unit, CNR, 35131 Padova, Italy
| | - Claudio Toniolo
- Department of Chemical Sciences, University of Padova, 35131 Padova, Italy.,Institute of Biomolecular Chemistry, Padova Unit, CNR, 35131 Padova, Italy
| | - Cristina Peggion
- Department of Chemical Sciences, University of Padova, 35131 Padova, Italy.,Institute of Biomolecular Chemistry, Padova Unit, CNR, 35131 Padova, Italy
| |
Collapse
|
4
|
Newaz K, Ghalehnovi M, Rahnama A, Antsaklis PJ, Milenković T. Network-based protein structural classification. ROYAL SOCIETY OPEN SCIENCE 2020; 7:191461. [PMID: 32742675 PMCID: PMC7353965 DOI: 10.1098/rsos.191461] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/23/2019] [Accepted: 05/05/2020] [Indexed: 06/11/2023]
Abstract
Experimental determination of protein function is resource-consuming. As an alternative, computational prediction of protein function has received attention. In this context, protein structural classification (PSC) can help, by allowing for determining structural classes of currently unclassified proteins based on their features, and then relying on the fact that proteins with similar structures have similar functions. Existing PSC approaches rely on sequence-based or direct three-dimensional (3D) structure-based protein features. By contrast, we first model 3D structures of proteins as protein structure networks (PSNs). Then, we use network-based features for PSC. We propose the use of graphlets, state-of-the-art features in many research areas of network science, in the task of PSC. Moreover, because graphlets can deal only with unweighted PSNs, and because accounting for edge weights when constructing PSNs could improve PSC accuracy, we also propose a deep learning framework that automatically learns network features from weighted PSNs. When evaluated on a large set of approximately 9400 CATH and approximately 12 800 SCOP protein domains (spanning 36 PSN sets), the best of our proposed approaches are superior to existing PSC approaches in terms of accuracy, with comparable running times. Our data and code are available at https://doi.org/10.5281/zenodo.3787922.
Collapse
Affiliation(s)
- Khalique Newaz
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
- Center for Network and Data Science, University of Notre Dame, Notre Dame, IN 46556, USA
- Eck institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Mahboobeh Ghalehnovi
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Arash Rahnama
- Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Panos J. Antsaklis
- Department of Electrical Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
| | - Tijana Milenković
- Department of Computer Science and Engineering, University of Notre Dame, Notre Dame, IN 46556, USA
- Center for Network and Data Science, University of Notre Dame, Notre Dame, IN 46556, USA
- Eck institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA
| |
Collapse
|
5
|
Amala A, Emerson IA. Understanding contact patterns of protein structures from protein contact map and investigation of unique patterns in the globin-like folded domains. J Cell Biochem 2018; 120:9877-9886. [PMID: 30525229 DOI: 10.1002/jcb.28270] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Accepted: 10/24/2018] [Indexed: 11/06/2022]
Abstract
Proteins are biochemical compounds made up of one or more polypeptides in a specific order, typically folded into a functionally active form. Proteins are categorized into four different structural classes according to the topology of α-helices and β-strands. In this study, we modeled these four structural classes as an undirected network depicting amino acids as nodes and interaction between them as edges. Results infer that basic protein classes can be easily recognized as well as distinguished by utilizing protein contact maps (PCM). Toward studying the globin-like fold, the helix-loop-helix region contacts were seen to be of a unique pattern, and these remained in all the folds. Further, the averaged diagonal contacts were analyzed and identified those contacts in α/β proteins were higher in comparison with the other class. Interesting, we noticed that anti-parallel beta sheets were dominant in all-β and α + β classes that lead to similar diagonal patterns. Network properties of all four basic classes were analyzed and found to possess small-world property. Findings infer that PCM may assist classify protein structure classes and it also helps in evaluating the predicted protein structures.
Collapse
Affiliation(s)
- Arumugam Amala
- Bioinformatics Programming Laboratory, Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology, Tamil Nadu, India
| | - Isaac Arnold Emerson
- Bioinformatics Programming Laboratory, Department of Biotechnology, School of Bio Sciences and Technology, Vellore Institute of Technology, Tamil Nadu, India
| |
Collapse
|
6
|
Abstract
Computational protein design (CPD) has established itself as a leading field in basic and applied science with a strong coupling between the two. Proteins are computationally designed from the level of amino acids to the level of a functional protein complex. Design targets range from increased thermo- (or other) stability to specific requested reactions such as protein-protein binding, enzymatic reactions, or nanotechnology applications. The design scheme may encompass small regions of the proteins or the entire protein. In either case, the design may aim at the side-chains or at the full backbone conformation. Herein, the main framework for the process is outlined highlighting key elements in the CPD iterative cycle. These include the very definition of CPD, the diverse goals of CPD, components of the CPD protocol, methods for searching sequence and structure space, scoring functions, and augmenting the CPD with other optimization tools. Taken together, this chapter aims to introduce the framework of CPD.
Collapse
Affiliation(s)
- Ilan Samish
- Department of Plants and Environmental Sciences, Weizmann Institute of Science, Rehovot, Israel.
- Department of Biotechnology Engineering, Braude Academic College of Engineering, Karmiel, Israel.
- Amai Proteins Ltd., Ashdod, Israel.
| |
Collapse
|
7
|
Oda H, Ota M, Toh H. Profile comparison revealed deviation from structural constraint at the positively selected sites. Biosystems 2016; 147:67-77. [PMID: 27443483 DOI: 10.1016/j.biosystems.2016.07.007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2015] [Revised: 07/13/2016] [Accepted: 07/16/2016] [Indexed: 11/18/2022]
Abstract
The amino acid substitutions at a site are affected by mixture of various constraints. It is also known that the amino acid substitutions are accelerated at sites under positive selection. However, the relationship between the substitutions at positively selected sites and the constraints has not been thoroughly examined. The advances in computational biology have enabled us to divide the mixture of the constraints into the structural constraint and the remainings by using the amino acid sequences and the tertiary structures, which is expressed as the deviation of the mixture of constraints from the structural constraint. Here, two types of profiles, or matrices with the size of 20 x (site length), are compared. One of the profiles represents the mixture of constraints, and is generated from a multiple amino acid sequence alignment, whereas the other is designed to represent the structural constraints. We applied the profile comparison method to proteins under positive selection to examine the relationship between the positive selection and constraints. The results suggested that the constraint at a site under positive selection tends to be deviated from the structural constraint at the site.
Collapse
Affiliation(s)
- Hiroyuki Oda
- Graduate School of Systems Life Sciences, Kyushu University, 744 Motooka Nishi-ku, Fukuoka 819-0395, Japan.
| | - Motonori Ota
- Graduate School of Information Science, Nagoya University, Furo-cho, Chikusa-ku, Nagoya City, Aichi 464-8601, Japan
| | - Hiroyuki Toh
- Department of Biomedical Chemistry, School of Science and Technology, Kwansei Gakuin University, 2-1 Gakuen, Sanda, Hyogo 669-1337, Japan
| |
Collapse
|
8
|
Striegel DA, Wojtowicz D, Przytycka TM, Periwal V. Correlated rigid modes in protein families. Phys Biol 2016; 13:025003. [PMID: 27063781 DOI: 10.1088/1478-3975/13/2/025003] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
A great deal of evolutionarily conserved information is contained in genomes and proteins. Enormous effort has been put into understanding protein structure and developing computational tools for protein folding, and many sophisticated approaches take structure and sequence homology into account. Several groups have applied statistical physics approaches to extracting information about proteins from sequences alone. Here, we develop a new method for sequence analysis based on first principles, in information theory, in statistical physics and in Bayesian analysis. We provide a complete derivation of our approach and we apply it to a variety of systems, to demonstrate its utility and its limitations. We show in some examples that phylogenetic alignments of amino-acid sequences of families of proteins imply the existence of a small number of modes that appear to be associated with correlated global variation. These modes are uncovered efficiently in our approach by computing a non-perturbative effective potential directly from the alignment. We show that this effective potential approaches a limiting form inversely with the logarithm of the number of sequences. Mapping symbol entropy flows along modes to underlying physical structures shows that these modes arise due to correlated compensatory adjustments. In the protein examples, these occur around functional binding pockets.
Collapse
|
9
|
Kayano AM, Simões-Silva R, Medeiros PS, Maltarollo VG, Honorio KM, Oliveira E, Albericio F, da Silva SL, Aguiar ACC, Krettli AU, Fernandes CF, Zuliani JP, Calderon LA, Stábeli RG, Soares AM. BbMP-1, a new metalloproteinase isolated from Bothrops brazili snake venom with in vitro antiplasmodial properties. Toxicon 2015; 106:30-41. [DOI: 10.1016/j.toxicon.2015.09.005] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2015] [Revised: 09/05/2015] [Accepted: 09/07/2015] [Indexed: 10/23/2022]
|
10
|
Meysman P, Zhou C, Cule B, Goethals B, Laukens K. Mining the entire Protein DataBank for frequent spatially cohesive amino acid patterns. BioData Min 2015; 8:4. [PMID: 25657820 PMCID: PMC4318390 DOI: 10.1186/s13040-015-0038-4] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2014] [Accepted: 01/18/2015] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The three-dimensional structure of a protein is an essential aspect of its functionality. Despite the large diversity in protein structures and functionality, it is known that there are common patterns and preferences in the contacts between amino acid residues, or between residues and other biomolecules, such as DNA. The discovery and characterization of these patterns is an important research topic within structural biology as it can give fundamental insight into protein structures and can aid in the prediction of unknown structures. RESULTS Here we apply an efficient spatial pattern miner to search for sets of amino acids that occur frequently in close spatial proximity in the protein structures of the Protein DataBank. This allowed us to mine for a new class of amino acid patterns, that we term FreSCOs (Frequent Spatially Cohesive Component sets), which feature synergetic combinations. To demonstrate the relevance of these FreSCOs, they were compared in relation to the thermostability of the protein structure and the interaction preferences of DNA-protein complexes. In both cases, the results matched well with prior investigations using more complex methods on smaller data sets. CONCLUSIONS The currently characterized protein structures feature a diverse set of frequent amino acid patterns that can be related to the stability of the protein molecular structure and that are independent from protein function or specific conserved domains.
Collapse
Affiliation(s)
- Pieter Meysman
- Advanced Database Research and Modelling (ADReM), Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Edegem, Belgium
| | - Cheng Zhou
- Advanced Database Research and Modelling (ADReM), Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
| | - Boris Cule
- Advanced Database Research and Modelling (ADReM), Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
| | - Bart Goethals
- Advanced Database Research and Modelling (ADReM), Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
| | - Kris Laukens
- Advanced Database Research and Modelling (ADReM), Department of Mathematics and Computer Science, University of Antwerp, Antwerp, Belgium
- Biomedical Informatics Research Center Antwerp (biomina), University of Antwerp/Antwerp University Hospital, Edegem, Belgium
| |
Collapse
|
11
|
Three-dimensional protein structure prediction: Methods and computational strategies. Comput Biol Chem 2014; 53PB:251-276. [DOI: 10.1016/j.compbiolchem.2014.10.001] [Citation(s) in RCA: 121] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/12/2014] [Revised: 10/03/2014] [Accepted: 10/07/2014] [Indexed: 01/01/2023]
|
12
|
Ferrada E. The amino acid alphabet and the architecture of the protein sequence-structure map. I. Binary alphabets. PLoS Comput Biol 2014; 10:e1003946. [PMID: 25473967 PMCID: PMC4256021 DOI: 10.1371/journal.pcbi.1003946] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2014] [Accepted: 09/26/2014] [Indexed: 11/19/2022] Open
Abstract
The correspondence between protein sequences and structures, or sequence-structure map, relates to fundamental aspects of structural, evolutionary and synthetic biology. The specifics of the mapping, such as the fraction of accessible sequences and structures, or the sequences' ability to fold fast, are dictated by the type of interactions between the monomers that compose the sequences. The set of possible interactions between monomers is encapsulated by the potential energy function. In this study, I explore the impact of the relative forces of the potential on the architecture of the sequence-structure map. My observations rely on simple exact models of proteins and random samples of the space of potential energy functions of binary alphabets. I adopt a graph perspective and study the distribution of viable sequences and the structures they produce, as networks of sequences connected by point mutations. I observe that the relative proportion of attractive, neutral and repulsive forces defines types of potentials, that induce sequence-structure maps of vastly different architectures. I characterize the properties underlying these differences and relate them to the structure of the potential. Among these properties are the expected number and relative distribution of sequences associated to specific structures and the diversity of structures as a function of sequence divergence. I study the types of binary potentials observed in natural amino acids and show that there is a strong bias towards only some types of potentials, a bias that seems to characterize the folding code of natural proteins. I discuss implications of these observations for the architecture of the sequence-structure map of natural proteins, the construction of random libraries of peptides, and the early evolution of the natural amino acid alphabet.
Collapse
Affiliation(s)
- Evandro Ferrada
- Santa Fe Institute, Santa Fe, New Mexico, United States of America
| |
Collapse
|
13
|
Joseph AP, de Brevern AG. From local structure to a global framework: recognition of protein folds. J R Soc Interface 2014; 11:20131147. [PMID: 24740960 DOI: 10.1098/rsif.2013.1147] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open
Abstract
Protein folding has been a major area of research for many years. Nonetheless, the mechanisms leading to the formation of an active biological fold are still not fully apprehended. The huge amount of available sequence and structural information provides hints to identify the putative fold for a given sequence. Indeed, protein structures prefer a limited number of local backbone conformations, some being characterized by preferences for certain amino acids. These preferences largely depend on the local structural environment. The prediction of local backbone conformations has become an important factor to correctly identifying the global protein fold. Here, we review the developments in the field of local structure prediction and especially their implication in protein fold recognition.
Collapse
Affiliation(s)
- Agnel Praveen Joseph
- Science and Technology Facilities Council, Rutherford Appleton Laboratory, Harwell Oxford, , Didcot OX11 0QX, UK
| | | |
Collapse
|
14
|
Kukic P, Mirabello C, Tradigo G, Walsh I, Veltri P, Pollastri G. Toward an accurate prediction of inter-residue distances in proteins using 2D recursive neural networks. BMC Bioinformatics 2014; 15:6. [PMID: 24410833 PMCID: PMC3893389 DOI: 10.1186/1471-2105-15-6] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2013] [Accepted: 12/20/2013] [Indexed: 11/21/2022] Open
Abstract
Background Protein inter-residue contact maps provide a translation and rotation invariant topological representation of a protein. They can be used as an intermediary step in protein structure predictions. However, the prediction of contact maps represents an unbalanced problem as far fewer examples of contacts than non-contacts exist in a protein structure. In this study we explore the possibility of completely eliminating the unbalanced nature of the contact map prediction problem by predicting real-value distances between residues. Predicting full inter-residue distance maps and applying them in protein structure predictions has been relatively unexplored in the past. Results We initially demonstrate that the use of native-like distance maps is able to reproduce 3D structures almost identical to the targets, giving an average RMSD of 0.5Å. In addition, the corrupted physical maps with an introduced random error of ±6Å are able to reconstruct the targets within an average RMSD of 2Å. After demonstrating the reconstruction potential of distance maps, we develop two classes of predictors using two-dimensional recursive neural networks: an ab initio predictor that relies only on the protein sequence and evolutionary information, and a template-based predictor in which additional structural homology information is provided. We find that the ab initio predictor is able to reproduce distances with an RMSD of 6Å, regardless of the evolutionary content provided. Furthermore, we show that the template-based predictor exploits both sequence and structure information even in cases of dubious homology and outperforms the best template hit with a clear margin of up to 3.7Å. Lastly, we demonstrate the ability of the two predictors to reconstruct the CASP9 targets shorter than 200 residues producing the results similar to the state of the machine learning art approach implemented in the Distill server. Conclusions The methodology presented here, if complemented by more complex reconstruction protocols, can represent a possible path to improve machine learning algorithms for 3D protein structure prediction. Moreover, it can be used as an intermediary step in protein structure predictions either on its own or complemented by NMR restraints.
Collapse
Affiliation(s)
- Predrag Kukic
- School of Computer Science and Informatics, Complex and Adaptive Systems Laboratory, University College Dublin, Belfield, Dublin 4, Ireland.
| | | | | | | | | | | |
Collapse
|
15
|
Abstract
Motivation: Template-based modeling, including homology modeling and protein threading, is the most reliable method for protein 3D structure prediction. However, alignment errors and template selection are still the main bottleneck for current template-base modeling methods, especially when proteins under consideration are distantly related. Results: We present a novel context-specific alignment potential for protein threading, including alignment and template selection. Our alignment potential measures the log-odds ratio of one alignment being generated from two related proteins to being generated from two unrelated proteins, by integrating both local and global context-specific information. The local alignment potential quantifies how well one sequence residue can be aligned to one template residue based on context-specific information of the residues. The global alignment potential quantifies how well two sequence residues can be placed into two template positions at a given distance, again based on context-specific information. By accounting for correlation among a variety of protein features and making use of context-specific information, our alignment potential is much more sensitive than the widely used context-independent or profile-based scoring function. Experimental results confirm that our method generates significantly better alignments and threading results than the best profile-based methods on several large benchmarks. Our method works particularly well for distantly related proteins or proteins with sparse sequence profiles because of the effective integration of context-specific, structure and global information. Availability:http://raptorx.uchicago.edu/download/. Contact:jinboxu@gmail.com
Collapse
Affiliation(s)
- Jianzhu Ma
- Toyota Technological Institute at Chicago, IL 60637, USA
| | | | | | | |
Collapse
|
16
|
Mishra S, Saxena A, Sangwan RS. Fundamentals of Homology Modeling Steps and Comparison among Important Bioinformatics Tools: An Overview. ACTA ACUST UNITED AC 2013. [DOI: 10.17311/sciintl.2013.237.252] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
17
|
|
18
|
Gront D, Blaszczyk M, Wojciechowski P, Kolinski A. BioShell Threader: protein homology detection based on sequence profiles and secondary structure profiles. Nucleic Acids Res 2012; 40:W257-62. [PMID: 22693216 PMCID: PMC3394251 DOI: 10.1093/nar/gks555] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
The BioShell package has recently been extended with a web server for protein homology detection based on profile-to-profile alignment (known as 1D threading). Its aim is to assign structural templates to each domain of the query. The server uses sequence profiles that describe observed sequence variability and secondary structure profiles providing expected probability for a certain secondary structure type at a given position in a protein. Three independent predictors are used to increase the rate of successful predictions. Careful evaluation shows that there is nearly 80% chance that the query sequence belongs to the same SCOP family as the top scoring template. The Bioshell Threader server is freely available at: http://www.bioshell.pl/threader/.
Collapse
Affiliation(s)
- Dominik Gront
- University of Warsaw, Faculty of Chemistry, Pasteura 1, 02-093 Warsaw, Poland.
| | | | | | | |
Collapse
|
19
|
Khashan R, Zheng W, Tropsha A. Scoring protein interaction decoys using exposed residues (SPIDER): a novel multibody interaction scoring function based on frequent geometric patterns of interfacial residues. Proteins 2012; 80:2207-17. [PMID: 22581643 DOI: 10.1002/prot.24110] [Citation(s) in RCA: 37] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2012] [Revised: 04/05/2012] [Accepted: 04/20/2012] [Indexed: 01/14/2023]
Abstract
Accurate prediction of the structure of protein-protein complexes in computational docking experiments remains a formidable challenge. It has been recognized that identifying native or native-like poses among multiple decoys is the major bottleneck of the current scoring functions used in docking. We have developed a novel multibody pose-scoring function that has no theoretical limit on the number of residues contributing to the individual interaction terms. We use a coarse-grain representation of a protein-protein complex where each residue is represented by its side chain centroid. We apply a computational geometry approach called Almost-Delaunay tessellation that transforms protein-protein complexes into a residue contact network, or an undirectional graph where vertex-residues are nodes connected by edges. This treatment forms a family of interfacial graphs representing a dataset of protein-protein complexes. We then employ frequent subgraph mining approach to identify common interfacial residue patterns that appear in at least a subset of native protein-protein interfaces. The geometrical parameters and frequency of occurrence of each "native" pattern in the training set are used to develop the new SPIDER scoring function. SPIDER was validated using standard "ZDOCK" benchmark dataset that was not used in the development of SPIDER. We demonstrate that SPIDER scoring function ranks native and native-like poses above geometrical decoys and that it exceeds in performance a popular ZRANK scoring function. SPIDER was ranked among the top scoring functions in a recent round of CAPRI (Critical Assessment of PRedicted Interactions) blind test of protein-protein docking methods.
Collapse
Affiliation(s)
- Raed Khashan
- Division of Chemical Biology and Medicinal Chemistry, UNC Eshelman School of Pharmacy, University of North Carolina, Chapel Hill, North Carolina 27599, USA
| | | | | |
Collapse
|
20
|
Sundaramurthy P, Sreenivasan R, Shameer K, Gakkhar S, Sowdhamini R. HORIBALFRE program: Higher Order Residue Interactions Based ALgorithm for Fold REcognition. Bioinformation 2011; 7:352-9. [PMID: 22355236 PMCID: PMC3280490 DOI: 10.6026/97320630007352] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2011] [Accepted: 11/24/2011] [Indexed: 11/23/2022] Open
Abstract
Understanding the functional and structural implication of a protein encoded in novel genes using function association or fold recognition approaches remains to be a challenging task in the current era of genomes, metagenomes and personal genomes. In an attempt to enhance potential-based fold-recognition methods in recognizing remote homology between proteins, we propose a new approach "Higher Order Residue Interaction Based ALgorithm for Fold REcognition (HORIBALFRE)". Higher order residue interactions refer to a class of interactions in protein structures mediated by C(α) or C(β) atoms within a pre-defined distance cut-off. Higher order residue interactions (pairwise, triplet and quadruplet interactions) play a vital role in attaining the stable conformation of a protein structure. In HORIBALFRE, we incorporated the potential contributions from two body (pairwise) interactions, three body (triplet interactions) and four-body (quadruple interaction) interactions, to implement a new fold recognition algorithm. Core of HORIBALFRE algorithm includes the potentials generated from a library of protein structure derived from manually curated CAMPASS database of structure based sequence alignment. We used Fischer's dataset, with 68 templates and 56 target sequences, derived from SCOP database and performed one-against-all sequence alignment using TCoffee. Various potentials were derived using custom scripts and these potentials were incorporated in the HORIBALFRE algorithm. In this manuscript, we report outline of a novel fold recognition algorithm and initial results. Our results show that inclusion of quadruplet class of higher order residue interaction improves fold recognition.
Collapse
Affiliation(s)
- Pandurangan Sundaramurthy
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore - 560065, India
- Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee -247667, India
| | - Raashi Sreenivasan
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore - 560065, India
- Centre for Biotechnology, Anna University, Chennai - 600025, India
- University of Wisconsin-Madison, Madison, WI 53706-1481, USA; 5Division of Cardiovascular Diseases, Mayo Clinic, Rochester, MN 55901 USA
| | - Khader Shameer
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore - 560065, India
- Authors contributed equally to this work
| | - Sunita Gakkhar
- Department of Mathematics, Indian Institute of Technology Roorkee, Roorkee -247667, India
| | - Ramanathan Sowdhamini
- National Center for Biological Sciences, Tata Institute of Fundamental Research, GKVK Campus, Bellary Road, Bangalore - 560065, India
| |
Collapse
|
21
|
Vishnepolsky B, Pirtskhalava M. CONTSOR--a new knowledge-based fold recognition potential, based on side chain orientation and contacts between residue terminal groups. Protein Sci 2011; 21:134-41. [PMID: 22057923 DOI: 10.1002/pro.763] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2011] [Revised: 10/18/2011] [Accepted: 10/31/2011] [Indexed: 11/09/2022]
Abstract
Recognizing the structural similarity without significant sequence identity (fold recognition) is an effective method for protein structure prediction. Previously, we developed a fold recognition potential called SORDIS, which incorporated side chain orientation in relation to hydrophobic core centers, distance of the residues from the protein globule center and secondary structure terms. But this potential does not include terms, based on close contacts between residues. In this paper a new fold recognition potential CONTSOR was presented, which based on SORDIS terms and the term, based on contacts between amino acid terminal groups. The performance of this potential was evaluated on SABmark benchmark for alignment accuracy and on SABmark and Lindahl benchmarks for fold recognition. The results show that CONTSOR has the best performance among other potentials on SABmark benchmark both for alignment accuracy and fold recognition and one of the best performances on Lindahl benchmark. CONTSOR software package is available for download at http://www.lifescience.org.ge/downloads/contsor.zip.
Collapse
Affiliation(s)
- Boris Vishnepolsky
- Life Science Research Centre, Laboratory of Bioinformatics, 14 Gotua Street, Tbilisi, Georgia.
| | | |
Collapse
|
22
|
Mapping the distribution of packing topologies within protein interiors shows predominant preference for specific packing motifs. BMC Bioinformatics 2011; 12:195. [PMID: 21605466 PMCID: PMC3123238 DOI: 10.1186/1471-2105-12-195] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2010] [Accepted: 05/24/2011] [Indexed: 11/24/2022] Open
Abstract
Background Mapping protein primary sequences to their three dimensional folds referred to as the 'second genetic code' remains an unsolved scientific problem. A crucial part of the problem concerns the geometrical specificity in side chain association leading to densely packed protein cores, a hallmark of correctly folded native structures. Thus, any model of packing within proteins should constitute an indispensable component of protein folding and design. Results In this study an attempt has been made to find, characterize and classify recurring patterns in the packing of side chain atoms within a protein which sustains its native fold. The interaction of side chain atoms within the protein core has been represented as a contact network based on the surface complementarity and overlap between associating side chain surfaces. Some network topologies definitely appear to be preferred and they have been termed 'packing motifs', analogous to super secondary structures in proteins. Study of the distribution of these motifs reveals the ubiquitous presence of typical smaller graphs, which appear to get linked or coalesce to give larger graphs, reminiscent of the nucleation-condensation model in protein folding. One such frequently occurring motif, also envisaged as the unit of clustering, the three residue clique was invariably found in regions of dense packing. Finally, topological measures based on surface contact networks appeared to be effective in discriminating sequences native to a specific fold amongst a set of decoys. Conclusions Out of innumerable topological possibilities, only a finite number of specific packing motifs are actually realized in proteins. This small number of motifs could serve as a basis set in the construction of larger networks. Of these, the triplet clique exhibits distinct preference both in terms of composition and geometry.
Collapse
|
23
|
Sun W, He J. From isotropic to anisotropic side chain representations: comparison of three models for residue contact estimation. PLoS One 2011; 6:e19238. [PMID: 21552527 PMCID: PMC3084275 DOI: 10.1371/journal.pone.0019238] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2010] [Accepted: 03/29/2011] [Indexed: 11/19/2022] Open
Abstract
The criterion to determine residue contact is a fundamental problem in deriving knowledge-based mean-force potential energy calculations for protein structures. A frequently used criterion is to require the side chain center-to-center distance or the -to- atom distance to be within a pre-determined cutoff distance. However, the spatially anisotropic nature of the side chain determines that it is challenging to identify the contact pairs. This study compares three side chain contact models: the Atom Distance criteria (ADC) model, the Isotropic Sphere Side chain (ISS) model and the Anisotropic Ellipsoid Side chain (AES) model using 424 high resolution protein structures in the Protein Data Bank. The results indicate that the ADC model is the most accurate and ISS is the worst. The AES model eliminates about 95% of the incorrectly counted contact-pairs in the ISS model. Algorithm analysis shows that AES model is the most computational intensive while ADC model has moderate computational cost. We derived a dataset of the mis-estimated contact pairs by AES model. The most misjudged pairs are Arg-Glu, Arg-Asp and Arg-Tyr. Such a dataset can be useful for developing the improved AES model by incorporating the pair-specific information for the cutoff distance.
Collapse
Affiliation(s)
- Weitao Sun
- Zhou Pei-Yuan Center for Applied Mathematics, Tsinghua University, Beijing, China.
| | | |
Collapse
|
24
|
Chen H, Kihara D. Effect of using suboptimal alignments in template-based protein structure prediction. Proteins 2011; 79:315-34. [PMID: 21058297 PMCID: PMC3058269 DOI: 10.1002/prot.22885] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
Computational protein structure prediction remains a challenging task in protein bioinformatics. In the recent years, the importance of template-based structure prediction is increasing because of the growing number of protein structures solved by the structural genomics projects. To capitalize the significant efforts and investments paid on the structural genomics projects, it is urgent to establish effective ways to use the solved structures as templates by developing methods for exploiting remotely related proteins that cannot be simply identified by homology. In this work, we examine the effect of using suboptimal alignments in template-based protein structure prediction. We showed that suboptimal alignments are often more accurate than the optimal one, and such accurate suboptimal alignments can occur even at a very low rank of the alignment score. Suboptimal alignments contain a significant number of correct amino acid residue contacts. Moreover, suboptimal alignments can improve template-based models when used as input to Modeller. Finally, we use suboptimal alignments for handling a contact potential in a probabilistic way in a threading program, SUPRB. The probabilistic contacts strategy outperforms the partly thawed approach, which only uses the optimal alignment in defining residue contacts, and also the re-ranking strategy, which uses the contact potential in re-ranking alignments. The comparison with existing methods in the template-recognition test shows that SUPRB is very competitive and outperforms existing methods.
Collapse
Affiliation(s)
- Hao Chen
- Department of Biological Sciences College of Science, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences College of Science, Purdue University, West Lafayette, IN, 47907, USA
- Department of Computer Science College of Science, Purdue University, West Lafayette, IN, 47907, USA
- Markey Center for Structural Biology College of Science, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
25
|
Di Lena P, Fariselli P, Margara L, Vassura M, Casadio R. Fast overlapping of protein contact maps by alignment of eigenvectors. ACTA ACUST UNITED AC 2010; 26:2250-8. [PMID: 20610612 DOI: 10.1093/bioinformatics/btq402] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
MOTIVATION Searching for structural similarity is a key issue of protein functional annotation. The maximum contact map overlap (CMO) is one of the possible measures of protein structure similarity. Exact and approximate methods known to optimize the CMO are computationally expensive and this hampers their applicability to large-scale comparison of protein structures. RESULTS In this article, we describe a heuristic algorithm (Al-Eigen) for finding a solution to the CMO problem. Our approach relies on the approximation of contact maps by eigendecomposition. We obtain good overlaps of two contact maps by computing the optimal global alignment of few principal eigenvectors. Our algorithm is simple, fast and its running time is independent of the amount of contacts in the map. Experimental testing indicates that the algorithm is comparable to exact CMO methods in terms of the overlap quality, to structural alignment methods in terms of structure similarity detection and it is fast enough to be suited for large-scale comparison of protein structures. Furthermore, our preliminary tests indicates that it is quite robust to noise, which makes it suitable for structural similarity detection also for noisy and incomplete contact maps. AVAILABILITY Available at http://bioinformatics.cs.unibo.it/Al-Eigen.
Collapse
Affiliation(s)
- Pietro Di Lena
- Department of Computer Science, University of Bologna, Bologna, Italy.
| | | | | | | | | |
Collapse
|
26
|
Solis AD, Rackovsky SR. Information-theoretic analysis of the reference state in contact potentials used for protein structure prediction. Proteins 2010; 78:1382-97. [PMID: 20034109 DOI: 10.1002/prot.22652] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
Using information-theoretic concepts, we examine the role of the reference state, a crucial component of empirical potential functions, in protein fold recognition. We derive an information-based connection between the probability distribution functions of the reference state and those that characterize the decoy set used in threading. In examining commonly used contact reference states, we find that the quasi-chemical approximation is informatically superior to other variant models designed to include characteristics of real protein chains, such as finite length and variable amino acid composition from protein to protein. We observe that in these variant models, the total divergence, the operative function that quantifies discrimination, decreases along with threading performance. We find that any amount of nativeness encoded in the reference state model does not significantly improve threading performance. A promising avenue for the development of better potentials is suggested by our information-theoretic analysis of the action of contact potentials on individual protein sequences. Our results show that contact potentials perform better when the compositional properties of the data set used to derive the score function probabilities are similar to the properties of the sequence of interest. Results also suggest to use only sequences of similar composition in deriving contact potentials, to tailor the contact potential specifically for a test sequence.
Collapse
Affiliation(s)
- Armando D Solis
- Department of Pharmacology and Systems Therapeutics, Mount Sinai School of Medicine, New York, New York 10029, USA.
| | | |
Collapse
|
27
|
Sun W, He J. Understanding on the residue contact network using the log-normal cluster model and the multilevel wheel diagram. Biopolymers 2010; 93:904-16. [DOI: 10.1002/bip.21494] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
|
28
|
McAllister SR, Floudas CA. An improved hybrid global optimization method for protein tertiary structure prediction. COMPUTATIONAL OPTIMIZATION AND APPLICATIONS 2010; 45:377-413. [PMID: 20357906 PMCID: PMC2847311 DOI: 10.1007/s10589-009-9277-y] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
First principles approaches to the protein structure prediction problem must search through an enormous conformational space to identify low-energy, near-native structures. In this paper, we describe the formulation of the tertiary structure prediction problem as a nonlinear constrained minimization problem, where the goal is to minimize the energy of a protein conformation subject to constraints on torsion angles and interatomic distances. The core of the proposed algorithm is a hybrid global optimization method that combines the benefits of the αBB deterministic global optimization approach with conformational space annealing. These global optimization techniques employ a local minimization strategy that combines torsion angle dynamics and rotamer optimization to identify and improve the selection of initial conformations and then applies a sequential quadratic programming approach to further minimize the energy of the protein conformations subject to constraints. The proposed algorithm demonstrates the ability to identify both lower energy protein structures, as well as larger ensembles of low-energy conformations.
Collapse
|
29
|
Gertsman I, Komives EA, Johnson JE. HK97 maturation studied by crystallography and H/2H exchange reveals the structural basis for exothermic particle transitions. J Mol Biol 2010; 397:560-74. [PMID: 20093122 DOI: 10.1016/j.jmb.2010.01.016] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2009] [Revised: 12/29/2009] [Accepted: 01/07/2010] [Indexed: 12/27/2022]
Abstract
HK97 is an exceptionally amenable system for characterizing major conformational changes associated with capsid maturation in double-stranded DNA bacteriophage. HK97 undergoes a capsid expansion of approximately 20%, accompanied by major subunit rearrangements during genome packaging. A previous 3.44-A-resolution crystal structure of the mature capsid Head II and cryo-electron microscopy studies of other intermediate expansion forms of HK97 suggested that, primarily, rigid-body movements facilitated the maturation process. We recently reported a 3.65-A-resolution structure of the preexpanded particle form Prohead II (P-II) and found that the capsid subunits undergo significant refolding and twisting of the tertiary structure to accommodate expansion. The P-II study focused on major twisting motions in the P-domain and on refolding of the spine helix during the transition. Here we extend the crystallographic comparison between P-II and Head II, characterizing the refolding events occurring in each of the four major domains of the capsid subunit and their effect on quaternary structure stabilization. In addition, hydrogen/deuterium exchange, coupled to mass spectrometry, was used to characterize the structural dynamics of three distinct capsid intermediates: P-II, Expansion Intermediate, and the nearly mature Head I. Differences in the solvent accessibilities of the seven quasi-equivalent capsid subunits, attributed to differences in secondary and quaternary structures, were observed in P-II. Nearly all differences in solvent accessibility among subunits disappear after the first transition to Expansion Intermediate. We show that most of the refolding is coupled to this transformation, an event associated with the transition from asymmetric to symmetric hexamers.
Collapse
Affiliation(s)
- Ilya Gertsman
- Department of Molecular Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | | | | |
Collapse
|
30
|
Sundaramurthy P, Shameer K, Sreenivasan R, Gakkhar S, Sowdhamini R. HORI: a web server to compute Higher Order Residue Interactions in protein structures. BMC Bioinformatics 2010; 11 Suppl 1:S24. [PMID: 20122196 PMCID: PMC3009495 DOI: 10.1186/1471-2105-11-s1-s24] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Folding of a protein into its three dimensional structure is influenced by both local and global interactions within a protein. Higher order residue interactions, like pairwise, triplet and quadruplet ones, play a vital role in attaining the stable conformation of the protein structure. It is generally agreed that higher order interactions make significant contribution to the potential energy landscape of folded proteins and therefore it is important to identify them to estimate their contributions to overall stability of a protein structure. RESULTS We developed HORI [Higher order residue interactions in proteins], a web server for the calculation of global and local higher order interactions in protein structures. The basic algorithm of HORI is designed based on the classical concept of four-body nearest-neighbour propensities of amino-acid residues. It has been proved that higher order residue interactions up to the level of quadruple interactions plays a major role in the three-dimensional structure of proteins and is an important feature that can be used in protein structure analysis. CONCLUSION HORI server will be a useful resource for the structural bioinformatics community to perform analysis on protein structures based on higher order residue interactions. HORI server is a highly interactive web server designed in three modules that enables the user to analyse higher order residue interactions in protein structures. HORI server is available from the URL: http://caps.ncbs.res.in/hori.
Collapse
Affiliation(s)
- Pandurangan Sundaramurthy
- National Centre for Biological Sciences (TIFR), GKVK Campus, Bellary Road, Bangalore, 560065, India.
| | | | | | | | | |
Collapse
|
31
|
Abstract
This paper discusses recent optimization approaches to the protein side-chain prediction problem, protein structural alignment, and molecular structure determination from X-ray diffraction measurements. The machinery employed to solve these problems has included algorithms from linear programming, dynamic programming, combinatorial optimization, and mixed-integer nonlinear programming. Many of these problems are purely continuous in nature. Yet, to this date, they have been approached mostly via combinatorial optimization algorithms that are applied to discrete approximations. The main purpose of the paper is to offer an introduction and motivate further systems approaches to these problems.
Collapse
Affiliation(s)
- Nikolaos V. Sahinidis
- Department of Chemical Engineering Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
32
|
Structure and function of Plasmodium falciparum malate dehydrogenase: role of critical amino acids in co-substrate binding pocket. Biochimie 2009; 91:1509-17. [PMID: 19772885 DOI: 10.1016/j.biochi.2009.09.005] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2009] [Accepted: 09/11/2009] [Indexed: 11/24/2022]
Abstract
The malaria parasite thrives on anaerobic fermentation of glucose for energy. Earlier studies from our laboratory have demonstrated that a cytosolic malate dehydrogenase (PfMDH) with striking similarity to lactate dehydrogenase (PfLDH) might complement PfLDH function in Plasmodium falciparum. The N-terminal glycine motif, which forms a characteristic Rossman dinucleotide-binding fold in the co-substrate binding pocket, differentiates PfMDH (GlyXGlyXXGly) from other eukaryotic and prokaryotic malate dehydrogenases (GlyXXGlyXXGly). The amino acids lining the co-substrate binding pocket are completely conserved in MDHs from different species of human, primate and rodent malaria parasites. Based on this knowledge and conserved domains among prokaryotic and eukaryotic MDH, the role of critical amino acids lining the co-substrate binding pocket was analyzed in catalytic functions of PfMDH using site-directed mutagenesis. Insertion of Ala at the 9th or 10th position, which converts the N-terminal GlyXGlyXXGly motif (characteristic of malarial MDH and LDH) to GlyXXGlyXXGly (as in bacterial and eukaryotic MDH), uncoupled regulation of the enzyme through substrate inhibition. The dinucleotide fold GlyXGlyXXGly motif seems not to be responsible for the distinct affinity of PfMDH to 3-acetylpyridine-adenine dinucleotide (APAD, a synthetic analog of NAD), since Ala9 and Ala10 insertion mutants still utilized APADH. The Gln11Met mutation, which converts the signature glycine motif in PfMDH to that of PfLDH, did not change the enzyme function. However, the Gln11Gly mutant showed approximately a 5-fold increase in catalytic activity, and higher susceptibility to inhibition with gossypol. Asn119 and His174 participate in binding of both co-substrate and substrate. The Asn119Gly mutant exhibited approximately a 3-fold decrease in catalytic efficiency, while mutation of His174 to Asn or Ala resulted in an inactive enzyme. These studies provide critical insights into the co-substrate binding pocket of PfMDH, which may be important in design of selective PfMDH/PfLDH inhibitors as potential antimalarials.
Collapse
|
33
|
Lobanov MY, Finkel’shtein AV. Analogy-based protein structure prediction: II. Testing of substitution matrices and pseudopotentials used to align protein sequences with spatial structures. Mol Biol 2009. [DOI: 10.1134/s0026893309040207] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
34
|
Lobanov MY, Bogatyreva NS, Ivankov DN, Finkel’shtein AV. Analogy-based protein structure prediction: I. A new database of spatially similar and dissimilar structures of protein domains for testing and optimizing prediction methods. Mol Biol 2009. [DOI: 10.1134/s0026893309040190] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
35
|
Mazereeuw-Hautier J, Aufenvenne K, Deraison C, Ahvazi B, Oji V, Traupe H, Hovnanian A. Acral self-healing collodion baby: report of a new clinical phenotype caused by a novelTGM1mutation. Br J Dermatol 2009; 161:456-63. [DOI: 10.1111/j.1365-2133.2009.09277.x] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
|
36
|
Vishnepolsky B, Pirtskhalava M. ALIGN_MTX--an optimal pairwise textual sequence alignment program, adapted for using in sequence-structure alignment. Comput Biol Chem 2009; 33:235-8. [PMID: 19477686 DOI: 10.1016/j.compbiolchem.2009.04.003] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2008] [Revised: 03/26/2009] [Accepted: 04/23/2009] [Indexed: 11/19/2022]
Abstract
The presented program ALIGN_MTX makes alignment of two textual sequences with an opportunity to use any several characters for the designation of sequence elements and arbitrary user substitution matrices. It can be used not only for the alignment of amino acid and nucleotide sequences but also for sequence-structure alignment used in threading, amino acid sequence alignment, using preliminary known PSSM matrix, and in other cases when alignment of biological or non-biological textual sequences is required. This distinguishes it from the majority of similar alignment programs that make, as a rule, alignment only of amino acid or nucleotide sequences represented as a sequence of single alphabetic characters. ALIGN_MTX is presented as downloadable zip archive at http://www.imbbp.org/software/ALIGN_MTX/ and available for free use. As application of using the program, the results of comparison of different types of substitution matrix for alignment quality in distantly related protein pair sets were presented. Threading matrix SORDIS, based on side-chain orientation in relation to hydrophobic core centers with evolutionary change-based substitution matrix BLOSUM and using multiple sequence alignment information position-specific score matrices (PSSM) were taken for test alignment accuracy. The best performance shows PSSM matrix, but in the reduced set with lower sequence similarity threading matrix SORDIS shows the same performance and it was shown that combined potential with SORDIS and PSSM can improve alignment quality in evolutionary distantly related protein pairs.
Collapse
Affiliation(s)
- Boris Vishnepolsky
- Institute of Molecular Biology and Biological Physics, 12 Gotua St., Tbilisi, 0160, Georgia.
| | | |
Collapse
|
37
|
da Silveira CH, Pires DEV, Minardi RC, Ribeiro C, Veloso CJM, Lopes JCD, Meira W, Neshich G, Ramos CHI, Habesch R, Santoro MM. Protein cutoff scanning: A comparative analysis of cutoff dependent and cutoff free methods for prospecting contacts in proteins. Proteins 2009; 74:727-43. [PMID: 18704933 DOI: 10.1002/prot.22187] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Carlos H da Silveira
- Department of Biochemistry and Immunology, Institute of Biological Sciences, Federal University of Minas Gerais, UFMG, Brazil.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
38
|
Pradhan A, Mukherjee P, Tripathi AK, Avery MA, Walker LA, Tekwani BL. Analysis of quaternary structure of a [LDH-like] malate dehydrogenase of Plasmodium falciparum with oligomeric mutants. Mol Cell Biochem 2009; 325:141-8. [PMID: 19184366 DOI: 10.1007/s11010-009-0028-2] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2008] [Accepted: 01/15/2009] [Indexed: 10/21/2022]
Abstract
L-Malate dehydrogenase (PfMDH) from Plasmodium falciparum, the causative agent for the most severe form of malaria, has shown remarkable similarities to L: -lactate dehydrogenase (PfLDH). PfMDH is more closely related to [LDH-like] MDHs characterized in archae and other prokaryotes. Initial sequence analysis and identification of critical amino acid residues involved in inter-subunit salt-bridge interactions predict tetrameric structure for PfMDH. The catalytically active recombinant PfMDH was characterized as a tetramer. The enzyme is localized primarily in the parasites cytosol. To gain molecular insights into PfMDH/PfLDH relationships and to understand the quaternary structure of PfMDH, dimers were generated by mutation to the potential salt-bridge interacting sites. The R183A and R214G mutations, which snapped the salt bridges between the dimers and resulted in lower dimeric state, did not affect catalytic properties of the enzyme. The mutant dimers of PfMDH were active equally as the wild-type PfMDH. The studies reveal structure of PfMDH as a dimer of dimers. The tetrameric state of PfMDH was not essential for catalytic functions of the enzyme but may be an evolutionary adaptation for cytosolic localization to support its role in NAD/NADH coupling, an important metabolic function for survival of the malaria parasite.
Collapse
Affiliation(s)
- Anupam Pradhan
- National Center for Natural Products Research, School of Pharmacy, University of Mississippi, University, MS 38677, USA
| | | | | | | | | | | |
Collapse
|
39
|
Wu S, Zhang Y. MUSTER: Improving protein sequence profile-profile alignments by using multiple sources of structure information. Proteins 2008; 72:547-56. [PMID: 18247410 DOI: 10.1002/prot.21945] [Citation(s) in RCA: 310] [Impact Index Per Article: 19.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
We develop a new threading algorithm MUSTER by extending the previous sequence profile-profile alignment method, PPA. It combines various sequence and structure information into single-body terms which can be conveniently used in dynamic programming search: (1) sequence profiles; (2) secondary structures; (3) structure fragment profiles; (4) solvent accessibility; (5) dihedral torsion angles; (6) hydrophobic scoring matrix. The balance of the weighting parameters is optimized by a grading search based on the average TM-score of 111 training proteins which shows a better performance than using the conventional optimization methods based on the PROSUP database. The algorithm is tested on 500 nonhomologous proteins independent of the training sets. After removing the homologous templates with a sequence identity to the target >30%, in 224 cases, the first template alignment has the correct topology with a TM-score >0.5. Even with a more stringent cutoff by removing the templates with a sequence identity >20% or detectable by PSI-BLAST with an E-value <0.05, MUSTER is able to identify correct folds in 137 cases with the first model of TM-score >0.5. Dependent on the homology cutoffs, the average TM-score of the first threading alignments by MUSTER is 5.1-6.3% higher than that by PPA. This improvement is statistically significant by the Wilcoxon signed rank test with a P-value < 1.0 x 10(-13), which demonstrates the effect of additional structural information on the protein fold recognition. The MUSTER server is freely available to the academic community at http://zhang.bioinformatics.ku.edu/MUSTER.
Collapse
Affiliation(s)
- Sitao Wu
- Center for Bioinformatics and Department of Molecular Bioscience, University of Kansas, 2030 Becker Dr, Lawrence, Kansas 66047, USA
| | | |
Collapse
|
40
|
Abstract
The long-standing problem of constructing protein structure alignments is of central importance in computational biology. The main goal is to provide an alignment of residue correspondences, in order to identify homologous residues across chains. A critical next step of this is the alignment of protein complexes and their interfaces. Here, we introduce the program CMAPi, a two-dimensional dynamic programming algorithm that, given a pair of protein complexes, optimally aligns the contact maps of their interfaces: it produces polynomial-time near-optimal alignments in the case of multiple complexes. We demonstrate the efficacy of our algorithm on complexes from PPI families listed in the SCOPPI database and from highly divergent cytokine families. In comparison to existing techniques, CMAPi generates more accurate alignments of interacting residues within families of interacting proteins, especially for sequences with low similarity. While previous methods that use an all-atom based representation of the interface have been successful, CMAPi's use of a contact map representation allows it to be more tolerant to conformational changes and thus to align more of the interaction surface. These improved interface alignments should enhance homology modeling and threading methods for predicting PPIs by providing a basis for generating template profiles for sequence-structure alignment.
Collapse
Affiliation(s)
- Vinay Pulim
- Computer Science and Artificial Intelligence Laboratory, MIT, MIT, Cambridge, USA
| | | | | |
Collapse
|
41
|
Structural insights into the Plasmodium falciparum histone deacetylase 1 (PfHDAC-1): A novel target for the development of antimalarial therapy. Bioorg Med Chem 2008; 16:5254-65. [DOI: 10.1016/j.bmc.2008.03.005] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2008] [Revised: 02/27/2008] [Accepted: 03/03/2008] [Indexed: 11/20/2022]
|
42
|
Mukherjee P, Desai PV, Srivastava A, Tekwani BL, Avery MA. Probing the structures of leishmanial farnesyl pyrophosphate synthases: homology modeling and docking studies. J Chem Inf Model 2008; 48:1026-40. [PMID: 18419114 DOI: 10.1021/ci700355z] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Leishmania donovani and Leishmania major farnesyl pyrophosphate synthase ( LdFPPS and LmFPPS) are potential targets for the development of antileishmanial therapy. The protein sequence for LdFPPS was recently elucidated in our laboratory. Highly refined homology models were generated using the protein sequences of LdFPPS and the closely related LmFPPS enzyme. A ligand-refined model of LmFPPS with a bound bisphosphonate ligand was generated using restraint-guided molecular mechanics followed by quantum mechanics/molecular mechanics refinement. The ligand-refined model of LmFPPS was further validated through extensive pose validation, enrichment, and other docking studies involving known bisphosphonate inhibitors. The model was able to explain the critical binding site interactions and site-directed mutagenesis data obtained from experimental studies on related FPPS enzymes. The ligand-refined model in conjunction with the validated docking protocol could be utilized in the future for structure-based virtual screening and rational drug design studies against these targets.
Collapse
Affiliation(s)
- Prasenjit Mukherjee
- Department of Medicinal Chemistry, School of Pharmacy, University of Mississippi, University, Mississippi 38677, USA
| | | | | | | | | |
Collapse
|
43
|
Vishnepolsky B, Managadze G, Pirtskhalava M. Comparison of the efficiency of evolutionary change-based and side chain orientation-based fold recognition potentials. Proteins 2008; 71:1863-78. [PMID: 18175309 DOI: 10.1002/prot.21871] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The present article describes residue level knowledge based potential SORDIS. SORDIS incorporates the information on side-chain orientation in relation to hydrophobic core centres, distance of residue from the globule centre and secondary structure. SORDIS has been tested and compared with widespread evolutionary change-based substitution matrices (BLOSUM, PAM, GONNET, Johnson-Overington, BLAJ, HSDM, and STROMA) in fold recognition experiments within the zone of weak sequence similarity (<16%). The obtained results show that the lower is the amino acid similarity between homologous pairs the higher is the performance of SORDIS in comparison with the potentials, based on the information about the evolutionary changes. Therefore, we propose that the employment of SORDIS in fold recognition can be useful.
Collapse
Affiliation(s)
- Boris Vishnepolsky
- Institute of Molecular Biology and Biological Physics, Tbilisi 0160, Georgia
| | | | | |
Collapse
|
44
|
Abstract
Most newly sequenced proteins are likely to adopt a similar structure to one which has already been experimentally determined. For this reason, the most successful approaches to protein structure prediction have been template-based methods. Such prediction methods attempt to identify and model the folds of unknown structures by aligning the target sequences to a set of representative template structures within a fold library. In this chapter, I discuss the development of template-based approaches to fold prediction, from the traditional techniques to the recent state-of-the-art methods. I also discuss the recent development of structural annotation databases, which contain models built by aligning the sequences from entire proteomes against known structures. Finally, I run through a practical step-by-step guide for aligning target sequences to known structures and contemplate the future direction of template-based structure prediction.
Collapse
|
45
|
|
46
|
A historical perspective of template-based protein structure prediction. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2008; 413:3-42. [PMID: 18075160 DOI: 10.1007/978-1-59745-574-9_1] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
This chapter presents a broad and a historical overview of the problem of protein structure prediction. Different structure prediction methods, including homology modeling, fold recognition (FR)/protein threading, ab initio/de novo approaches, and hybrid techniques involving multiple types of approaches, are introduced in a historical context. The progress of the field as a whole, especially in the threading/FR area, as reflected by the CASP/CAFASP contests, is reviewed. At the end of the chapter, we discuss the challenging issues ahead in the field of protein structure prediction.
Collapse
|
47
|
Xu J, Jiao F, Yu L. Protein structure prediction using threading. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2008; 413:91-121. [PMID: 18075163 DOI: 10.1007/978-1-59745-574-9_4] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/15/2022]
Abstract
This chapter discusses the protocol for computational protein structure prediction by protein threading. First, we present a general procedure and summarize some typical ideas for each step of protein threading. Then, we describe the design and implementation of RAPTOR, a protein structure prediction program based on threading. The major focuses are three key components of RAPTOR: a linear programming approach to protein threading, two machine learning approaches (SVM and Gradient Boosting) to fold recognition, and evaluation of the statistical significance of the prediction results. The first part of this chapter is a brief review of protein threading, and the second part contains original research results. Some key ideas and results have been previously published.
Collapse
Affiliation(s)
- Jinbo Xu
- Toyota Technological Institute at Chicago, Chicago, IL, USA
| | | | | |
Collapse
|
48
|
Phan QT, Myers CL, Fu Y, Sheppard DC, Yeaman MR, Welch WH, Ibrahim AS, Edwards JE, Filler SG. Als3 is a Candida albicans invasin that binds to cadherins and induces endocytosis by host cells. PLoS Biol 2007; 5:e64. [PMID: 17311474 PMCID: PMC1802757 DOI: 10.1371/journal.pbio.0050064] [Citation(s) in RCA: 414] [Impact Index Per Article: 24.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2006] [Accepted: 12/28/2006] [Indexed: 11/19/2022] Open
Abstract
Candida albicans is the most common cause of hematogenously disseminated and oropharyngeal candidiasis. Both of these diseases are characterized by fungal invasion of host cells. Previously, we have found that C. albicans hyphae invade endothelial cells and oral epithelial cells in vitro by inducing their own endocytosis. Therefore, we set out to identify the fungal surface protein and host cell receptors that mediate this process. We found that the C. albicans Als3 is required for the organism to be endocytosed by human umbilical vein endothelial cells and two different human oral epithelial lines. Affinity purification experiments with wild-type and an als3Δ/als3Δ mutant strain of C. albicans demonstrated that Als3 was required for C. albicans to bind to multiple host cell surface proteins, including N-cadherin on endothelial cells and E-cadherin on oral epithelial cells. Furthermore, latex beads coated with the recombinant N-terminal portion of Als3 were endocytosed by Chinese hamster ovary cells expressing human N-cadherin or E-cadherin, whereas control beads coated with bovine serum albumin were not. Molecular modeling of the interactions of the N-terminal region of Als3 with the ectodomains of N-cadherin and E-cadherin indicated that the binding parameters of Als3 to either cadherin are similar to those of cadherin–cadherin binding. Therefore, Als3 is a fungal invasin that mimics host cell cadherins and induces endocytosis by binding to N-cadherin on endothelial cells and E-cadherin on oral epithelial cells. These results uncover the first known fungal invasin and provide evidence that C. albicans Als3 is a molecular mimic of human cadherins. The fungus Candida albicans is usually a harmless colonizer of human mucosal surfaces. In the mouth, it can cause oropharyngeal candidiasis, also called thrush. In hospitalized and immunocompromised patients, C. albicans can enter the blood stream and be carried throughout the body to cause a disseminated infection, which is associated with a mortality rate of up to 40%. The organism invades the epithelial cell lining of the mouth during oropharyngeal candidiasis and invades the endothelial cell lining of the blood vessels during disseminated candidiasis. We discovered that Als3, a protein expressed on the surface of C. albicans, is required for this invasion process. Cadherins on the surface of human cells normally bind other cadherins for adhesion and signaling; however, we found that Als3 also binds to cadherins on endothelial cells and oral epithelial cells, and this binding induces these host cells to take up the fungus. The structure of Als3 is predicted to be quite similar to that of the two cadherins studied, and the parameters of the binding of Als3 to either cadherin are similar to those of cadherin–cadherin binding. These results suggest that Als3 is a functional and structural mimic of human cadherins, and provide new insights into how C. albicans invades host cells. Als3 aids the invasion of the fungal pathogenCandida albicans into human host cells by mimicking human cadherins to induce endocytosis.
Collapse
Affiliation(s)
- Quynh T Phan
- Department of Medicine, Los Angeles Biomedical Research Institute, Harbor-UCLA Medical Center, Torrance, California, United States of America
| | - Carter L Myers
- Department of Medicine, Los Angeles Biomedical Research Institute, Harbor-UCLA Medical Center, Torrance, California, United States of America
| | - Yue Fu
- Department of Medicine, Los Angeles Biomedical Research Institute, Harbor-UCLA Medical Center, Torrance, California, United States of America
- David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
| | - Donald C Sheppard
- Department of Microbiology and Immunology, McGill University, Montreal, Quebec, Canada
| | - Michael R Yeaman
- Department of Medicine, Los Angeles Biomedical Research Institute, Harbor-UCLA Medical Center, Torrance, California, United States of America
- David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
| | - William H Welch
- Department of Biochemistry, University of Nevada Reno, Reno, Nevada, United States of America
| | - Ashraf S Ibrahim
- Department of Medicine, Los Angeles Biomedical Research Institute, Harbor-UCLA Medical Center, Torrance, California, United States of America
- David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
| | - John E Edwards
- Department of Medicine, Los Angeles Biomedical Research Institute, Harbor-UCLA Medical Center, Torrance, California, United States of America
- David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
| | - Scott G Filler
- Department of Medicine, Los Angeles Biomedical Research Institute, Harbor-UCLA Medical Center, Torrance, California, United States of America
- David Geffen School of Medicine, University of California Los Angeles, Los Angeles, California, United States of America
- * To whom correspondence should be addressed. E-mail:
| |
Collapse
|
49
|
Abstract
Large-scale genome sequencing and structural genomics projects generate numerous sequences and structures for 'hypothetical' proteins without functional characterizations. Detection of homology to experimentally characterized proteins can provide functional clues, but the accuracy of homology-based predictions is limited by the paucity of tools for quantitative comparison of diverging residues responsible for the functional divergence. SURF'S UP! is a web server for analysis of functional relationships in protein families, as inferred from protein surface maps comparison according to the algorithm. It assigns a numerical score to the similarity between patterns of physicochemical features(charge, hydrophobicity) on compared protein surfaces. It allows recognizing clusters of proteins that have similar surfaces, hence presumably similar functions. The server takes as an input a set of protein coordinates and returns files with "spherical coordinates" of proteins in a PDB format and their graphical presentation, a matrix with values of mutual similarities between the surfaces, and the unrooted tree that represents the clustering of similar surfaces, calculated by the neighbor-joining method. SURF'S UP! facilitates the comparative analysis of physicochemical features of the surface, which are the key determinants of the protein function. By concentrating on coarse surface features, SURF'S UP! can work with models obtained from comparative modelling. Although it is designed to analyse the conservation among homologs, it can also be used to compare surfaces of non-homologous proteins with different three-dimensional folds, as long as a functionally meaningful structural superposition is supplied by the user. Another valuable characteristic of our method is the lack of initial assumptions about the functional features to be compared. SURF'S UP! is freely available for academic researchers at http://asia.genesilico.pl/surfs_up/.
Collapse
Affiliation(s)
- Joanna M Sasin
- Laboratory of Bioinformatics and Protein Engineering, International Institute of Molecular and Cell Biology, Warsaw, Poland.
| | | | | |
Collapse
|
50
|
Biswas P, Zou J, Saven JG. Statistical theory for protein ensembles with designed energy landscapes. J Chem Phys 2007; 123:154908. [PMID: 16252973 DOI: 10.1063/1.2062047] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Combinatorial protein libraries provide a promising route to investigate the determinants and features of protein folding and to identify novel folding amino acid sequences. A library of sequences based on a pool of different monomer types are screened for folding molecules, consistent with a particular foldability criterion. The number of sequences grows exponentially with the length of the polymer, making both experimental and computational tabulations of sequences infeasible. Herein a statistical theory is extended to specify the properties of sequences having particular values of global energetic quantities that specify their energy landscape. The theory yields the site-specific monomer probabilities. A foldability criterion is derived that characterizes the properties of sequences by quantifying the energetic separation of the target state from low-energy states in the unfolded ensemble and the fluctuations of the energies in the unfolded state ensemble. For a simple lattice model of proteins, excellent agreement is observed between the theory and the results of exact enumeration. The theory may be used to provide a quantitative framework for the design and interpretation of combinatorial experiments.
Collapse
Affiliation(s)
- Parbati Biswas
- Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.
| | | | | |
Collapse
|