1
|
Chaturvedi S, Khan S, Thakur N, Jangra A, Tiwari S. Genome-wide identification and gene expression analysis of GHMP kinase gene family in banana cv. Rasthali. Mol Biol Rep 2023; 50:9061-9072. [PMID: 37731027 DOI: 10.1007/s11033-023-08743-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2022] [Accepted: 08/08/2023] [Indexed: 09/22/2023]
Abstract
BACKGROUND The GHMP kinase gene family encompasses ATP-dependent kinases, significantly involved in the biosynthesis of isoprenes, amino acids, and metabolism of carbohydrates. Banana is a staple tropical crop that is globally consumed but known for high sensitivity to salt, cold, and drought stresses. The GHMP kinases are known to play a significant role during abiotic stresses in plants. The present study emphasizes the role of GHMP kinases in various abiotic stress conditions in banana. METHODS AND RESULTS We identified 12 GHMP kinase (MaGHMP kinase) genes in the banana genome database and witnessed the presence of the conserved Pro-X-X-X-Gly-Leu-X-Ser-Ser-Ala domain in their protein sequences. All genes were found to be involved in ATP-binding and carried kinase activity confronting their biological roles in the isoprene (27%) and amino acid (20%) biosyntheses. The expression analysis of genes during cold, drought, and salt stress conditions in tissue culture grown banana cultivar Rasthali plants showed a significant involvement of MaGHMP kinase genes in these stress conditions. The highest expression of MaGHMP kinase3 (8.5 fold) was noted during cold stress, while MaGHMP kinase1 (25 fold and 40.01 fold) showed maximum expression during drought and salt stress conditions in leaf tissue of Rasthali. CONCLUSION Our findings suggested that MaGHMP kinase1 (MaHSK) and MaGHMP kinase3 (MaGlcAK) could be considered promising candidates for thwarting the abiotic stresses in banana.
Collapse
Affiliation(s)
- Siddhant Chaturvedi
- Plant Tissue Culture and Genetic Engineering Lab, Department of Biotechnology, S.A.S. Nagar, Ministry of Science and Technology (Government of India), National Agri-Food Biotechnology Institute (NABI), Sector 81, Knowledge City, Mohali, Punjab, 140306, India
- Department of Biotechnology, Panjab University, Chandigarh, 160014, India
- Department of Botany, Goswami Tulsidas Government Post Graduate College (Bundelkhand University, Jhansi), Karwi, Chitrakoot, Uttar Pradesh, 210205, India
| | - Shahirina Khan
- Plant Tissue Culture and Genetic Engineering Lab, Department of Biotechnology, S.A.S. Nagar, Ministry of Science and Technology (Government of India), National Agri-Food Biotechnology Institute (NABI), Sector 81, Knowledge City, Mohali, Punjab, 140306, India
- Department of Botany, Central University of Punjab, Bathinda, Punjab, 151001, India
| | - Neha Thakur
- Plant Tissue Culture and Genetic Engineering Lab, Department of Biotechnology, S.A.S. Nagar, Ministry of Science and Technology (Government of India), National Agri-Food Biotechnology Institute (NABI), Sector 81, Knowledge City, Mohali, Punjab, 140306, India
| | - Alka Jangra
- Department of Bio and Nanotechnology, Guru Jambheshwar University of Science and Technology, Hisar, Haryana, 125001, India
| | - Siddharth Tiwari
- Plant Tissue Culture and Genetic Engineering Lab, Department of Biotechnology, S.A.S. Nagar, Ministry of Science and Technology (Government of India), National Agri-Food Biotechnology Institute (NABI), Sector 81, Knowledge City, Mohali, Punjab, 140306, India.
| |
Collapse
|
2
|
Bota PM, Oliva B, Fernandez-Fuentes N. Theoretical 3D Modeling of NLRP3 Inflammasome Complex. Methods Mol Biol 2023; 2696:269-280. [PMID: 37578729 DOI: 10.1007/978-1-0716-3350-2_18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/15/2023]
Abstract
The NOD-like receptor pyrin domain containing 3 (NLRP3) is a multidomain protein that plays a key role in innate immune response. Structures of NLRP3 in different conformational states and bound to cognate partners are available. In this chapter we present an approach to model the oligomeric structure of NLRP3 by homology modeling using multiple templates, symmetry, and refinement. The overall process presented here represents advanced exercise in structural modeling that provides unique insights into the biological role and activation of NLRP3 oligomer. Finally, the same approach can be easily adapted to the rest of the members of the NLRP family.
Collapse
Affiliation(s)
- Patricia Mirela Bota
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain
| | - Baldo Oliva
- Structural Bioinformatics Lab (GRIB-IMIM), Department of Experimental and Health Science, Universitat Pompeu Fabra, Barcelona, Catalonia, Spain.
| | - Narcis Fernandez-Fuentes
- Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, UK
| |
Collapse
|
3
|
Santos-Martin C, Wang G, Subedi P, Hor L, Totsika M, Paxman JJ, Heras B. Structural bioinformatic analysis of DsbA proteins and their pathogenicity associated substrates. Comput Struct Biotechnol J 2021; 19:4725-4737. [PMID: 34504665 PMCID: PMC8405906 DOI: 10.1016/j.csbj.2021.08.018] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2021] [Revised: 08/12/2021] [Accepted: 08/12/2021] [Indexed: 01/02/2023] Open
Abstract
The disulfide bond (DSB) forming system and in particular DsbA, is a key bacterial oxidative folding catalyst. Due to its role in promoting the correct assembly of a wide range of virulence factors required at different stages of the infection process, DsbA is a master virulence rheostat, making it an attractive target for the development of new virulence blockers. Although DSB systems have been extensively studied across different bacterial species, to date, little is known about how DsbA oxidoreductases are able to recognize and interact with such a wide range of substrates. This review summarizes the current knowledge on the DsbA enzymes, with special attention on their interaction with the partner oxidase DsbB and substrates associated with bacterial virulence. The structurally and functionally diverse set of bacterial proteins that rely on DsbA-mediated disulfide bond formation are summarized. Local sequence and secondary structure elements of these substrates are analyzed to identify common elements recognized by DsbA enzymes. This not only provides information on protein folding systems in bacteria but also offers tools for identifying new DsbA substrates and informs current efforts aimed at developing DsbA targeted anti-microbials.
Collapse
Affiliation(s)
- Carlos Santos-Martin
- Department of Biochemistry and Genetics, La Trobe Institute of Molecular Science, La Trobe University, Melbourne, Australia
| | - Geqing Wang
- Department of Biochemistry and Genetics, La Trobe Institute of Molecular Science, La Trobe University, Melbourne, Australia
| | - Pramod Subedi
- Department of Biochemistry and Genetics, La Trobe Institute of Molecular Science, La Trobe University, Melbourne, Australia
| | - Lilian Hor
- Department of Biochemistry and Genetics, La Trobe Institute of Molecular Science, La Trobe University, Melbourne, Australia
| | - Makrina Totsika
- Centre for Immunology and Infection Control, School of Biomedical Sciences, Queensland University of Technology, Brisbane, Australia
| | - Jason John Paxman
- Department of Biochemistry and Genetics, La Trobe Institute of Molecular Science, La Trobe University, Melbourne, Australia
| | - Begoña Heras
- Department of Biochemistry and Genetics, La Trobe Institute of Molecular Science, La Trobe University, Melbourne, Australia
| |
Collapse
|
4
|
YongE F, GaoShan K. Identify Beta-Hairpin Motifs with Quadratic Discriminant Algorithm Based on the Chemical Shifts. PLoS One 2015; 10:e0139280. [PMID: 26422468 PMCID: PMC4589334 DOI: 10.1371/journal.pone.0139280] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2015] [Accepted: 09/09/2015] [Indexed: 01/13/2023] Open
Abstract
Successful prediction of the beta-hairpin motif will be helpful for understanding the of the fold recognition. Some algorithms have been proposed for the prediction of beta-hairpin motifs. However, the parameters used by these methods were primarily based on the amino acid sequences. Here, we proposed a novel model for predicting beta-hairpin structure based on the chemical shift. Firstly, we analyzed the statistical distribution of chemical shifts of six nuclei in not beta-hairpin and beta-hairpin motifs. Secondly, we used these chemical shifts as features combined with three algorithms to predict beta-hairpin structure. Finally, we achieved the best prediction, namely sensitivity of 92%, the specificity of 94% with 0.85 of Mathew’s correlation coefficient using quadratic discriminant analysis algorithm, which is clearly superior to the same method for the prediction of beta-hairpin structure from 20 amino acid compositions in the three-fold cross-validation. Our finding showed that the chemical shift is an effective parameter for beta-hairpin prediction, suggesting the quadratic discriminant analysis is a powerful algorithm for the prediction of beta-hairpin.
Collapse
Affiliation(s)
- Feng YongE
- College of Science, Inner Mongolia Agriculture University, Hohhot, PR China
- * E-mail:
| | - Kou GaoShan
- College of Science, Inner Mongolia Agriculture University, Hohhot, PR China
| |
Collapse
|
5
|
Kou G, Feng Y. Identify five kinds of simple super-secondary structures with quadratic discriminant algorithm based on the chemical shifts. J Theor Biol 2015; 380:392-8. [DOI: 10.1016/j.jtbi.2015.06.006] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/09/2015] [Revised: 06/02/2015] [Accepted: 06/04/2015] [Indexed: 10/23/2022]
|
6
|
Prediction of four kinds of simple supersecondary structures in protein by using chemical shifts. ScientificWorldJournal 2014; 2014:978503. [PMID: 25050407 PMCID: PMC4090465 DOI: 10.1155/2014/978503] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2014] [Revised: 06/03/2014] [Accepted: 06/04/2014] [Indexed: 12/23/2022] Open
Abstract
Knowledge of supersecondary structures can provide important information about its spatial structure of protein. Some approaches have been developed for the prediction of protein supersecondary structure. However, the feature used by these approaches is primarily based on amino acid sequences. In this study, a novel model is presented to predict protein supersecondary structure by use of chemical shifts (CSs) information derived from nuclear magnetic resonance (NMR) spectroscopy. Using these CSs as inputs of the method of quadratic discriminant analysis (QD), we achieve the overall prediction accuracy of 77.3%, which is competitive with the same method for predicting supersecondary structures from amino acid compositions in threefold cross-validation. Moreover, our finding suggests that the combined use of different chemical shifts will influence the accuracy of prediction.
Collapse
|
7
|
Bonet J, Segura J, Planas-Iglesias J, Oliva B, Fernandez-Fuentes N. Frag’r’Us: knowledge-based sampling of protein backbone conformations for de novo structure-based protein design. Bioinformatics 2014; 30:1935-6. [DOI: 10.1093/bioinformatics/btu129] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
8
|
Bonet J, Planas-Iglesias J, Garcia-Garcia J, Marín-López MA, Fernandez-Fuentes N, Oliva B. ArchDB 2014: structural classification of loops in proteins. Nucleic Acids Res 2013; 42:D315-9. [PMID: 24265221 PMCID: PMC3964960 DOI: 10.1093/nar/gkt1189] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
The function of a protein is determined by its three-dimensional structure, which is formed by regular (i.e. β-strands and α-helices) and non-periodic structural units such as loops. Compared to regular structural elements, non-periodic, non-repetitive conformational units enclose a much higher degree of variability—raising difficulties in the identification of regularities, and yet represent an important part of the structure of a protein. Indeed, loops often play a pivotal role in the function of a protein and different aspects of protein folding and dynamics. Therefore, the structural classification of protein loops is an important subject with clear applications in homology modelling, protein structure prediction, protein design (e.g. enzyme design and catalytic loops) and function prediction. ArchDB, the database presented here (freely available at http://sbi.imim.es/archdb), represents such a resource and has been an important asset for the scientific community throughout the years. In this article, we present a completely reworked and updated version of ArchDB. The new version of ArchDB features a novel, fast and user-friendly web-based interface, and a novel graph-based, computationally efficient, clustering algorithm. The current version of ArchDB classifies 149,134 loops in 5739 classes and 9608 subclasses.
Collapse
Affiliation(s)
- Jaume Bonet
- Structural Bioinformatics Lab (GRIB-IMIM), Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Barcelona, Catalonia, 08950, Spain and Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, SY23 3DA Aberystwyth, Ceredigion, UK
| | | | | | | | | | | |
Collapse
|
9
|
Abstract
DNAs and proteins are major classes of biomolecules that differ in many aspects. However, a considerable number of their members also share a common architectural feature that enables the assembly of multi-protein complexes and thereby permits the effective processing of signals: loop structures of substantial sizes. Here we briefly review a few representative examples and suggest a functional classification of different types of loop structures. In proteins, these loops occur in protein regions classified as intrinsically disordered. Studying such loops, their binders and their interactions with other loops should reveal much about cellular information computation and signaling network architectures. It is also expected to provide critical information for synthetic biologists and bioengineers.
Collapse
Affiliation(s)
- Stephan M Feller
- Biological Systems Architecture Group, Weatherall Institute of Molecular Medicine, University of Oxford, Oxford, OX3 9DS, UK.
| | | |
Collapse
|
10
|
Schumacher MA, Min J, Link TM, Guan Z, Xu W, Ahn YH, Soderblom EJ, Kurie JM, Evdokimov A, Moseley MA, Lewis K, Brennan RG. Role of unusual P loop ejection and autophosphorylation in HipA-mediated persistence and multidrug tolerance. Cell Rep 2012; 2:518-25. [PMID: 22999936 DOI: 10.1016/j.celrep.2012.08.013] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2012] [Revised: 07/17/2012] [Accepted: 08/15/2012] [Indexed: 12/29/2022] Open
Abstract
HipA is a bacterial serine/threonine protein kinase that phosphorylates targets, bringing about persistence and multidrug tolerance. Autophosphorylation of residue Ser150 is a critical regulatory mechanism of HipA function. Intriguingly, Ser150 is not located on the activation loop, as are other kinases; instead, it is in the protein core, where it forms part of the ATP-binding "P loop motif." How this buried residue is phosphorylated and regulates kinase activity is unclear. Here, we report multiple structures that reveal the P loop motif's exhibition of a remarkable "in-out" conformational equilibrium, which allows access to Ser150 and its intermolecular autophosphorylation. Phosphorylated Ser150 stabilizes the "out state," which inactivates the kinase by disrupting the ATP-binding pocket. Thus, our data reveal a mechanism of protein kinase regulation that is vital for multidrug tolerance and persistence, as kinase inactivation provides the critical first step in allowing dormant cells to revert to the growth phenotype and to reinfect the host.
Collapse
Affiliation(s)
- Maria A Schumacher
- Department of Biochemistry, Duke University School of Medicine, Durham, NC 27710, USA.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Regad L, Martin J, Camproux AC. Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs. BMC Bioinformatics 2011; 12:247. [PMID: 21689388 PMCID: PMC3158783 DOI: 10.1186/1471-2105-12-247] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2010] [Accepted: 06/20/2011] [Indexed: 12/24/2022] Open
Abstract
Background One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Results Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Conclusions Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins.
Collapse
|
12
|
Mining protein loops using a structural alphabet and statistical exceptionality. BMC Bioinformatics 2010; 11:75. [PMID: 20132552 PMCID: PMC2833150 DOI: 10.1186/1471-2105-11-75] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2009] [Accepted: 02/04/2010] [Indexed: 12/21/2022] Open
Abstract
Background Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. Results We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 Å). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of amino-acid conservation with at least four significant positions and 87% of long loops contain at least one such word. We complement our analysis with the detection of statistically over-represented patterns of structural letters as in conventional DNA sequence analysis. About 30% (930) of structural words are over-represented, and cover about 40% of loop lengths. Interestingly, these words exhibit lower structural variability and higher sequential specificity, suggesting structural or functional constraints. Conclusions We developed a method to systematically decompose and study protein loops using recurrent structural motifs. This method is based on the structural alphabet HMM-SA and not on structural alignment and geometrical parameters. We extracted meaningful structural motifs that are found in both short and long loops. To our knowledge, it is the first time that pattern mining helps to increase the signal-to-noise ratio in protein loops. This finding helps to better describe protein loops and might permit to decrease the complexity of long-loop analysis. Detailed results are available at http://www.mti.univ-paris-diderot.fr/publication/supplementary/2009/ACCLoop/.
Collapse
|
13
|
Tyagi M, Bornot A, Offmann B, de Brevern AG. Analysis of loop boundaries using different local structure assignment methods. Protein Sci 2009; 18:1869-81. [PMID: 19606500 DOI: 10.1002/pro.198] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Loops connect regular secondary structures. In many instances, they are known to play important biological roles. Analysis and prediction of loop conformations depend directly on the definition of repetitive structures. Nonetheless, the secondary structure assignment methods (SSAMs) often lead to divergent assignments. In this study, we analyzed, both structure and sequence point of views, how the divergence between different SSAMs affect boundary definitions of loops connecting regular secondary structures. The analysis of SSAMs underlines that no clear consensus between the different SSAMs can be easily found. Because these latter greatly influence the loop boundary definitions, important variations are indeed observed, that is, capping positions are shifted between different SSAMs. On the other hand, our results show that the sequence information in these capping regions are more stable than expected, and, classical and equivalent sequence patterns were found for most of the SSAMs. This is, to our knowledge, the most exhaustive survey in this field as (i) various databank have been used leading to similar results without implication of protein redundancy and (ii) the first time various SSAMs have been used. This work hence gives new insights into the difficult question of assignment of repetitive structures and addresses the issue of loop boundaries definition. Although SSAMs give very different local structure assignments capping sequence patterns remain efficiently stable.
Collapse
Affiliation(s)
- Manoj Tyagi
- Laboratoire de Biochimie et Génétique Moléculaire, Université de La Réunion, BP 7151, 15 avenue René Cassin, 97715 Saint Denis Messag Cedex 09, La Réunion, France
| | | | | | | |
Collapse
|
14
|
Regad L, Guyon F, Maupetit J, Tufféry P, Camproux A. A Hidden Markov Model applied to the protein 3D structure analysis. Comput Stat Data Anal 2008. [DOI: 10.1016/j.csda.2007.09.010] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
|
15
|
Hermoso A, Espadaler J, Enrique Querol E, Aviles FX, Sternberg MJ, Oliva B, Fernandez-Fuentes N. Including Functional Annotations and Extending the Collection of Structural Classifications of Protein Loops (ArchDB). Bioinform Biol Insights 2008. [DOI: 10.1177/117793220700100004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Loops represent an important part of protein structures. The study of loop is critical for two main reasons: First, loops are often involved in protein function, stability and folding. Second, despite improvements in experimental and computational structure prediction methods, modeling the conformation of loops remains problematic. Here, we present a structural classification of loops, ArchDB, a mine of information with application in both mentioned fields: loop structure prediction and function prediction. ArchDB ( http://sbi.imim.es/archdb ) is a database of classified protein loop motifs. The current database provides four different classification sets tailored for different purposes. ArchDB-40, a loop classification derived from SCOP40, well suited for modeling common loop motifs. Since features relevant to loop structure or function can be more easily determined on well-populated clusters, we have developed ArchDB-95, a loop classification derived from SCOP95. This new classification set shows a ~40% increase in the number of subclasses, and a large 7-fold increase in the number of putative structure/function-related subclasses. We also present ArchDB-EC, a classification of loop motifs from enzymes, and ArchDB-KI, a manually annotated classification of loop motifs from kinases. Information about ligand contacts and PDB sites has been included in all classification sets. Improvements in our classification scheme are described, as well as several new database features, such as the ability to query by conserved annotations, sequence similarity, or uploading 3D coordinates of a protein. The lengths of classified loops range between 0 and 36 residues long. ArchDB offers an exhaustive sampling of loop structures. Functional information about loops and links with related biological databases are also provided. All this information and the possibility to browse/query the database through a web-server outline an useful tool with application in the comparative study of loops, the analysis of loops involved in protein function and to obtain templates for loop modeling.
Collapse
Affiliation(s)
- Antoni Hermoso
- Laboratori de Bioinformàtica, Institut de Biomedicina I Biotecnologia, Universitat Autònoma de Barcelona, Bellaterra 08193, Catalonia. Spain
| | - Jordi Espadaler
- Laboratori de Bioinformàtica, Institut de Biomedicina I Biotecnologia, Universitat Autònoma de Barcelona, Bellaterra 08193, Catalonia. Spain
- Laboratori de Bioinformàtica Estructural (GRIB), Universitat Pompeu Fabra/IMIM, Parc de Recerca Biomèdica de Barcelona, Barcelona 08003, Catalonia, Spain
| | - E Enrique Querol
- Laboratori de Bioinformàtica, Institut de Biomedicina I Biotecnologia, Universitat Autònoma de Barcelona, Bellaterra 08193, Catalonia. Spain
| | - Francesc X. Aviles
- Laboratori de Bioinformàtica, Institut de Biomedicina I Biotecnologia, Universitat Autònoma de Barcelona, Bellaterra 08193, Catalonia. Spain
| | - Michael J.E. Sternberg
- Structural Bioinformatics Group, Department of Biological Sciences, Imperial College, London SW7 2AZ, U.K
| | - Baldomero Oliva
- Laboratori de Bioinformàtica Estructural (GRIB), Universitat Pompeu Fabra/IMIM, Parc de Recerca Biomèdica de Barcelona, Barcelona 08003, Catalonia, Spain
| | - Narcis Fernandez-Fuentes
- Leeds Institute of Molecular Medicine, Section of Experimental Therapeutics, St. James University Hospital, Leeds LS7 9TF. U.K
| |
Collapse
|
16
|
ProCKSI: a decision support system for Protein (structure) Comparison, Knowledge, Similarity and Information. BMC Bioinformatics 2007; 8:416. [PMID: 17963510 PMCID: PMC2222653 DOI: 10.1186/1471-2105-8-416] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2007] [Accepted: 10/26/2007] [Indexed: 11/19/2022] Open
Abstract
Background We introduce the decision support system for Protein (Structure) Comparison, Knowledge, Similarity and Information (ProCKSI). ProCKSI integrates various protein similarity measures through an easy to use interface that allows the comparison of multiple proteins simultaneously. It employs the Universal Similarity Metric (USM), the Maximum Contact Map Overlap (MaxCMO) of protein structures and other external methods such as the DaliLite and the TM-align methods, the Combinatorial Extension (CE) of the optimal path, and the FAST Align and Search Tool (FAST). Additionally, ProCKSI allows the user to upload a user-defined similarity matrix supplementing the methods mentioned, and computes a similarity consensus in order to provide a rich, integrated, multicriteria view of large datasets of protein structures. Results We present ProCKSI's architecture and workflow describing its intuitive user interface, and show its potential on three distinct test-cases. In the first case, ProCKSI is used to evaluate the results of a previous CASP competition, assessing the similarity of proposed models for given targets where the structures could have a large deviation from one another. To perform this type of comparison reliably, we introduce a new consensus method. The second study deals with the verification of a classification scheme for protein kinases, originally derived by sequence comparison by Hanks and Hunter, but here we use a consensus similarity measure based on structures. In the third experiment using the Rost and Sander dataset (RS126), we investigate how a combination of different sets of similarity measures influences the quality and performance of ProCKSI's new consensus measure. ProCKSI performs well with all three datasets, showing its potential for complex, simultaneous multi-method assessment of structural similarity in large protein datasets. Furthermore, combining different similarity measures is usually more robust than relying on one single, unique measure. Conclusion Based on a diverse set of similarity measures, ProCKSI computes a consensus similarity profile for the entire protein set. All results can be clustered, visualised, analysed and easily compared with each other through a simple and intuitive interface. ProCKSI is publicly available at for academic and non-commercial use.
Collapse
|
17
|
Yoon S, Ebert JC, Chung EY, De Micheli G, Altman RB. Clustering protein environments for function prediction: finding PROSITE motifs in 3D. BMC Bioinformatics 2007; 8 Suppl 4:S10. [PMID: 17570144 PMCID: PMC1892080 DOI: 10.1186/1471-2105-8-s4-s10] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open
Abstract
BACKGROUND Structural genomics initiatives are producing increasing numbers of three-dimensional (3D) structures for which there is little functional information. Structure-based annotation of molecular function is therefore becoming critical. We previously presented FEATURE, a method for describing microenvironments around functional sites in proteins. However, FEATURE uses supervised machine learning and so is limited to building models for sites of known importance and location. We hypothesized that there are a large number of sites in proteins that are associated with function that have not yet been recognized. Toward that end, we have developed a method for clustering protein microenvironments in order to evaluate the potential for discovering novel sites that have not been previously identified. RESULTS We have prototyped a computational method for rapid clustering of millions of microenvironments in order to discover residues whose surrounding environments are similar and which may therefore share a functional or structural role. We clustered nearly 2,000,000 environments from 9,600 protein chains and defined 4,550 clusters. As a preliminary validation, we asked whether known 3D environments associated with PROSITE motifs were "rediscovered". We found examples of clusters highly enriched for residues that share PROSITE sequence motifs. CONCLUSION Our results demonstrate that we can cluster protein environments successfully using a simplified representation and K-means clustering algorithm. The rediscovery of known 3D motifs allows us to calibrate the size and intercluster distances that characterize useful clusters. This information will then allow us to find new clusters with similar characteristics that represent novel structural or functional sites.
Collapse
Affiliation(s)
- Sungroh Yoon
- Computer Systems Laboratory, Stanford University, Stanford, CA 94305, USA
- Intel Corporation, 2200 Mission College Blvd., Santa Clara, CA 95054, USA
| | - Jessica C Ebert
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| | - Eui-Young Chung
- School of Electrical and Electronic Engineering, Yonsei University, Seoul 120-749, Republic of Korea
| | - Giovanni De Micheli
- Integrated Systems Center, Swiss Federal Institute of Technology (EPFL), Lausanne, CH-1015, Switzerland
| | - Russ B Altman
- Department of Genetics, Stanford University, Stanford, CA 94305, USA
| |
Collapse
|
18
|
Espadaler J, Querol E, Aviles FX, Oliva B. Identification of function-associated loop motifs and application to protein function prediction. Bioinformatics 2006; 22:2237-43. [PMID: 16870939 DOI: 10.1093/bioinformatics/btl382] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION The detection of function-related local 3D-motifs in protein structures can provide insights towards protein function in absence of sequence or fold similarity. Protein loops are known to play important roles in protein function and several loop classifications have been described, but the automated identification of putative functional 3D-motifs in such classifications has not yet been addressed. This identification can be used on sequence annotations. RESULTS We evaluated three different scoring methods for their ability to identify known motifs from the PROSITE database in ArchDB. More than 500 new putative function-related motifs not reported in PROSITE were identified. Sequence patterns derived from these motifs were especially useful at predicting precise annotations. The number of reliable sequence annotations could be increased up to 100% with respect to standard BLAST. CONTACT boliva@imim.es SUPPLEMENTARY INFORMATION Supplementary Data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jordi Espadaler
- Group de Bioinformàtica Estructural (GRIB-IMIM), Departament de Ciències Experimentals i de la Salut, Universitat Pompeu Fabra 08003 Barcelona, Catalonia, Spain
| | | | | | | |
Collapse
|
19
|
De Vivo M, Cavalli A, Bottegoni G, Carloni P, Recanatini M. Role of phosphorylated Thr160 for the activation of the CDK2/Cyclin A complex. Proteins 2005; 62:89-98. [PMID: 16292742 DOI: 10.1002/prot.20697] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The enzymatic activity of the CDK2/Cyclin A complex increases upon the specific phosphorylation of Thr160@CDK2. In the present study, we have performed a comparative molecular dynamics (MD) study of models of the complex CDK2/Cyclin A/Substrate, which differ for the presence or absence of the phosphate group bound to Thr160. The models are based on two X-ray structures available for CDK2/CyclinA and pCDK2/CyclinA/Substrate complexes. In this way, we analyze the influence of the phosphorylated Thr160 (pThr160) on both the flexibility of CDK2 activation loop (AL) and substrate binding in CDK2. Our calculations point to a decreased flexibility of the AL in the phosphorylated model, in fairly good agreement with experimental data, and to a key role of pThr160 for substrate recognition and stability. Multiple alignments of the CDKs sequences point to the very high conservation of the AL sequence among the CDKs, thus extending our results to all CDKs.
Collapse
Affiliation(s)
- Marco De Vivo
- Department of Pharmaceutical Sciences, University of Bologna, Bologna, Italy
| | | | | | | | | |
Collapse
|