1
|
Opaleny F, Ulbrich P, Planas-Iglesias J, Byska J, Stourac J, Bednar D, Furmanova K, Kozlikova B. Visual Support for the Loop Grafting Workflow on Proteins. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2025; 31:580-590. [PMID: 39255099 DOI: 10.1109/tvcg.2024.3456401] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/12/2024]
Abstract
In understanding and redesigning the function of proteins in modern biochemistry, protein engineers are increasingly focusing on exploring regions in proteins called loops. Analyzing various characteristics of these regions helps the experts design the transfer of the desired function from one protein to another. This process is denoted as loop grafting. We designed a set of interactive visualizations that provide experts with visual support through all the loop grafting pipeline steps. The workflow is divided into several phases, reflecting the steps of the pipeline. Each phase is supported by a specific set of abstracted 2D visual representations of proteins and their loops that are interactively linked with the 3D View of proteins. By sequentially passing through the individual phases, the user shapes the list of loops that are potential candidates for loop grafting. Finally, the actual in-silico insertion of the loop candidates from one protein to the other is performed, and the results are visually presented to the user. In this way, the fully computational rational design of proteins and their loops results in newly designed protein structures that can be further assembled and tested through in-vitro experiments. We showcase the contribution of our visual support design on a real case scenario changing the enantiomer selectivity of the engineered enzyme. Moreover, we provide the readers with the experts' feedback.
Collapse
|
2
|
Mughal F, Caetano-Anollés G. Evolution of Intrinsic Disorder in Protein Loops. Life (Basel) 2023; 13:2055. [PMID: 37895436 PMCID: PMC10608553 DOI: 10.3390/life13102055] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2023] [Revised: 10/08/2023] [Accepted: 10/10/2023] [Indexed: 10/29/2023] Open
Abstract
Intrinsic disorder accounts for the flexibility of protein loops, molecular building blocks that are largely responsible for the processes and molecular functions of the living world. While loops likely represent early structural forms that served as intermediates in the emergence of protein structural domains, their origin and evolution remain poorly understood. Here, we conduct a phylogenomic survey of disorder in loop prototypes sourced from the ArchDB classification. Tracing prototypes associated with protein fold families along an evolutionary chronology revealed that ancient prototypes tended to be more disordered than their derived counterparts, with ordered prototypes developing later in evolution. This highlights the central evolutionary role of disorder and flexibility. While mean disorder increased with time, a minority of ordered prototypes exist that emerged early in evolutionary history, possibly driven by the need to preserve specific molecular functions. We also revealed the percolation of evolutionary constraints from higher to lower levels of organization. Percolation resulted in trade-offs between flexibility and rigidity that impacted prototype structure and geometry. Our findings provide a deep evolutionary view of the link between structure, disorder, flexibility, and function, as well as insights into the evolutionary role of intrinsic disorder in loops and their contribution to protein structure and function.
Collapse
Affiliation(s)
- Fizza Mughal
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL 61801, USA
| | - Gustavo Caetano-Anollés
- Evolutionary Bioinformatics Laboratory, Department of Crop Sciences, University of Illinois, Urbana, IL 61801, USA
- C.R. Woese Institute for Genomic Biology, University of Illinois, Urbana, IL 61801, USA
| |
Collapse
|
3
|
Corbella M, Pinto GP, Kamerlin SCL. Loop dynamics and the evolution of enzyme activity. Nat Rev Chem 2023; 7:536-547. [PMID: 37225920 DOI: 10.1038/s41570-023-00495-w] [Citation(s) in RCA: 52] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/06/2023] [Indexed: 05/26/2023]
Abstract
In the early 2000s, Tawfik presented his 'New View' on enzyme evolution, highlighting the role of conformational plasticity in expanding the functional diversity of limited repertoires of sequences. This view is gaining increasing traction with increasing evidence of the importance of conformational dynamics in both natural and laboratory evolution of enzymes. The past years have seen several elegant examples of harnessing conformational (particularly loop) dynamics to successfully manipulate protein function. This Review revisits flexible loops as critical participants in regulating enzyme activity. We showcase several systems of particular interest: triosephosphate isomerase barrel proteins, protein tyrosine phosphatases and β-lactamases, while briefly discussing other systems in which loop dynamics are important for selectivity and turnover. We then discuss the implications for engineering, presenting examples of successful loop manipulation in either improving catalytic efficiency, or changing selectivity completely. Overall, it is becoming clearer that mimicking nature by manipulating the conformational dynamics of key protein loops is a powerful method of tailoring enzyme activity, without needing to target active-site residues.
Collapse
Affiliation(s)
- Marina Corbella
- Department of Chemistry, Uppsala University, Uppsala, Sweden
| | - Gaspar P Pinto
- Department of Chemistry, Uppsala University, Uppsala, Sweden
- Cortex Discovery GmbH, Regensburg, Germany
| | - Shina C L Kamerlin
- Department of Chemistry, Uppsala University, Uppsala, Sweden.
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA.
| |
Collapse
|
4
|
Williams MA, Bouchier JM, Mason AK, Brown PJB. Activation of ChvG-ChvI regulon by cell wall stress confers resistance to β-lactam antibiotics and initiates surface spreading in Agrobacterium tumefaciens. PLoS Genet 2022; 18:e1010274. [PMID: 36480495 PMCID: PMC9731437 DOI: 10.1371/journal.pgen.1010274] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Accepted: 10/28/2022] [Indexed: 12/13/2022] Open
Abstract
A core component of nearly all bacteria, the cell wall is an ideal target for broad spectrum antibiotics. Many bacteria have evolved strategies to sense and respond to antibiotics targeting cell wall synthesis, especially in the soil where antibiotic-producing bacteria compete with one another. Here we show that cell wall stress caused by both chemical and genetic inhibition of the essential, bifunctional penicillin-binding protein PBP1a prevents microcolony formation and activates the canonical host-invasion two-component system ChvG-ChvI in Agrobacterium tumefaciens. Using RNA-seq, we show that depletion of PBP1a for 6 hours results in a downregulation in transcription of flagellum-dependent motility genes and an upregulation in transcription of type VI secretion and succinoglycan biosynthesis genes, a hallmark of the ChvG-ChvI regulon. Depletion of PBP1a for 16 hours, results in differential expression of many additional genes and may promote a stress response, resembling those of sigma factors in other bacteria. Remarkably, the overproduction of succinoglycan causes cell spreading and deletion of the succinoglycan biosynthesis gene exoA restores microcolony formation. Treatment with cefsulodin phenocopies depletion of PBP1a and we correspondingly find that chvG and chvI mutants are hypersensitive to cefsulodin. This hypersensitivity only occurs in response to treatment with β-lactam antibiotics, suggesting that the ChvG-ChvI pathway may play a key role in resistance to antibiotics targeting cell wall synthesis. Finally, we provide evidence that ChvG-ChvI likely has a conserved role in conferring resistance to cell wall stress within the Alphaproteobacteria that is independent of the ChvG-ChvI repressor ExoR.
Collapse
Affiliation(s)
- Michelle A. Williams
- Division of Biological Sciences, University of Missouri-Columbia, Columbia, Missouri, United States of America
| | - Jacob M. Bouchier
- Division of Biological Sciences, University of Missouri-Columbia, Columbia, Missouri, United States of America
| | - Amara K. Mason
- Division of Biological Sciences, University of Missouri-Columbia, Columbia, Missouri, United States of America
| | - Pamela J. B. Brown
- Division of Biological Sciences, University of Missouri-Columbia, Columbia, Missouri, United States of America
| |
Collapse
|
5
|
Love O, Pacheco Lima MC, Clark C, Cornillie S, Roalstad S, Cheatham TE. Evaluating the accuracy of the AMBER protein force fields in modeling dihydrofolate reductase structures: misbalance in the conformational arrangements of the flexible loop domains. J Biomol Struct Dyn 2022:1-15. [PMID: 35838167 PMCID: PMC9840716 DOI: 10.1080/07391102.2022.2098823] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Abstract
Protein flexible loop regions were once thought to be simple linkers between other more functional secondary structural elements. However, as it becomes clearer that these loop domains are critical players in a plethora of biological processes, accurate conformational sampling of 3D loop structures is vital to the advancement of drug design techniques and the overall growth of knowledge surrounding molecular systems. While experimental techniques provide a wealth of structural information, the resolution of flexible loop domains is sometimes low or entirely absent due to their complex and dynamic nature. This highlights an opportunity for de novo structure prediction using in silico methods with molecular dynamics (MDs). This study evaluates some of the AMBER protein force field's (ffs) ability to accurately model dihydrofolate reductase (DHFR) conformations, a protein complex characterized by specific arrangements and interactions of multiple flexible loops whose conformations are determined by the presence or absence of bound ligands and cofactors. Although the AMBER ffs, including ff19SB, studied well model most protein structures with rich secondary structure, results obtained here suggest the inability to significantly sample the expected DHFR loop-loop conformations - of the six distinct protein-ligand systems simulated, a majority lacked consistent stabilization of experimentally derived metrics definitive the three enzyme conformations. Although under-sampling and the chosen ff parameter combinations could be the cause, given past successes with these MD approaches for many protein systems, this suggests a potential misbalance in available ff parameters required to accurately predict the structure of multiple flexible loop regions present in proteins.Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Olivia Love
- Department of Medicinal Chemistry, College of Pharmacy, University of Utah, Salt Lake City, UT, USA
| | | | - Casey Clark
- Department of Medicinal Chemistry, College of Pharmacy, University of Utah, Salt Lake City, UT, USA
| | - Sean Cornillie
- Department of Medicinal Chemistry, College of Pharmacy, University of Utah, Salt Lake City, UT, USA
| | - Shelly Roalstad
- Department of Medicinal Chemistry, College of Pharmacy, University of Utah, Salt Lake City, UT, USA
| | - Thomas E. Cheatham
- Department of Medicinal Chemistry, College of Pharmacy, University of Utah, Salt Lake City, UT, USA
| |
Collapse
|
6
|
Ma Q, Wang X, Luan F, Han P, Zheng X, Yin Y, Zhang X, Zhang Y, Gao X. Functional Studies on an Indel Loop between the Subtypes of meso-Diaminopimelate Dehydrogenase. ACS Catal 2022. [DOI: 10.1021/acscatal.2c01799] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
Affiliation(s)
- Qinyuan Ma
- School of Life Science and Medicine, Shandong University of Technology, Zibo 255000, China
| | - Xiaoxiao Wang
- School of Life Science and Medicine, Shandong University of Technology, Zibo 255000, China
| | - Fang Luan
- School of Life Science and Medicine, Shandong University of Technology, Zibo 255000, China
| | - Ping Han
- School of Life Science and Medicine, Shandong University of Technology, Zibo 255000, China
| | - Xue Zheng
- School of Life Science and Medicine, Shandong University of Technology, Zibo 255000, China
| | - Yanmiao Yin
- School of Life Science and Medicine, Shandong University of Technology, Zibo 255000, China
| | - Xianghe Zhang
- School of Life Science and Medicine, Shandong University of Technology, Zibo 255000, China
| | - Yàning Zhang
- School of Life Science and Medicine, Shandong University of Technology, Zibo 255000, China
| | - Xiuzhen Gao
- School of Life Science and Medicine, Shandong University of Technology, Zibo 255000, China
| |
Collapse
|
7
|
Papadopoulos C, Callebaut I, Gelly JC, Hatin I, Namy O, Renard M, Lespinet O, Lopes A. Intergenic ORFs as elementary structural modules of de novo gene birth and protein evolution. Genome Res 2021; 31:2303-2315. [PMID: 34810219 PMCID: PMC8647833 DOI: 10.1101/gr.275638.121] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/13/2021] [Accepted: 09/23/2021] [Indexed: 01/08/2023]
Abstract
The noncoding genome plays an important role in de novo gene birth and in the emergence of genetic novelty. Nevertheless, how noncoding sequences' properties could promote the birth of novel genes and shape the evolution and the structural diversity of proteins remains unclear. Therefore, by combining different bioinformatic approaches, we characterized the fold potential diversity of the amino acid sequences encoded by all intergenic open reading frames (ORFs) of S. cerevisiae with the aim of (1) exploring whether the structural states' diversity of proteomes is already present in noncoding sequences, and (2) estimating the potential of the noncoding genome to produce novel protein bricks that could either give rise to novel genes or be integrated into pre-existing proteins, thus participating in protein structure diversity and evolution. We showed that amino acid sequences encoded by most yeast intergenic ORFs contain the elementary building blocks of protein structures. Moreover, they encompass the large structural state diversity of canonical proteins, with the majority predicted as foldable. Then, we investigated the early stages of de novo gene birth by reconstructing the ancestral sequences of 70 yeast de novo genes and characterized the sequence and structural properties of intergenic ORFs with a strong translation signal. This enabled us to highlight sequence and structural factors determining de novo gene emergence. Finally, we showed a strong correlation between the fold potential of de novo proteins and one of their ancestral amino acid sequences, reflecting the relationship between the noncoding genome and the protein structure universe.
Collapse
Affiliation(s)
- Chris Papadopoulos
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Isabelle Callebaut
- Sorbonne Université, Muséum National d'Histoire Naturelle, UMR CNRS 7590, Institut de Minéralogie, de Physique des Matériaux et de Cosmochimie, IMPMC, 75005 Paris, France
| | - Jean-Christophe Gelly
- Université de Paris, Biologie Intégrée du Globule Rouge, UMR_S1134, BIGR, INSERM, F-75015 Paris, France
- Laboratoire d'Excellence GR-Ex, 75015 Paris, France
- Institut National de la Transfusion Sanguine, F-75015 Paris, France
| | - Isabelle Hatin
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Namy
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Maxime Renard
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Olivier Lespinet
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Anne Lopes
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| |
Collapse
|
8
|
Wong SWK, Liu Z. Conformational variability of loops in the SARS-CoV-2 spike protein. Proteins 2021; 90:691-703. [PMID: 34661307 PMCID: PMC8662175 DOI: 10.1002/prot.26266] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Revised: 10/05/2021] [Accepted: 10/12/2021] [Indexed: 11/07/2022]
Abstract
The SARS‐CoV‐2 spike (S) protein facilitates viral infection, and has been the focus of many structure determination efforts. Its flexible loop regions are known to be involved in protein binding and may adopt multiple conformations. This article identifies the S protein loops and studies their conformational variability based on the available Protein Data Bank structures. While most loops had essentially one stable conformation, 17 of 44 loop regions were observed to be structurally variable with multiple substantively distinct conformations based on a cluster analysis. Loop modeling methods were then applied to the S protein loop targets, and the prediction accuracies discussed in relation to the characteristics of the conformational clusters identified. Loops with multiple conformations were found to be challenging to model based on a single structural template.
Collapse
Affiliation(s)
- Samuel W. K. Wong
- Department of Statistics and Actuarial ScienceUniversity of WaterlooWaterlooCanada
| | - Zongjun Liu
- Department of Statistics and Actuarial ScienceUniversity of WaterlooWaterlooCanada
| |
Collapse
|
9
|
Patel A, McBride JAM, Mark BL. The endopeptidase of the maize-affecting Marafivirus type member maize rayado fino virus doubles as a deubiquitinase. J Biol Chem 2021; 297:100957. [PMID: 34265303 PMCID: PMC8348309 DOI: 10.1016/j.jbc.2021.100957] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2021] [Revised: 07/05/2021] [Accepted: 07/09/2021] [Indexed: 10/28/2022] Open
Abstract
Marafiviruses are capable of persistent infection in a range of plants that have importance to the agriculture and biofuel industries. Although the genomes of a few of these viruses have been studied in-depth, the composition and processing of the polyproteins produced from their main ORFs have not. The Marafivirus polyprotein consists of essential proteins that form the viral replicase, as well as structural proteins for virus assembly. It has been proposed that Marafiviruses code for cysteine proteases within their polyproteins, which act as endopeptidases to autocatalytically cleave the polyprotein into functional domains. Furthermore, it has also been suggested that Marafivirus endopeptidases may have deubiquitinating activity, which has been shown to enhance viral replication by downregulating viral protein degradation by the ubiquitin (Ub) proteasomal pathway as well as tampering with cell signaling associated with innate antiviral responses in other positive-sense ssRNA viruses. Here, we provide the first evidence of cysteine proteases from six different Marafiviruses that harbor deubiquitinating activity and reveal intragenus differences toward Ub linkage types. We also examine the structural basis of the endopeptidase/deubiquitinase from the Marafivirus type member, maize rayado fino virus. Structures of the enzyme alone and bound to Ub reveal marked structural rearrangements that occur upon binding of Ub and provide insights into substrate specificity and differences that set it apart from other viral cysteine proteases.
Collapse
Affiliation(s)
- Ankoor Patel
- Department of Microbiology, University of Manitoba, Winnipeg, Canada
| | | | - Brian L Mark
- Department of Microbiology, University of Manitoba, Winnipeg, Canada.
| |
Collapse
|
10
|
Sridharan S, Nagarajan SK, Venugopal K, Venkatasubbu GD. Time-dependent conformational analysis of ALK5-lumican complex in presence of graphene and graphene oxide employing molecular dynamics and MMPBSA calculation. J Biomol Struct Dyn 2021; 40:5932-5955. [PMID: 33507126 DOI: 10.1080/07391102.2021.1876772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022]
Abstract
Lumican, an extracellular matrix protein avails wound healing by binding to ALK5 membrane receptor (TGF-beta receptor I). Their interaction enables epithelialization and substantiates rejuvenation of injured tissue. To enrich permanence of ALK5-lumican interaction, we employed graphene and graphene oxide co-factors. Herein, this study explicates concomitancy of graphene and graphene oxide with ALK5-lumican. We performed an in silico approach involving molecular modelling, molecular docking, molecular dynamics for 200 ns, DSSP analysis and MMPBSA calculations. Results of molecular dynamics indicate cofactors influential in altering bioactive site of lumican than ALK5. Similarly, MMPBSA calculations unveiled binding energy of apoenzyme as -108.09 kcal/mol, holoenzyme (G) as -79.20 kcal/mol and holoenzyme (GO) as -114.33 kcal/mol. This concludes graphene oxide lucrative in enhancing binding energy of ALK5-lumican in holoenzyme (GO) via coil formation of Lum C13 domain. In contrast, graphene reduced binding energy of ALK5-lumican in holoenzyme (G) modifying Lum C13 into beta sheets. MMPBSA residual contribution analysis of Lum C13 residues revealed binding energy of -13.9 kcal/mol for apoenzyme, -6.8 kcal/mol for holoenzyme (G) and -19.5 kcal/mol for holoenzyme (GO). This supports coil formation propitious for better ALK5-Lum interaction. Highest SASA energy of -21.05 kcal/mol of holoenzyme (G) assures graphene reasonable for improved ALK5-lumican hydrophobicity. As per the motive of the study, graphene oxide enriches permanence of ALK5-lumican. This provides counsel for plausible exploitation of lumican and graphene oxide as targeted/nano drug delivery system to reinstate acute wounds, chronic wounds, corneal wounds, hypertrophic scars and keloids in near future. Communicated by Ramaswamy H. Sarma.
Collapse
Affiliation(s)
- Sindhiya Sridharan
- Department of Nanotechnology, SRM Institute of Science and Technology, Chengalpattu, Tamil Nadu, India
| | - Santhosh Kumar Nagarajan
- Department of Genetic Engineering, SRM Institute of Science and Technology, Chengalpattu, Tamil Nadu, India
| | - Kathirvel Venugopal
- Department of Physics and Nanotechnology, SRM Institute of Science and Technology, Chengalpattu, Tamil Nadu, India
| | - G Devanand Venkatasubbu
- Department of Nanotechnology, SRM Institute of Science and Technology, Chengalpattu, Tamil Nadu, India
| |
Collapse
|
11
|
Mitusińska K, Skalski T, Góra A. Simple Selection Procedure to Distinguish between Static and Flexible Loops. Int J Mol Sci 2020; 21:ijms21072293. [PMID: 32225102 PMCID: PMC7177474 DOI: 10.3390/ijms21072293] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2020] [Revised: 03/22/2020] [Accepted: 03/24/2020] [Indexed: 12/02/2022] Open
Abstract
Loops are the most variable and unorganized elements of the secondary structure of proteins. Their ability to shift their shape can play a role in the binding of small ligands, enzymatic catalysis, or protein–protein interactions. Due to the loop flexibility, the positions of their residues in solved structures show the largest B-factors, or in a worst-case scenario can be unknown. Based on the loops’ movements’ timeline, they can be divided into slow (static) and fast (flexible). Although most of the loops that are missing in experimental structures belong to the flexible loops group, the computational tools for loop reconstruction use a set of static loop conformations to predict the missing part of the structure and evaluate the model. We believe that these two loop types can adopt different conformations and that using scoring functions appropriate for static loops is not sufficient for flexible loops. We showed that common model evaluation methods, are insufficient in the case of flexible solvent-exposed loops. Instead, we recommend using the potential energy to evaluate such loop models. We provide a novel model selection method based on a set of geometrical parameters to distinguish between flexible and static loops without the use of molecular dynamics simulations. We have also pointed out the importance of water network and interactions with the solvent for the flexible loop modeling.
Collapse
Affiliation(s)
- Karolina Mitusińska
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, ul. Krzywoustego 8, 44-100 Gliwice, Poland;
| | - Tomasz Skalski
- Biotechnology Centre, Silesian University of Technology, ul. Krzywoustego 8, 44-100 Gliwice, Poland;
| | - Artur Góra
- Tunneling Group, Biotechnology Centre, Silesian University of Technology, ul. Krzywoustego 8, 44-100 Gliwice, Poland;
- Correspondence: ; Tel.: +48-322371659
| |
Collapse
|
12
|
Kalhor H, Poorebrahim M, Rahimi H, Shabani AA, Karimipoor M, Akbari Eidgahi MR, Teimoori-Toolabi L. Structural and dynamic characterization of human Wnt2-Fzd7 complex using computational approaches. J Mol Model 2018; 24:274. [PMID: 30191337 DOI: 10.1007/s00894-018-3788-3] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2017] [Accepted: 08/09/2018] [Indexed: 12/20/2022]
Abstract
Wnt and Frizzled (Fzd) family members play crucial roles in the self-renewal of tumor-initiating cells. Until now, only a few studies have addressed the distinct mechanism of Wnt-Fzd interactions. In this study, we suggest a possible interaction mode of Wnt2 with the Fzd7 cysteine-rich domain (CRD)-both of which are up-regulated in some types of cancer. A combination of homology modeling, molecular docking and molecular dynamics (MD) simulations was carried out to study this ligand-receptor complex in great detail. The results demonstrated the unique dynamic behavior of Wnt2 upon binding to Fzd7. Interestingly, the β-strand content of the C-terminal binding site of Wnt2 was obviously reduced when bound to Fzd7 CRD. Moreover, the N-terminal and C-terminal binding sites of Wnt2 appeared to interact with the C-terminal and N-terminal binding sites of Fzd7, respectively. Calculation of the binding energies uncovered the pivotal role of electrostatic and hydrophobic interactions in the binding of Wnt2 to Fzd7 CRD. In conclusion, this study provides valuable insights into the mechanism of the Wnt2-Fzd7 CRD interaction for application in colorectal cancer prevention programs. Graphical abstract Flowchart representation of different steps used in this study.
Collapse
Affiliation(s)
- Hourieh Kalhor
- Department and Biotechnology Research Center, Semnan University of Medical Sciences, Semnan, Iran
- Molecular Medicine Department, Pasteur Institute of Iran, Tehran, Iran
| | | | - Hamzeh Rahimi
- Molecular Medicine Department, Pasteur Institute of Iran, Tehran, Iran
| | - Ali Akbar Shabani
- Department and Biotechnology Research Center, Semnan University of Medical Sciences, Semnan, Iran
| | | | | | | |
Collapse
|
13
|
Deletion of loop fragment adjacent to active site diminishes the stability and activity of exo-inulinase. Int J Biol Macromol 2016; 92:1234-1241. [DOI: 10.1016/j.ijbiomac.2016.08.039] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2016] [Revised: 08/04/2016] [Accepted: 08/11/2016] [Indexed: 11/20/2022]
|
14
|
Papaleo E, Saladino G, Lambrughi M, Lindorff-Larsen K, Gervasio FL, Nussinov R. The Role of Protein Loops and Linkers in Conformational Dynamics and Allostery. Chem Rev 2016; 116:6391-423. [DOI: 10.1021/acs.chemrev.5b00623] [Citation(s) in RCA: 239] [Impact Index Per Article: 26.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Elena Papaleo
- Computational
Biology Laboratory, Unit of Statistics, Bioinformatics and Registry, Danish Cancer Society Research Center, Strandboulevarden 49, 2100 Copenhagen, Denmark
- Structural
Biology and NMR Laboratory, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark
| | - Giorgio Saladino
- Department
of Chemistry, University College London, London WC1E 6BT, United Kingdom
| | - Matteo Lambrughi
- Department
of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza
della Scienza 2, 20126 Milan, Italy
| | - Kresten Lindorff-Larsen
- Structural
Biology and NMR Laboratory, Department of Biology, University of Copenhagen, 2200 Copenhagen, Denmark
| | | | - Ruth Nussinov
- Cancer
and Inflammation Program, Leidos Biomedical Research, Inc., Frederick
National Laboratory for Cancer Research, National Cancer Institute Frederick, Frederick, Maryland 21702, United States
- Sackler Institute
of Molecular Medicine, Department of Human Genetics and Molecular
Medicine Sackler School of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
| |
Collapse
|
15
|
Han L, Singh S, Thorson JS, Phillips GN. Loop dynamics of thymidine diphosphate-rhamnose 3'-O-methyltransferase (CalS11), an enzyme in calicheamicin biosynthesis. STRUCTURAL DYNAMICS (MELVILLE, N.Y.) 2016; 3:012004. [PMID: 26958582 PMCID: PMC4760980 DOI: 10.1063/1.4941368] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/17/2015] [Accepted: 01/22/2016] [Indexed: 06/05/2023]
Abstract
Structure analysis and ensemble refinement of the apo-structure of thymidine diphosphate (TDP)-rhamnose 3'-O-methyltransferase reveal a gate for substrate entry and product release. TDP-rhamnose 3'-O-methyltransferase (CalS11) catalyses a 3'-O-methylation of TDP-rhamnose, an intermediate in the biosynthesis of enediyne antitumor antibiotic calicheamicin. CalS11 operates at the sugar nucleotide stage prior to glycosylation step. Here, we present the crystal structure of the apo form of CalS11 at 1.89 Å resolution. We propose that the L2 loop functions as a gate facilitating and/or providing specificity for substrate entry or promoting product release. Ensemble refinement analysis slightly improves the crystallographic refinement statistics and furthermore provides a compelling way to visualize the dynamic model of loop L2, supporting the understanding of its proposed role in catalysis.
Collapse
Affiliation(s)
- Lu Han
- Biosciences at Rice, Rice University , Houston, Texas 77005, USA
| | - Shanteri Singh
- Center for Pharmaceutical Research and Innovation, Pharmaceutical Sciences, University of Kentucky College of Pharmacy , Lexington, Kentucky 40536-0596, USA
| | - Jon S Thorson
- Center for Pharmaceutical Research and Innovation, Pharmaceutical Sciences, University of Kentucky College of Pharmacy , Lexington, Kentucky 40536-0596, USA
| | | |
Collapse
|
16
|
Iacoangeli A, Marcatili P, Tramontano A. Exploiting Homology Information in Nontemplate Based Prediction of Protein Structures. J Chem Theory Comput 2015; 11:5045-51. [DOI: 10.1021/acs.jctc.5b00371] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Alfredo Iacoangeli
- Department
of Physics, Sapienza University of Rome, P.le A. Moro 4, 00185 Rome, Italy
| | - Paolo Marcatili
- Department
of Physics, Sapienza University of Rome, P.le A. Moro 4, 00185 Rome, Italy
| | - Anna Tramontano
- Department
of Physics, Sapienza University of Rome, P.le A. Moro 4, 00185 Rome, Italy
- Istituto
Pasteur Fondazione Cenci Bolognetti, Sapienza University of Rome, P.le
A. Moro 4, 00185 Rome, Italy
| |
Collapse
|
17
|
Bonet J, Segura J, Planas-Iglesias J, Oliva B, Fernandez-Fuentes N. Frag’r’Us: knowledge-based sampling of protein backbone conformations for de novo structure-based protein design. Bioinformatics 2014; 30:1935-6. [DOI: 10.1093/bioinformatics/btu129] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
|
18
|
Bonet J, Fiser A, Oliva B, Fernandez-Fuentes N. Smotifs as structural local descriptors of supersecondary elements: classification, completeness and applications. BIO-ALGORITHMS AND MED-SYSTEMS 2014. [DOI: 10.1515/bams-2014-0016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
AbstractProtein structures are made up of periodic and aperiodic structural elements (i.e., α-helices, β-strands and loops). Despite the apparent lack of regular structure, loops have specific conformations and play a central role in the folding, dynamics, and function of proteins. In this article, we reviewed our previous works in the study of protein loops as local supersecondary structural motifs or Smotifs. We reexamined our works about the structural classification of loops (ArchDB) and its application to loop structure prediction (ArchPRED), including the assessment of the limits of knowledge-based loop structure prediction methods. We finalized this article by focusing on the modular nature of proteins and how the concept of Smotifs provides a convenient and practical approach to decompose proteins into strings of concatenated Smotifs and how can this be used in computational protein design and protein structure prediction.
Collapse
|
19
|
Bonet J, Planas-Iglesias J, Garcia-Garcia J, Marín-López MA, Fernandez-Fuentes N, Oliva B. ArchDB 2014: structural classification of loops in proteins. Nucleic Acids Res 2013; 42:D315-9. [PMID: 24265221 PMCID: PMC3964960 DOI: 10.1093/nar/gkt1189] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023] Open
Abstract
The function of a protein is determined by its three-dimensional structure, which is formed by regular (i.e. β-strands and α-helices) and non-periodic structural units such as loops. Compared to regular structural elements, non-periodic, non-repetitive conformational units enclose a much higher degree of variability—raising difficulties in the identification of regularities, and yet represent an important part of the structure of a protein. Indeed, loops often play a pivotal role in the function of a protein and different aspects of protein folding and dynamics. Therefore, the structural classification of protein loops is an important subject with clear applications in homology modelling, protein structure prediction, protein design (e.g. enzyme design and catalytic loops) and function prediction. ArchDB, the database presented here (freely available at http://sbi.imim.es/archdb), represents such a resource and has been an important asset for the scientific community throughout the years. In this article, we present a completely reworked and updated version of ArchDB. The new version of ArchDB features a novel, fast and user-friendly web-based interface, and a novel graph-based, computationally efficient, clustering algorithm. The current version of ArchDB classifies 149,134 loops in 5739 classes and 9608 subclasses.
Collapse
Affiliation(s)
- Jaume Bonet
- Structural Bioinformatics Lab (GRIB-IMIM), Universitat Pompeu Fabra, Barcelona Research Park of Biomedicine (PRBB), Barcelona, Catalonia, 08950, Spain and Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, SY23 3DA Aberystwyth, Ceredigion, UK
| | | | | | | | | | | |
Collapse
|
20
|
Li Y. Conformational sampling in template-free protein loop structure modeling: an overview. Comput Struct Biotechnol J 2013; 5:e201302003. [PMID: 24688696 PMCID: PMC3962101 DOI: 10.5936/csbj.201302003] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2012] [Revised: 01/23/2013] [Accepted: 01/28/2013] [Indexed: 01/04/2023] Open
Abstract
Accurately modeling protein loops is an important step to predict three-dimensional structures as well as to understand functions of many proteins. Because of their high flexibility, modeling the three-dimensional structures of loops is difficult and is usually treated as a "mini protein folding problem" under geometric constraints. In the past decade, there has been remarkable progress in template-free loop structure modeling due to advances of computational methods as well as stably increasing number of known structures available in PDB. This mini review provides an overview on the recent computational approaches for loop structure modeling. In particular, we focus on the approaches of sampling loop conformation space, which is a critical step to obtain high resolution models in template-free methods. We review the potential energy functions for loop modeling, loop buildup mechanisms to satisfy geometric constraints, and loop conformation sampling algorithms. The recent loop modeling results are also summarized.
Collapse
Affiliation(s)
- Yaohang Li
- Department of Computer Science, Old Dominion University, Norfolk, VA 23529, USA
| |
Collapse
|
21
|
Papaleo E, Casiraghi N, Arrigoni A, Vanoni M, Coccetti P, De Gioia L. Loop 7 of E2 enzymes: an ancestral conserved functional motif involved in the E2-mediated steps of the ubiquitination cascade. PLoS One 2012; 7:e40786. [PMID: 22815819 PMCID: PMC3399832 DOI: 10.1371/journal.pone.0040786] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2012] [Accepted: 06/12/2012] [Indexed: 12/31/2022] Open
Abstract
The ubiquitin (Ub) system controls almost every aspect of eukaryotic cell biology. Protein ubiquitination depends on the sequential action of three classes of enzymes (E1, E2 and E3). E2 Ub-conjugating enzymes have a central role in the ubiquitination pathway, interacting with both E1 and E3, and influencing the ultimate fate of the substrates. Several E2s are characterized by an extended acidic insertion in loop 7 (L7), which if mutated is known to impair the proper E2-related functions. In the present contribution, we show that acidic loop is a conserved ancestral motif in E2s, relying on the presence of alternate hydrophobic and acidic residues. Moreover, the dynamic properties of a subset of family 3 E2s, as well as their binary and ternary complexes with Ub and the cognate E3, have been investigated. Here we provide a model of L7 role in the different steps of the ubiquitination cascade of family 3 E2s. The L7 hydrophobic residues turned out to be the main determinant for the stabilization of the E2 inactive conformations by a tight network of interactions in the catalytic cleft. Moreover, phosphorylation is known from previous studies to promote E2 competent conformations for Ub charging, inducing electrostatic repulsion and acting on the L7 acidic residues. Here we show that these active conformations are stabilized by a network of hydrophobic interactions between L7 and L4, the latter being a conserved interface for E3-recruitment in several E2s. In the successive steps, L7 conserved acidic residues also provide an interaction interface for both Ub and the Rbx1 RING subdomain of the cognate E3. Our data therefore suggest a crucial role for L7 of family 3 E2s in all the E2-mediated steps of the ubiquitination cascade. Its different functions are exploited thank to its conserved hydrophobic and acidic residues in a finely orchestrate mechanism.
Collapse
Affiliation(s)
- Elena Papaleo
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Milan, Italy.
| | | | | | | | | | | |
Collapse
|
22
|
Regad L, Martin J, Camproux AC. Dissecting protein loops with a statistical scalpel suggests a functional implication of some structural motifs. BMC Bioinformatics 2011; 12:247. [PMID: 21689388 PMCID: PMC3158783 DOI: 10.1186/1471-2105-12-247] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2010] [Accepted: 06/20/2011] [Indexed: 12/24/2022] Open
Abstract
Background One of the strategies for protein function annotation is to search particular structural motifs that are known to be shared by proteins with a given function. Results Here, we present a systematic extraction of structural motifs of seven residues from protein loops and we explore their correspondence with functional sites. Our approach is based on the structural alphabet HMM-SA (Hidden Markov Model - Structural Alphabet), which allows simplification of protein structures into uni-dimensional sequences, and advanced pattern statistics adapted to short sequences. Structural motifs of interest are selected by looking for structural motifs significantly over-represented in SCOP superfamilies in protein loops. We discovered two types of structural motifs significantly over-represented in SCOP superfamilies: (i) ubiquitous motifs, shared by several superfamilies and (ii) superfamily-specific motifs, over-represented in few superfamilies. A comparison of ubiquitous words with known small structural motifs shows that they contain well-described motifs as turn, niche or nest motifs. A comparison between superfamily-specific motifs and biological annotations of Swiss-Prot reveals that some of them actually correspond to functional sites involved in the binding sites of small ligands, such as ATP/GTP, NAD(P) and SAH/SAM. Conclusions Our findings show that statistical over-representation in SCOP superfamilies is linked to functional features. The detection of over-represented motifs within structures simplified by HMM-SA is therefore a promising approach for prediction of functional sites and annotation of uncharacterized proteins.
Collapse
|
23
|
Cerdà-Costa N, Bonet J, Fernández MR, Avilés FX, Oliva B, Villegas S. Prediction of a new class of RNA recognition motif. J Mol Model 2010; 17:1863-75. [PMID: 21082207 DOI: 10.1007/s00894-010-0888-0] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2010] [Accepted: 10/21/2010] [Indexed: 10/18/2022]
Abstract
The observation that activation domains (AD) of procarboxypeptidases are rather long compared to the pro-regions of other zymogens raises the possibility that they could play additional roles apart from precluding enzymatic activity within the proenzyme and helping in its folding process. In the present work, we compared the overall pro-domain tertiary structure with several proteins belonging to the same fold in the structural classification of proteins (SCOP) database by using structure and sequence comparisons. The best score obtained was between the activation domain of human procarboxypeptidase A4 (ADA4h) and the human U1A protein from the U1 snRNP. Structural alignment revealed the existence of RNP1- and RNP2-related sequences in ADA4h. After modeling ADA4h on U1A, the new structure was used to extract a new sequence pattern characteristic for important residues at key positions. The new sequence pattern allowed scanning protein sequences to predict the RNA-binding function for 32 sequences undetected by PFAM. Unspecific RNA electrophoretic mobility shift assays experimentally supported the prediction that ADA4h binds an RNA motif similar to the U1A binding-motif of stem-loop II of U1 small nuclear RNA. The experiments carried out with ADA4h in the present work suggest the sharing of a common ancestor with other RNA recognition motifs. However, the fact that key residues preventing activity within the proenzyme are also key residues for RNA binding might have induced the activation domains of procarboxypeptidases to evolve from the canonical RNP1 and RNP2 sequences.
Collapse
Affiliation(s)
- Núria Cerdà-Costa
- Departament de Bioquímica i Biologia Molecular, Unitat de Biociències, Universitat Autònoma de Barcelona, 08193 Cerdanyola del Vallès, Spain
| | | | | | | | | | | |
Collapse
|
24
|
Wohlkönig A, Huet J, Looze Y, Wintjens R. Structural relationships in the lysozyme superfamily: significant evidence for glycoside hydrolase signature motifs. PLoS One 2010; 5:e15388. [PMID: 21085702 PMCID: PMC2976769 DOI: 10.1371/journal.pone.0015388] [Citation(s) in RCA: 80] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2010] [Accepted: 08/31/2010] [Indexed: 11/19/2022] Open
Abstract
BACKGROUND Chitin is a polysaccharide that forms the hard, outer shell of arthropods and the cell walls of fungi and some algae. Peptidoglycan is a polymer of sugars and amino acids constituting the cell walls of most bacteria. Enzymes that are able to hydrolyze these cell membrane polymers generally play important roles for protecting plants and animals against infection with insects and pathogens. A particular group of such glycoside hydrolase enzymes share some common features in their three-dimensional structure and in their molecular mechanism, forming the lysozyme superfamily. RESULTS Besides having a similar fold, all known catalytic domains of glycoside hydrolase proteins of lysozyme superfamily (families and subfamilies GH19, GH22, GH23, GH24 and GH46) share in common two structural elements: the central helix of the all-α domain, which invariably contains the catalytic glutamate residue acting as general-acid catalyst, and a β-hairpin pointed towards the substrate binding cleft. The invariant β-hairpin structure is interestingly found to display the highest amino acid conservation in aligned sequences of a given family, thereby allowing to define signature motifs for each GH family. Most of such signature motifs are found to have promising performances for searching sequence databases. Our structural analysis further indicates that the GH motifs participate in enzymatic catalysis essentially by containing the catalytic water positioning residue of inverting mechanism. CONCLUSIONS The seven families and subfamilies of the lysozyme superfamily all have in common a β-hairpin structure which displays a family-specific sequence motif. These GH β-hairpin motifs contain potentially important residues for the catalytic activity, thereby suggesting the participation of the GH motif to catalysis and also revealing a common catalytic scheme utilized by enzymes of the lysozyme superfamily.
Collapse
Affiliation(s)
- Alexandre Wohlkönig
- Structural Biology Brussels and Molecular and Cellular Interactions, VIB, Brussels, Belgium
| | - Joëlle Huet
- Laboratoire de Chimie Générale, Institut de Pharmacie, Université Libre de Bruxelles, Brussels, Belgium
| | - Yvan Looze
- Laboratoire de Chimie Générale, Institut de Pharmacie, Université Libre de Bruxelles, Brussels, Belgium
| | - René Wintjens
- Laboratoire de Chimie Générale, Institut de Pharmacie, Université Libre de Bruxelles, Brussels, Belgium
- Interdisciplinary Research Institute, USR 3078 CNRS, Villeneuve d'Ascq, France
| |
Collapse
|
25
|
Marsico A, Henschel A, Winter C, Tuukkanen A, Vassilev B, Scheubert K, Schroeder M. Structural fragment clustering reveals novel structural and functional motifs in alpha-helical transmembrane proteins. BMC Bioinformatics 2010; 11:204. [PMID: 20420672 PMCID: PMC2876129 DOI: 10.1186/1471-2105-11-204] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/18/2009] [Accepted: 04/26/2010] [Indexed: 11/23/2022] Open
Abstract
Background A large proportion of an organism's genome encodes for membrane proteins. Membrane proteins are important for many cellular processes, and several diseases can be linked to mutations in them. With the tremendous growth of sequence data, there is an increasing need to reliably identify membrane proteins from sequence, to functionally annotate them, and to correctly predict their topology. Results We introduce a technique called structural fragment clustering, which learns sequential motifs from 3D structural fragments. From over 500,000 fragments, we obtain 213 statistically significant, non-redundant, and novel motifs that are highly specific to α-helical transmembrane proteins. From these 213 motifs, 58 of them were assigned to function and checked in the scientific literature for a biological assessment. Seventy percent of the motifs are found in co-factor, ligand, and ion binding sites, 30% at protein interaction interfaces, and 12% bind specific lipids such as glycerol or cardiolipins. The vast majority of motifs (94%) appear across evolutionarily unrelated families, highlighting the modularity of functional design in membrane proteins. We describe three novel motifs in detail: (1) a dimer interface motif found in voltage-gated chloride channels, (2) a proton transfer motif found in heme-copper oxidases, and (3) a convergently evolved interface helix motif found in an aspartate symporter, a serine protease, and cytochrome b. Conclusions Our findings suggest that functional modules exist in membrane proteins, and that they occur in completely different evolutionary contexts and cover different binding sites. Structural fragment clustering allows us to link sequence motifs to function through clusters of structural fragments. The sequence motifs can be applied to identify and characterize membrane proteins in novel genomes.
Collapse
Affiliation(s)
- Annalisa Marsico
- Bioinformatics department, Biotechnology Center TU Dresden, Dresden, Germany
| | | | | | | | | | | | | |
Collapse
|
26
|
Mining protein loops using a structural alphabet and statistical exceptionality. BMC Bioinformatics 2010; 11:75. [PMID: 20132552 PMCID: PMC2833150 DOI: 10.1186/1471-2105-11-75] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2009] [Accepted: 02/04/2010] [Indexed: 12/21/2022] Open
Abstract
Background Protein loops encompass 50% of protein residues in available three-dimensional structures. These regions are often involved in protein functions, e.g. binding site, catalytic pocket... However, the description of protein loops with conventional tools is an uneasy task. Regular secondary structures, helices and strands, have been widely studied whereas loops, because they are highly variable in terms of sequence and structure, are difficult to analyze. Due to data sparsity, long loops have rarely been systematically studied. Results We developed a simple and accurate method that allows the description and analysis of the structures of short and long loops using structural motifs without restriction on loop length. This method is based on the structural alphabet HMM-SA. HMM-SA allows the simplification of a three-dimensional protein structure into a one-dimensional string of states, where each state is a four-residue prototype fragment, called structural letter. The difficult task of the structural grouping of huge data sets is thus easily accomplished by handling structural letter strings as in conventional protein sequence analysis. We systematically extracted all seven-residue fragments in a bank of 93000 protein loops and grouped them according to the structural-letter sequence, named structural word. This approach permits a systematic analysis of loops of all sizes since we consider the structural motifs of seven residues rather than complete loops. We focused the analysis on highly recurrent words of loops (observed more than 30 times). Our study reveals that 73% of loop-lengths are covered by only 3310 highly recurrent structural words out of 28274 observed words). These structural words have low structural variability (mean RMSd of 0.85 Å). As expected, half of these motifs display a flanking-region preference but interestingly, two thirds are shared by short (less than 12 residues) and long loops. Moreover, half of recurrent motifs exhibit a significant level of amino-acid conservation with at least four significant positions and 87% of long loops contain at least one such word. We complement our analysis with the detection of statistically over-represented patterns of structural letters as in conventional DNA sequence analysis. About 30% (930) of structural words are over-represented, and cover about 40% of loop lengths. Interestingly, these words exhibit lower structural variability and higher sequential specificity, suggesting structural or functional constraints. Conclusions We developed a method to systematically decompose and study protein loops using recurrent structural motifs. This method is based on the structural alphabet HMM-SA and not on structural alignment and geometrical parameters. We extracted meaningful structural motifs that are found in both short and long loops. To our knowledge, it is the first time that pattern mining helps to increase the signal-to-noise ratio in protein loops. This finding helps to better describe protein loops and might permit to decrease the complexity of long-loop analysis. Detailed results are available at http://www.mti.univ-paris-diderot.fr/publication/supplementary/2009/ACCLoop/.
Collapse
|
27
|
Tyagi M, Bornot A, Offmann B, de Brevern AG. Analysis of loop boundaries using different local structure assignment methods. Protein Sci 2009; 18:1869-81. [PMID: 19606500 DOI: 10.1002/pro.198] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Loops connect regular secondary structures. In many instances, they are known to play important biological roles. Analysis and prediction of loop conformations depend directly on the definition of repetitive structures. Nonetheless, the secondary structure assignment methods (SSAMs) often lead to divergent assignments. In this study, we analyzed, both structure and sequence point of views, how the divergence between different SSAMs affect boundary definitions of loops connecting regular secondary structures. The analysis of SSAMs underlines that no clear consensus between the different SSAMs can be easily found. Because these latter greatly influence the loop boundary definitions, important variations are indeed observed, that is, capping positions are shifted between different SSAMs. On the other hand, our results show that the sequence information in these capping regions are more stable than expected, and, classical and equivalent sequence patterns were found for most of the SSAMs. This is, to our knowledge, the most exhaustive survey in this field as (i) various databank have been used leading to similar results without implication of protein redundancy and (ii) the first time various SSAMs have been used. This work hence gives new insights into the difficult question of assignment of repetitive structures and addresses the issue of loop boundaries definition. Although SSAMs give very different local structure assignments capping sequence patterns remain efficiently stable.
Collapse
Affiliation(s)
- Manoj Tyagi
- Laboratoire de Biochimie et Génétique Moléculaire, Université de La Réunion, BP 7151, 15 avenue René Cassin, 97715 Saint Denis Messag Cedex 09, La Réunion, France
| | | | | | | |
Collapse
|
28
|
Bioinformatics annotation of the hypothetical proteins found by omics techniques can help to disclose additional virulence factors. Curr Microbiol 2009; 59:451-6. [PMID: 19636617 DOI: 10.1007/s00284-009-9459-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2009] [Accepted: 07/07/2009] [Indexed: 01/17/2023]
Abstract
The advent of genomics should have facilitated the identification of microbial virulence factors, a key objective for vaccine design. When the bacterial pathogen infects the host it expresses a set of genes, a number of them being virulence factors. Among the genes identified by techniques as microarrays, in vivo expression technology, signature-tagged mutagenesis and differential fluorescence induction there are many related to cellular stress, basal metabolism, etc., which cannot be directly involved in virulence, or at least cannot be considered useful candidates to be deleted for designing a live attenuated vaccine. Among the genes disclosed by these methodologies there are a number of hypothetical or unknown proteins. As they can hide some true virulence factors, we have reannotated all of these hypothetical proteins from several respiratory pathogens by a careful and in-depth analysis of each one. Although some of the re-annotations match with functions that can be related to microbial virulence, the identification of virulence factors remains difficult.
Collapse
|
29
|
Petrey D, Honig B. Is protein classification necessary? Toward alternative approaches to function annotation. Curr Opin Struct Biol 2009; 19:363-8. [PMID: 19269161 PMCID: PMC2745633 DOI: 10.1016/j.sbi.2009.02.001] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2009] [Accepted: 02/02/2009] [Indexed: 11/16/2022]
Abstract
The current nonredundant protein sequence database contains over seven million entries and the number of individual functional domains is significantly larger than this value. The vast quantity of data associated with these proteins poses enormous challenges to any attempt at function annotation. Classification of proteins into sequence and structural groups has been widely used as an approach to simplifying the problem. In this article we question such strategies. We describe how the multifunctionality and structural diversity of even closely related proteins confounds efforts to assign function on the basis of overall sequence or structural similarity. Rather, we suggest that strategies that avoid classification may offer a more robust approach to protein function annotation.
Collapse
Affiliation(s)
- Donald Petrey
- Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Center for Computational Biology and Bioinformatics, Columbia University, New York, NY 10032, USA
| | | |
Collapse
|
30
|
Manikandan K, Pal D, Ramakumar S, Brener NE, Iyengar SS, Seetharaman G. Functionally important segments in proteins dissected using Gene Ontology and geometric clustering of peptide fragments. Genome Biol 2008; 9:R52. [PMID: 18331637 PMCID: PMC2397504 DOI: 10.1186/gb-2008-9-3-r52] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2007] [Revised: 02/24/2008] [Accepted: 03/10/2008] [Indexed: 11/25/2022] Open
Abstract
A geometric clustering algorithm has been developed to dissect protein fragments based on their relevance to function. We have developed a geometric clustering algorithm using backbone φ,ψ angles to group conformationally similar peptide fragments of any length. By labeling each fragment in the cluster with the level-specific Gene Ontology 'molecular function' term of its protein, we are able to compute statistics for molecular function-propensity and p-value of individual fragments in the cluster. Clustering-cum-statistical analysis for peptide fragments 8 residues in length and with only trans peptide bonds shows that molecular function propensities ≥20 and p-values ≤0.05 can dissect fragments within a protein linked to the molecular function.
Collapse
|
31
|
Lin MS, Head-Gordon T. Improved Energy Selection of Nativelike Protein Loops from Loop Decoys. J Chem Theory Comput 2008; 4:515-21. [DOI: 10.1021/ct700292u] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Affiliation(s)
- Matthew S. Lin
- UCSF/UCB Joint Graduate Group in Bioengineering, Berkeley, California 94720, and Department of Bioengineering, University of California, Berkeley, California 94720
| | - Teresa Head-Gordon
- UCSF/UCB Joint Graduate Group in Bioengineering, Berkeley, California 94720, and Department of Bioengineering, University of California, Berkeley, California 94720
| |
Collapse
|
32
|
Hermoso A, Espadaler J, Enrique Querol E, Aviles FX, Sternberg MJ, Oliva B, Fernandez-Fuentes N. Including Functional Annotations and Extending the Collection of Structural Classifications of Protein Loops (ArchDB). Bioinform Biol Insights 2008. [DOI: 10.1177/117793220700100004] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
Abstract
Loops represent an important part of protein structures. The study of loop is critical for two main reasons: First, loops are often involved in protein function, stability and folding. Second, despite improvements in experimental and computational structure prediction methods, modeling the conformation of loops remains problematic. Here, we present a structural classification of loops, ArchDB, a mine of information with application in both mentioned fields: loop structure prediction and function prediction. ArchDB ( http://sbi.imim.es/archdb ) is a database of classified protein loop motifs. The current database provides four different classification sets tailored for different purposes. ArchDB-40, a loop classification derived from SCOP40, well suited for modeling common loop motifs. Since features relevant to loop structure or function can be more easily determined on well-populated clusters, we have developed ArchDB-95, a loop classification derived from SCOP95. This new classification set shows a ~40% increase in the number of subclasses, and a large 7-fold increase in the number of putative structure/function-related subclasses. We also present ArchDB-EC, a classification of loop motifs from enzymes, and ArchDB-KI, a manually annotated classification of loop motifs from kinases. Information about ligand contacts and PDB sites has been included in all classification sets. Improvements in our classification scheme are described, as well as several new database features, such as the ability to query by conserved annotations, sequence similarity, or uploading 3D coordinates of a protein. The lengths of classified loops range between 0 and 36 residues long. ArchDB offers an exhaustive sampling of loop structures. Functional information about loops and links with related biological databases are also provided. All this information and the possibility to browse/query the database through a web-server outline an useful tool with application in the comparative study of loops, the analysis of loops involved in protein function and to obtain templates for loop modeling.
Collapse
Affiliation(s)
- Antoni Hermoso
- Laboratori de Bioinformàtica, Institut de Biomedicina I Biotecnologia, Universitat Autònoma de Barcelona, Bellaterra 08193, Catalonia. Spain
| | - Jordi Espadaler
- Laboratori de Bioinformàtica, Institut de Biomedicina I Biotecnologia, Universitat Autònoma de Barcelona, Bellaterra 08193, Catalonia. Spain
- Laboratori de Bioinformàtica Estructural (GRIB), Universitat Pompeu Fabra/IMIM, Parc de Recerca Biomèdica de Barcelona, Barcelona 08003, Catalonia, Spain
| | - E Enrique Querol
- Laboratori de Bioinformàtica, Institut de Biomedicina I Biotecnologia, Universitat Autònoma de Barcelona, Bellaterra 08193, Catalonia. Spain
| | - Francesc X. Aviles
- Laboratori de Bioinformàtica, Institut de Biomedicina I Biotecnologia, Universitat Autònoma de Barcelona, Bellaterra 08193, Catalonia. Spain
| | - Michael J.E. Sternberg
- Structural Bioinformatics Group, Department of Biological Sciences, Imperial College, London SW7 2AZ, U.K
| | - Baldomero Oliva
- Laboratori de Bioinformàtica Estructural (GRIB), Universitat Pompeu Fabra/IMIM, Parc de Recerca Biomèdica de Barcelona, Barcelona 08003, Catalonia, Spain
| | - Narcis Fernandez-Fuentes
- Leeds Institute of Molecular Medicine, Section of Experimental Therapeutics, St. James University Hospital, Leeds LS7 9TF. U.K
| |
Collapse
|