1
|
New protein families with hendecad coiled coils in the proteome of life. J Struct Biol 2023; 215:108007. [PMID: 37524272 DOI: 10.1016/j.jsb.2023.108007] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/28/2023] [Revised: 06/30/2023] [Accepted: 07/28/2023] [Indexed: 08/02/2023]
Abstract
Coiled coils are a widespread and well understood protein fold. Their short and simple repeats underpin considerable structural and functional diversity. The vast majority of coiled coils consist of 7-residue (heptad) sequence repeats, but in essence most combinations of 3- and 4-residue segments, each starting with a residue of the hydrophobic core, are compatible with coiled-coil structure. The most frequent among these other repeat patterns are 11-residue (hendecad, 3 + 4 + 4) repeats. Hendecads are frequently found in low copy number, interspersed between heptads, but some proteins consist largely or entirely of hendecad repeats. Here we describe the first large-scale survey of these proteins in the proteome of life. For this, we scanned the protein sequence database for sequences with 11-residue periodicity that lacked β-strand prediction. We then clustered these by pairwise similarity to construct a map of potential hendecad coiled-coil families. Here we discuss these according to their structural properties, their potential cellular roles, and the evolutionary mechanisms shaping their diversity. We note in particular the continuous amplification of hendecads, both within existing proteins and de novo from previously non-coding sequence, as a powerful mechanism in the genesis of new coiled-coil forms.
Collapse
|
2
|
The design of functional proteins using tensorized energy calculations. CELL REPORTS METHODS 2023; 3:100560. [PMID: 37671023 PMCID: PMC10475850 DOI: 10.1016/j.crmeth.2023.100560] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Revised: 05/25/2023] [Accepted: 07/21/2023] [Indexed: 09/07/2023]
Abstract
In protein design, the energy associated with a huge number of sequence-conformer perturbations has to be routinely estimated. Hence, enhancing the throughput and accuracy of these energy calculations can profoundly improve design success rates and enable tackling more complex design problems. In this work, we explore the possibility of tensorizing the energy calculations and apply them in a protein design framework. We use this framework to design enhanced proteins with anti-cancer and radio-tracing functions. Particularly, we designed multispecific binders against ligands of the epidermal growth factor receptor (EGFR), where the tested design could inhibit EGFR activity in vitro and in vivo. We also used this method to design high-affinity Cu2+ binders that were stable in serum and could be readily loaded with copper-64 radionuclide. The resulting molecules show superior functional properties for their respective applications and demonstrate the generalizable potential of the described protein design approach.
Collapse
|
3
|
A conserved motif suggests a common origin for a group of proteins involved in the cell division of Gram-positive bacteria. PLoS One 2023; 18:e0273136. [PMID: 36662698 PMCID: PMC9858780 DOI: 10.1371/journal.pone.0273136] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2022] [Accepted: 12/29/2022] [Indexed: 01/21/2023] Open
Abstract
DivIVA, GpsB, FilP, and Scy are all involved in bacterial cell division. They have been reported to interact with each other, and although they have been the subject of considerable research interest, not much is known about the molecular basis for their biological activity. Although they show great variability in taxonomic occurrence, phenotypic profile, and molecular properties, we find that they nevertheless share a conserved N-terminal sequence motif, which points to a common evolutionary origin. The motif always occurs N-terminally to a coiled-coil helix that mediates dimerization. We define the motif and coiled coil jointly as a new domain, which we name DivIVA-like. In a large-scale survey of this domain in the protein sequence database, we identify a new family of proteins potentially involved in cell division, whose members, unlike all other DivIVA-like proteins, have between 2 and 8 copies of the domain in tandem. AlphaFold models indicate that the domains in these proteins assemble within a single chain, therefore not mediating dimerization.
Collapse
|
4
|
Identification of the adhesive domain of AtaA from Acinetobacter sp. Tol 5 and its application in immobilizing Escherichia coli. Front Bioeng Biotechnol 2023; 10:1095057. [PMID: 36698637 PMCID: PMC9868564 DOI: 10.3389/fbioe.2022.1095057] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2022] [Accepted: 12/21/2022] [Indexed: 01/11/2023] Open
Abstract
Cell immobilization is an important technique for efficiently utilizing whole-cell biocatalysts. We previously invented a method for bacterial cell immobilization using AtaA, a trimeric autotransporter adhesin from the highly sticky bacterium Acinetobacter sp. Tol 5. However, except for Acinetobacter species, only one bacterium has been successfully immobilized using AtaA. This is probably because the heterologous expression of large AtaA (1 MDa), that is a homotrimer of polypeptide chains composed of 3,630 amino acids, is difficult. In this study, we identified the adhesive domain of AtaA and constructed a miniaturized AtaA (mini-AtaA) to improve the heterologous expression of ataA. In-frame deletion mutants were used to perform functional mapping, revealing that the N-terminal head domain is essential for the adhesive feature of AtaA. The mini-AtaA, which contains a homotrimer of polypeptide chains from 775 amino acids and lacks the unnecessary part for its adhesion, was properly expressed in E. coli, and a larger amount of molecules was displayed on the cell surface than that of full-length AtaA (FL-AtaA). The immobilization ratio of E. coli cells expressing mini-AtaA on a polyurethane foam support was significantly higher compared to the cells with or without FL-AtaA expression, respectively. The expression of mini-AtaA in E. coli had little effect on the cell growth and the activity of another enzyme reflecting the production level, and the immobilized E. coli cells could be used for repetitive enzymatic reactions as a whole-cell catalyst.
Collapse
|
5
|
New β-Propellers Are Continuously Amplified From Single Blades in all Major Lineages of the β-Propeller Superfamily. Front Mol Biosci 2022; 9:895496. [PMID: 35755816 PMCID: PMC9218822 DOI: 10.3389/fmolb.2022.895496] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2022] [Accepted: 05/13/2022] [Indexed: 11/13/2022] Open
Abstract
β-Propellers are toroidal folds, in which consecutive supersecondary structure units of four anti-parallel β-strands-called blades-are arranged radially around a central axis. Uniquely among toroidal folds, blades span the full range of sequence symmetry, from near identity to complete divergence, indicating an ongoing process of amplification and differentiation. We have proposed that the major lineages of β-propellers arose through this mechanism and that therefore their last common ancestor was a single blade, not a fully formed β-propeller. Here we show that this process of amplification and differentiation is also widespread within individual lineages, yielding β-propellers with blades of more than 60% pairwise sequence identity in most major β-propeller families. In some cases, the blades are nearly identical, indicating a very recent amplification event, but even in cases where such recently amplified β-propellers have more than 80% overall sequence identity to each other, comparison of their DNA sequence shows that the amplification occurred independently.
Collapse
|
6
|
A topological refactoring design strategy yields highly stable granulopoietic proteins. Nat Commun 2022; 13:2948. [PMID: 35618709 PMCID: PMC9135769 DOI: 10.1038/s41467-022-30157-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 04/19/2022] [Indexed: 11/09/2022] Open
Abstract
Protein therapeutics frequently face major challenges, including complicated production, instability, poor solubility, and aggregation. De novo protein design can readily address these challenges. Here, we demonstrate the utility of a topological refactoring strategy to design novel granulopoietic proteins starting from the granulocyte-colony stimulating factor (G-CSF) structure. We change a protein fold by rearranging the sequence and optimising it towards the new fold. Testing four designs, we obtain two that possess nanomolar activity, the most active of which is highly thermostable and protease-resistant, and matches its designed structure to atomic accuracy. While the designs possess starkly different sequence and structure from the native G-CSF, they show specific activity in differentiating primary human haematopoietic stem cells into mature neutrophils. The designs also show significant and specific activity in vivo. Our topological refactoring approach is largely independent of sequence or structural context, and is therefore applicable to a wide range of protein targets. Skokowa et al. reconstruct the fold of a granulopoietic cytokine, resulting in de novo, hyperstable, highly active proteins with therapeutic potential for treating several neutropenia disorders.
Collapse
|
7
|
De novo design of growth factor inhibiting proteins. KLINISCHE PADIATRIE 2022. [DOI: 10.1055/s-0042-1748729] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
8
|
De novo design of cytokines, antikines, and novokines. KLINISCHE PADIATRIE 2022. [DOI: 10.1055/s-0042-1748728] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/18/2022]
|
9
|
RpuS/R Is a Novel Two-Component Signal Transduction System That Regulates the Expression of the Pyruvate Symporter MctP in Sinorhizobium fredii NGR234. Front Microbiol 2022; 13:871077. [PMID: 35572670 PMCID: PMC9100948 DOI: 10.3389/fmicb.2022.871077] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2022] [Accepted: 04/11/2022] [Indexed: 11/16/2022] Open
Abstract
The SLC5/STAC histidine kinases comprise a recently identified family of sensor proteins in two-component signal transduction systems (TCSTS), in which the signaling domain is fused to an SLC5 solute symporter domain through a STAC domain. Only two members of this family have been characterized experimentally, the CrbS/R system that regulates acetate utilization in Vibrio and Pseudomonas, and the CbrA/B system that regulates the utilization of histidine in Pseudomonas and glucose in Azotobacter. In an attempt to expand the characterized members of this family beyond the Gammaproteobacteria, we identified two putative TCSTS in the Alphaproteobacterium Sinorhizobium fredii NGR234 whose sensor histidine kinases belong to the SLC5/STAC family. Using reverse genetics, we were able to identify the first TCSTS as a CrbS/R homolog that is also needed for growth on acetate, while the second TCSTS, RpuS/R, is a novel system required for optimal growth on pyruvate. Using RNAseq and transcriptional fusions, we determined that in S. fredii the RpuS/R system upregulates the expression of an operon coding for the pyruvate symporter MctP when pyruvate is the sole carbon source. In addition, we identified a conserved DNA sequence motif in the putative promoter region of the mctP operon that is essential for the RpuR-mediated transcriptional activation of genes under pyruvate-utilizing conditions. Finally, we show that S. fredii mutants lacking these TCSTS are affected in nodulation, producing fewer nodules than the parent strain and at a slower rate.
Collapse
|
10
|
Exploring protein-protein interactions at the proteome level. Structure 2022; 30:462-475. [DOI: 10.1016/j.str.2022.02.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 10/26/2021] [Accepted: 02/02/2022] [Indexed: 02/08/2023]
|
11
|
Target highlights in CASP14: Analysis of models by structure providers. Proteins 2021; 89:1647-1672. [PMID: 34561912 PMCID: PMC8616854 DOI: 10.1002/prot.26247] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/08/2021] [Revised: 09/13/2021] [Accepted: 09/16/2021] [Indexed: 12/11/2022]
Abstract
The biological and functional significance of selected Critical Assessment of Techniques for Protein Structure Prediction 14 (CASP14) targets are described by the authors of the structures. The authors highlight the most relevant features of the target proteins and discuss how well these features were reproduced in the respective submitted predictions. The overall ability to predict three-dimensional structures of proteins has improved remarkably in CASP14, and many difficult targets were modeled with impressive accuracy. For the first time in the history of CASP, the experimentalists not only highlighted that computational models can accurately reproduce the most critical structural features observed in their targets, but also envisaged that models could serve as a guidance for further studies of biologically-relevant properties of proteins.
Collapse
|
12
|
Computational models in the service of X-ray and cryo-electron microscopy structure determination. Proteins 2021; 89:1633-1646. [PMID: 34449113 PMCID: PMC8616789 DOI: 10.1002/prot.26223] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/07/2021] [Revised: 08/11/2021] [Accepted: 08/17/2021] [Indexed: 01/20/2023]
Abstract
Critical assessment of structure prediction (CASP) conducts community experiments to determine the state of the art in computing protein structure from amino acid sequence. The process relies on the experimental community providing information about not yet public or about to be solved structures, for use as targets. For some targets, the experimental structure is not solved in time for use in CASP. Calculated structure accuracy improved dramatically in this round, implying that models should now be much more useful for resolving many sorts of experimental difficulties. To test this, selected models for seven unsolved targets were provided to the experimental groups. These models were from the AlphaFold2 group, who overall submitted the most accurate predictions in CASP14. Four targets were solved with the aid of the models, and, additionally, the structure of an already solved target was improved. An a posteriori analysis showed that, in some cases, models from other groups would also be effective. This paper provides accounts of the successful application of models to structure determination, including molecular replacement for X-ray crystallography, backbone tracing and sequence positioning in a cryo-electron microscopy structure, and correction of local features. The results suggest that, in future, there will be greatly increased synergy between computational and experimental approaches to structure determination.
Collapse
|
13
|
Assessing the utility of CASP14 models for molecular replacement. Proteins 2021; 89:1752-1769. [PMID: 34387010 PMCID: PMC8881082 DOI: 10.1002/prot.26214] [Citation(s) in RCA: 35] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2021] [Revised: 07/20/2021] [Accepted: 07/27/2021] [Indexed: 11/21/2022]
Abstract
The assessment of CASP models for utility in molecular replacement is a measure of their use in a valuable real‐world application. In CASP7, the metric for molecular replacement assessment involved full likelihood‐based molecular replacement searches; however, this restricted the assessable targets to crystal structures with only one copy of the target in the asymmetric unit, and to those where the search found the correct pose. In CASP10, full molecular replacement searches were replaced by likelihood‐based rigid‐body refinement of models superimposed on the target using the LGA algorithm, with the metric being the refined log‐likelihood‐gain (LLG) score. This enabled multi‐copy targets and very poor models to be evaluated, but a significant further issue remained: the requirement of diffraction data for assessment. We introduce here the relative‐expected‐LLG (reLLG), which is independent of diffraction data. This reLLG is also independent of any crystal form, and can be calculated regardless of the source of the target, be it X‐ray, NMR or cryo‐EM. We calibrate the reLLG against the LLG for targets in CASP14, showing that it is a robust measure of both model and group ranking. Like the LLG, the reLLG shows that accurate coordinate error estimates add substantial value to predicted models. We find that refinement by CASP groups can often convert an inadequate initial model into a successful MR search model. Consistent with findings from others, we show that the AlphaFold2 models are sufficiently good, and reliably so, to surpass other current model generation strategies for attempting molecular replacement phasing.
Collapse
|
14
|
An astonishing wealth of new proteasome homologs. Bioinformatics 2021; 37:4694-4703. [PMID: 34323935 PMCID: PMC8665760 DOI: 10.1093/bioinformatics/btab558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2021] [Revised: 07/02/2021] [Accepted: 07/28/2021] [Indexed: 11/21/2022] Open
Abstract
Motivation The proteasome is the main proteolytic machine for targeted protein degradation in archaea and eukaryotes. While some bacteria also possess the proteasome, most of them contain a simpler and more specialized homolog, the heat shock locus V protease. In recent years, three further homologs of the proteasome core subunits have been characterized in prokaryotes: Anbu, BPH and connectase. With the inclusion of these members, the family of proteasome-like proteins now exhibits a range of architectural and functional forms, from the canonical proteasome, a barrel-shaped protease without pronounced intrinsic substrate specificity, to the monomeric connectase, a highly specific protein ligase. Results We employed systematic sequence searches to show that we have only seen the tip of the iceberg so far and that beyond the hitherto known proteasome homologs lies a wealth of distantly related, uncharacterized homologs. We describe a total of 22 novel proteasome homologs in bacteria and archaea. Using sequence and structure analysis, we analyze their evolutionary history and assess structural differences that may modulate their function. With this initial description, we aim to stimulate the experimental investigation of these novel proteasome-like family members. Availability and implementation The protein sequences in this study are searchable in the MPI Bioinformatics Toolkit (https://toolkit.tuebingen.mpg.de) with ProtBLAST/PSI-BLAST and with HHpred (database ‘proteasome_homologs’). The following data are available at https://data.mendeley.com/datasets/t48yhff7hs/3: (i) sequence alignments for each proteasome-like homolog, (ii) the coordinates for their structural models and (iii) a cluster-map file, which can be navigated interactively in CLANS and gives direct access to all the sequences in this study. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
15
|
High-accuracy protein structure prediction in CASP14. Proteins 2021; 89:1687-1699. [PMID: 34218458 DOI: 10.1002/prot.26171] [Citation(s) in RCA: 161] [Impact Index Per Article: 53.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 06/16/2021] [Accepted: 06/23/2021] [Indexed: 12/25/2022]
Abstract
The application of state-of-the-art deep-learning approaches to the protein modeling problem has expanded the "high-accuracy" category in CASP14 to encompass all targets. Building on the metrics used for high-accuracy assessment in previous CASPs, we evaluated the performance of all groups that submitted models for at least 10 targets across all difficulty classes, and judged the usefulness of those produced by AlphaFold2 (AF2) as molecular replacement search models with AMPLE. Driven by the qualitative diversity of the targets submitted to CASP, we also introduce DipDiff as a new measure for the improvement in backbone geometry provided by a model versus available templates. Although a large leap in high-accuracy is seen due to AF2, the second-best method in CASP14 out-performed the best in CASP13, illustrating the role of community-based benchmarking in the development and evolution of the protein structure prediction field.
Collapse
|
16
|
|
17
|
Archaeal Connectase is a specific and efficient protein ligase related to proteasome β subunits. Proc Natl Acad Sci U S A 2021; 118:e2017871118. [PMID: 33688044 PMCID: PMC7980362 DOI: 10.1073/pnas.2017871118] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022] Open
Abstract
Sequence-specific protein ligations are widely used to produce customized proteins "on demand." Such chimeric, immobilized, fluorophore-conjugated or segmentally labeled proteins are generated using a range of chemical, (split) intein, split domain, or enzymatic methods. Where short ligation motifs and good chemoselectivity are required, ligase enzymes are often chosen, although they have a number of disadvantages, for example poor catalytic efficiency, low substrate specificity, and side reactions. Here, we describe a sequence-specific protein ligase with more favorable characteristics. This ligase, Connectase, is a monomeric homolog of 20S proteasome subunits in methanogenic archaea. In pulldown experiments with Methanosarcina mazei cell extract, we identify a physiological substrate in methyltransferase A (MtrA), a key enzyme of archaeal methanogenesis. Using microscale thermophoresis and X-ray crystallography, we show that only a short sequence of about 20 residues derived from MtrA and containing a highly conserved KDPGA motif is required for this high-affinity interaction. Finally, in quantitative activity assays, we demonstrate that this recognition tag can be repurposed to allow the ligation of two unrelated proteins. Connectase catalyzes such ligations at substantially higher rates, with higher yields, but without detectable side reactions when compared with a reference enzyme. It thus presents an attractive tool for the development of new methods, for example in the preparation of selectively labeled proteins for NMR, the covalent and geometrically defined attachment of proteins on surfaces for cryo-electron microscopy, or the generation of multispecific antibodies.
Collapse
|
18
|
The VCBS superfamily forms a third supercluster of β-propellers that includes tachylectin and integrins. Bioinformatics 2021; 36:5618-5622. [PMID: 33416871 PMCID: PMC8023676 DOI: 10.1093/bioinformatics/btaa1085] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2020] [Revised: 12/14/2020] [Accepted: 12/21/2020] [Indexed: 01/15/2023] Open
Abstract
MOTIVATION β-Propellers are found in great variety across all kingdoms of life. They assume many cellular roles, primarily as scaffolds for macromolecular interactions and catalysis. Despite their diversity, most β-propeller families clearly originated by amplification from the same ancient peptide-the "blade". In cluster analyses, β-propellers of the WD40 superfamily always formed the largest group, to which some important families, such as the α-integrin, Asp-box, and glycoside hydrolase β-propellers connected weakly. Motivated by the dramatic growth of sequence databases we revisited these connections, with a special focus on VCBS-like β-propellers, which have not been analysed for their evolutionary relationships so far. RESULTS We found that VCBS-like form a supercluster with integrin-like β-propellers and tachylectins, clearly delimited from the superclusters formed by WD40 and Asp-Box β-propellers. Connections between the three superclusters are made mainly through PQQ-like β-propeller. Our results present a new, greatly expanded view of the β-propeller classification landscape. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
19
|
Design of novel granulopoietic proteins by topological rescaffolding. PLoS Biol 2020; 18:e3000919. [PMID: 33351791 PMCID: PMC7755208 DOI: 10.1371/journal.pbio.3000919] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2019] [Accepted: 11/24/2020] [Indexed: 11/18/2022] Open
Abstract
Computational protein design is rapidly becoming more powerful, and improving the accuracy of computational methods would greatly streamline protein engineering by eliminating the need for empirical optimization in the laboratory. In this work, we set out to design novel granulopoietic agents using a rescaffolding strategy with the goal of achieving simpler and more stable proteins. All of the 4 experimentally tested designs were folded, monomeric, and stable, while the 2 determined structures agreed with the design models within less than 2.5 Å. Despite the lack of significant topological or sequence similarity to their natural granulopoietic counterpart, 2 designs bound to the granulocyte colony-stimulating factor (G-CSF) receptor and exhibited potent, but delayed, in vitro proliferative activity in a G-CSF-dependent cell line. Interestingly, the designs also induced proliferation and differentiation of primary human hematopoietic stem cells into mature granulocytes, highlighting the utility of our approach to develop highly active therapeutic leads purely based on computational design. De novo designed cytokines that activate the G-CSF receptor show that the receptor-binding information can be encoded onto stable, miniaturised protein scaffolds that possess potent granulopoietic activity; such novel proteins provide for ideal candidates for protein-based therapeutics.
Collapse
|
20
|
|
21
|
A secreted fungal histidine- and alanine-rich protein regulates metal ion homeostasis and oxidative stress. THE NEW PHYTOLOGIST 2020; 227:1174-1188. [PMID: 32285459 DOI: 10.1111/nph.16606] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 04/01/2020] [Indexed: 05/22/2023]
Abstract
Like pathogens, beneficial endophytic fungi secrete effector proteins to promote plant colonization, for example, through perturbation of host immunity. The genome of the root endophyte Serendipita indica encodes a novel family of highly similar, small alanine- and histidine-rich proteins, whose functions remain unknown. Members of this protein family carry an N-terminal signal peptide and a conserved C-terminal DELD motif. Here we report on the functional characterization of the plant-responsive DELD family protein Dld1 using a combination of structural, biochemical, biophysical and cytological analyses. The crystal structure of Dld1 shows an unusual, monomeric histidine zipper consisting of two antiparallel coiled-coil helices. Similar to other histidine-rich proteins, Dld1 displays varying affinity to different transition metal ions and undergoes metal ion- and pH-dependent unfolding. Transient expression of mCherry-tagged Dld1 in barley leaf and root tissue suggests that Dld1 localizes to the plant cell wall and accumulates at cell wall appositions during fungal penetration. Moreover, recombinant Dld1 enhances barley root colonization by S. indica, and inhibits H2 O2 -mediated radical polymerization of 3,3'-diaminobenzidine. Our data suggest that Dld1 has the potential to enhance micronutrient accessibility for the fungus and to interfere with oxidative stress and reactive oxygen species homeostasis to facilitate host colonization.
Collapse
|
22
|
Histones predate the split between bacteria and archaea. Bioinformatics 2020; 35:2349-2353. [PMID: 30520969 DOI: 10.1093/bioinformatics/bty1000] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2018] [Revised: 11/27/2018] [Accepted: 12/05/2018] [Indexed: 02/02/2023] Open
Abstract
MOTIVATION Histones form octameric complexes called nucleosomes, which organize the genomic DNA of eukaryotes into chromatin. Each nucleosome comprises two copies each of the histones H2A, H2B, H3 and H4, which share a common ancestry. Although histones were initially thought to be a eukaryotic innovation, the subsequent identification of archaeal homologs led to the notion that histones emerged before the divergence of archaea and eukaryotes. RESULTS Here, we report the detection and classification of two new groups of histone homologs, which are present in both archaea and bacteria. Proteins in one group consist of two histone subunits welded into single-chain pseudodimers, whereas in the other they resemble eukaryotic core histone subunits and show sequence patterns characteristic of DNA binding. The sequences come from a broad spectrum of deeply-branching lineages, excluding their genesis by horizontal gene transfer. Our results extend the origin of histones to the last universal common ancestor. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
|
23
|
Auto-regulation of Rab5 GEF activity in Rabex5 by allosteric structural changes, catalytic core dynamics and ubiquitin binding. eLife 2019; 8:46302. [PMID: 31718772 PMCID: PMC6855807 DOI: 10.7554/elife.46302] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2019] [Accepted: 10/22/2019] [Indexed: 02/06/2023] Open
Abstract
Intracellular trafficking depends on the function of Rab GTPases, whose activation is regulated by guanine exchange factors (GEFs). The Rab5 GEF, Rabex5, was previously proposed to be auto-inhibited by its C-terminus. Here, we studied full-length Rabex5 and Rabaptin5 proteins as well as domain deletion Rabex5 mutants using hydrogen deuterium exchange mass spectrometry. We generated a structural model of Rabex5, using chemical cross-linking mass spectrometry and integrative modeling techniques. By correlating structural changes with nucleotide exchange activity for each construct, we uncovered new auto-regulatory roles for the ubiquitin binding domains and the Linker connecting those domains to the catalytic core of Rabex5. We further provide evidence that enhanced dynamics in the catalytic core are linked to catalysis. Our results suggest a more complex auto-regulation mechanism than previously thought and imply that ubiquitin binding serves not only to position Rabex5 but to also control its Rab5 GEF activity through allosteric structural alterations.
Collapse
|
24
|
The ancestral KH peptide at the root of a domain family with three different folds. Bioinformatics 2019; 34:3961-3965. [PMID: 29912332 DOI: 10.1093/bioinformatics/bty480] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2018] [Accepted: 06/12/2018] [Indexed: 11/13/2022] Open
Abstract
Motivation The direct ancestor of the DNA-protein world of today is considered to have been an RNA-peptide world, in which peptides were co-factors of RNA-mediated catalysis and replication. Evidence for these ancestral peptides, from which folded proteins evolved, can be derived even today from regions of local sequence similarity within globally dissimilar folds. One of these is the 45-residue motif common to both folds of the hnRNP K homology (KH) domain. Results In a survey of KH domains, we found a third fold that contains the KH motif at its core. This corresponds to the Small Domain of bacterial Ribonucleases G/E and, like type I and type II KH domains, it cannot be related to the others by a single genetic event, providing further support for the KH motif as an ancestral peptide predating folded proteins. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
|
25
|
Structural diversity of oligomeric β-propellers with different numbers of identical blades. eLife 2019; 8:49853. [PMID: 31613220 PMCID: PMC6805158 DOI: 10.7554/elife.49853] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2019] [Accepted: 09/25/2019] [Indexed: 12/29/2022] Open
Abstract
β-Propellers arise through the amplification of a supersecondary structure element called a blade. This process produces toroids of between four and twelve repeats, which are almost always arranged sequentially in a single polypeptide chain. We found that new propellers evolve continuously by amplification from single blades. We therefore investigated whether such nascent propellers can fold as homo-oligomers before they have been fully amplified within a single chain. One- to six-bladed building blocks derived from two seven-bladed WD40 propellers yielded stable homo-oligomers with six to nine blades, depending on the size of the building block. High-resolution structures for tetramers of two blades, trimers of three blades, and dimers of four and five blades, respectively, show structurally diverse propellers and include a novel fold, highlighting the inherent flexibility of the WD40 blade. Our data support the hypothesis that subdomain-sized fragments can provide structural versatility in the evolution of new proteins.
Collapse
|
26
|
The Origin of Mitochondria-Specific Outer Membrane β-Barrels from an Ancestral Bacterial Fragment. Genome Biol Evol 2018; 10:2759-2765. [PMID: 30265295 PMCID: PMC6193526 DOI: 10.1093/gbe/evy216] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/27/2018] [Indexed: 12/26/2022] Open
Abstract
Outer membrane β-barrels (OMBBs) are toroidal arrays of antiparallel β-strands that span the outer membrane of Gram-negative bacteria and eukaryotic organelles. Although homologous, most families of bacterial OMBBs evolved through the independent amplification of an ancestral ββ-hairpin. In mitochondria, one family (SAM50) has a clear bacterial ancestry; the origin of the other family, consisting of 19-stranded OMBBs found only in mitochondria (MOMBBs), is substantially unclear. In a large-scale comparison of mitochondrial and bacterial OMBBs, we find evidence that the common ancestor of all MOMBBs emerged by the amplification of a double ββ-hairpin of bacterial origin, probably at the time of the Last Eukaryotic Common Ancestor. Thus, MOMBBs are indeed descended from bacterial OMBBs, but their fold formed independently in the proto-mitochondria, possibly in response to the need for a general-purpose polypeptide importer. This occurred by a process of amplification, despite the final fold having a prime number of strands.
Collapse
|
27
|
Chemical Ligand Space of Cereblon. ACS OMEGA 2018; 3:11163-11171. [PMID: 31459225 PMCID: PMC6644994 DOI: 10.1021/acsomega.8b00959] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2018] [Accepted: 08/31/2018] [Indexed: 05/20/2023]
Abstract
The protein cereblon serves as a substrate receptor of a ubiquitin ligase complex that can be tuned toward different target proteins by cereblon-binding agents. This approach to targeted protein degradation is exploited in different clinical settings and has sparked the development of a growing number of thalidomide derivatives. Here, we probe the chemical space of cereblon binding beyond such derivatives and work out a simple set of chemical requirements, delineating the metaclass of cereblon effectors. We report co-crystal structures for a diverse set of compounds, including commonly used pharmaceuticals, but also find that already minimalistic cereblon-binding moieties might exert teratogenic effects in zebrafish. Our results may guide the design of a post-thalidomide generation of therapeutic cereblon effectors and provide a framework for the circumvention of unintended cereblon binding by negative design for future pharmaceuticals.
Collapse
|
28
|
Abstract
Designing proteins with novel folds remains a major challenge, as the biophysical properties of the target fold are not known a priori and no sequence profile exists to describe its features. Therefore, most computational design efforts so far have been directed toward creating proteins that recapitulate existing folds. Here we present a strategy centered upon the design of novel intramolecular interfaces that enables the construction of a target fold from a set of starting fragments. This strategy effectively reduces the amount of computational sampling necessary to achieve an optimal sequence, without compromising the level of topological control. The solenoid architecture has been a target of extensive protein design efforts, as it provides a highly modular platform of low topological complexity. However, none of the previous efforts have attempted to depart from the natural form, which is characterized by a uniformly handed superhelical architecture. Here we aimed to design a more complex platform, abolishing the superhelicity by introducing internally alternating handedness, resulting in a novel, corrugated architecture. We employed our interface-driven strategy, designing three proteins and confirming the design by solving the structure of two examples.
Collapse
|
29
|
Adenylate cyclases: Receivers, transducers, and generators of signals. Cell Signal 2018; 46:135-144. [PMID: 29563061 DOI: 10.1016/j.cellsig.2018.03.002] [Citation(s) in RCA: 46] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2018] [Revised: 03/09/2018] [Accepted: 03/12/2018] [Indexed: 11/18/2022]
Abstract
Class III adenylate cyclases (ACs) are widespread signaling proteins, which translate diverse intracellular and extracellular stimuli into a uniform intracellular signal. They are typically composed of an N-terminal array of input domains and transducers, followed C-terminally by a catalytic domain, which, as a dimer, generates the second messenger cAMP. The input domains, which receive stimuli, and the transducers, which propagate the signals, are often found in other signaling proteins. The nature of stimuli and the regulatory mechanisms of ACs have been studied experimentally in only a few cases, and even in these, important questions remain open, such as whether eukaryotic ACs regulated by G protein-coupled receptors can also receive stimuli through their own membrane domains. Here we survey the current knowledge on regulation and intramolecular signal propagation in ACs and draw comparisons to other signaling proteins. We highlight the pivotal role of a recently identified cyclase-specific transducer element located N-terminally of many AC catalytic domains, suggesting an intramolecular signaling capacity.
Collapse
|
30
|
A Completely Reimplemented MPI Bioinformatics Toolkit with a New HHpred Server at its Core. J Mol Biol 2017; 430:2237-2243. [PMID: 29258817 DOI: 10.1016/j.jmb.2017.12.007] [Citation(s) in RCA: 1454] [Impact Index Per Article: 207.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2017] [Revised: 12/10/2017] [Accepted: 12/11/2017] [Indexed: 12/12/2022]
Abstract
The MPI Bioinformatics Toolkit (https://toolkit.tuebingen.mpg.de) is a free, one-stop web service for protein bioinformatic analysis. It currently offers 34 interconnected external and in-house tools, whose functionality covers sequence similarity searching, alignment construction, detection of sequence features, structure prediction, and sequence classification. This breadth has made the Toolkit an important resource for experimental biology and for teaching bioinformatic inquiry. Recently, we replaced the first version of the Toolkit, which was released in 2005 and had served around 2.5 million queries, with an entirely new version, focusing on improved features for the comprehensive analysis of proteins, as well as on promoting teaching. For instance, our popular remote homology detection server, HHpred, now allows pairwise comparison of two sequences or alignments and offers additional profile HMMs for several model organisms and domain databases. Here, we introduce the new version of our Toolkit and its application to the analysis of proteins.
Collapse
|
31
|
From ancestral peptides to designed proteins. Curr Opin Struct Biol 2017; 48:103-109. [PMID: 29195087 DOI: 10.1016/j.sbi.2017.11.006] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2017] [Accepted: 11/20/2017] [Indexed: 11/16/2022]
Abstract
The diversity of modern proteins arose through the combinatorial shuffling and differentiation of a limited number of autonomously folding domain prototypes, but the origin of these prototypes themselves has long remained poorly understood. In recent years, the proposal that they originated by repetition, accretion, and recombination from an ancestral set of peptides, which evolved as cofactors of RNA-based replication and catalysis, has gained wide acceptance, supported by the systematic identification of such ancestral peptides and the experimental recapitulation of the mechanisms by which they could have yielded the first folded proteins. Inspired by this evolutionary process, protein engineers have seized on design from pre-optimized peptide components as a powerful approach to generating proteins with novel topology and functionality.
Collapse
|
32
|
Characterization of the CrbS/R Two-Component System in Pseudomonas fluorescens Reveals a New Set of Genes under Its Control and a DNA Motif Required for CrbR-Mediated Transcriptional Activation. Front Microbiol 2017; 8:2287. [PMID: 29250042 PMCID: PMC5715377 DOI: 10.3389/fmicb.2017.02287] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2017] [Accepted: 11/06/2017] [Indexed: 01/18/2023] Open
Abstract
The CrbS/R system is a two-component signal transduction system that regulates acetate utilization in Vibrio cholerae, P. aeruginosa, and P. entomophila. CrbS is a hybrid histidine kinase that belongs to a recently identified family, in which the signaling domain is fused to an SLC5 solute symporter domain through aSTAC domain. Upon activation by CrbS, CrbR activates transcription of the acs gene, which encodes an acetyl-CoA synthase (ACS), and the actP gene, which encodes an acetate/solute symporter. In this work, we characterized the CrbS/R system in Pseudomonas fluorescens SBW25. Through the quantitative proteome analysis of different mutants, we were able to identify a new set of genes under its control, which play an important role during growth on acetate. These results led us to the identification of a conserved DNA motif in the putative promoter region of acetate-utilization genes in the Gammaproteobacteria that is essential for the CrbR-mediated transcriptional activation of genes under acetate-utilizing conditions. Finally, we took advantage of the existence of a second SLC5-containing two-component signal transduction system in P. fluorescens, CbrA/B, to demonstrate that the activation of the response regulator by the histidine kinase is not dependent on substrate transport through the SLC5 domain.
Collapse
|
33
|
Ribosomal proteins as documents of the transition from unstructured (poly)peptides to folded proteins. J Struct Biol 2017; 198:74-81. [DOI: 10.1016/j.jsb.2017.04.007] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2017] [Revised: 04/23/2017] [Accepted: 04/24/2017] [Indexed: 11/16/2022]
|
34
|
Characterization of a novel signal transducer element intrinsic to class IIIa/b adenylate cyclases and guanylate cyclases. FEBS J 2017; 284:1204-1217. [PMID: 28222489 DOI: 10.1111/febs.14047] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/27/2016] [Revised: 01/09/2017] [Accepted: 02/17/2017] [Indexed: 11/28/2022]
Abstract
Adenylate cyclases (ACs) are signaling proteins that produce the second messenger cAMP. Class III ACs comprise four groups (class IIIa-d) of which class IIIa and IIIb ACs have been identified in bacteria and eukaryotes. Many class IIIa ACs are anchored to membranes via hexahelical domains. In eukaryotic ACs, membrane anchors are well conserved, suggesting that this region possesses important functional characteristics that are as yet unknown. To address this question, we replaced the hexahelical membrane anchor of the mycobacterial AC Rv1625c with the hexahelical quorum-sensing receptor from Legionella, LqsS. Using this chimera, we identified a novel 19-amino-acid cyclase transducer element (CTE) located N-terminally to the catalytic domain that links receptor stimulation to effector activation. Coupling of the receptor to the AC was possible at several positions distal to the membrane exit, resulting in stimulatory or inhibitory responses to the ligand Legionella autoinducer-1. In contrast, on the AC effector side functional coupling was only successful when starting with the CTE. Bioinformatics approaches established that distinct CTEs are widely present in class IIIa and IIIb ACs and in vertebrate guanylate cyclases. The data suggest that membrane-delimited receiver domains transduce regulatory signals to the downstream catalytic domains in an engineered AC model system. This may suggest a previously unknown mechanism for cellular cAMP regulation.
Collapse
|
35
|
N@a and N@d: Oligomer and Partner Specification by Asparagine in Coiled-Coil Interfaces. ACS Chem Biol 2017; 12:528-538. [PMID: 28026921 DOI: 10.1021/acschembio.6b00935] [Citation(s) in RCA: 26] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
The α-helical coiled coil is one of the best-studied protein-protein interaction motifs. As a result, sequence-to-structure relationships are available for the prediction of natural coiled-coil sequences and the de novo design of new ones. However, coiled coils adopt a wide range of oligomeric states and topologies, and our understanding of the specification of these and the discrimination between them remains incomplete. Gaps in our knowledge assume more importance as coiled coils are used increasingly to construct biomimetic systems of higher complexity; for this, coiled-coil components need to be robust, orthogonal, and transferable between contexts. Here, we explore how the polar side chain asparagine (Asn, N) is tolerated within otherwise hydrophobic helix-helix interfaces of coiled coils. The long-held view is that Asn placed at certain sites of the coiled-coil sequence repeat selects one oligomer state over others, which is rationalized by the ability of the side chain to make hydrogen bonds, or interactions with chelated ions within the coiled-coil interior of the favored state. We test this with experiments on de novo peptide sequences traditionally considered as directing parallel dimers and trimers, and more widely through bioinformatics analysis of natural coiled-coil sequences and structures. We find that when located centrally, rather than near the termini of such coiled-coil sequences, Asn does exert the anticipated oligomer-specifying influence. However, outside of these bounds, Asn is observed less frequently in the natural sequences, and the synthetic peptides are hyperthermostable and lose oligomer-state specificity. These findings highlight that not all regions of coiled-coil repeat sequences are equivalent, and that care is needed when designing coiled-coil interfaces.
Collapse
|
36
|
Coiled Coils - A Model System for the 21st Century. Trends Biochem Sci 2016; 42:130-140. [PMID: 27884598 DOI: 10.1016/j.tibs.2016.10.007] [Citation(s) in RCA: 109] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Accepted: 10/25/2016] [Indexed: 01/01/2023]
Abstract
α-Helical coiled coils were described more than 60 years ago as simple, repetitive structures mediating oligomerization and mechanical stability. Over the past 20 years, however, they have emerged as one of the most diverse protein folds in nature, enabling many biological functions beyond mechanical rigidity, such as membrane fusion, signal transduction, and solute transport. Despite this great diversity, their structures can be described by parametric equations, making them uniquely suited for rational protein design. Far from having been exhausted as a source of structural insight and a basis for functional engineering, coiled coils are poised to become even more important for protein science in the coming decades.
Collapse
|
37
|
Origin of a folded repeat protein from an intrinsically disordered ancestor. eLife 2016; 5. [PMID: 27623012 PMCID: PMC5074805 DOI: 10.7554/elife.16761] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2016] [Accepted: 09/09/2016] [Indexed: 01/03/2023] Open
Abstract
Repetitive proteins are thought to have arisen through the amplification of subdomain-sized peptides. Many of these originated in a non-repetitive context as cofactors of RNA-based replication and catalysis, and required the RNA to assume their active conformation. In search of the origins of one of the most widespread repeat protein families, the tetratricopeptide repeat (TPR), we identified several potential homologs of its repeated helical hairpin in non-repetitive proteins, including the putatively ancient ribosomal protein S20 (RPS20), which only becomes structured in the context of the ribosome. We evaluated the ability of the RPS20 hairpin to form a TPR fold by amplification and obtained structures identical to natural TPRs for variants with 2-5 point mutations per repeat. The mutations were neutral in the parent organism, suggesting that they could have been sampled in the course of evolution. TPRs could thus have plausibly arisen by amplification from an ancestral helical hairpin.
Collapse
|
38
|
The MPI bioinformatics Toolkit as an integrative platform for advanced protein sequence and structure analysis. Nucleic Acids Res 2016; 44:W410-5. [PMID: 27131380 PMCID: PMC4987908 DOI: 10.1093/nar/gkw348] [Citation(s) in RCA: 292] [Impact Index Per Article: 36.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2016] [Accepted: 04/19/2016] [Indexed: 12/21/2022] Open
Abstract
The MPI Bioinformatics Toolkit (http://toolkit.tuebingen.mpg.de) is an open, interactive web service for comprehensive and collaborative protein bioinformatic analysis. It offers a wide array of interconnected, state-of-the-art bioinformatics tools to experts and non-experts alike, developed both externally (e.g. BLAST+, HMMER3, MUSCLE) and internally (e.g. HHpred, HHblits, PCOILS). While a beta version of the Toolkit was released 10 years ago, the current production-level release has been available since 2008 and has serviced more than 1.6 million external user queries. The usage of the Toolkit has continued to increase linearly over the years, reaching more than 400 000 queries in 2015. In fact, through the breadth of its tools and their tight interconnection, the Toolkit has become an excellent platform for experimental scientists as well as a useful resource for teaching bioinformatic inquiry to students in the life sciences. In this article, we report on the evolution of the Toolkit over the last ten years, focusing on the expansion of the tool repertoire (e.g. CS-BLAST, HHblits) and on infrastructural work needed to remain operative in a changing web environment.
Collapse
|
39
|
Abstract
Coiled coils are the best-understood protein fold, as their backbone structure can uniquely be described by parametric equations. This level of understanding has allowed their manipulation in unprecedented detail. They do not seem a likely source of surprises, yet we describe here the unexpected formation of a new type of fiber by the simple insertion of two or six residues into the underlying heptad repeat of a parallel, trimeric coiled coil. These insertions strain the supercoil to the breaking point, causing the local formation of short β-strands, which move the path of the chain by 120° around the trimer axis. The result is an α/β coiled coil, which retains only one backbone hydrogen bond per repeat unit from the parent coiled coil. Our results show that a substantially novel backbone structure is possible within the allowed regions of the Ramachandran space with only minor mutations to a known fold.
Collapse
|
40
|
Structural Basis for Toughness and Flexibility in the C-terminal Passenger Domain of an Acinetobacter Trimeric Autotransporter Adhesin. J Biol Chem 2015; 291:3705-24. [PMID: 26698633 DOI: 10.1074/jbc.m115.701698] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2015] [Indexed: 11/06/2022] Open
Abstract
Trimeric autotransporter adhesins (TAAs) on the cell surface of Gram-negative pathogens mediate bacterial adhesion to host cells and extracellular matrix proteins. However, AtaA, a TAA in the nonpathogenic Acinetobacter sp. strain Tol 5, shows nonspecific high adhesiveness to abiotic material surfaces as well as to biotic surfaces. It consists of a passenger domain secreted by the C-terminal transmembrane anchor domain (TM), and the passenger domain contains an N-terminal head, N-terminal stalk, C-terminal head (Chead), and C-terminal stalk (Cstalk). The Chead-Cstalk-TM fragment, which is conserved in many Acinetobacter TAAs, has by itself the head-stalk-anchor architecture of a complete TAA. Here, we show the crystal structure of the Chead-Cstalk fragment, AtaA_C-terminal passenger domain (CPSD), providing the first view of several conserved TAA domains. The YadA-like head (Ylhead) of the fragment is capped by a unique structure (headCap), composed of three β-hairpins and a connector motif; it also contains a head insert motif (HIM1) before its last inner β-strand. The headCap, Ylhead, and HIM1 integrally form a stable Chead structure. Some of the major domains of the CPSD fragment are inherently flexible and provide bending sites for the fiber between segments whose toughness is ensured by topological chain exchange and hydrophobic core formation inside the trimer. Thus, although adherence assays using in-frame deletion mutants revealed that the characteristic adhesive sites of AtaA reside in its N-terminal part, the flexibility and toughness of the CPSD part provide the resilience that enables the adhesive properties of the full-length fiber across a wide range of conditions.
Collapse
|
41
|
A vocabulary of ancient peptides at the origin of folded proteins. eLife 2015; 4:e09410. [PMID: 26653858 PMCID: PMC4739770 DOI: 10.7554/elife.09410] [Citation(s) in RCA: 145] [Impact Index Per Article: 16.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2015] [Accepted: 12/13/2015] [Indexed: 01/01/2023] Open
Abstract
The seemingly limitless diversity of proteins in nature arose from only a few thousand domain prototypes, but the origin of these themselves has remained unclear. We are pursuing the hypothesis that they arose by fusion and accretion from an ancestral set of peptides active as co-factors in RNA-dependent replication and catalysis. Should this be true, contemporary domains may still contain vestiges of such peptides, which could be reconstructed by a comparative approach in the same way in which ancient vocabularies have been reconstructed by the comparative study of modern languages. To test this, we compared domains representative of known folds and identified 40 fragments whose similarity is indicative of common descent, yet which occur in domains currently not thought to be homologous. These fragments are widespread in the most ancient folds and enriched for iron-sulfur- and nucleic acid-binding. We propose that they represent the observable remnants of a primordial RNA-peptide world.
Collapse
|
42
|
Some of the most interesting CASP11 targets through the eyes of their authors. Proteins 2015; 84 Suppl 1:34-50. [PMID: 26473983 PMCID: PMC4834066 DOI: 10.1002/prot.24942] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Revised: 09/17/2015] [Accepted: 10/11/2015] [Indexed: 11/17/2022]
Abstract
The Critical Assessment of protein Structure Prediction (CASP) experiment would not have been possible without the prediction targets provided by the experimental structural biology community. In this article, selected crystallographers providing targets for the CASP11 experiment discuss the functional and biological significance of the target proteins, highlight their most interesting structural features, and assess whether these features were correctly reproduced in the predictions submitted to CASP11. Proteins 2016; 84(Suppl 1):34–50. © 2015 The Authors. Proteins: Structure, Function, and Bioinformatics Published by Wiley Periodicals, Inc.
Collapse
|
43
|
STAC--A New Domain Associated with Transmembrane Solute Transport and Two-Component Signal Transduction Systems. J Mol Biol 2015; 427:3327-3339. [PMID: 26321252 DOI: 10.1016/j.jmb.2015.08.017] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2015] [Revised: 08/07/2015] [Accepted: 08/19/2015] [Indexed: 01/17/2023]
Abstract
Transmembrane receptors are integral components of sensory pathways in prokaryotes. These receptors share a common dimeric architecture, consisting in its basic form of an N-terminal extracellular sensor, transmembrane helices, and an intracellular effector. As an exception, we have identified an archaeal receptor family--exemplified by Af1503 from Archaeoglobus fulgidus--that is C-terminally shortened, lacking a recognizable effector module. Instead, a HAMP domain forms the sole extension for signal transduction in the cytosol. Here, we examine the gene environment of Af1503-like receptors and find a frequent association with transmembrane transport proteins. Furthermore, we identify and define a closely associated new protein domain family, which we characterize structurally using Af1502 from A. fulgidus. Members of this family are found both as stand-alone proteins and as domains within extant receptors. In general, the latter appear as connectors between the solute carrier 5 (SLC5)-like transmembrane domains and two-component signal transduction (TCST) domains. This is seen, for example, in the histidine kinase CbrA, which is a global regulator of metabolism, virulence, and antibiotic resistance in Pseudomonads. We propose that this newly identified domain family mediates signal transduction in systems regulating transport processes and name it STAC, for SLC and TCST-Associated Component.
Collapse
|
44
|
Abstract
Cereblon, a primary target of thalidomide and its derivatives, has been characterized structurally from both bacteria and animals. Especially well studied is the thalidomide binding domain, CULT, which shows an invariable structure across different organisms and in complex with different ligands. Here, based on a series of crystal structures of a bacterial representative, we reveal the conformational flexibility and structural dynamics of this domain. In particular, we follow the unfolding of large fractions of the domain upon release of thalidomide in the crystalline state. Our results imply that a third of the domain, including the thalidomide binding pocket, only folds upon ligand binding. We further characterize the structural effect of the C-terminal truncation resulting from the mental-retardation linked R419X nonsense mutation in vitro and offer a mechanistic hypothesis for its irresponsiveness to thalidomide. At 1.2Å resolution, our data provide a view of thalidomide binding at atomic resolution.
Collapse
|
45
|
|
46
|
Structure and evolution of N-domains in AAA metalloproteases. J Mol Biol 2015; 427:910-923. [PMID: 25576874 DOI: 10.1016/j.jmb.2014.12.024] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2014] [Revised: 12/01/2014] [Accepted: 12/29/2014] [Indexed: 10/24/2022]
Abstract
Metalloproteases of the AAA (ATPases associated with various cellular activities) family play a crucial role in protein quality control within the cytoplasmic membrane of bacteria and the inner membrane of eukaryotic organelles. These membrane-anchored hexameric enzymes are composed of an N-terminal domain with one or two transmembrane helices, a central AAA ATPase module, and a C-terminal Zn(2+)-dependent protease. While the latter two domains have been well studied, so far, little is known about the N-terminal regions. Here, in an extensive bioinformatic and structural analysis, we identified three major, non-homologous groups of N-domains in AAA metalloproteases. By far, the largest one is the FtsH-like group of bacteria and eukaryotic organelles. The other two groups are specific to Yme1: one found in plants, fungi, and basal metazoans and the other one found exclusively in animals. Using NMR and crystallography, we determined the subunit structure and hexameric assembly of Escherichia coli FtsH-N, exhibiting an unusual α+β fold, and the conserved part of fungal Yme1-N from Saccharomyces cerevisiae, revealing a tetratricopeptide repeat fold. Our bioinformatic analysis showed that, uniquely among these proteins, the N-domain of Yme1 from the cnidarian Hydra vulgaris contains both the tetratricopeptide repeat region seen in basal metazoans and a region of homology to the N-domains of animals. Thus, it is a modern-day representative of an intermediate in the evolution of animal Yme1 from basal eukaryotic precursors.
Collapse
|
47
|
The thalidomide-binding domain of cereblon defines the CULT domain family and is a new member of the β-tent fold. PLoS Comput Biol 2015; 11:e1004023. [PMID: 25569776 PMCID: PMC4287342 DOI: 10.1371/journal.pcbi.1004023] [Citation(s) in RCA: 31] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2014] [Accepted: 11/04/2014] [Indexed: 11/18/2022] Open
Abstract
Despite having caused one of the greatest medical catastrophies of the last century through its teratogenic side-effects, thalidomide continues to be an important agent in the treatment of leprosy and cancer. The protein cereblon, which forms an E3 ubiquitin ligase compex together with damaged DNA-binding protein 1 (DDB1) and cullin 4A, has been recently indentified as a primary target of thalidomide and its C-terminal part as responsible for binding thalidomide within a domain carrying several invariant cysteine and tryptophan residues. This domain, which we name CULT (cereblon domain of unknown activity, binding cellular ligands and thalidomide), is also found in a family of secreted proteins from animals and in a family of bacterial proteins occurring primarily in δ-proteobacteria. Its nearest relatives are yippee, a highly conserved eukaryotic protein of unknown function, and Mis18, a protein involved in the priming of centromeres for recruitment of CENP-A. Searches for distant homologs point to an evolutionary relationship of CULT, yippee, and Mis18 to proteins sharing a common fold, which consists of two four-stranded β-meanders packing at a roughly right angle and coordinating a zinc ion at their apex. A β-hairpin inserted into the first β-meander extends across the bottom of the structure towards the C-terminal edge of the second β-meander, with which it forms a cradle-shaped binding site that is topologically conserved in all members of this fold. We name this the β-tent fold for the striking arrangement of its constituent β-sheets. The fold has internal pseudosymmetry, raising the possibility that it arose by duplication of a subdomain-sized fragment.
Collapse
|
48
|
Abstract
Trimeric autotransporter adhesins (TAAs) are modular, highly repetitive outer membrane proteins that mediate adhesion to external surfaces in many Gram-negative bacteria. In recent years, several TAAs have been investigated in considerable detail, also at the structural level. However, in their vast majority, putative TAAs in prokaryotic genomes remain poorly annotated, due to their sequence diversity and changeable domain architecture. In order to achieve an automated annotation of these proteins that is both detailed and accurate we have taken a domain dictionary approach, in which we identify recurrent domains by sequence comparisons, produce bioinformatic descriptors for each domain type, and connect these to structural information where available. We implemented this approach in a web-based platform, daTAA, in 2008 and demonstrated its applicability by reconstructing the complete fiber structure of a TAA conserved in enterobacteria. Here we review current knowledge on the domain structure of TAAs.
Collapse
|
49
|
Thalidomide mimics uridine binding to an aromatic cage in cereblon. J Struct Biol 2014; 188:225-32. [PMID: 25448889 DOI: 10.1016/j.jsb.2014.10.010] [Citation(s) in RCA: 46] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2014] [Revised: 10/21/2014] [Accepted: 10/24/2014] [Indexed: 12/31/2022]
Abstract
Thalidomide and its derivatives lenalidomide and pomalidomide are important anticancer agents but can cause severe birth defects via an interaction with the protein cereblon. The ligand-binding domain of cereblon is found, with a high degree of conservation, in both bacteria and eukaryotes. Using a bacterial model system, we reveal the structural determinants of cereblon substrate recognition, based on a series of high-resolution crystal structures. For the first time, we identify a cellular ligand that is universally present: we show that thalidomide and its derivatives mimic and compete for the binding of uridine, and validate these findings in vivo. The nature of the binding pocket, an aromatic cage of three tryptophan residues, further suggests a role in the recognition of cationic ligands. Our results allow for general evaluation of pharmaceuticals for potential cereblon-dependent teratogenicity.
Collapse
|
50
|
Axial helix rotation as a mechanism for signal regulation inferred from the crystallographic analysis of the E. coli serine chemoreceptor. J Struct Biol 2014; 186:349-56. [PMID: 24680785 DOI: 10.1016/j.jsb.2014.03.015] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2013] [Revised: 03/14/2014] [Accepted: 03/17/2014] [Indexed: 11/19/2022]
Abstract
Bacterial chemotaxis receptors are elongated homodimeric coiled-coil bundles, which transduce signals generated in an N-terminal sensor domain across 15-20nm to a conserved C-terminal signaling subdomain. This signal transduction regulates the activity of associated kinases, altering the behavior of the flagellar motor and hence cell motility. Signaling is in turn modulated by selective methylation and demethylation of specific glutamate and glutamine residues in an adaptation subdomain. We have determined the structure of a chimeric protein, consisting of the HAMP domain from Archaeoglobus fulgidus Af1503 and the methyl-accepting domain of Escherichia coli Tsr. It shows a 21nm coiled coil that alternates between two coiled-coil packing modes: canonical knobs-into-holes and complementary x-da, a variant form related to the canonical one by axial rotation of the helices. Comparison of the obtained structure to the Thermotoga maritima chemoreceptor TM1143 reveals that they adopt different axial rotation states in their adaptation subdomains. This conformational change is presumably induced by the upstream HAMP domain and may modulate the affinity of the chemoreceptor to the methylation-demethylation system. The presented findings extend the cogwheel model for signal transmission to chemoreceptors.
Collapse
|