1
|
Kumar P, Petrenas R, Dawson WM, Schweke H, Levy ED, Woolfson DN. CC + : A searchable database of validated coiled coils in PDB structures and AlphaFold2 models. Protein Sci 2023; 32:e4789. [PMID: 37768271 PMCID: PMC10588367 DOI: 10.1002/pro.4789] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2023] [Revised: 09/10/2023] [Accepted: 09/23/2023] [Indexed: 09/29/2023]
Abstract
α-Helical coiled coils are common tertiary and quaternary elements of protein structure. In coiled coils, two or more α helices wrap around each other to form bundles. This apparently simple structural motif can generate many architectures and topologies. Coiled coil-forming sequences can be predicted from heptad repeats of hydrophobic and polar residues, hpphppp, although this is not always reliable. Alternatively, coiled-coil structures can be identified using the program SOCKET, which finds knobs-into-holes (KIH) packing between side chains of neighboring helices. SOCKET also classifies coiled-coil architecture and topology, thus allowing sequence-to-structure relationships to be garnered. In 2009, we used SOCKET to create a relational database of coiled-coil structures, CC+ , from the RCSB Protein Data Bank (PDB). Here, we report an update of CC+ following an update of SOCKET (to Socket2) and the recent explosion of structural data and the success of AlphaFold2 in predicting protein structures from genome sequences. With the most-stringent SOCKET parameters, CC+ contains ≈12,000 coiled-coil assemblies from experimentally determined structures, and ≈120,000 potential coiled-coil structures within single-chain models predicted by AlphaFold2 across 48 proteomes. CC+ allows these and other less-stringently defined coiled coils to be searched at various levels of structure, sequence, and side-chain interactions. The identified coiled coils can be viewed directly from CC+ using the Socket2 application, and their associated data can be downloaded for further analyses. CC+ is available freely at http://coiledcoils.chm.bris.ac.uk/CCPlus/Home.html. It will be updated automatically. We envisage that CC+ could be used to understand coiled-coil assemblies and their sequence-to-structure relationships, and to aid protein design and engineering.
Collapse
Affiliation(s)
- Prasun Kumar
- School of ChemistryUniversity of BristolBristolUK
| | | | | | - Hugo Schweke
- Department of Chemical and Structural BiologyWeizmann Institute of ScienceRehovotIsrael
| | - Emmanuel D. Levy
- Department of Chemical and Structural BiologyWeizmann Institute of ScienceRehovotIsrael
| | - Derek N. Woolfson
- School of ChemistryUniversity of BristolBristolUK
- School of BiochemistryUniversity of Bristol, Medical Sciences Building, University WalkBristolUK
- Bristol BioDesign Institute, School of ChemistryUniversity of BristolBristolUK
| |
Collapse
|
2
|
Woolfson DN. Understanding a protein fold: the physics, chemistry, and biology of α-helical coiled coils. J Biol Chem 2023; 299:104579. [PMID: 36871758 PMCID: PMC10124910 DOI: 10.1016/j.jbc.2023.104579] [Citation(s) in RCA: 10] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2023] [Revised: 02/25/2023] [Accepted: 02/27/2023] [Indexed: 03/07/2023] Open
Abstract
Protein science is being transformed by powerful computational methods for structure prediction and design: AlphaFold2 can predict many natural protein structures from sequence, and other AI methods are enabling the de novo design of new structures. This raises a question: how much do we understand the underlying sequence-to-structure/function relationships being captured by these methods? This perspective presents our current understanding of one class of protein assembly, the α-helical coiled coils. At first sight, these are straightforward: sequence repeats of hydrophobic (h) and polar (p) residues, (hpphppp)n, direct the folding and assembly of amphipathic α helices into bundles. However, many different bundles are possible: they can have two or more helices (different oligomers); the helices can have parallel, antiparallel or mixed arrangements (different topologies); and the helical sequences can be the same (homomers) or different (heteromers). Thus, sequence-to-structure relationships must be present within the hpphppp repeats to distinguish these states. I discuss the current understanding of this problem at three levels: First, physics gives a parametric framework to generate the many possible coiled-coil backbone structures. Second, chemistry provides a means to explore and deliver sequence-to-structure relationships. Third, biology shows how coiled coils are adapted and functionalized in nature, inspiring applications of coiled coils in synthetic biology. I argue that the chemistry is largely understood; the physics is partly solved, though the considerable challenge of predicting even relative stabilities of different coiled-coil states remains; but there is much more to explore in the biology and synthetic biology of coiled coils.
Collapse
Affiliation(s)
- Derek N Woolfson
- School of Chemistry, University of Bristol, Bristol, United Kingdom; School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, Bristol, United Kingdom; BrisEngBio, School of Chemistry, University of Bristol, Bristol, United Kingdom; Max Planck-Bristol Centre for Minimal Biology, University of Bristol, Bristol, United Kingdom.
| |
Collapse
|
3
|
Kardas S, Fossépré M, Lemaur V, Fernandes AE, Glinel K, Jonas AM, Surin M. Revealing the Organization of Catalytic Sequence-Defined Oligomers via Combined Molecular Dynamics Simulations and Network Analysis. J Chem Inf Model 2022; 62:2761-2770. [PMID: 35608867 DOI: 10.1021/acs.jcim.2c00101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Similar to biological macromolecules such as DNA and proteins, the precise control over the monomer position in sequence-defined polymers is of paramount importance for tuning their structures and properties toward achieving specific functions. Here, we apply molecular network analysis on three-dimensional structures issued from molecular dynamics simulations to decipher how the chain organization of trifunctional catalytic oligomers is influenced by the oligomer sequence and the length of oligo(ethylene oxide) spacers. Our findings demonstrate that the tuning of their primary structures is crucial for favoring cooperative interactions between the catalytic units and thus higher catalytic activities. This combined approach can assist in establishing structure-property relationships, leading to a more rational design of sequence-defined catalytic oligomers via computational chemistry.
Collapse
Affiliation(s)
- Sinan Kardas
- Laboratory for Chemistry of Novel Materials, Center of Innovation and Research in Materials and Polymers, University of Mons-UMONS, Place du Parc 20, Mons B-7000, Belgium.,Institute for Complex Molecular Systems, Eindhoven University of Technology-TU/e, P.O. Box 513, Eindhoven 5600 MB, The Netherlands
| | - Mathieu Fossépré
- Laboratory for Chemistry of Novel Materials, Center of Innovation and Research in Materials and Polymers, University of Mons-UMONS, Place du Parc 20, Mons B-7000, Belgium
| | - Vincent Lemaur
- Laboratory for Chemistry of Novel Materials, Center of Innovation and Research in Materials and Polymers, University of Mons-UMONS, Place du Parc 20, Mons B-7000, Belgium
| | - Antony E Fernandes
- Institute of Condensed Matter and Nanosciences, Bio- and Soft Matter, Université catholique de Louvain-UCLouvain, Louvain-la-Neuve B-1348, Belgium.,Certech, Rue Jules Bordet 45, Zone Industrielle C, Seneffe B-7180, Belgium
| | - Karine Glinel
- Institute of Condensed Matter and Nanosciences, Bio- and Soft Matter, Université catholique de Louvain-UCLouvain, Louvain-la-Neuve B-1348, Belgium
| | - Alain M Jonas
- Institute of Condensed Matter and Nanosciences, Bio- and Soft Matter, Université catholique de Louvain-UCLouvain, Louvain-la-Neuve B-1348, Belgium
| | - Mathieu Surin
- Laboratory for Chemistry of Novel Materials, Center of Innovation and Research in Materials and Polymers, University of Mons-UMONS, Place du Parc 20, Mons B-7000, Belgium
| |
Collapse
|
4
|
Feng SH, Xia CQ, Shen HB. CoCoPRED: coiled-coil protein structural feature prediction from amino acid sequence using deep neural networks. Bioinformatics 2022; 38:720-729. [PMID: 34718416 DOI: 10.1093/bioinformatics/btab744] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 10/08/2021] [Accepted: 10/27/2021] [Indexed: 02/03/2023] Open
Abstract
MOTIVATION Coiled-coil is composed of two or more helices that are wound around each other. It widely exists in proteins and has been discovered to play a variety of critical roles in biology processes. Generally, there are three types of structural features in coiled-coil: coiled-coil domain (CCD), oligomeric state and register. However, most of the existing computational tools only focus on one of them. RESULTS Here, we describe a new deep learning model, CoCoPRED, which is based on convolutional layers, bidirectional long short-term memory, and attention mechanism. It has three networks, i.e. CCD network, oligomeric state network, and register network, corresponding to the three types of structural features in coiled-coil. This means CoCoPRED has the ability of fulfilling comprehensive prediction for coiled-coil proteins. Through the 5-fold cross-validation experiment, we demonstrate that CoCoPRED can achieve better performance than the state-of-the-art models on both CCD prediction and oligomeric state prediction. Further analysis suggests the CCD prediction may be a performance indicator of the oligomeric state prediction in CoCoPRED. The attention heads in CoCoPRED indicate that registers a, b and e are more crucial for the oligomeric state prediction. AVAILABILITY AND IMPLEMENTATION CoCoPRED is available at http://www.csbio.sjtu.edu.cn/bioinf/CoCoPRED. The datasets used in this research can also be downloaded from the website. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Shi-Hao Feng
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Chun-Qiu Xia
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China
| | - Hong-Bin Shen
- Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Key Laboratory of System Control and Information Processing, Ministry of Education of China, Shanghai 200240, China.,Department of Computer Science, Shanghai Jiao Tong University, Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering, Shanghai 200240, China
| |
Collapse
|
5
|
Khalife S, Malliavin T, Liberti L. Secondary structure assignment of proteins in the absence of sequence information. BIOINFORMATICS ADVANCES 2021; 1:vbab038. [PMID: 36700087 PMCID: PMC9710659 DOI: 10.1093/bioadv/vbab038] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 10/29/2021] [Accepted: 11/23/2021] [Indexed: 01/28/2023]
Abstract
Motivation The structure of proteins is organized in a hierarchy among which the secondary structure elements, α-helix, β-strand and loop, are the basic bricks. The determination of secondary structure elements usually requires the knowledge of the whole structure. Nevertheless, in numerous experimental circumstances, the protein structure is partially known. The detection of secondary structures from these partial structures is hampered by the lack of information about connecting residues along the primary sequence. Results We introduce a new methodology to estimate the secondary structure elements from the values of local distances and angles between the protein atoms. Our method uses a message passing neural network, named Sequoia, which allows the automatic prediction of secondary structure elements from the values of local distances and angles between the protein atoms. This neural network takes as input the topology of the given protein graph, where the vertices are protein residues, and the edges are weighted by values of distances and pseudo-dihedral angles generalizing the backbone angles ϕ and ψ. Any pair of residues, independently of its covalent bonds along the primary sequence of the protein, is tagged with this distance and angle information. Sequoia permits the automatic detection of the secondary structure elements, with an F1-score larger than 80% for most of the cases, when α helices and β strands are predicted. In contrast to the approaches classically used in structural biology, such as DSSP, Sequoia is able to capture the variations of geometry at the interface of adjacent secondary structure element. Due to its general modeling frame, Sequoia is able to handle graphs containing only C α atoms, which is particularly useful on low resolution structural input and in the frame of electron microscopy development. Availability and implementation Sequoia source code can be found at https://github.com/Khalife/Sequoia with additional documentation. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | | | - Leo Liberti
- LIX, CNRS, Ecole Polytechnique, Institut Polytechnique de Paris, Palaiseau 91128, France
| |
Collapse
|
6
|
Kumar P, Woolfson DN. Socket2: A Program for Locating, Visualising, and Analysing Coiled-coil Interfaces in Protein Structures. Bioinformatics 2021; 37:4575-4577. [PMID: 34498035 PMCID: PMC8652024 DOI: 10.1093/bioinformatics/btab631] [Citation(s) in RCA: 17] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2021] [Revised: 06/14/2021] [Accepted: 08/24/2021] [Indexed: 12/03/2022] Open
Abstract
Motivation Protein–protein interactions are central to all biological processes. One frequently observed mode of such interactions is the α-helical coiled coil (CC). Thus, an ability to extract, visualize and analyze CC interfaces quickly and without expert guidance would facilitate a wide range of biological research. In 2001, we reported Socket, which locates and characterizes CCs in protein structures based on the knobs-into-holes (KIH) packing between helices in CCs. Since then, studies of natural and de novo designed CCs have boomed, and the number of CCs in the RCSB PDB has increased rapidly. Therefore, we have updated Socket and made it accessible to expert and nonexpert users alike. Results The original Socket only classified CCs with up to six helices. Here, we report Socket2, which rectifies this oversight to identify CCs with any number of helices, and KIH interfaces with any of the 20 proteinogenic residues or incorporating nonnatural amino acids. In addition, we have developed a new and easy-to-use web server with additional features. These include the use of NGL Viewer for instantly visualizing CCs, and tabs for viewing the sequence repeats, helix-packing angles and core-packing geometries of CCs identified and calculated by Socket2. Availability and implementation Socket2 has been tested on all modern browsers. It can be accessed freely at http://coiledcoils.chm.bris.ac.uk/socket2/home.html. The source code is distributed using an MIT licence and available to download under the Downloads tab of the Socket2 home page.
Collapse
Affiliation(s)
- Prasun Kumar
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, United Kingdom
| | - Derek N Woolfson
- School of Chemistry, University of Bristol, Cantock's Close, Bristol BS8 1TS, United Kingdom.,School of Biochemistry, University of Bristol, Medical Sciences Building, University Walk, Bristol, United Kingdom BS8 1TD.,Bristol BioDesign Institute, University of Bristol, Life Sciences Building, Tyndall Avenue, Bristol, BS8, United Kingdom 1TQ
| |
Collapse
|
7
|
Simm D, Hatje K, Waack S, Kollmar M. Critical assessment of coiled-coil predictions based on protein structure data. Sci Rep 2021; 11:12439. [PMID: 34127723 PMCID: PMC8203680 DOI: 10.1038/s41598-021-91886-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2021] [Accepted: 05/28/2021] [Indexed: 02/05/2023] Open
Abstract
Coiled-coil regions were among the first protein motifs described structurally and theoretically. The simplicity of the motif promises that coiled-coil regions can be detected with reasonable accuracy and precision in any protein sequence. Here, we re-evaluated the most commonly used coiled-coil prediction tools with respect to the most comprehensive reference data set available, the entire Protein Data Bank, down to each amino acid and its secondary structure. Apart from the 30-fold difference in minimum and maximum number of coiled coils predicted the tools strongly vary in where they predict coiled-coil regions. Accordingly, there is a high number of false predictions and missed, true coiled-coil regions. The evaluation of the binary classification metrics in comparison with naïve coin-flip models and the calculation of the Matthews correlation coefficient, the most reliable performance metric for imbalanced data sets, suggests that the tested tools' performance is close to random. This implicates that the tools' predictions have only limited informative value. Coiled-coil predictions are often used to interpret biochemical data and are part of in-silico functional genome annotation. Our results indicate that these predictions should be treated very cautiously and need to be supported and validated by experimental evidence.
Collapse
Affiliation(s)
- Dominic Simm
- grid.418140.80000 0001 2104 4211Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany ,grid.7450.60000 0001 2364 4210Theoretical Computer Science and Algorithmic Methods, Institute of Computer Science, Georg-August-University Göttingen, Göttingen, Germany
| | - Klas Hatje
- grid.418140.80000 0001 2104 4211Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany ,grid.417570.00000 0004 0374 1269Present Address: Roche Pharmaceutical Research and Early Development, Pharmaceutical Sciences, Roche Innovation Center Basel, F. Hoffmann-La Roche Ltd., Basel, Switzerland
| | - Stephan Waack
- grid.7450.60000 0001 2364 4210Theoretical Computer Science and Algorithmic Methods, Institute of Computer Science, Georg-August-University Göttingen, Göttingen, Germany
| | - Martin Kollmar
- grid.418140.80000 0001 2104 4211Group Systems Biology of Motor Proteins, Department of NMR-Based Structural Biology, Max-Planck-Institute for Biophysical Chemistry, Göttingen, Germany ,grid.7450.60000 0001 2364 4210Theoretical Computer Science and Algorithmic Methods, Institute of Computer Science, Georg-August-University Göttingen, Göttingen, Germany
| |
Collapse
|
8
|
Foutch D, Pham B, Shen T. Protein conformational switch discerned via network centrality properties. Comput Struct Biotechnol J 2021; 19:3599-3608. [PMID: 34257839 PMCID: PMC8246261 DOI: 10.1016/j.csbj.2021.06.004] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 06/01/2021] [Accepted: 06/02/2021] [Indexed: 11/17/2022] Open
Abstract
Network analysis has emerged as a powerful tool for examining structural biology systems. The spatial organization of the components of a biomolecular structure has been rendered as a graph representation and analyses have been performed to deduce the biophysical and mechanistic properties of these components. For proteins, the analysis of protein structure networks (PSNs), especially via network centrality measurements and cluster coefficients, has led to identifying amino acid residues that play key functional roles and classifying amino acid residues in general. Whether these network properties examined in various studies are sensitive to subtle (yet biologically significant) conformational changes remained to be addressed. Here, we focused on four types of network centrality properties (betweenness, closeness, degree, and eigenvector centralities) for conformational changes upon ligand binding of a sensor protein (constitutive androstane receptor) and an allosteric enzyme (ribonucleotide reductase). We found that eigenvector centrality is sensitive and can distinguish salient structural features between protein conformational states while other centrality measures, especially closeness centrality, are less sensitive and rather generic with respect to the structural specificity. We also demonstrated that an ensemble-informed, modified PSN with static edges removed (which we term PSN*) has enhanced sensitivity at discerning structural changes.
Collapse
Affiliation(s)
- David Foutch
- Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN 37232, USA
| | - Bill Pham
- Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| | - Tongye Shen
- Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA.,UT-ORNL Center for Molecular Biophysics, Oak Ridge National Laboratory, Oak Ridge, TN 37830, USA
| |
Collapse
|
9
|
Dawson WM, Martin FJO, Rhys GG, Shelley KL, Brady RL, Woolfson DN. Coiled coils 9-to-5: rational de novo design of α-helical barrels with tunable oligomeric states. Chem Sci 2021; 12:6923-6928. [PMID: 34745518 PMCID: PMC8503928 DOI: 10.1039/d1sc00460c] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2021] [Accepted: 04/13/2021] [Indexed: 01/10/2023] Open
Abstract
The rational design of linear peptides that assemble controllably and predictably in water is challenging. Short sequences must encode unique target structures and avoid alternative states. However, the non-covalent forces that stabilize and discriminate between states are weak. Nonetheless, for α-helical coiled-coil assemblies considerable progress has been made in rational de novo design. In these, sequence repeats of nominally hydrophobic (h) and polar (p) residues, hpphppp, direct the assembly of amphipathic helices into dimeric to tetrameric bundles. Expanding this pattern to hpphhph can produce larger α-helical barrels. Here, we show that pentameric to nonameric barrels are accessed by varying the residue at one of the h sites. In peptides with four L/I-K-E-I-A-x-Z repeats, decreasing the size of Z from threonine to serine to alanine to glycine gives progressively larger oligomers. X-ray crystal structures of the resulting α-helical barrels rationalize this: side chains at Z point directly into the helical interfaces, and smaller residues allow closer helix contacts and larger assemblies.
Collapse
Affiliation(s)
- William M Dawson
- School of Chemistry, University of Bristol Cantock's Close Bristol BS8 1TS UK
| | - Freddie J O Martin
- School of Chemistry, University of Bristol Cantock's Close Bristol BS8 1TS UK
| | - Guto G Rhys
- School of Chemistry, University of Bristol Cantock's Close Bristol BS8 1TS UK
- Department of Chemistry, University of Bayreuth, Universitätsstraße 30 95447 Bayreuth Germany
| | - Kathryn L Shelley
- School of Chemistry, University of Bristol Cantock's Close Bristol BS8 1TS UK
- School of Biochemistry, University of Bristol Biomedical Sciences Building, University Walk Bristol BS8 1TD UK
| | - R Leo Brady
- School of Biochemistry, University of Bristol Biomedical Sciences Building, University Walk Bristol BS8 1TD UK
| | - Derek N Woolfson
- School of Chemistry, University of Bristol Cantock's Close Bristol BS8 1TS UK
- School of Biochemistry, University of Bristol Biomedical Sciences Building, University Walk Bristol BS8 1TD UK
- Bristol BioDesign Institute, University of Bristol Life Sciences Building, Tyndall Avenue Bristol BS8 1TQ UK
| |
Collapse
|
10
|
Szczepaniak K, Bukala A, da Silva Neto AM, Ludwiczak J, Dunin-Horkawicz S. A library of coiled-coil domains: from regular bundles to peculiar twists. Bioinformatics 2021; 36:5368-5376. [PMID: 33325494 PMCID: PMC8016460 DOI: 10.1093/bioinformatics/btaa1041] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2020] [Revised: 10/30/2020] [Accepted: 12/07/2020] [Indexed: 11/30/2022] Open
Abstract
MOTIVATION Coiled coils are widespread protein domains involved in diverse processes ranging from providing structural rigidity to the transduction of conformational changes. They comprise two or more α-helices that are wound around each other to form a regular supercoiled bundle. Owing to this regularity, coiled-coil structures can be described with parametric equations, thus enabling the numerical representation of their properties, such as the degree and handedness of supercoiling, rotational state of the helices, and the offset between them. These descriptors are invaluable in understanding the function of coiled coils and designing new structures of this type. The existing tools for such calculations require manual preparation of input and are therefore not suitable for the high-throughput analyses. RESULTS To address this problem, we developed SamCC-Turbo, a software for fully automated, per-residue measurement of coiled coils. By surveying Protein Data Bank with SamCC-Turbo, we generated a comprehensive atlas of ∼50 000 coiled-coil regions. This machine learning-ready dataset features precise measurements as well as decomposes coiled-coil structures into fragments characterized by various degrees of supercoiling. The potential applications of SamCC-Turbo are exemplified by analyses in which we reveal general structural features of coiled coils involved in functions requiring conformational plasticity. Finally, we discuss further directions in the prediction and modeling of coiled coils. AVAILABILITY AND IMPLEMENTATION SamCC-Turbo is available as a web server (https://lbs.cent.uw.edu.pl/samcc_turbo) and as a Python library (https://github.com/labstructbioinf/samcc_turbo), whereas the results of the Protein Data Bank scan can be browsed and downloaded at https://lbs.cent.uw.edu.pl/ccdb. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Krzysztof Szczepaniak
- Laboratory of Structural Bioinformatics, Centre of New Technologies, University of Warsaw, 02-097 Warsaw, Poland
| | - Adriana Bukala
- Laboratory of Structural Bioinformatics, Centre of New Technologies, University of Warsaw, 02-097 Warsaw, Poland
| | - Antonio Marinho da Silva Neto
- Molecular Prospecting and Bioinformatics Group, Laboratory of Immunopathology Keizo Asami, Federal University of Pernambuco, 50670-901 Recife, Brazil
| | - Jan Ludwiczak
- Laboratory of Structural Bioinformatics, Centre of New Technologies, University of Warsaw, 02-097 Warsaw, Poland
- Laboratory of Bioinformatics, Nencki Institute of Experimental Biology, 02-093 Warsaw, Poland
| | - Stanislaw Dunin-Horkawicz
- Laboratory of Structural Bioinformatics, Centre of New Technologies, University of Warsaw, 02-097 Warsaw, Poland
| |
Collapse
|
11
|
Ng JF, Fraternali F. Understanding the structural details of APOBEC3-DNA interactions using graph-based representations. Curr Res Struct Biol 2020; 2:130-143. [PMID: 34235473 PMCID: PMC8244423 DOI: 10.1016/j.crstbi.2020.07.001] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Revised: 07/17/2020] [Accepted: 07/21/2020] [Indexed: 12/22/2022] Open
Abstract
Human APOBEC3 (A3; apolipoprotein B mRNA editing catalytic polypeptide-like 3) is a family of seven enzymes involved in generating mutations in nascent reverse transcripts of many retroviruses, as well as the human genome in a range of cancer types. The structural details of the interaction between A3 proteins and DNA molecules are only available for a few family members. Here we use homology modelling techniques to address the difference in structural coverage of human A3 enzymes interacting with different DNA substrates. A3-DNA interfaces are represented as residue networks ("graphs"), based on which features at these interfaces are compared and quantified. We demonstrate that graph-based representations are effective in highlighting structural features of A3-DNA interfaces. By large-scale in silico mutagenesis of the bound DNA chain, we predicted the preference of substrate DNA sequence for multiple A3 domains. These data suggested that computational modelling approaches could contribute in the exploration of the structural basis for sequence specificity in A3 substrate selection, and demonstrated the utility of graph-based approaches in evaluating a large number of structural models generated in silico. APOBEC3(A3)-DNA structures have been resolved with modified deaminase domains. Structural modelling of interaction between wild-type A3 domains and DNA substrates. Graph-based representations reveal structural differences across A3-DNA interfaces. Using in silico mutagenesis we compared substrate preference of multiple A3 domains. Graph-based approaches can efficiently compare a large number of structural models.
Collapse
|