1
|
Hafsa NE, Berjanskii MV, Arndt D, Wishart DS. Rapid and reliable protein structure determination via chemical shift threading. JOURNAL OF BIOMOLECULAR NMR 2018; 70:33-51. [PMID: 29196969 DOI: 10.1007/s10858-017-0154-1] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Accepted: 11/14/2017] [Indexed: 06/07/2023]
Abstract
Protein structure determination using nuclear magnetic resonance (NMR) spectroscopy can be both time-consuming and labor intensive. Here we demonstrate how chemical shift threading can permit rapid, robust, and accurate protein structure determination using only chemical shift data. Threading is a relatively old bioinformatics technique that uses a combination of sequence information and predicted (or experimentally acquired) low-resolution structural data to generate high-resolution 3D protein structures. The key motivations behind using NMR chemical shifts for protein threading lie in the fact that they are easy to measure, they are available prior to 3D structure determination, and they contain vital structural information. The method we have developed uses not only sequence and chemical shift similarity but also chemical shift-derived secondary structure, shift-derived super-secondary structure, and shift-derived accessible surface area to generate a high quality protein structure regardless of the sequence similarity (or lack thereof) to a known structure already in the PDB. The method (called E-Thrifty) was found to be very fast (often < 10 min/structure) and to significantly outperform other shift-based or threading-based structure determination methods (in terms of top template model accuracy)-with an average TM-score performance of 0.68 (vs. 0.50-0.62 for other methods). Coupled with recent developments in chemical shift refinement, these results suggest that protein structure determination, using only NMR chemical shifts, is becoming increasingly practical and reliable. E-Thrifty is available as a web server at http://ethrifty.ca .
Collapse
Affiliation(s)
- Noor E Hafsa
- Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E8, Canada
| | - Mark V Berjanskii
- Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada
| | - David Arndt
- Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada
| | - David S Wishart
- Department of Computing Science, University of Alberta, Edmonton, AB, T6G 2E8, Canada.
- Department of Biological Sciences, University of Alberta, Edmonton, AB, T6G 2E9, Canada.
| |
Collapse
|
2
|
Unraveling the meaning of chemical shifts in protein NMR. BIOCHIMICA ET BIOPHYSICA ACTA-PROTEINS AND PROTEOMICS 2017; 1865:1564-1576. [PMID: 28716441 DOI: 10.1016/j.bbapap.2017.07.005] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/22/2017] [Revised: 06/29/2017] [Accepted: 07/07/2017] [Indexed: 12/14/2022]
Abstract
Chemical shifts are among the most informative parameters in protein NMR. They provide wealth of information about protein secondary and tertiary structure, protein flexibility, and protein-ligand binding. In this report, we review the progress in interpreting and utilizing protein chemical shifts that has occurred over the past 25years, with a particular focus on the large body of work arising from our group and other Canadian NMR laboratories. More specifically, this review focuses on describing, assessing, and providing some historical context for various chemical shift-based methods to: (1) determine protein secondary and super-secondary structure; (2) derive protein torsion angles; (3) assess protein flexibility; (4) predict residue accessible surface area; (5) refine 3D protein structures; (6) determine 3D protein structures and (7) characterize intrinsically disordered proteins. This review also briefly covers some of the methods that we previously developed to predict chemical shifts from 3D protein structures and/or protein sequence data. It is hoped that this review will help to increase awareness of the considerable utility of NMR chemical shifts in structural biology and facilitate more widespread adoption of chemical-shift based methods by the NMR spectroscopists, structural biologists, protein biophysicists, and biochemists worldwide. This article is part of a Special Issue entitled: Biophysics in Canada, edited by Lewis Kay, John Baenziger, Albert Berghuis and Peter Tieleman.
Collapse
|
3
|
|
4
|
Suhrer SJ, Gruber M, Wiederstein M, Sippl MJ. Effective techniques for protein structure mining. Methods Mol Biol 2012; 857:33-54. [PMID: 22323216 DOI: 10.1007/978-1-61779-588-6_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Retrieval and characterization of protein structure relationships are instrumental in a wide range of tasks in structural biology. The classification of protein structures (COPS) is a web service that provides efficient access to structure and sequence similarities for all currently available protein structures. Here, we focus on the application of COPS to the problem of template selection in homology modeling.
Collapse
Affiliation(s)
- Stefan J Suhrer
- Center of Applied Molecular Engineering, Division of Bioinformatics, University of Salzburg, Salzburg, Austria.
| | | | | | | |
Collapse
|
5
|
Varnay I, Truffault V, Djuranovic S, Ursinus A, Coles M, Kessler H. Optimized measurement temperature gives access to the solution structure of a 49 kDa homohexameric β-propeller. J Am Chem Soc 2011; 132:15692-8. [PMID: 20961124 DOI: 10.1021/ja1064608] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Ph1500 is a homohexameric, two-domain protein of unknown function from the hyperthermophilic archaeon Pyrococcus horikoshii. The C-terminal hexamerization domain (Ph1500C) is of particular interest, as it lacks sequence homology to proteins of known structure. However, it resisted crystallization for X-ray analysis, and proteins of this size (49 kDa) present a considerable challenge to NMR structure determination in solution. We solved the high-resolution structure of Ph1500C, exploiting the hyperthermophilic nature of the protein to minimize unfavorable relaxation properties by high-temperature measurement. Thus, the side chain assignment (97%) and structure determination became possible at full proton density. To our knowledge, Ph1500C is the largest protein for which this has been achieved. To minimize detrimental fast water exchange of amide protons at increased temperature, we employed a strategy where the temperature was optimized separately for backbone and side chain experiments.
Collapse
Affiliation(s)
- Ilka Varnay
- Institute for Advanced Study and Center of Integrated Protein Science, Department Chemie, Technische Universität München, Lichtenbergstr. 4, 85747 Garching, Germany
| | | | | | | | | | | |
Collapse
|
6
|
Wishart DS. Interpreting protein chemical shift data. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2011; 58:62-87. [PMID: 21241884 DOI: 10.1016/j.pnmrs.2010.07.004] [Citation(s) in RCA: 191] [Impact Index Per Article: 13.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2010] [Accepted: 07/29/2010] [Indexed: 05/12/2023]
Affiliation(s)
- David S Wishart
- Department of Biological Sciences, National Institute for Nanotechnology (NINT), Edmonton, AB, Canada T6G 2E8.
| |
Collapse
|
7
|
Ginzinger SW, Skocibusić M, Heun V. CheckShift improved: fast chemical shift reference correction with high accuracy. JOURNAL OF BIOMOLECULAR NMR 2009; 44:207-11. [PMID: 19575298 DOI: 10.1007/s10858-009-9330-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2009] [Accepted: 05/27/2009] [Indexed: 05/20/2023]
Abstract
The construction of a consistent protein chemical shift database is an important step toward making more extensive use of this data in structural studies. Unfortunately, progress in this direction has been hampered by the quality of the available data, particularly with respect to chemical shift referencing, which is often either inaccurate or inconsistently annotated. Preprocessing of the data is therefore required to detect and correct referencing errors. In an earlier study we developed CheckShift, a program for performing this task automatically. Now we spent substantial effort in improving the running time of the CheckShift algorithm, which resulted in an running time decrease of 90%, thereby achieving equivalent quality to the former version of CheckShift. The reason for the running time decrease is twofold. Firstly we improved the search for the optimal re-referencing offset considerably. Secondly, as CheckShift is based on a secondary structure prediction from the amino acid sequence (formally PsiPred was used), we evaluated a wide range of available secondary structure prediction programs focusing on the special needs of the CheckShift algorithm. The results of this evaluation prove empirically that we can use faster secondary structure prediction programs than PsiPred without sacrificing CheckShift's accuracy. Very recently Wang and Markley (2009) gave a small list of extreme outliers of the former version of the CheckShift web-server. Those were due to the empirical reduction of the search space implemented in the old version. The new version of CheckShift now gives very similar results to RefDB and LACS for all outliers mentioned in Table 1 of Wang and Markley (2009).
Collapse
Affiliation(s)
- Simon W Ginzinger
- Department of Molecular Biology Division of Bioinformatics, Center of Applied Molecular Engineering, University of Salzburg, Hellbrunnerstr. 34/3.OG, Salzburg 5020, Osterreich.
| | | | | |
Collapse
|
8
|
Ginzinger SW, Coles M. SimShiftDB; local conformational restraints derived from chemical shift similarity searches on a large synthetic database. JOURNAL OF BIOMOLECULAR NMR 2009; 43:179-85. [PMID: 19224375 PMCID: PMC2847166 DOI: 10.1007/s10858-009-9301-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/08/2008] [Accepted: 01/07/2009] [Indexed: 05/11/2023]
Abstract
We present SimShiftDB, a new program to extract conformational data from protein chemical shifts using structural alignments. The alignments are obtained in searches of a large database containing 13,000 structures and corresponding back-calculated chemical shifts. SimShiftDB makes use of chemical shift data to provide accurate results even in the case of low sequence similarity, and with even coverage of the conformational search space. We compare SimShiftDB to HHSearch, a state-of-the-art sequence-based search tool, and to TALOS, the current standard tool for the task. We show that for a significant fraction of the predicted similarities, SimShiftDB outperforms the other two methods. Particularly, the high coverage afforded by the larger database often allows predictions to be made for residues not involved in canonical secondary structure, where TALOS predictions are both less frequent and more error prone. Thus SimShiftDB can be seen as a complement to currently available methods.
Collapse
Affiliation(s)
- Simon W. Ginzinger
- Department of Molecular Biology, Division of Bioinformatics, Center of Applied Molecular Engineering, University of Salzburg, Hellbrunnerstr. 34/3.OG, 5020 Salzburg, Austria
| | - Murray Coles
- Department of Protein Evolution, Max-Planck-Institute for Developmental Biology, Spemannstrasse. 35, 72076 Tübingen, Germany
| |
Collapse
|
9
|
Ginzinger SW, Gerick F, Coles M, Heun V. CheckShift: automatic correction of inconsistent chemical shift referencing. JOURNAL OF BIOMOLECULAR NMR 2007; 39:223-7. [PMID: 17899394 DOI: 10.1007/s10858-007-9191-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/24/2007] [Accepted: 08/07/2007] [Indexed: 05/17/2023]
Abstract
The construction of a consistent protein chemical shift database is an important step toward making more extensive use of this data in structural studies. Unfortunately, progress in this direction has been hampered by the quality of the available data, particularly with respect to chemical shift referencing, which is often either inaccurate or inconsistently annotated. Preprocessing of the data is therefore required to detect and correct referencing errors. We have developed a program for performing this task, based on the comparison of reported and expected chemical shift distributions. This program, named CheckShift, does not require additional data and is therefore applicable to data sets where structures are not available. Therefore CheckShift provides the possibility to re-reference chemical shifts prior to their use as structural constraints.
Collapse
Affiliation(s)
- Simon W Ginzinger
- Institut für Informatik, Ludwig-Maximilians-Universität München, Amalienstrasse 17, Munich, Germany.
| | | | | | | |
Collapse
|
10
|
Latek D, Ekonomiuk D, Kolinski A. Protein structure prediction: combining de novo modeling with sparse experimental data. J Comput Chem 2007; 28:1668-76. [PMID: 17342709 DOI: 10.1002/jcc.20657] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Routine structure prediction of new folds is still a challenging task for computational biology. The challenge is not only in the proper determination of overall fold but also in building models of acceptable resolution, useful for modeling the drug interactions and protein-protein complexes. In this work we propose and test a comprehensive approach to protein structure modeling supported by sparse, and relatively easy to obtain, experimental data. We focus on chemical shift-based restraints from NMR, although other sparse restraints could be easily included. In particular, we demonstrate that combining the typical NMR software with artificial intelligence-based prediction of secondary structure enhances significantly the accuracy of the restraints for molecular modeling. The computational procedure is based on the reduced representation approach implemented in the CABS modeling software, which proved to be a versatile tool for protein structure prediction during the CASP (CASP stands for critical assessment of techniques for protein structure prediction) experiments (see http://predictioncenter/CASP6/org). The method is successfully tested on a small set of representative globular proteins of different size and topology, including the two CASP6 targets, for which the required NMR data already exist. The method is implemented in a semi-automated pipeline applicable to a large scale structural annotation of genomic data. Here, we limit the computations to relatively small set. This enabled, without a loss of generality, a detailed discussion of various factors determining accuracy of the proposed approach to the protein structure prediction.
Collapse
Affiliation(s)
- Dorota Latek
- Faculty of Chemistry, Warsaw University, Pateura 1, 02-093 Warsaw, Poland.
| | | | | |
Collapse
|