1
|
Poppe L, Hartman JJ, Romero A, Reagan JD. Structural and Thermodynamic Model for the Activation of Cardiac Troponin. Biochemistry 2022; 61:741-748. [PMID: 35349258 DOI: 10.1021/acs.biochem.2c00084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Cardiac troponin is a regulatory protein complex located on the sarcomere that regulates the engagement of myosin on actin filaments. Low-molecular weight modulators of troponin that bind allosterically with the calcium ion have the potential to improve cardiac contractility in patients with reduced cardiac function. Here we propose an approach to the rational design of troponin modulators through the combined use of solution nuclear magnetic resonance and isothermal titration calorimetry methods. In contrast to traditional approaches limited to calcium and activator-bound troponin structures, here we analyzed the structural and thermodynamic impact of an activator in the context of the troponin functional cycle. This led us to propose a rationale for developing an efficacious troponin activator.
Collapse
Affiliation(s)
- Leszek Poppe
- Amgen, Inc., Thousand Oaks, California 91320, United States
| | - James J Hartman
- Cytokinetics, Inc., South San Francisco, California 94080, United States
| | - Antonio Romero
- Cytokinetics, Inc., South San Francisco, California 94080, United States
| | - Jeffrey D Reagan
- Amgen, Inc., South San Francisco, California 94080, United States
| |
Collapse
|
2
|
Yeh L, Satterthwaite L, Patterson D. Automated, context-free assignment of asymmetric rotor microwave spectra. J Chem Phys 2019; 150:204122. [PMID: 31153211 DOI: 10.1063/1.5085794] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
We present a new algorithm, Robust Automated Assignment of Rigid Rotors (RAARR), for assigning rotational spectra of asymmetric tops. The RAARR algorithm can automatically assign experimental spectra under a broad range of conditions, including spectra comprised of multiple mixture components, in ≲100 s. The RAARR algorithm exploits constraints placed by the conservation of energy to find sets of connected lines in an unassigned spectrum. The highly constrained structure of these sets eliminates all but a handful of plausible assignments for a given set, greatly reducing the number of potential assignments that must be evaluated. We successfully apply our algorithm to automatically assign 15 experimental spectra, including 5 previously unassigned species, without prior estimation of molecular rotational constants. In 9 of the 15 cases, the RAARR algorithm successfully assigns two or more mixture components.
Collapse
Affiliation(s)
- Lia Yeh
- Department of Physics, University of California, Santa Barbara, California 93106, USA
| | - Lincoln Satterthwaite
- Department of Physics, University of California, Santa Barbara, California 93106, USA
| | - David Patterson
- Department of Physics, University of California, Santa Barbara, California 93106, USA
| |
Collapse
|
3
|
Smelter A, Rouchka EC, Moseley HNB. Detecting and accounting for multiple sources of positional variance in peak list registration analysis and spin system grouping. JOURNAL OF BIOMOLECULAR NMR 2017; 68:281-296. [PMID: 28815397 PMCID: PMC5587626 DOI: 10.1007/s10858-017-0126-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/04/2017] [Accepted: 07/26/2017] [Indexed: 05/13/2023]
Abstract
Peak lists derived from nuclear magnetic resonance (NMR) spectra are commonly used as input data for a variety of computer assisted and automated analyses. These include automated protein resonance assignment and protein structure calculation software tools. Prior to these analyses, peak lists must be aligned to each other and sets of related peaks must be grouped based on common chemical shift dimensions. Even when programs can perform peak grouping, they require the user to provide uniform match tolerances or use default values. However, peak grouping is further complicated by multiple sources of variance in peak position limiting the effectiveness of grouping methods that utilize uniform match tolerances. In addition, no method currently exists for deriving peak positional variances from single peak lists for grouping peaks into spin systems, i.e. spin system grouping within a single peak list. Therefore, we developed a complementary pair of peak list registration analysis and spin system grouping algorithms designed to overcome these limitations. We have implemented these algorithms into an approach that can identify multiple dimension-specific positional variances that exist in a single peak list and group peaks from a single peak list into spin systems. The resulting software tools generate a variety of useful statistics on both a single peak list and pairwise peak list alignment, especially for quality assessment of peak list datasets. We used a range of low and high quality experimental solution NMR and solid-state NMR peak lists to assess performance of our registration analysis and grouping algorithms. Analyses show that an algorithm using a single iteration and uniform match tolerances approach is only able to recover from 50 to 80% of the spin systems due to the presence of multiple sources of variance. Our algorithm recovers additional spin systems by reevaluating match tolerances in multiple iterations. To facilitate evaluation of the algorithms, we developed a peak list simulator within our nmrstarlib package that generates user-defined assigned peak lists from a given BMRB entry or database of entries. In addition, over 100,000 simulated peak lists with one or two sources of variance were generated to evaluate the performance and robustness of these new registration analysis and peak grouping algorithms.
Collapse
Affiliation(s)
- Andrey Smelter
- School of Interdisciplinary and Graduate Studies, University of Louisville, Louisville, KY, 40202, USA
- Department of Computer Engineering and Computer Science, University of Louisville, Louisville, KY, 40202, USA
| | - Eric C Rouchka
- Department of Computer Engineering and Computer Science, University of Louisville, Louisville, KY, 40202, USA
- KBRIN Bioinformatics Core, University of Louisville, Louisville, KY, 40202, USA
| | - Hunter N B Moseley
- Department of Molecular and Cellular Biochemistry, University of Kentucky, Lexington, KY, 40356, USA.
- Markey Cancer Center, University of Kentucky, Lexington, KY, 40356, USA.
- Center for Environmental and Systems Biochemistry, University of Kentucky, Lexington, KY, 40356, USA.
- Institute for Biomedical Informatics, University of Kentucky, Lexington, KY, 40356, USA.
| |
Collapse
|
4
|
NMR-based automated protein structure determination. Arch Biochem Biophys 2017; 628:24-32. [PMID: 28263718 DOI: 10.1016/j.abb.2017.02.011] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Revised: 02/18/2017] [Accepted: 02/28/2017] [Indexed: 11/21/2022]
Abstract
NMR spectra analysis for protein structure determination can now in many cases be performed by automated computational methods. This overview of the computational methods for NMR protein structure analysis presents recent automated methods for signal identification in multidimensional NMR spectra, sequence-specific resonance assignment, collection of conformational restraints, and structure calculation, as implemented in the CYANA software package. These algorithms are sufficiently reliable and integrated into one software package to enable the fully automated structure determination of proteins starting from NMR spectra without manual interventions or corrections at intermediate steps, with an accuracy of 1-2 Å backbone RMSD in comparison with manually solved reference structures.
Collapse
|
5
|
Didenko T, Proudfoot A, Dutta SK, Serrano P, Wüthrich K. Non-Uniform Sampling and J-UNIO Automation for Efficient Protein NMR Structure Determination. Chemistry 2015; 21:12363-9. [PMID: 26227870 PMCID: PMC4576834 DOI: 10.1002/chem.201502544] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2015] [Indexed: 11/10/2022]
Abstract
High-resolution structure determination of small proteins in solution is one of the big assets of NMR spectroscopy in structural biology. Improvements in the efficiency of NMR structure determination by advances in NMR experiments and automation of data handling therefore attracts continued interest. Here, non-uniform sampling (NUS) of 3D heteronuclear-resolved [(1)H,(1)H]-NOESY data yielded two- to three-fold savings of instrument time for structure determinations of soluble proteins. With the 152-residue protein NP_372339.1 from Staphylococcus aureus and the 71-residue protein NP_346341.1 from Streptococcus pneumonia we show that high-quality structures can be obtained with NUS NMR data, which are equally well amenable to robust automated analysis as the corresponding uniformly sampled data.
Collapse
Affiliation(s)
- Tatiana Didenko
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037 (USA) http://www.jcsg.org
- Joint Center for Structural Genomics, La Jolla, CA 92037 (USA), Fax: (+1) 858-784-8014
- GPCR-Network, 3430 S. Vermont Ave., TRF 105, Los Angeles, CA 90089-3301 (USA), Fax: (+1) 858-784-8014 http://gpcr.usc.edu
| | - Andrew Proudfoot
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037 (USA) http://www.jcsg.org
- Joint Center for Structural Genomics, La Jolla, CA 92037 (USA), Fax: (+1) 858-784-8014
| | - Samit Kumar Dutta
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037 (USA) http://www.jcsg.org
- Joint Center for Structural Genomics, La Jolla, CA 92037 (USA), Fax: (+1) 858-784-8014
| | - Pedro Serrano
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037 (USA) http://www.jcsg.org
- Joint Center for Structural Genomics, La Jolla, CA 92037 (USA), Fax: (+1) 858-784-8014
| | - Kurt Wüthrich
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037 (USA) http://www.jcsg.org. , ,
- Joint Center for Structural Genomics, La Jolla, CA 92037 (USA), Fax: (+1) 858-784-8014. , ,
- GPCR-Network, 3430 S. Vermont Ave., TRF 105, Los Angeles, CA 90089-3301 (USA), Fax: (+1) 858-784-8014 http://gpcr.usc.edu. , ,
- Skaggs Institute for Chemical Biology, The Scripps Research Institute, 10550 North Torrey Pines Road, La Jolla, CA 92037 (USA), Fax: (+1) 858-784-8014. , ,
| |
Collapse
|
6
|
Abstract
Three-dimensional structures of proteins in solution can be calculated on the basis of conformational restraints derived from NMR measurements. This chapter gives an overview of the computational methods for NMR protein structure analysis highlighting recent automated methods for the assignment of NMR spectra, the collection of conformational restraints, and the structure calculation.
Collapse
|
7
|
Strickland M, Stephens T, Liu J, Tjandra N. Exploiting image registration for automated resonance assignment in NMR. JOURNAL OF BIOMOLECULAR NMR 2015; 62:143-156. [PMID: 25828257 PMCID: PMC4452424 DOI: 10.1007/s10858-015-9926-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/02/2015] [Accepted: 03/24/2015] [Indexed: 06/04/2023]
Abstract
Analysis of protein NMR data involves the assignment of resonance peaks in a number of multidimensional data sets. To establish resonance assignment a three-dimensional search is used to match a pair of common variables, such as chemical shifts of the same spin system, in different NMR spectra. We show that by displaying the variables to be compared in two-dimensional plots the process can be simplified. Moreover, by utilizing a fast Fourier transform cross-correlation algorithm, more common to the field of image registration or pattern matching, we can automate this process. Here, we use sequential NMR backbone assignment as an example to show that the combination of correlation plots and segmented pattern matching establishes fast backbone assignment in fifteen proteins of varying sizes. For example, the 265-residue RalBP1 protein was 95.4% correctly assigned in 10 s. The same concept can be applied to any multidimensional NMR data set where analysis comprises the comparison of two variables. This modular and robust approach offers high efficiency with excellent computational scalability and could be easily incorporated into existing assignment software.
Collapse
Affiliation(s)
| | | | | | - Nico Tjandra
- To whom correspondence should be addressed: Building 50, Room 3503, NHLBI, NIH, Bethesda, MD 20892, Phone: (301) 402-3029, Fax: (301) 402-3405,
| |
Collapse
|
8
|
Buchner L, Güntert P. Systematic evaluation of combined automated NOE assignment and structure calculation with CYANA. JOURNAL OF BIOMOLECULAR NMR 2015; 62:81-95. [PMID: 25796507 DOI: 10.1007/s10858-015-9921-z] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Accepted: 03/16/2015] [Indexed: 05/07/2023]
Abstract
The automated assignment of NOESY cross peaks has become a fundamental technique for NMR protein structure analysis. A widely used algorithm for this purpose is implemented in the program CYANA. It has been used for a large number of structure determinations of proteins in solution but a systematic evaluation of its performance has not yet been reported. In this paper we systematically analyze the reliability of combined automated NOESY assignment and structure calculation with CYANA under a variety of conditions on the basis of the experimental NMR data sets of ten proteins. To evaluate the robustness of the algorithm, the original high-quality experimental data sets were modified in different ways to simulate the effect of data imperfections, i.e. incomplete or erroneous chemical shift assignments, missing NOESY cross peaks, inaccurate peak positions, inaccurate peak intensities, lower dimensionality NOESY spectra, and higher tolerances for the matching of chemical shifts and peak positions. The results show that the algorithm is remarkably robust with regard to imperfections of the NOESY peak lists and the chemical shift tolerances but susceptible to lacking or erroneous resonance assignments, in particular for nuclei that are involved in many NOESY cross peaks.
Collapse
Affiliation(s)
- Lena Buchner
- Institute of Biophysical Chemistry, Center for Biomolecular Magnetic Resonance, and Frankfurt Institute of Advanced Studies, Goethe University Frankfurt am Main, Max-von-Laue-Str. 9, 60438, Frankfurt am Main, Germany
| | | |
Collapse
|
9
|
Nielsen JT, Nielsen NC. VirtualSpectrum, a tool for simulating peak list for multi-dimensional NMR spectra. JOURNAL OF BIOMOLECULAR NMR 2014; 60:51-66. [PMID: 25119482 DOI: 10.1007/s10858-014-9851-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2014] [Accepted: 08/01/2014] [Indexed: 06/03/2023]
Abstract
NMR spectroscopy is a widely used technique for characterizing the structure and dynamics of macromolecules. Often large amounts of NMR data are required to characterize the structure of proteins. To save valuable time and resources on data acquisition, simulated data is useful in the developmental phase, for data analysis, and for comparison with experimental data. However, existing tools for this purpose can be difficult to use, are sometimes specialized for certain types of molecules or spectra, or produce too idealized data. Here we present a fast, flexible and robust tool, VirtualSpectrum, for generating peak lists for most multi-dimensional NMR experiments for both liquid and solid state NMR. It is possible to tune the quality of the generated peak lists to include sources of artifacts from peak overlap, noise and missing signals. VirtualSpectrum uses an analytic expression to represent the spectrum and derive the peak positions, seamlessly handling overlap between signals. We demonstrate our tool by comparing simulated and experimental spectra for different multi-dimensional NMR spectra and analyzing systematically three cases where overlap between peaks is particularly relevant; solid state NMR data, liquid state NMR homonuclear (1)H and (15)N-edited spectra, and 2D/3D heteronuclear correlation spectra of unstructured proteins. We analyze the impact of protein size and secondary structure on peak overlap and on the accuracy of structure determination based on data of different qualities simulated by VirtualSpectrum.
Collapse
Affiliation(s)
- Jakob Toudahl Nielsen
- Department of Chemistry, Center for Insoluble Protein Structures (inSPIN), Interdisciplinary Nanoscience Center (iNANO), University of Aarhus, Gustav Wieds Vej 14, 8000, Aarhus C, Denmark,
| | | |
Collapse
|
10
|
Zhang Z, Porter J, Tripsianes K, Lange OF. Robust and highly accurate automatic NOESY assignment and structure determination with Rosetta. JOURNAL OF BIOMOLECULAR NMR 2014; 59:135-45. [PMID: 24845473 DOI: 10.1007/s10858-014-9832-4] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/15/2014] [Accepted: 04/19/2014] [Indexed: 05/16/2023]
Abstract
We have developed a novel and robust approach for automatic and unsupervised simultaneous nuclear Overhauser effect (NOE) assignment and structure determination within the CS-Rosetta framework. Starting from unassigned peak lists and chemical shift assignments, autoNOE-Rosetta determines NOE cross-peak assignments and generates structural models. The approach tolerates incomplete and raw NOE peak lists as well as incomplete or partially incorrect chemical shift assignments, and its performance has been tested on 50 protein targets ranging from 50 to 200 residues in size. We find a significantly improved performance compared to established programs, particularly for larger proteins and for NOE data obtained on perdeuterated protein samples. X-ray crystallographic structures allowed comparison of Rosetta and conventional, PDB-deposited, NMR models in 20 of 50 test cases. The unsupervised autoNOE-Rosetta models were often of significantly higher accuracy than the corresponding expert-supervised NMR models deposited in the PDB. We also tested the method with unrefined peak lists and found that performance was nearly as good as for refined peak lists. Finally, demonstrating our method's remarkable robustness against problematic input data, we provided correct models for an incorrect PDB-deposited NMR solution structure.
Collapse
Affiliation(s)
- Zaiyong Zhang
- Department Chemie, Biomolecular NMR and Munich Center for Integrated Protein Science, Technische Universität München, Lichtenbergstrasse 4, 85747, Garching, Germany
| | | | | | | |
Collapse
|
11
|
Nielsen JT, Kulminskaya N, Bjerring M, Nielsen NC. Automated robust and accurate assignment of protein resonances for solid state NMR. JOURNAL OF BIOMOLECULAR NMR 2014; 59:119-34. [PMID: 24817190 DOI: 10.1007/s10858-014-9835-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/13/2014] [Accepted: 04/29/2014] [Indexed: 05/26/2023]
Abstract
The process of resonance assignment represents a time-consuming and potentially error-prone bottleneck in structural studies of proteins by solid-state NMR (ssNMR). Software for the automation of this process is therefore of high interest. Procedures developed through the last decades for solution-state NMR are not directly applicable for ssNMR due to the inherently lower data quality caused by lower sensitivity and broader lines, leading to overlap between peaks. Recently, the first efforts towards procedures specifically aimed for ssNMR have been realized (Schmidt et al. in J Biomol NMR 56(3):243-254, 2013). Here we present a robust automatic method, which can accurately assign protein resonances using peak lists from a small set of simple 2D and 3D ssNMR experiments, applicable in cases with low sensitivity. The method is demonstrated on three uniformly (13)C, (15)N labeled biomolecules with different challenges on the assignments. In particular, for the immunoglobulin binding domain B1 of streptococcal protein G automatic assignment shows 100% accuracy for the backbone resonances and 91.8% when including all side chain carbons. It is demonstrated, by using a procedure for generating artificial spectra with increasing line widths, that our method, GAMES_ASSIGN can handle a significant amount of overlapping peaks in the assignment. The impact of including different ssNMR experiments is evaluated as well.
Collapse
Affiliation(s)
- Jakob Toudahl Nielsen
- Center for Insoluble Protein Structures (inSPIN), Interdisciplinary Nanoscience Center (iNANO), Department of Chemistry, Aarhus University, Gustav Wieds Vej 14, 8000, Aarhus C, Denmark,
| | | | | | | |
Collapse
|
12
|
|
13
|
Tejero R, Snyder D, Mao B, Aramini JM, Montelione GT. PDBStat: a universal restraint converter and restraint analysis software package for protein NMR. JOURNAL OF BIOMOLECULAR NMR 2013; 56:337-51. [PMID: 23897031 PMCID: PMC3932191 DOI: 10.1007/s10858-013-9753-7] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/22/2013] [Accepted: 06/11/2013] [Indexed: 05/20/2023]
Abstract
The heterogeneous array of software tools used in the process of protein NMR structure determination presents organizational challenges in the structure determination and validation processes, and creates a learning curve that limits the broader use of protein NMR in biology. These challenges, including accurate use of data in different data formats required by software carrying out similar tasks, continue to confound the efforts of novices and experts alike. These important issues need to be addressed robustly in order to standardize protein NMR structure determination and validation. PDBStat is a C/C++ computer program originally developed as a universal coordinate and protein NMR restraint converter. Its primary function is to provide a user-friendly tool for interconverting between protein coordinate and protein NMR restraint data formats. It also provides an integrated set of computational methods for protein NMR restraint analysis and structure quality assessment, relabeling of prochiral atoms with correct IUPAC names, as well as multiple methods for analysis of the consistency of atomic positions indicated by their convergence across a protein NMR ensemble. In this paper we provide a detailed description of the PDBStat software, and highlight some of its valuable computational capabilities. As an example, we demonstrate the use of the PDBStat restraint converter for restrained CS-Rosetta structure generation calculations, and compare the resulting protein NMR structure models with those generated from the same NMR restraint data using more traditional structure determination methods. These results demonstrate the value of a universal restraint converter in allowing the use of multiple structure generation methods with the same restraint data for consensus analysis of protein NMR structures and the underlying restraint data.
Collapse
Affiliation(s)
- Roberto Tejero
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey and Robert Wood Johnson Medical School, University of Medicine and Dentistry of New Jersey, and Northeast Structural Genomics Consortium, 679 Hoes Lane, Piscataway, New Jersey, 08854, USA
- Departamento de Quίmica Fίsica, Universidad de Valencia, Avenida Dr. Moliner 50 46100 Burjassot, Valencia, SPAIN
| | - David Snyder
- Department of Chemistry, William Paterson University, 300 Pompton Road Wayne, New Jersey 07470, USA
| | - Binchen Mao
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey and Robert Wood Johnson Medical School, University of Medicine and Dentistry of New Jersey, and Northeast Structural Genomics Consortium, 679 Hoes Lane, Piscataway, New Jersey, 08854, USA
| | - James M. Aramini
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey and Robert Wood Johnson Medical School, University of Medicine and Dentistry of New Jersey, and Northeast Structural Genomics Consortium, 679 Hoes Lane, Piscataway, New Jersey, 08854, USA
| | - Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, Rutgers, The State University of New Jersey and Robert Wood Johnson Medical School, University of Medicine and Dentistry of New Jersey, and Northeast Structural Genomics Consortium, 679 Hoes Lane, Piscataway, New Jersey, 08854, USA
- To whom correspondence should be addressed: Prof. Gaetano T. Montelione CABM, Rutgers University 679 Hoes Lane Piscataway, NJ 08854-5638 Phone: 732-235-5321
| |
Collapse
|
14
|
Schieborr U, Sreeramulu S, Elshorst B, Maurer M, Saxena K, Stehle T, Kudlinzki D, Gande SL, Schwalbe H. MOTOR: model assisted software for NMR structure determination. Proteins 2013; 81:2007-22. [PMID: 23852655 DOI: 10.1002/prot.24361] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2012] [Revised: 06/26/2013] [Accepted: 06/28/2013] [Indexed: 11/06/2022]
Abstract
Eukaryotic proteins with important biological function can be partially unstructured, conformational flexible, or heterogenic. Crystallization trials often fail for such proteins. In NMR spectroscopy, parts of the polypeptide chain undergoing dynamics in unfavorable time regimes cannot be observed. De novo NMR structure determination is seriously hampered when missing signals lead to an incomplete chemical shift assignment resulting in an information content of the NOE data insufficient to determine the structure ab initio. We developed a new protein structure determination strategy for such cases based on a novel NOE assignment strategy utilizing a number of model structures but no explicit reference structure as it is used for bootstrapping like algorithms. The software distinguishes in detail between consistent and mutually exclusive pairs of possible NOE assignments on the basis of different precision levels of measured chemical shifts searching for a set of maximum number of consistent NOE assignments in agreement with 3D space. Validation of the method using the structure of the low molecular-weight-protein tyrosine phosphatase A (MptpA) showed robust results utilizing protein structures with 30-45% sequence identity and 70% of the chemical shift assignments. About 60% of the resonance assignments are sufficient to identify those structural models with highest conformational similarity to the real structure. The software was benchmarked by de novo solution structures of fibroblast growth factor 21 (FGF21) and the extracellular fibroblast growth factor receptor domain FGFR4 D2, which both failed in crystallization trials and in classical NMR structure determination.
Collapse
Affiliation(s)
- Ulrich Schieborr
- Johann Wolfgang Goethe-University Frankfurt, Institute for Organic Chemistry and Chemical Biology, Center for Biomolecular Magnetic Resonance, Max-von-Laue-Str. 7, 60438, Frankfurt am Main, Germany
| | | | | | | | | | | | | | | | | |
Collapse
|
15
|
|
16
|
Schmidt E, Güntert P. A new algorithm for reliable and general NMR resonance assignment. J Am Chem Soc 2012; 134:12817-29. [PMID: 22794163 DOI: 10.1021/ja305091n] [Citation(s) in RCA: 123] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
The new FLYA automated resonance assignment algorithm determines NMR chemical shift assignments on the basis of peak lists from any combination of multidimensional through-bond or through-space NMR experiments for proteins. Backbone and side-chain assignments can be determined. All experimental data are used simultaneously, thereby exploiting optimally the redundancy present in the input peak lists and circumventing potential pitfalls of assignment strategies in which results obtained in a given step remain fixed input data for subsequent steps. Instead of prescribing a specific assignment strategy, the FLYA resonance assignment algorithm requires only experimental peak lists and the primary structure of the protein, from which the peaks expected in a given spectrum can be generated by applying a set of rules, defined in a straightforward way by specifying through-bond or through-space magnetization transfer pathways. The algorithm determines the resonance assignment by finding an optimal mapping between the set of expected peaks that are assigned by definition but have unknown positions and the set of measured peaks in the input peak lists that are initially unassigned but have a known position in the spectrum. Using peak lists obtained by purely automated peak picking from the experimental spectra of three proteins, FLYA assigned correctly 96-99% of the backbone and 90-91% of all resonances that could be assigned manually. Systematic studies quantified the impact of various factors on the assignment accuracy, namely the extent of missing real peaks and the amount of additional artifact peaks in the input peak lists, as well as the accuracy of the peak positions. Comparing the resonance assignments from FLYA with those obtained from two other existing algorithms showed that using identical experimental input data these other algorithms yielded significantly (40-142%) more erroneous assignments than FLYA. The FLYA resonance assignment algorithm thus has the reliability and flexibility to replace most manual and semi-automatic assignment procedures for NMR studies of proteins.
Collapse
Affiliation(s)
- Elena Schmidt
- Institute of Biophysical Chemistry, Center for Biomolecular Magnetic Resonance, Goethe University Frankfurt am Main, Frankfurt am Main, Germany
| | | |
Collapse
|
17
|
Sborgi L, Verma A, Muñoz V, de Alba E. Revisiting the NMR structure of the ultrafast downhill folding protein gpW from bacteriophage λ. PLoS One 2011; 6:e26409. [PMID: 22087227 PMCID: PMC3208555 DOI: 10.1371/journal.pone.0026409] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2011] [Accepted: 09/26/2011] [Indexed: 11/18/2022] Open
Abstract
GpW is a 68-residue protein from bacteriophage λ that participates in virus head morphogenesis. Previous NMR studies revealed a novel α+β fold for this protein. Recent experiments have shown that gpW folds in microseconds by crossing a marginal free energy barrier (i.e., downhill folding). These features make gpW a highly desirable target for further experimental and computational folding studies. As a step in that direction, we have re-determined the high-resolution structure of gpW by multidimensional NMR on a construct that eliminates the purification tags and unstructured C-terminal tail present in the prior study. In contrast to the previous work, we have obtained a full manual assignment and calculated the structure using only unambiguous distance restraints. This new structure confirms the α+β topology, but reveals important differences in tertiary packing. Namely, the two α-helices are rotated along their main axis to form a leucine zipper. The β-hairpin is orthogonal to the helical interface rather than parallel, displaying most tertiary contacts through strand 1. There also are differences in secondary structure: longer and less curved helices and a hairpin that now shows the typical right-hand twist. Molecular dynamics simulations starting from both gpW structures, and calculations with CS-Rosetta, all converge to our gpW structure. This confirms that the original structure has strange tertiary packing and strained secondary structure. A comparison of NMR datasets suggests that the problems were mainly caused by incomplete chemical shift assignments, mistakes in NOE assignment and the inclusion of ambiguous distance restraints during the automated procedure used in the original study. The new gpW corrects these problems, providing the appropriate structural reference for future work. Furthermore, our results are a cautionary tale against the inclusion of ambiguous experimental information in the determination of protein structures.
Collapse
Affiliation(s)
- Lorenzo Sborgi
- Centro de Investigaciones Biológicas, Consejo Superior de Investigaciones Científicas, Madrid, Spain
| | - Abhinav Verma
- Centro de Investigaciones Biológicas, Consejo Superior de Investigaciones Científicas, Madrid, Spain
| | - Victor Muñoz
- Centro de Investigaciones Biológicas, Consejo Superior de Investigaciones Científicas, Madrid, Spain
- Department of Chemistry and Biochemistry, University of Maryland, College Park, Maryland, United States of America
- * E-mail: (EdA); (VM)
| | - Eva de Alba
- Centro de Investigaciones Biológicas, Consejo Superior de Investigaciones Científicas, Madrid, Spain
- * E-mail: (EdA); (VM)
| |
Collapse
|
18
|
Orekhov VY, Jaravine VA. Analysis of non-uniformly sampled spectra with multi-dimensional decomposition. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2011; 59:271-92. [PMID: 21920222 DOI: 10.1016/j.pnmrs.2011.02.002] [Citation(s) in RCA: 246] [Impact Index Per Article: 18.9] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2011] [Accepted: 02/21/2011] [Indexed: 05/04/2023]
Affiliation(s)
- Vladislav Yu Orekhov
- Swedish NMR Centre, University of Gothenburg, Box 465, 40530 Gothenburg, Sweden.
| | | |
Collapse
|
19
|
Zeng J, Zhou P, Donald BR. Protein side-chain resonance assignment and NOE assignment using RDC-defined backbones without TOCSY data. JOURNAL OF BIOMOLECULAR NMR 2011; 50:371-95. [PMID: 21706248 PMCID: PMC3155202 DOI: 10.1007/s10858-011-9522-4] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/02/2010] [Accepted: 05/19/2011] [Indexed: 05/31/2023]
Abstract
One bottleneck in NMR structure determination lies in the laborious and time-consuming process of side-chain resonance and NOE assignments. Compared to the well-studied backbone resonance assignment problem, automated side-chain resonance and NOE assignments are relatively less explored. Most NOE assignment algorithms require nearly complete side-chain resonance assignments from a series of through-bond experiments such as HCCH-TOCSY or HCCCONH. Unfortunately, these TOCSY experiments perform poorly on large proteins. To overcome this deficiency, we present a novel algorithm, called NASCA: (NOE Assignment and Side-Chain Assignment), to automate both side-chain resonance and NOE assignments and to perform high-resolution protein structure determination in the absence of any explicit through-bond experiment to facilitate side-chain resonance assignment, such as HCCH-TOCSY. After casting the assignment problem into a Markov Random Field (MRF), NASCA: extends and applies combinatorial protein design algorithms to compute optimal assignments that best interpret the NMR data. The MRF captures the contact map information of the protein derived from NOESY spectra, exploits the backbone structural information determined by RDCs, and considers all possible side-chain rotamers. The complexity of the combinatorial search is reduced by using a dead-end elimination (DEE) algorithm, which prunes side-chain resonance assignments that are provably not part of the optimal solution. Then an A* search algorithm is employed to find a set of optimal side-chain resonance assignments that best fit the NMR data. These side-chain resonance assignments are then used to resolve the NOE assignment ambiguity and compute high-resolution protein structures. Tests on five proteins show that NASCA: assigns resonances for more than 90% of side-chain protons, and achieves about 80% correct assignments. The final structures computed using the NOE distance restraints assigned by NASCA: have backbone RMSD 0.8-1.5 Å from the reference structures determined by traditional NMR approaches.
Collapse
Affiliation(s)
- Jianyang Zeng
- Department of Computer Science, Duke University, Durham NC 27708
| | - Pei Zhou
- Department of Biochemistry, Duke University Medical Center, Durham NC 27710
| | - Bruce Randall Donald
- Department of Computer Science, Duke University, Durham NC 27708
- Department of Biochemistry, Duke University Medical Center, Durham NC 27710
| |
Collapse
|
20
|
Abstract
Around half of all protein structures solved nowadays using solution-state nuclear magnetic resonance (NMR) spectroscopy have been because of automated data analysis. The pervasiveness of computational approaches in general hides, however, a more nuanced view in which the full variety and richness of the field appears. This review is structured around a comparison of methods associated with three NMR observables: classical nuclear Overhauser effect (NOE) constraint gathering in contrast with more recent chemical shift and residual dipole coupling (RDC) based protocols. In each case, the emphasis is placed on the latest research, covering mainly the past 5 years. By describing both general concepts and representative programs, the objective is to map out a field in which--through the very profusion of approaches--it is all too easy to lose one's bearings.
Collapse
|
21
|
Moseley HNB, Sperling LJ, Rienstra CM. Automated protein resonance assignments of magic angle spinning solid-state NMR spectra of β1 immunoglobulin binding domain of protein G (GB1). JOURNAL OF BIOMOLECULAR NMR 2010; 48:123-8. [PMID: 20931264 PMCID: PMC2962796 DOI: 10.1007/s10858-010-9448-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/14/2010] [Accepted: 08/18/2010] [Indexed: 05/11/2023]
Abstract
Magic-angle spinning solid-state NMR (MAS SSNMR) represents a fast developing experimental technique with great potential to provide structural and dynamics information for proteins not amenable to other methods. However, few automated analysis tools are currently available for MAS SSNMR. We present a methodology for automating protein resonance assignments of MAS SSNMR spectral data and its application to experimental peak lists of the β1 immunoglobulin binding domain of protein G (GB1) derived from a uniformly ¹³C- and ¹⁵N-labeled sample. This application to the 56 amino acid GB1 produced an overall 84.1% assignment of the N, CO, CA, and CB resonances with no errors using peak lists from NCACX 3D, CANcoCA 3D, and CANCOCX 4D experiments. This proof of concept demonstrates the tractability of this problem.
Collapse
|
22
|
Rout AK, Barnwal RP, Agarwal G, Chary KVR. Root-mean-square-deviation-based rapid backbone resonance assignments in proteins. MAGNETIC RESONANCE IN CHEMISTRY : MRC 2010; 48:793-797. [PMID: 20803498 DOI: 10.1002/mrc.2664] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2023]
Abstract
We have shown that the methodology based on the estimation of root-mean-square deviation (RMSD) between two sets of chemical shifts is very useful to rapidly assign the spectral signatures of (1)H(N), (13)C(α), (13)C(β), (13)C', (1)H(α) and (15)N spins of a given protein in one state from the knowledge of its resonance assignments in a different state, without resorting to routine established procedures (manual and automated). We demonstrate the utility of this methodology to rapidly assign the 3D spectra of a metal-binding protein in its holo-state from the knowledge of its assignments in apo-state, the spectra of a protein in its paramagnetic state from the knowledge of its assignments in diamagnetic state and, finally, the spectra of a mutant protein from the knowledge of the chemical shifts of the corresponding wild-type protein. The underlying assumption of this methodology is that, it is impossible for any two amino acid residues in a given protein to have all the six chemical shifts degenerate and that the protein under consideration does not undergo large conformational changes in going from one conformational state to another. The methodology has been tested using experimental data on three proteins, M-crystallin (8.5 kDa, predominantly β-sheet, for apo- to holo-state), Calbindin (7.5 kDa, predominantly α-helical, for diamagnetic to paramagnetic state and apo to holo) and EhCaBP1 (14.3 kDa, α-helical, the wild-type protein with one of its mutant). In all the cases, the extent of assignment is found to be greater than 85%.
Collapse
Affiliation(s)
- Ashok K Rout
- Department of Chemical Sciences, Tata Institute of Fundamental Research, Homi Bhabha Road, Colaba, Mumbai-400005, India
| | | | | | | |
Collapse
|
23
|
Loquet A, Gardiennet C, Böckmann A. Protein 3D structure determination by high-resolution solid-state NMR. CR CHIM 2010. [DOI: 10.1016/j.crci.2010.03.007] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
24
|
Stratmann D, Guittet E, van Heijenoort C. Robust structure-based resonance assignment for functional protein studies by NMR. JOURNAL OF BIOMOLECULAR NMR 2010; 46:157-73. [PMID: 20024602 PMCID: PMC2813526 DOI: 10.1007/s10858-009-9390-3] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2009] [Accepted: 11/04/2009] [Indexed: 05/20/2023]
Abstract
High-throughput functional protein NMR studies, like protein interactions or dynamics, require an automated approach for the assignment of the protein backbone. With the availability of a growing number of protein 3D structures, a new class of automated approaches, called structure-based assignment, has been developed quite recently. Structure-based approaches use primarily NMR input data that are not based on J-coupling and for which connections between residues are not limited by through bonds magnetization transfer efficiency. We present here a robust structure-based assignment approach using mainly H(N)-H(N) NOEs networks, as well as (1)H-(15) N residual dipolar couplings and chemical shifts. The NOEnet complete search algorithm is robust against assignment errors, even for sparse input data. Instead of a unique and partly erroneous assignment solution, an optimal assignment ensemble with an accuracy equal or near to 100% is given by NOEnet. We show that even low precision assignment ensembles give enough information for functional studies, like modeling of protein-complexes. Finally, the combination of NOEnet with a low number of ambiguous J-coupling sequential connectivities yields a high precision assignment ensemble. NOEnet will be available under: http://www.icsn.cnrs-gif.fr/download/nmr.
Collapse
Affiliation(s)
- Dirk Stratmann
- NMR, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Eric Guittet
- Centre de Recherche de Gif, Laboratoire de Chimie et Biologie Structurales ICSN-CNRS, 1, av. de la terrasse, 91190 Gif-sur-Yvette, France
| | - Carine van Heijenoort
- Centre de Recherche de Gif, Laboratoire de Chimie et Biologie Structurales ICSN-CNRS, 1, av. de la terrasse, 91190 Gif-sur-Yvette, France
| |
Collapse
|
25
|
Zawadzka-Kazimierczuk A, Kazimierczuk K, Koźmiński W. A set of 4D NMR experiments of enhanced resolution for easy resonance assignment in proteins. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2010; 202:109-16. [PMID: 19880336 DOI: 10.1016/j.jmr.2009.10.006] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/15/2009] [Revised: 10/13/2009] [Accepted: 10/14/2009] [Indexed: 05/13/2023]
Abstract
This paper presents examples of techniques based on the principle of random sampling that allows acquisition of NMR spectra featuring extraordinary resolution. This is due to increased dimensionality and maximum evolution time reached. The acquired spectra of CsPin protein and maltose binding protein were analyzed statistically with the aim to evaluate each technique. The results presented include exemplary spectral cross-sections. The spectral data provided by the proposed techniques allow easy assignment of backbone and side-chain resonances.
Collapse
|
26
|
Mielke SP, Krishnan V. Characterization of protein secondary structure from NMR chemical shifts. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2009; 54:141-165. [PMID: 20160946 PMCID: PMC2766081 DOI: 10.1016/j.pnmrs.2008.06.002] [Citation(s) in RCA: 76] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/10/2023]
Affiliation(s)
- Steven P. Mielke
- UC Davis Genome Center, University of California, Davis, California
| | - V.V. Krishnan
- Department of Applied Science and Center for Comparative Medicine, University of California, Davis, California
- Department of Chemistry, California State University, Fresno, California
- Correspondence to or
| |
Collapse
|
27
|
Stratmann D, van Heijenoort C, Guittet E. NOEnet--use of NOE networks for NMR resonance assignment of proteins with known 3D structure. ACTA ACUST UNITED AC 2008; 25:474-81. [PMID: 19074506 PMCID: PMC2642640 DOI: 10.1093/bioinformatics/btn638] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Motivation: A prerequisite for any protein study by NMR is the assignment of the resonances from the 15N−1H HSQC spectrum to their corresponding atoms of the protein backbone. Usually, this assignment is obtained by analyzing triple resonance NMR experiments. An alternative assignment strategy exploits the information given by an already available 3D structure of the same or a homologous protein. Up to now, the algorithms that have been developed around the structure-based assignment strategy have the important drawbacks that they cannot guarantee a high assignment accuracy near to 100%. Results: We propose here a new program, called NOEnet, implementing an efficient complete search algorithm that ensures the correctness of the assignment results. NOEnet exploits the network character of unambiguous NOE constraints to realize an exhaustive search of all matching possibilities of the NOE network onto the structural one. NOEnet has been successfully tested on EIN, a large protein of 28 kDa, using only NOE data. The complete search of NOEnet finds all possible assignments compatible with experimental data that can be defined as an assignment ensemble. We show that multiple assignment possibilities of large NOE networks are restricted to a small spatial assignment range (SAR), so that assignment ensembles, obtained from accessible experimental data, are precise enough to be used for functional proteins studies, like protein–ligand interaction or protein dynamics studies. We believe that NOEnet can become a major tool for the structure-based backbone resonance assignment strategy in NMR. Availability: The NOEnet program will be available under: http://www.icsn.cnrs-gif.fr/download/nmr Contact:carine@icsn.cnrs-gif.fr; eric.guittet@icsn.cnrs-gif.fr Supplementary Information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Dirk Stratmann
- Laboratoire de Chimie et Biologie Structurales, ICSN-CNRS, Gif-sur-Yvette, France
| | | | | |
Collapse
|
28
|
Automated structure determination from NMR spectra. EUROPEAN BIOPHYSICS JOURNAL: EBJ 2008; 38:129-43. [PMID: 18807026 DOI: 10.1007/s00249-008-0367-z] [Citation(s) in RCA: 178] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/06/2008] [Accepted: 08/28/2008] [Indexed: 10/21/2022]
Abstract
Automated methods for protein structure determination by NMR have increasingly gained acceptance and are now widely used for the automated assignment of distance restraints and the calculation of three-dimensional structures. This review gives an overview of the techniques for automated protein structure analysis by NMR, including both NOE-based approaches and methods relying on other experimental data such as residual dipolar couplings and chemical shifts, and presents the FLYA algorithm for the fully automated NMR structure determination of proteins that is suitable to substitute all manual spectra analysis and thus overcomes a major efficiency limitation of the NMR method for protein structure determination.
Collapse
|
29
|
Verdegem D, Dijkstra K, Hanoulle X, Lippens G. Graphical interpretation of Boolean operators for protein NMR assignments. JOURNAL OF BIOMOLECULAR NMR 2008; 42:11-21. [PMID: 18762868 DOI: 10.1007/s10858-008-9262-2] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/31/2008] [Revised: 06/06/2008] [Accepted: 06/09/2008] [Indexed: 05/26/2023]
Abstract
We have developed a graphics based algorithm for semi-automated protein NMR assignments. Using the basic sequential triple resonance assignment strategy, the method is inspired by the Boolean operators as it applies "AND"-, "OR"- and "NOT"-like operations on planes pulled out of the classical three-dimensional spectra to obtain its functionality. The method's strength lies in the continuous graphical presentation of the spectra, allowing both a semi-automatic peaklist construction and sequential assignment. We demonstrate here its general use for the case of a folded protein with a well-dispersed spectrum, but equally for a natively unfolded protein where spectral resolution is minimal.
Collapse
Affiliation(s)
- Dries Verdegem
- Unité de Glycobiologie Structurale et Fonctionelle, UMR 8576 CNRS, IFR 147, Université des Sciences et Technologies de Lille, 59655, Villeneuve d'Ascq, France
| | | | | | | |
Collapse
|
30
|
Fiorito F, Herrmann T, Damberger FF, Wüthrich K. Automated amino acid side-chain NMR assignment of proteins using (13)C- and (15)N-resolved 3D [ (1)H, (1)H]-NOESY. JOURNAL OF BIOMOLECULAR NMR 2008; 42:23-33. [PMID: 18709333 DOI: 10.1007/s10858-008-9259-x] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/15/2008] [Accepted: 07/15/2008] [Indexed: 05/26/2023]
Abstract
ASCAN is a new algorithm for automatic sequence-specific NMR assignment of amino acid side-chains in proteins, which uses as input the primary structure of the protein, chemical shift lists of (1)H(N), (15)N, (13)C(alpha), (13)C(beta) and possibly (1)H(alpha) from the previous polypeptide backbone assignment, and one or several 3D (13)C- or (15)N-resolved [(1)H,(1)H]-NOESY spectra. ASCAN has also been laid out for the use of TOCSY-type data sets as supplementary input. The program assigns new resonances based on comparison of the NMR signals expected from the chemical structure with the experimentally observed NOESY peak patterns. The core parts of the algorithm are a procedure for generating expected peak positions, which is based on variable combinations of assigned and unassigned resonances that arise for the different amino acid types during the assignment procedure, and a corresponding set of acceptance criteria for assignments based on the NMR experiments used. Expected patterns of NOESY cross peaks involving unassigned resonances are generated using the list of previously assigned resonances, and tentative chemical shift values for the unassigned signals taken from the BMRB statistics for globular proteins. Use of this approach with the 101-amino acid residue protein FimD(25-125) resulted in 84% of the hydrogen atoms and their covalently bound heavy atoms being assigned with a correctness rate of 90%. Use of these side-chain assignments as input for automated NOE assignment and structure calculation with the ATNOS/CANDID/DYANA program suite yielded structure bundles of comparable quality, in terms of precision and accuracy of the atomic coordinates, as those of a reference structure determined with interactive assignment procedures. A rationale for the high quality of the ASCAN-based structure determination results from an analysis of the distribution of the assigned side chains, which revealed near-complete assignments in the core of the protein, with most of the incompletely assigned residues located at or near the protein surface.
Collapse
Affiliation(s)
- Francesco Fiorito
- Institut für Molekularbiologie und Biophysik, ETH Zürich, CH-8093, Zurich, Switzerland
| | | | | | | |
Collapse
|
31
|
Abstract
MOTIVATION Complementing its traditional role in structural studies of proteins, nuclear magnetic resonance (NMR) spectroscopy is playing an increasingly important role in functional studies. NMR dynamics experiments characterize motions involved in target recognition, ligand binding, etc., while NMR chemical shift perturbation experiments identify and localize protein-protein and protein-ligand interactions. The key bottleneck in these studies is to determine the backbone resonance assignment, which allows spectral peaks to be mapped to specific atoms. This article develops a novel approach to address that bottleneck, exploiting an available X-ray structure or homology model to assign the entire backbone from a set of relatively fast and cheap NMR experiments. RESULTS We formulate contact replacement for resonance assignment as the problem of computing correspondences between a contact graph representing the structure and an NMR graph representing the data; the NMR graph is a significantly corrupted, ambiguous version of the contact graph. We first show that by combining connectivity and amino acid type information, and exploiting the random structure of the noise, one can provably determine unique correspondences in polynomial time with high probability, even in the presence of significant noise (a constant number of noisy edges per vertex). We then detail an efficient randomized algorithm and show that, over a variety of experimental and synthetic datasets, it is robust to typical levels of structural variation (1-2 AA), noise (250-600%) and missings (10-40%). Our algorithm achieves very good overall assignment accuracy, above 80% in alpha-helices, 70% in beta-sheets and 60% in loop regions. AVAILABILITY Our contact replacement algorithm is implemented in platform-independent Python code. The software can be freely obtained for academic use by request from the authors.
Collapse
Affiliation(s)
- Fei Xiong
- Department of Computer Science, Dartmouth College, Hanover, NH 03755, USA
| | | | | |
Collapse
|
32
|
Lemak A, Steren CA, Arrowsmith CH, Llinás M. Sequence specific resonance assignment via Multicanonical Monte Carlo search using an ABACUS approach. JOURNAL OF BIOMOLECULAR NMR 2008; 41:29-41. [PMID: 18458824 DOI: 10.1007/s10858-008-9238-2] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/10/2007] [Accepted: 04/08/2008] [Indexed: 05/26/2023]
Abstract
ABACUS [Grishaev et al. (2005) Proteins 61:36-43] is a novel protocol for automated protein structure determination via NMR. ABACUS starts from molecular fragments defined by unassigned J-coupled spin-systems and involves a Monte Carlo stochastic search in assignment space, probabilistic sequence selection, and assembly of fragments into structures that are used to guide the stochastic search. Here, we report further development of the two main algorithms that increase the flexibility and robustness of the method. Performance of the BACUS [Grishaev and Llinás (2004) J Biomol NMR 28:1-101] algorithm was significantly improved through use of sequential connectivities available from through-bond correlated 3D-NMR experiments, and a new set of likelihood probabilities derived from a database of 56 ultra high resolution X-ray structures. A Multicanonical Monte Carlo procedure, Fragment Monte Carlo (FMC), was developed for sequence-specific assignment of spin-systems. It relies on an enhanced assignment sampling and provides the uncertainty of assignments in a quantitative manner. The efficiency of the protocol was validated on data from four proteins of between 68-116 residues, yielding 100% accuracy in sequence specific assignment of backbone and side chain resonances.
Collapse
Affiliation(s)
- Alexander Lemak
- The Ontario Cancer Institute and Department of Medical Biophysics, University of Toronto, Toronto, ON, Canada M5G 2M9.
| | | | | | | |
Collapse
|
33
|
|
34
|
Gong H, Shen Y, Rose GD. Building native protein conformation from NMR backbone chemical shifts using Monte Carlo fragment assembly. Protein Sci 2007; 16:1515-21. [PMID: 17656574 PMCID: PMC2203357 DOI: 10.1110/ps.072988407] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
We have been analyzing the extent to which protein secondary structure determines protein tertiary structure in simple protein folds. An earlier paper demonstrated that three-dimensional structure can be obtained successfully using only highly approximate backbone torsion angles for every residue. Here, the initial information is further diluted by introducing a realistic degree of experimental uncertainty into this process. In particular, we tackle the practical problem of determining three-dimensional structure solely from backbone chemical shifts, which can be measured directly by NMR and are known to be correlated with a protein's backbone torsion angles. Extending our previous algorithm to incorporate these experimentally determined data, clusters of structures compatible with the experimentally determined chemical shifts were generated by fragment assembly Monte Carlo. The cluster that corresponds to the native conformation was then identified based on four energy terms: steric clash, solvent-squeezing, hydrogen-bonding, and hydrophobic contact. Currently, the method has been applied successfully to five small proteins with simple topology. Although still under development, this approach offers promise for high-throughput NMR structure determination.
Collapse
Affiliation(s)
- Haipeng Gong
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland 21218, USA
| | | | | |
Collapse
|
35
|
Matsuki Y, Akutsu H, Fujiwara T. Spectral fitting for signal assignment and structural analysis of uniformly 13C-labeled solid proteins by simulated annealing based on chemical shifts and spin dynamics. JOURNAL OF BIOMOLECULAR NMR 2007; 38:325-39. [PMID: 17612797 DOI: 10.1007/s10858-007-9170-x] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2007] [Accepted: 05/24/2007] [Indexed: 05/16/2023]
Abstract
We describe an approach for the signal assignment and structural analysis with a suite of two-dimensional (13)C-(13)C magic-angle-spinning solid-state NMR spectra of uniformly (13)C-labeled peptides and proteins. We directly fit the calculated spectra to experimental ones by simulated annealing in restrained molecular dynamics program CNS as a function of atomic coordinates. The spectra are calculated from the conformation dependent chemical shift obtained with SHIFTX and the cross-peak intensities computed for recoupled dipolar interactions. This method was applied to a membrane-bound 14-residue peptide, mastoparan-X. The obtained C', C(alpha) and C(beta) chemical shifts agreed with those reported previously at the precisions of 0.2, 0.7 and 0.4 ppm, respectively. This spectral fitting program also provides backbone dihedral angles with a precision of about 50 degrees from the spectra even with resonance overlaps. The restraints on the angles were improved by applying protein database program TALOS to the obtained chemical shifts. The peptide structure provided by these restraints was consistent with the reported structure at the backbone RMSD of about 1 A.
Collapse
Affiliation(s)
- Yoh Matsuki
- Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita 565-0871, Japan
| | | | | |
Collapse
|
36
|
Wan X, Lin G. CISA: combined NMR resonance connectivity information determination and sequential assignment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2007; 4:336-348. [PMID: 17666755 DOI: 10.1109/tcbb.2007.1047] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/16/2023]
Abstract
A nearly complete sequential resonance assignment is a key factor leading to successful protein structure determination via NMR spectroscopy. Assuming the availability of a set of NMR spectral peak lists, most of the existing assignment algorithms first use the differences between chemical shift values for common nuclei across multiple spectra to provide the evidence that some pairs of peaks should be assigned to sequentially adjacent amino acid residues in the target protein. They then use these connectivities as constraints to produce a sequential assignment. At various levels of success, these algorithms typically generate a large number of potential connectivity constraints, and it grows exponentially as the quality of spectral data decreases. A key observation used in our sequential assignment program, CISA, is that chemical shift residual signature information can be used to improve the connectivity determination, and thus to dramatically decrease the number of predicted connectivity constraints. Fewer connectivity constraints lead to less ambiguities in the sequential assignment. Extensive simulation studies on several large test datasets demonstrated that CISA is efficient and effective, compared to three most recently proposed sequential resonance assignment programs RANDOM, PACES, and MARS.
Collapse
|
37
|
Egawa A, Fujiwara T, Mizoguchi T, Kakitani Y, Koyama Y, Akutsu H. Structure of the light-harvesting bacteriochlorophyll c assembly in chlorosomes from Chlorobium limicola determined by solid-state NMR. Proc Natl Acad Sci U S A 2007; 104:790-5. [PMID: 17215361 PMCID: PMC1783392 DOI: 10.1073/pnas.0605911104] [Citation(s) in RCA: 95] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
We have determined the atomic structure of the bacteriochlorophyll c (BChl c) assembly in a huge light-harvesting organelle, the chlorosome of green photosynthetic bacteria, by solid-state NMR. Previous electron microscopic and spectroscopic studies indicated that chlorosomes have a cylindrical architecture with a diameter of approximately 10 nm consisting of layered BChl molecules. Assembly structures in huge noncrystalline chlorosomes have been proposed based mainly on structure-dependent chemical shifts and a few distances acquired by solid-state NMR, but those studies did not provide a definite structure. Our approach is based on (13)C dipolar spin-diffusion solid-state NMR of uniformly (13)C-labeled chlorosomes under magic-angle spinning. Approximately 90 intermolecular C C distances were obtained by simultaneous assignment of distance correlations and structure optimization preceded by polarization-transfer matrix analysis. It was determined from the approximately 90 intermolecular distances that BChl c molecules form piggyback-dimer-based parallel layers. This finding rules out the well known monomer-based structures. A molecular model of the cylinder in the chlorosome was built by using this structure. It provided insights into the mechanisms of efficient light harvesting and excitation transfer to the reaction centers. This work constitutes an important advance in the structure determination of huge intact systems that cannot be crystallized.
Collapse
Affiliation(s)
- Ayako Egawa
- *Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita 565-0871, Japan; and
| | - Toshimichi Fujiwara
- *Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita 565-0871, Japan; and
| | - Tadashi Mizoguchi
- Faculty of Science and Technology, Kwansei Gakuin University, Gakuen, Sanda 669-1337, Japan
| | - Yoshinori Kakitani
- Faculty of Science and Technology, Kwansei Gakuin University, Gakuen, Sanda 669-1337, Japan
| | - Yasushi Koyama
- Faculty of Science and Technology, Kwansei Gakuin University, Gakuen, Sanda 669-1337, Japan
| | - Hideo Akutsu
- *Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita 565-0871, Japan; and
- To whom correspondence should be addressed. E-mail:
| |
Collapse
|
38
|
Vitek O, Bailey-Kellogg C, Craig B, Vitek J. Inferential backbone assignment for sparse data. JOURNAL OF BIOMOLECULAR NMR 2006; 35:187-208. [PMID: 16855861 DOI: 10.1007/s10858-006-9027-8] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/16/2005] [Accepted: 05/08/2006] [Indexed: 05/10/2023]
Abstract
This paper develops an approach to protein backbone NMR assignment that effectively assigns large proteins while using limited sets of triple-resonance experiments. Our approach handles proteins with large fractions of missing data and many ambiguous pairs of pseudoresidues, and provides a statistical assessment of confidence in global and position-specific assignments. The approach is tested on an extensive set of experimental and synthetic data of up to 723 residues, with match tolerances of up to 0.5 ppm for Calpha and Cbeta resonance types. The tests show that the approach is particularly helpful when data contain experimental noise and require large match tolerances. The keys to the approach are an empirical Bayesian probability model that rigorously accounts for uncertainty in the data at all stages in the analysis, and a hybrid stochastic tree-based search algorithm that effectively explores the large space of possible assignments.
Collapse
Affiliation(s)
- Olga Vitek
- Institute for Systems Biology, 1441 North 34th Street, Seattle, WA, 98103-8904, USA.
| | | | | | | |
Collapse
|
39
|
Masse JE, Keller R, Pervushin K. SideLink: automated side-chain assignment of biopolymers from NMR data by relative-hypothesis-prioritization-based simulated logic. JOURNAL OF MAGNETIC RESONANCE (SAN DIEGO, CALIF. : 1997) 2006; 181:45-67. [PMID: 16632394 DOI: 10.1016/j.jmr.2006.03.012] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/15/2005] [Revised: 03/06/2006] [Accepted: 03/10/2006] [Indexed: 05/08/2023]
Abstract
Previously we published the development of AutoLink, a program to assign the backbone resonances of macromolecules. The primary limitation of this program has proven to be its inability to directly recognize spectral data, relying on the user to define peak positions in its input. Here, we introduce a new program for the assignment of side-chain resonances. Like AutoLink, this new program, called SideLink, uses Relative Hypothesis Prioritization to emulate "human" logic. To address the higher complexity of side-chain assignment problems, the RHP algorithm has itself been advanced, making it capable of processing almost any combinatorial logic problem. Additionally, SideLink directly examines spectral data, overcoming the need and limitations of prior data interpretation by users.
Collapse
Affiliation(s)
- James E Masse
- Laboratorium fur Physikalische Chemie, ETH Zurich, CH-8093, Zurich, Switzerland
| | | | | |
Collapse
|
40
|
Wu KP, Chang JM, Chen JB, Chang CF, Wu WJ, Huang TH, Sung TY, Hsu WL. RIBRA--an error-tolerant algorithm for the NMR backbone assignment problem. J Comput Biol 2006; 13:229-44. [PMID: 16597237 DOI: 10.1089/cmb.2006.13.229] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
We develop an iterative relaxation algorithm called RIBRA for NMR protein backbone assignment. RIBRA applies nearest neighbor and weighted maximum independent set algorithms to solve the problem. To deal with noisy NMR spectral data, RIBRA is executed in an iterative fashion based on the quality of spectral peaks. We first produce spin system pairs using the spectral data without missing peaks, then the data group with one missing peak, and finally, the data group with two missing peaks. We test RIBRA on two real NMR datasets, hbSBD and hbLBD, and perfect BMRB data (with 902 proteins) and four synthetic BMRB data which simulate four kinds of errors. The accuracy of RIBRA on hbSBD and hbLBD are 91.4% and 83.6%, respectively. The average accuracy of RIBRA on perfect BMRB datasets is 98.28%, and 98.28%, 95.61%, 98.16%, and 96.28% on four kinds of synthetic datasets, respectively.
Collapse
Affiliation(s)
- Kun-Pin Wu
- Institute of Information Science, Nankang, Taipei, Taiwan
| | | | | | | | | | | | | | | |
Collapse
|
41
|
Baran MC, Moseley HNB, Aramini JM, Bayro MJ, Monleon D, Locke JY, Montelione GT. SPINS: a laboratory information management system for organizing and archiving intermediate and final results from NMR protein structure determinations. Proteins 2006; 62:843-51. [PMID: 16395675 DOI: 10.1002/prot.20840] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]
Abstract
Recent technological advances and experimental techniques have contributed to an increasing number and size of NMR datasets. In order to scale up productivity, laboratory information management systems for handling these extensive data need to be designed and implemented. The SPINS (Standardized ProteIn Nmr Storage) Laboratory Information Management System (LIMS) addresses these needs by providing an interface for archival of complete protein NMR structure determinations, together with functionality for depositing these data to the public BioMagResBank (BMRB). The software tracks intermediate files during each step of an NMR structure-determination process, including: data collection, data processing, resonance assignments, resonance assignment validation, structure calculation, and structure validation. The underlying SPINS data dictionary allows for the integration of various third party NMR data processing and analysis software, enabling users to launch programs they are accustomed to using for each step of the structure determination process directly out of the SPINS user interface.
Collapse
Affiliation(s)
- Michael C Baran
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, Rutgers University, Northeast Structural Genomics Consortium, Piscataway, New Jersey 08854, USA
| | | | | | | | | | | | | |
Collapse
|
42
|
Xu Y, Wang X, Yang J, Vaynberg J, Qin J. PASA--a program for automated protein NMR backbone signal assignment by pattern-filtering approach. JOURNAL OF BIOMOLECULAR NMR 2006; 34:41-56. [PMID: 16505963 DOI: 10.1007/s10858-005-5358-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2005] [Accepted: 11/09/2005] [Indexed: 05/05/2023]
Abstract
We present a new program, PASA (Program for Automated Sequential Assignment), for assigning protein backbone resonances based on multidimensional heteronuclear NMR data. Distinct from existing programs, PASA emphasizes a per-residue-based pattern-filtering approach during the initial stage of the automated 13Calpha and/or 13Cbeta chemical shift matching. The pattern filter employs one or multiple constraints such as 13Calpha/Cbeta chemical shift ranges for different amino acid types and side-chain spin systems, which helps to rule out, in a stepwise fashion, improbable assignments as resulted from resonance degeneracy or missing signals. Such stepwise filtering approach substantially minimizes early false linkage problems that often propagate, amplify, and ultimately cause complication or combinatorial explosion of the automation process. Our program (http://www.lerner.ccf.org/moleccard/qin/) was tested on four representative small-large sized proteins with various degrees of resonance degeneracy and missing signals, and we show that PASA achieved the assignments efficiently and rapidly that are fully consistent with those obtained by laborious manual protocols. The results demonstrate that PASA may be a valuable tool for NMR-based structural analyses, genomics, and proteomics.
Collapse
Affiliation(s)
- Yizhuang Xu
- Structural Biology Program, NB20, The Lerner Research Institute, The Cleveland Clinic Foundation, 9500 Euclid Ave., Cleveland, OH 44195, USA
| | | | | | | | | |
Collapse
|
43
|
Wang J, Wang T, Zuiderweg ERP, Crippen GM. CASA: an efficient automated assignment of protein mainchain NMR data using an ordered tree search algorithm. JOURNAL OF BIOMOLECULAR NMR 2005; 33:261-79. [PMID: 16341754 DOI: 10.1007/s10858-005-4079-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/05/2005] [Accepted: 10/05/2005] [Indexed: 05/05/2023]
Abstract
Rapid analysis of protein structure, interaction, and dynamics requires fast and automated assignments of 3D protein backbone triple-resonance NMR spectra. We introduce a new depth-first ordered tree search method of automated assignment, CASA, which uses hand-edited peak-pick lists of a flexible number of triple resonance experiments. The computer program was tested on 13 artificially simulated peak lists for proteins up to 723 residues, as well as on the experimental data for four proteins. Under reasonable tolerances, it generated assignments that correspond to the ones reported in the literature within a few minutes of CPU time. The program was also tested on the proteins analyzed by other methods, with both simulated and experimental peaklists, and it could generate good assignments in all relevant cases. The robustness was further tested under various situations.
Collapse
Affiliation(s)
- Jianyong Wang
- Department of Physics, University of Michigan, Ann Arbor, MI 48109-1120, USA
| | | | | | | |
Collapse
|
44
|
Kamisetty H, Bailey-Kellogg C, Pandurangan G. An efficient randomized algorithm for contact-based NMR backbone resonance assignment. Bioinformatics 2005; 22:172-80. [PMID: 16287932 DOI: 10.1093/bioinformatics/bti786] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Backbone resonance assignment is a critical bottleneck in studies of protein structure, dynamics and interactions by nuclear magnetic resonance (NMR) spectroscopy. A minimalist approach to assignment, which we call 'contact-based', seeks to dramatically reduce experimental time and expense by replacing the standard suite of through-bond experiments with the through-space (nuclear Overhauser enhancement spectroscopy, NOESY) experiment. In the contact-based approach, spectral data are represented in a graph with vertices for putative residues (of unknown relation to the primary sequence) and edges for hypothesized NOESY interactions, such that observed spectral peaks could be explained if the residues were 'close enough'. Due to experimental ambiguity, several incorrect edges can be hypothesized for each spectral peak. An assignment is derived by identifying consistent patterns of edges (e.g. for alpha-helices and beta-sheets) within a graph and by mapping the vertices to the primary sequence. The key algorithmic challenge is to be able to uncover these patterns even when they are obscured by significant noise. RESULTS This paper develops, analyzes and applies a novel algorithm for the identification of polytopes representing consistent patterns of edges in a corrupted NOESY graph. Our randomized algorithm aggregates simplices into polytopes and fixes inconsistencies with simple local modifications, called rotations, that maintain most of the structure already uncovered. In characterizing the effects of experimental noise, we employ an NMR-specific random graph model in proving that our algorithm gives optimal performance in expected polynomial time, even when the input graph is significantly corrupted. We confirm this analysis in simulation studies with graphs corrupted by up to 500% noise. Finally, we demonstrate the practical application of the algorithm on several experimental beta-sheet datasets. Our approach is able to eliminate a large majority of noise edges and to uncover large consistent sets of interactions. AVAILABILITY Our algorithm has been implemented in the platform-independent Python code. The software can be freely obtained for academic use by request from the authors.
Collapse
|
45
|
Bailey-Kellogg C, Chainraj S, Pandurangan G. A random graph approach to NMR sequential assignment. J Comput Biol 2005; 12:569-83. [PMID: 16108704 DOI: 10.1089/cmb.2005.12.569] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Nuclear magnetic resonance (NMR) spectroscopy allows scientists to study protein structure, dynamics and interactions in solution. A necessary first step for such applications is determining the resonance assignment, mapping spectral data to atoms and residues in the primary sequence. Automated resonance assignment algorithms rely on information regarding connectivity (e.g., through-bond atomic interactions) and amino acid type, typically using the former to determine strings of connected residues and the latter to map those strings to positions in the primary sequence. Significant ambiguity exists in both connectivity and amino acid type information. This paper focuses on the information content available in connectivity alone and develops a novel random-graph theoretic framework and algorithm for connectivity-driven NMR sequential assignment. Our random graph model captures the structure of chemical shift degeneracy, a key source of connectivity ambiguity. We then give a simple and natural randomized algorithm for finding optimal assignments as sets of connected fragments in NMR graphs. The algorithm naturally and efficiently reuses substrings while exploring connectivity choices; it overcomes local ambiguity by enforcing global consistency of all choices. By analyzing our algorithm under our random graph model, we show that it can provably tolerate relatively large ambiguity while still giving expected optimal performance in polynomial time. We present results from practical applications of the algorithm to experimental datasets from a variety of proteins and experimental set-ups. We demonstrate that our approach is able to overcome significant noise and local ambiguity in identifying significant fragments of sequential assignments.
Collapse
Affiliation(s)
- Chris Bailey-Kellogg
- Department of Computer Science, Dartmouth College, 6211 Sudikoff Laboratory, Hanover, NH 03755, USA.
| | | | | |
Collapse
|
46
|
Fossi M, Castellani F, Nilges M, Oschkinat H, van Rossum BJ. SOLARIA: A Protocol for Automated Cross-Peak Assignment and Structure Calculation for Solid-State Magic-Angle Spinning NMR Spectroscopy. Angew Chem Int Ed Engl 2005; 44:6151-4. [PMID: 16175529 DOI: 10.1002/anie.200501884] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Affiliation(s)
- Michele Fossi
- Forschungsinstitut für Molekulare Pharmakologie, Robert-Rössle-Strasse 10, Berlin, Germany
| | | | | | | | | |
Collapse
|
47
|
Fossi M, Castellani F, Nilges M, Oschkinat H, van Rossum BJ. SOLARIA: A Protocol for Automated Cross-Peak Assignment and Structure Calculation for Solid-State Magic-Angle Spinning NMR Spectroscopy. Angew Chem Int Ed Engl 2005. [DOI: 10.1002/ange.200501884] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
|
48
|
Lin HN, Wu KP, Chang JM, Sung TY, Hsu WL. GANA--a genetic algorithm for NMR backbone resonance assignment. Nucleic Acids Res 2005; 33:4593-601. [PMID: 16093550 PMCID: PMC1184223 DOI: 10.1093/nar/gki768] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2005] [Revised: 07/01/2005] [Accepted: 07/27/2005] [Indexed: 11/13/2022] Open
Abstract
NMR data from different experiments often contain errors; thus, automated backbone resonance assignment is a very challenging issue. In this paper, we present a method called GANA that uses a genetic algorithm to automatically perform backbone resonance assignment with a high degree of precision and recall. Precision is the number of correctly assigned residues divided by the number of assigned residues, and recall is the number of correctly assigned residues divided by the number of residues with known human curated answers. GANA takes spin systems as input data and uses two data structures, candidate lists and adjacency lists, to assign the spin systems to each amino acid of a target protein. Using GANA, almost all spin systems can be mapped correctly onto a target protein, even if the data are noisy. We use the BioMagResBank (BMRB) dataset (901 proteins) to test the performance of GANA. To evaluate the robustness of GANA, we generate four additional datasets from the BMRB dataset to simulate data errors of false positives, false negatives and linking errors. We also use a combination of these three error types to examine the fault tolerance of our method. The average precision rates of GANA on BMRB and the four simulated test cases are 99.61, 99.55, 99.34, 99.35 and 98.60%, respectively. The average recall rates of GANA on BMRB and the four simulated test cases are 99.26, 99.19, 98.85, 98.87 and 97.78%, respectively. We also test GANA on two real wet-lab datasets, hbSBD and hbLBD. The precision and recall rates of GANA on hbSBD are 95.12 and 92.86%, respectively, and those of hbLBD are 100 and 97.40%, respectively.
Collapse
Affiliation(s)
- Hsin-Nan Lin
- Institute of Information Science, Academia SinicaTaipei, Taiwan
| | - Kun-Pin Wu
- Institute of Information Science, Academia SinicaTaipei, Taiwan
| | - Jia-Ming Chang
- Institute of Information Science, Academia SinicaTaipei, Taiwan
| | - Ting-Yi Sung
- Institute of Information Science, Academia SinicaTaipei, Taiwan
| | - Wen-Lian Hsu
- Institute of Information Science, Academia SinicaTaipei, Taiwan
| |
Collapse
|
49
|
Eghbalnia HR, Bahrami A, Wang L, Assadi A, Markley JL. Probabilistic Identification of Spin Systems and their Assignments including Coil-Helix Inference as Output (PISTACHIO). JOURNAL OF BIOMOLECULAR NMR 2005; 32:219-33. [PMID: 16132822 DOI: 10.1007/s10858-005-7944-6] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/09/2005] [Accepted: 05/12/2005] [Indexed: 05/04/2023]
Abstract
We present a novel automated strategy (PISTACHIO) for the probabilistic assignment of backbone and sidechain chemical shifts in proteins. The algorithm uses peak lists derived from various NMR experiments as input and provides as output ranked lists of assignments for all signals recognized in the input data as constituting spin systems. PISTACHIO was evaluated by comparing its performance with raw peak-picked data from 15 proteins ranging from 54 to 300 residues; the results were compared with those achieved by experts analyzing the same datasets by hand. As scored against the best available independent assignments for these proteins, the first-ranked PISTACHIO assignments were 80-100% correct for backbone signals and 75-95% correct for sidechain signals. The independent assignments benefited, in a number of cases, from structural data (e.g. from NOESY spectra) that were unavailable to PISTACHIO. Any number of datasets in any combination can serve as input. Thus PISTACHIO can be used as datasets are collected to ascertain the current extent of secure assignments, to identify residues with low assignment probability, and to suggest the types of additional data needed to remove ambiguities. The current implementation of PISTACHIO, which is available from a server on the Internet, supports input data from 15 standard double- and triple-resonance experiments. The software can readily accommodate additional types of experiments, including data from selectively labeled samples. The assignment probabilities can be carried forward and refined in subsequent steps leading to a structure. The performance of PISTACHIO showed no direct dependence on protein size, but correlated instead with data quality (completeness and signal-to-noise). PISTACHIO represents one component of a comprehensive probabilistic approach we are developing for the collection and analysis of protein NMR data.
Collapse
Affiliation(s)
- Hamid R Eghbalnia
- Biochemistry Department, National Magnetic Resonance Facility at Madison, 433, Babcock Drive, Madison, WI 53706, USA.
| | | | | | | | | |
Collapse
|
50
|
Liu HL, Hsu JP. Recent developments in structural proteomics for protein structure determination. Proteomics 2005; 5:2056-68. [PMID: 15846841 DOI: 10.1002/pmic.200401104] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
The major challenges in structural proteomics include identifying all the proteins on the genome-wide scale, determining their structure-function relationships, and outlining the precise three-dimensional structures of the proteins. Protein structures are typically determined by experimental approaches such as X-ray crystallography or nuclear magnetic resonance (NMR) spectroscopy. However, the knowledge of three-dimensional space by these techniques is still limited. Thus, computational methods such as comparative and de novo approaches and molecular dynamic simulations are intensively used as alternative tools to predict the three-dimensional structures and dynamic behavior of proteins. This review summarizes recent developments in structural proteomics for protein structure determination; including instrumental methods such as X-ray crystallography and NMR spectroscopy, and computational methods such as comparative and de novo structure prediction and molecular dynamics simulations.
Collapse
Affiliation(s)
- Hsuan-Liang Liu
- Department of Chemical Engineering, National Taipei University of Technology, Taiwan.
| | | |
Collapse
|