1
|
Jang R, Wang Y, Xue Z, Zhang Y. NMR data-driven structure determination using NMR-I-TASSER in the CASD-NMR experiment. J Biomol NMR 2015; 62:511-525. [PMID: 25737244 PMCID: PMC4560687 DOI: 10.1007/s10858-015-9914-y] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/28/2014] [Accepted: 02/21/2015] [Indexed: 05/30/2023]
Abstract
NMR-I-TASSER, an adaption of the I-TASSER algorithm combining NMR data for protein structure determination, recently joined the second round of the CASD-NMR experiment. Unlike many molecular dynamics-based methods, NMR-I-TASSER takes a molecular replacement-like approach to the problem by first threading the target through the PDB to identify structural templates which are then used for iterative NOE assignments and fragment structure assembly refinements. The employment of multiple templates allows NMR-I-TASSER to sample different topologies while convergence to a single structure is not required. Retroactive and blind tests of the CASD-NMR targets from Rounds 1 and 2 demonstrate that even without using NOE peak lists I-TASSER can generate correct structure topology with 15 of 20 targets having a TM-score above 0.5. With the addition of NOE-based distance restraints, NMR-I-TASSER significantly improved the I-TASSER models with all models having the TM-score above 0.5. The average RMSD was reduced from 5.29 to 2.14 Å in Round 1 and 3.18 to 1.71 Å in Round 2. There is no obvious difference in the modeling results with using raw and refined peak lists, indicating robustness of the pipeline to the NOE assignment errors. Overall, despite the low-resolution modeling the current NMR-I-TASSER pipeline provides a coarse-grained structure folding approach complementary to traditional molecular dynamics simulations, which can produce fast near-native frameworks for atomic-level structural refinement.
Collapse
Affiliation(s)
- Richard Jang
- School of Software Engineering, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109-2218, USA
| | - Yan Wang
- School of Life Science and Technology, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109-2218, USA
| | - Zhidong Xue
- School of Software Engineering, Huazhong University of Science and Technology, Wuhan, 430074, Hubei, China.
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109-2218, USA.
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, 100 Washtenaw Avenue, Ann Arbor, MI, 48109-2218, USA.
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI, 48109, USA.
| |
Collapse
|
2
|
Dashti H, Lee W, Tonelli M, Cornilescu CC, Cornilescu G, Assadi-Porter FM, Westler WM, Eghbalnia HR, Markley JL. NMRFAM-SDF: a protein structure determination framework. J Biomol NMR 2015; 62:481-95. [PMID: 25900069 PMCID: PMC4569665 DOI: 10.1007/s10858-015-9933-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/16/2015] [Accepted: 04/15/2015] [Indexed: 05/21/2023]
Abstract
The computationally demanding nature of automated NMR structure determination necessitates a delicate balancing of factors that include the time complexity of data collection, the computational complexity of chemical shift assignments, and selection of proper optimization steps. During the past two decades the computational and algorithmic aspects of several discrete steps of the process have been addressed. Although no single comprehensive solution has emerged, the incorporation of a validation protocol has gained recognition as a necessary step for a robust automated approach. The need for validation becomes even more pronounced in cases of proteins with higher structural complexity, where potentially larger errors generated at each step can propagate and accumulate in the process of structure calculation, thereby significantly degrading the efficacy of any software framework. This paper introduces a complete framework for protein structure determination with NMR--from data acquisition to the structure determination. The aim is twofold: to simplify the structure determination process for non-NMR experts whenever feasible, while maintaining flexibility by providing a set of modules that validate each step, and to enable the assessment of error propagations. This framework, called NMRFAM-SDF (NMRFAM-Structure Determination Framework), and its various components are available for download from the NMRFAM website (http://nmrfam.wisc.edu/software.htm).
Collapse
Affiliation(s)
- Hesam Dashti
- National Magnetic Resonance Facility at Madison, Biochemistry Department, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI, USA
| | - Woonghee Lee
- National Magnetic Resonance Facility at Madison, Biochemistry Department, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI, USA
| | - Marco Tonelli
- National Magnetic Resonance Facility at Madison, Biochemistry Department, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI, USA
| | - Claudia C Cornilescu
- National Magnetic Resonance Facility at Madison, Biochemistry Department, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI, USA
| | - Gabriel Cornilescu
- National Magnetic Resonance Facility at Madison, Biochemistry Department, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI, USA
| | - Fariba M Assadi-Porter
- National Magnetic Resonance Facility at Madison, Biochemistry Department, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI, USA
| | - William M Westler
- National Magnetic Resonance Facility at Madison, Biochemistry Department, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI, USA
| | - Hamid R Eghbalnia
- National Magnetic Resonance Facility at Madison, Biochemistry Department, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI, USA
| | - John L Markley
- National Magnetic Resonance Facility at Madison, Biochemistry Department, University of Wisconsin-Madison, 433 Babcock Drive, Madison, WI, USA.
| |
Collapse
|
3
|
Rosato A, Vranken W, Fogh RH, Ragan TJ, Tejero R, Pederson K, Lee HW, Prestegard JH, Yee A, Wu B, Lemak A, Houliston S, Arrowsmith CH, Kennedy M, Acton TB, Xiao R, Liu G, Montelione GT, Vuister GW. The second round of Critical Assessment of Automated Structure Determination of Proteins by NMR: CASD-NMR-2013. J Biomol NMR 2015; 62:413-24. [PMID: 26071966 PMCID: PMC4569658 DOI: 10.1007/s10858-015-9953-4] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/19/2015] [Accepted: 05/28/2015] [Indexed: 05/21/2023]
Abstract
The second round of the community-wide initiative Critical Assessment of automated Structure Determination of Proteins by NMR (CASD-NMR-2013) comprised ten blind target datasets, consisting of unprocessed spectral data, assigned chemical shift lists and unassigned NOESY peak and RDC lists, that were made available in both curated (i.e. manually refined) or un-curated (i.e. automatically generated) form. Ten structure calculation programs, using fully automated protocols only, generated a total of 164 three-dimensional structures (entries) for the ten targets, sometimes using both curated and un-curated lists to generate multiple entries for a single target. The accuracy of the entries could be established by comparing them to the corresponding manually solved structure of each target, which was not available at the time the data were provided. Across the entire data set, 71 % of all entries submitted achieved an accuracy relative to the reference NMR structure better than 1.5 Å. Methods based on NOESY peak lists achieved even better results with up to 100% of the entries within the 1.5 Å threshold for some programs. However, some methods did not converge for some targets using un-curated NOESY peak lists. Over 90% of the entries achieved an accuracy better than the more relaxed threshold of 2.5 Å that was used in the previous CASD-NMR-2010 round. Comparisons between entries generated with un-curated versus curated peaks show only marginal improvements for the latter in those cases where both calculations converged.
Collapse
Affiliation(s)
- Antonio Rosato
- Department of Chemistry and Magnetic Resonance Center, University of Florence, 50019, Sesto Fiorentino, Italy
| | - Wim Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, 1050, Brussels, Belgium
- (IB)2 Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, 1050, Brussels, Belgium
| | - Rasmus H Fogh
- Department of Biochemistry, School of Biological Sciences, University of Leicester, Henry Wellcome Building, Lancaster Road, Leicester, LE1 9HN, UK
| | - Timothy J Ragan
- Department of Biochemistry, School of Biological Sciences, University of Leicester, Henry Wellcome Building, Lancaster Road, Leicester, LE1 9HN, UK
| | - Roberto Tejero
- Departamento de Química Física, Universidad de Valencia, Avda. Dr. Moliner 50, 46100, Burjassot (Valencia), Spain
| | - Kari Pederson
- Complex Carbohydrate Research Center and Northeast Structural Genomics Consortium, University of Georgia, Athens, GA, 30602, USA
| | - Hsiau-Wei Lee
- Complex Carbohydrate Research Center and Northeast Structural Genomics Consortium, University of Georgia, Athens, GA, 30602, USA
| | - James H Prestegard
- Complex Carbohydrate Research Center and Northeast Structural Genomics Consortium, University of Georgia, Athens, GA, 30602, USA
| | - Adelinda Yee
- Department of Medical Biophysics, Cancer Genomics and Proteomics, Ontario Cancer Institute, Northeast Structural Genomics Consortium, University of Toronto, Toronto, ON, M5G 1L7, Canada
| | - Bin Wu
- Department of Medical Biophysics, Cancer Genomics and Proteomics, Ontario Cancer Institute, Northeast Structural Genomics Consortium, University of Toronto, Toronto, ON, M5G 1L7, Canada
| | - Alexander Lemak
- Department of Medical Biophysics, Cancer Genomics and Proteomics, Ontario Cancer Institute, Northeast Structural Genomics Consortium, University of Toronto, Toronto, ON, M5G 1L7, Canada
| | - Scott Houliston
- Department of Medical Biophysics, Cancer Genomics and Proteomics, Ontario Cancer Institute, Northeast Structural Genomics Consortium, University of Toronto, Toronto, ON, M5G 1L7, Canada
| | - Cheryl H Arrowsmith
- Department of Medical Biophysics, Cancer Genomics and Proteomics, Ontario Cancer Institute, Northeast Structural Genomics Consortium, University of Toronto, Toronto, ON, M5G 1L7, Canada
| | - Michael Kennedy
- Department of Chemistry and Biochemistry, Northeast Structural Genomics Consortium, Miami University, Oxford, OH, 45056, USA
| | - Thomas B Acton
- Department of Molecular Biology and Biochemistry, Center for Advanced Biotechnology and Medicine, Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
- Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Piscataway, NJ, 08854, USA
| | - Rong Xiao
- Department of Molecular Biology and Biochemistry, Center for Advanced Biotechnology and Medicine, Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
- Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Piscataway, NJ, 08854, USA
| | - Gaohua Liu
- Department of Molecular Biology and Biochemistry, Center for Advanced Biotechnology and Medicine, Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
- Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Piscataway, NJ, 08854, USA
| | - Gaetano T Montelione
- Department of Molecular Biology and Biochemistry, Center for Advanced Biotechnology and Medicine, Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA.
- Department of Biochemistry and Molecular Biology, Robert Wood Johnson Medical School, Piscataway, NJ, 08854, USA.
| | - Geerten W Vuister
- Department of Biochemistry, School of Biological Sciences, University of Leicester, Henry Wellcome Building, Lancaster Road, Leicester, LE1 9HN, UK.
| |
Collapse
|
4
|
Abstract
The automated assignment of NOESY cross peaks has become a fundamental technique for NMR protein structure analysis. A widely used algorithm for this purpose is implemented in the program CYANA. It has been used for a large number of structure determinations of proteins in solution but was so far not described in full detail. In this paper we present a complete description of the CYANA implementation of automated NOESY assignment, which differs extensively from its predecessor CANDID by the use of a consistent probabilistic treatment, and we discuss its performance in the second round of the critical assessment of structure determination by NMR.
Collapse
Affiliation(s)
- Peter Güntert
- Center for Biomolecular Magnetic Resonance, Institute of Biophysical Chemistry, Goethe University Frankfurt am Main, Max-von-Laue-Str. 9, 60438, Frankfurt am Main, Germany.
- Laboratory of Physical Chemistry, ETH Zürich, Zurich, Switzerland.
- Graduate School of Science, Tokyo Metropolitan University, Hachioji, Tokyo, Japan.
| | - Lena Buchner
- Center for Biomolecular Magnetic Resonance, Institute of Biophysical Chemistry, Goethe University Frankfurt am Main, Max-von-Laue-Str. 9, 60438, Frankfurt am Main, Germany
| |
Collapse
|
5
|
Guerry P, Duong VD, Herrmann T. CASD-NMR 2: robust and accurate unsupervised analysis of raw NOESY spectra and protein structure determination with UNIO. J Biomol NMR 2015; 62:473-480. [PMID: 25917899 DOI: 10.1007/s10858-015-9934-7] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/17/2015] [Accepted: 04/18/2015] [Indexed: 06/04/2023]
Abstract
UNIO is a comprehensive software suite for protein NMR structure determination that enables full automation of all NMR data analysis steps involved--including signal identification in NMR spectra, sequence-specific backbone and side-chain resonance assignment, NOE assignment and structure calculation. Within the framework of the second round of the community-wide stringent blind NMR structure determination challenge (CASD-NMR 2), we participated in two categories of CASD-NMR 2, namely using either raw NMR spectra or unrefined NOE peak lists as input. A total of 15 resulting NMR structure bundles were submitted for 9 out of 10 blind protein targets. All submitted UNIO structures accurately coincided with the corresponding blind targets as documented by an average backbone root mean-square deviation to the reference proteins of only 1.2 Å. Also, the precision of the UNIO structure bundles was virtually identical to the ensemble of reference structures. By assessing the quality of all UNIO structures submitted to the two categories, we find throughout that only the UNIO-ATNOS/CANDID approach using raw NMR spectra consistently yielded structure bundles of high quality for direct deposition in the Protein Data Bank. In conclusion, the results obtained in CASD-NMR 2 are another vital proof for robust, accurate and unsupervised NMR data analysis by UNIO for real-world applications.
Collapse
Affiliation(s)
- Paul Guerry
- Institut des Sciences Analytiques, Centre de RMN à très Hauts Champs, Université de Lyon (UMR 5280 CNRS, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1), 5 rue de la Doua, 69100, Villeurbanne, France
| | - Viet Dung Duong
- Institut des Sciences Analytiques, Centre de RMN à très Hauts Champs, Université de Lyon (UMR 5280 CNRS, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1), 5 rue de la Doua, 69100, Villeurbanne, France
| | - Torsten Herrmann
- Institut des Sciences Analytiques, Centre de RMN à très Hauts Champs, Université de Lyon (UMR 5280 CNRS, Ecole Normale Supérieure de Lyon, Université Claude Bernard Lyon 1), 5 rue de la Doua, 69100, Villeurbanne, France.
| |
Collapse
|
6
|
Ragan TJ, Fogh RH, Tejero R, Vranken W, Montelione GT, Rosato A, Vuister GW. Analysis of the structural quality of the CASD-NMR 2013 entries. J Biomol NMR 2015; 62:527-540. [PMID: 26032236 PMCID: PMC4569653 DOI: 10.1007/s10858-015-9949-0] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2015] [Accepted: 05/20/2015] [Indexed: 05/30/2023]
Abstract
We performed a comprehensive structure validation of both automated and manually generated structures of the 10 targets of the CASD-NMR-2013 effort. We established that automated structure determination protocols are capable of reliably producing structures of comparable accuracy and quality to those generated by a skilled researcher, at least for small, single domain proteins such as the ten targets tested. The most robust results appear to be obtained when NOESY peak lists are used either as the primary input data or to augment chemical shift data without the need to manually filter such lists. A detailed analysis of the long-range NOE restraints generated by the different programs from the same data showed a surprisingly low degree of overlap. Additionally, we found that there was no significant correlation between the extent of the NOE restraint overlap and the accuracy of the structure. This result was surprising given the importance of NOE data in producing good quality structures. We suggest that this could be explained by the information redundancy present in NOEs between atoms contained within a fixed covalent network.
Collapse
Affiliation(s)
- Timothy J Ragan
- Department of Biochemistry, School of Biological Sciences, University of Leicester, Henry Wellcome Building, Lancaster Road, Leicester, LE1 9HN, UK
| | - Rasmus H Fogh
- Department of Biochemistry, School of Biological Sciences, University of Leicester, Henry Wellcome Building, Lancaster Road, Leicester, LE1 9HN, UK
| | - Roberto Tejero
- Departamento de Química Física, Universidad de Valencia, Avda. Dr. Moliner 50, 46100, Burjassot (Valencia), Spain
| | - Wim Vranken
- Structural Biology Brussels, Vrije Universiteit Brussel, Pleinlaan 2, Brussels, Belgium
- (IB)2 Interuniversity Institute of Bioinformatics in Brussels, ULB-VUB, Triomflaan, 1050, Brussels, Belgium
| | - Gaetano T Montelione
- Center for Advanced Biotechnology and Medicine, Department of Molecular Biology and Biochemistry, and Northeast Structural Genomics Consortium, Rutgers, The State University of New Jersey, Piscataway, NJ, 08854, USA
- Robert Wood Johnson Medical School, Piscataway, NJ, 08854, USA
| | - Antonio Rosato
- Magnetic Resonance Center, Department of Chemistry, University of Florence, 50019, Sesto Fiorentino, Italy
| | - Geerten W Vuister
- Department of Biochemistry, School of Biological Sciences, University of Leicester, Henry Wellcome Building, Lancaster Road, Leicester, LE1 9HN, UK.
| |
Collapse
|