1
|
Li EH, Spaman LE, Tejero R, Janet Huang Y, Ramelot TA, Fraga KJ, Prestegard JH, Kennedy MA, Montelione GT. Blind assessment of monomeric AlphaFold2 protein structure models with experimental NMR data. J Magn Reson 2023; 352:107481. [PMID: 37257257 PMCID: PMC10659763 DOI: 10.1016/j.jmr.2023.107481] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/22/2023] [Revised: 05/08/2023] [Accepted: 05/15/2023] [Indexed: 06/02/2023]
Abstract
Recent advances in molecular modeling of protein structures are changing the field of structural biology. AlphaFold-2 (AF2), an AI system developed by DeepMind, Inc., utilizes attention-based deep learning to predict models of protein structures with high accuracy relative to structures determined by X-ray crystallography and cryo-electron microscopy (cryoEM). Comparing AF2 models to structures determined using solution NMR data, both high similarities and distinct differences have been observed. Since AF2 was trained on X-ray crystal and cryoEM structures, we assessed how accurately AF2 can model small, monomeric, solution protein NMR structures which (i) were not used in the AF2 training data set, and (ii) did not have homologous structures in the Protein Data Bank at the time of AF2 training. We identified nine open-source protein NMR data sets for such "blind" targets, including chemical shift, raw NMR FID data, NOESY peak lists, and (for 1 case) 15N-1H residual dipolar coupling data. For these nine small (70-108 residues) monomeric proteins, we generated AF2 prediction models and assessed how well these models fit to these experimental NMR data, using several well-established NMR structure validation tools. In most of these cases, the AF2 models fit the NMR data nearly as well, or sometimes better than, the corresponding NMR structure models previously deposited in the Protein Data Bank. These results provide benchmark NMR data for assessing new NMR data analysis and protein structure prediction methods. They also document the potential for using AF2 as a guiding tool in protein NMR data analysis, and more generally for hypothesis generation in structural biology research.
Collapse
Affiliation(s)
- Ethan H Li
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA
| | - Laura E Spaman
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
| | - Roberto Tejero
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
| | - Yuanpeng Janet Huang
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
| | - Theresa A Ramelot
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
| | - Keith J Fraga
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
| | - James H Prestegard
- Complex Carbohydrate Research Center, University of Georgia, Athens, GA 30602, USA.
| | - Michael A Kennedy
- Department of Chemistry and Biochemistry, Miami University, Oxford, OH 45056, USA.
| | - Gaetano T Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180, USA.
| |
Collapse
|