1
|
Luo Y, Zheng X, Qiu M, Gou Y, Yang Z, Qu X, Chen Z, Lin Y. Deep learning and its applications in nuclear magnetic resonance spectroscopy. PROGRESS IN NUCLEAR MAGNETIC RESONANCE SPECTROSCOPY 2025; 146-147:101556. [PMID: 40306798 DOI: 10.1016/j.pnmrs.2024.101556] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/19/2024] [Revised: 12/26/2024] [Accepted: 12/30/2024] [Indexed: 05/02/2025]
Abstract
Nuclear Magnetic Resonance (NMR), as an advanced technology, has widespread applications in various fields like chemistry, biology, and medicine. However, issues such as long acquisition times for multidimensional spectra and low sensitivity limit the broader application of NMR. Traditional algorithms aim to address these issues but have limitations in speed and accuracy. Deep Learning (DL), a branch of Artificial Intelligence (AI) technology, has shown remarkable success in many fields including NMR. This paper presents an overview of the basics of DL and current applications of DL in NMR, highlights existing challenges, and suggests potential directions for improvement.
Collapse
Affiliation(s)
- Yao Luo
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Xiaoxu Zheng
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Mengjie Qiu
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Yaoping Gou
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Zhengxian Yang
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Xiaobo Qu
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Zhong Chen
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China
| | - Yanqin Lin
- Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, Department of Electronic Science, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University, Xiamen 361005, China.
| |
Collapse
|
2
|
Zhozhikov L, Vasilev F, Maksimova N. Protein-Variant-Phenotype Study of NBAS Using AlphaFold in the Aspect of SOPH Syndrome. Proteins 2025; 93:871-884. [PMID: 39641476 DOI: 10.1002/prot.26764] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2024] [Revised: 10/04/2024] [Accepted: 11/01/2024] [Indexed: 12/07/2024]
Abstract
NBAS gene variants cause phenotypically distinct and nonoverlapping conditions, SOPH syndrome and ILFS2. NBAS is a so-called "moonlighting" protein responsible for retrograde membrane trafficking and nonsense-mediated decay. However, its three-dimensional model and the nature of its possible interactions with other proteins have remained elusive. Here, we used AlphaFold to predict protein-protein interaction (PPI) sites and mapped them to NBAS pathogenic variants. We repeated in silico milestone studies of the NBAS protein to explain the multisystem phenotype of its variants, with particular emphasis on the SOPH variant (p.R1914H). We revealed the putative binding sites for the main interaction partners of NBAS and assessed the implications of these binding sites for the subdomain architecture of the NBAS protein. Using AlphaFold, we disclosed the far-reaching impact of NBAS variants on the development of each phenotypic trait in patients with NBAS-related pathologies.
Collapse
Affiliation(s)
- Leonid Zhozhikov
- Research Laboratory of "Molecular Medicine and Human Genetics", Institute of Medicine, Ammosov North-Eastern Federal University, Yakutsk, Republic of Sakha (Yakutia), Russia
| | - Filipp Vasilev
- Research Laboratory of "Molecular Medicine and Human Genetics", Institute of Medicine, Ammosov North-Eastern Federal University, Yakutsk, Republic of Sakha (Yakutia), Russia
| | - Nadezhda Maksimova
- Research Laboratory of "Molecular Medicine and Human Genetics", Institute of Medicine, Ammosov North-Eastern Federal University, Yakutsk, Republic of Sakha (Yakutia), Russia
| |
Collapse
|
3
|
Szczepski K, Jaremko Ł. AlphaFold and what is next: bridging functional, systems and structural biology. Expert Rev Proteomics 2025; 22:45-58. [PMID: 39824781 DOI: 10.1080/14789450.2025.2456046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2024] [Revised: 01/13/2025] [Accepted: 01/16/2025] [Indexed: 01/20/2025]
Abstract
INTRODUCTION The DeepMind's AlphaFold (AF) has revolutionized biomedical and biocience research by providing both experts and non-experts with an invaluable tool for predicting protein structures. However, while AF is highly effective for predicting structures of rigid and globular proteins, it is not able to fully capture the dynamics, conformational variability, and interactions of proteins with ligands and other biomacromolecules. AREAS COVERED In this review, we present a comprehensive overview of the latest advancements in 3D model predictions for biomacromolecules using AF. We also provide a detailed analysis its of strengths and limitations, and explore more recent iterations, modifications, and practical applications of this strategy. Moreover, we map the path forward for expanding the landscape of AF toward predicting structures of every protein and peptide, and their interactions in the proteome in the most physiologically relevant form. This discussion is based on an extensive literature search performed using PubMed and Google Scholar. EXPERT OPINION While significant progress has been made to enhance AF's modeling capabilities, we argue that a combined approach integrating both various in silico and in vitro methods will be most beneficial for the future of structural biology, bridging the gaps between static and dynamic features of proteins and their functions.
Collapse
Affiliation(s)
- Kacper Szczepski
- Biological and Environmental Science & Engineering (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| | - Łukasz Jaremko
- Biological and Environmental Science & Engineering (BESE), King Abdullah University of Science and Technology (KAUST), Thuwal, Saudi Arabia
| |
Collapse
|
4
|
Yang Z, Cai W, Zhu W, Zheng X, Shi X, Qiu M, Chen Z, Liu M, Lin Y. Deep learning enabled ultra-high quality NMR chemical shift resolved spectra. Chem Sci 2024; 15:20039-20044. [PMID: 39568866 PMCID: PMC11575604 DOI: 10.1039/d4sc04742g] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2024] [Accepted: 11/09/2024] [Indexed: 11/22/2024] Open
Abstract
High quality chemical shift resolved spectra have long been pursued in nuclear magnetic resonance (NMR). In order to obtain chemical shift information with high resolution and sensitivity, a neural network named spin echo to obtain chemical shifts network (SE2CSNet) is developed to process the NMR data acquired by the spin echo pulse sequence. Through detecting the change of phase in the spin echo spectra, SE2CSNet can accurately detect the chemical shift position of spectral signals. The results show that the network can discern the chemical shift even when spectral signals overlap, but without strong coupling and chunking artifacts. In addition, this method can process the sample with low S/N (signal to noise ratio), and recover weak signals even hidden in noise, leading to ultra-high quality chemical shift resolved spectra. It is envisioned that the proposed methodology will find wide applications in many fields.
Collapse
Affiliation(s)
- Zhengxian Yang
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University Xiamen Fujian 361005 China
| | - Weigang Cai
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University Xiamen Fujian 361005 China
| | - Wen Zhu
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University Xiamen Fujian 361005 China
| | - Xiaoxu Zheng
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University Xiamen Fujian 361005 China
| | - Xiaoqi Shi
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University Xiamen Fujian 361005 China
| | - Mengjie Qiu
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University Xiamen Fujian 361005 China
| | - Zhong Chen
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University Xiamen Fujian 361005 China
| | - Maili Liu
- State Key Laboratory of Magnetic Resonance and Atomic and Molecular Physics, National Center for Magnetic Resonance in Wuhan, Wuhan Institute of Physics and Mathematics, Innovation Academy for Precision Measurement Science and Technology, Chinese Academy of Sciences Wuhan 430071 China
- University of Chinese Academy of Sciences Beijing 100049 China
| | - Yanqin Lin
- Department of Electronic Science, Fujian Provincial Key Laboratory of Plasma and Magnetic Resonance, State Key Laboratory of Physical Chemistry of Solid Surfaces, Xiamen University Xiamen Fujian 361005 China
| |
Collapse
|
5
|
Ptaszek AL, Li J, Konrat R, Platzer G, Head-Gordon T. UCBShift 2.0: Bridging the Gap from Backbone to Side Chain Protein Chemical Shift Prediction for Protein Structures. J Am Chem Soc 2024; 146:31733-31745. [PMID: 39531038 PMCID: PMC11784523 DOI: 10.1021/jacs.4c10474] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2024]
Abstract
Chemical shifts are a readily obtainable NMR observable that can be measured with high accuracy, and because they are sensitive to conformational averages and the local molecular environment, they yield detailed information about protein structure in solution. To predict chemical shifts of protein structures, we introduced the UCBShift method that uniquely fuses a transfer prediction module, which employs sequence and structure alignments to select reference chemical shifts from an experimental database, with a machine learning model that uses carefully curated and physics-inspired features derived from X-ray crystal structures to predict backbone chemical shifts for proteins. In this work, we extend the UCBShift 1.0 method to side chain chemical shift prediction to perform whole protein analysis, which, when validated against well-defined test data shows higher accuracy and better reliability compared to the popular SHIFTX2 method. With the greater abundance of cleaned protein shift-structure data and the modularity of the general UCBShift algorithms, users can gain insight into different features important for residue-specific stabilizing interactions for protein backbone and side chain chemical shift prediction. We suggest several backward and forward applications of UCBShift 2.0 that can help validate AlphaFold structures and probe protein dynamics.
Collapse
Affiliation(s)
- Aleksandra L. Ptaszek
- Christian Doppler Laboratory for High-Content Structural Biology and Biotechnology, Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, 1030-Vienna, Austria
- A.L.P. and J.L. contributed equally to this paper
| | - Jie Li
- A.L.P. and J.L. contributed equally to this paper
- Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley CA 94720, USA
| | - Robert Konrat
- Christian Doppler Laboratory for High-Content Structural Biology and Biotechnology, Department of Structural and Computational Biology, Max Perutz Labs, University of Vienna, Campus Vienna Biocenter 5, 1030-Vienna, Austria
| | - Gerald Platzer
- MAG-LAB GmbH, Karl-Farkas-Gasse 22, 1030- Vienna, Austria
| | - Teresa Head-Gordon
- Pitzer Center for Theoretical Chemistry, Department of Chemistry, University of California, Berkeley CA 94720, USA
- Departments of Bioengineering and Chemical and Biomolecular Engineering, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
6
|
Abriata LA. The Nobel Prize in Chemistry: past, present, and future of AI in biology. Commun Biol 2024; 7:1409. [PMID: 39472680 PMCID: PMC11522274 DOI: 10.1038/s42003-024-07113-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2024] [Accepted: 10/21/2024] [Indexed: 11/02/2024] Open
Abstract
A Comment on the transformative progress of artificial intelligence for structural and protein biology, referencing the 2024 Nobel Prize in Chemistry.
Collapse
Affiliation(s)
- Luciano A Abriata
- School of Life Sciences, Ecole Polytechnique Fédérale de Lausanne, CH-1015, Lausanne, Switzerland.
| |
Collapse
|
7
|
Gampp O, Wenchel L, Güntert P, Riek R. Homonuclear Super-Resolution NMR Spectroscopy. Angew Chem Int Ed Engl 2024:e202414324. [PMID: 39344424 DOI: 10.1002/anie.202414324] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2024] [Revised: 09/08/2024] [Accepted: 09/09/2024] [Indexed: 10/01/2024]
Abstract
In homonuclear 1H NMR (nuclear magnetic resonance) spectra such as [1H,1H]-NOESY (Nuclear Overhauser Enhancement spectroscopy), which is a historic cornerstone spectrum for biomolecular NMR structural biology, hundreds to thousands of cross peaks are present within a square of approximately 100 ppm2 leading to a lot of signal overlap. Spectral resolution is thus a limiting factor for unambiguous chemical shift assignment and data interpretation for dynamics and structure elucidation. Acquiring the spectra at higher magnetic fields such as at a 1.2 GHz 1H frequency helps to reduce spectral crowding, since resolution scales proportionally to the magnetic field strength. Here, we show that the linewidths of cross peaks in [1H,1H]-NOESY and [1H,1H]-TOCSY spectra can be further reduced by a factor of 2-3 in each dimension by super-resolution spectroscopy. In the indirect dimension a composite exponential-cosine weighted number of scans along the time increments are recorded and digitally smoothened by a window function, while in the direct dimension an exponential-cosine window function is applied. Furthermore, measurement time saving by reduced-acquisition super-resolution (RASR) is introduced. Application to the 20 kDa protein KRAS shows that highly resolved NMR spectra suitable for automated analysis can be acquired within less than 3 hours. The method opens an avenue towards automated chemical shift assignment, dynamics and structure determination of unlabeled small and medium size proteins within 24 hours.
Collapse
Affiliation(s)
- Olivia Gampp
- Institute of Molecular Physical Science, ETH Zürich, Vladimir-Prelog-Weg 2, CH-8093, Zürich, Switzerland
| | - Luca Wenchel
- Institute of Molecular Physical Science, ETH Zürich, Vladimir-Prelog-Weg 2, CH-8093, Zürich, Switzerland
| | - Peter Güntert
- Institute of Molecular Physical Science, ETH Zürich, Vladimir-Prelog-Weg 2, CH-8093, Zürich, Switzerland
- Institute of Biophysical Chemistry, Center for Biomolecular Magnetic Resonance, Goethe University Frankfurt am Main, 60438, Frankfurt am Main, Germany
- Department of Chemistry, Tokyo Metropolitan University, 192-0397, Hachioji, Tokyo, Japan
| | - Roland Riek
- Institute of Molecular Physical Science, ETH Zürich, Vladimir-Prelog-Weg 2, CH-8093, Zürich, Switzerland
| |
Collapse
|
8
|
Rasulov U, Wang HK, Viennet T, Droemer MA, Matosin S, Schindler S, Sun ZYJ, Mureddu L, Vuister GW, Robson SA, Arthanari H, Kuprov I. Protein NMR assignment by isotope pattern recognition. SCIENCE ADVANCES 2024; 10:eado0403. [PMID: 39231223 PMCID: PMC11373586 DOI: 10.1126/sciadv.ado0403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/13/2024] [Accepted: 07/29/2024] [Indexed: 09/06/2024]
Abstract
The current standard method for amino acid signal identification in protein NMR spectra is sequential assignment using triple-resonance experiments. Good software and elaborate heuristics exist, but the process remains laboriously manual. Machine learning does help, but its training databases need millions of samples that cover all relevant physics and every kind of instrumental artifact. In this communication, we offer a solution to this problem. We propose polyadic decompositions to store millions of simulated three-dimensional NMR spectra, on-the-fly generation of artifacts during training, a probabilistic way to incorporate prior and posterior information, and integration with the industry standard CcpNmr software framework. The resulting neural nets take [1H,13C] slices of mixed pyruvate-labeled HNCA spectra (different CA signal shapes for different residue types) and return an amino acid probability table. In combination with primary sequence information, backbones of common proteins (GB1, MBP, and INMT) are rapidly assigned from just the HNCA spectrum.
Collapse
Affiliation(s)
- Uluk Rasulov
- School of Chemistry, University of Southampton, University Road, Southampton SO17 1BJ, UK
| | - Harrison K Wang
- Department of Biochemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
- Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA
| | - Thibault Viennet
- Department of Chemistry and iNANO, Aarhus University, Langelandsgade 140, 8000 Aarhus C, Denmark
| | - Maxim A Droemer
- Faculty for Chemistry and Pharmacy, Ludwig-Maximilians-Universität München, Munich, Germany
| | - Srđan Matosin
- Department of Biochemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
- Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA
| | - Sebastian Schindler
- Department of Biochemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
- Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA
| | - Zhen-Yu J Sun
- Department of Biochemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
- Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA
| | - Luca Mureddu
- Department of Molecular and Cell Biology, Institute for Structural and Chemical Biology, University of Leicester, Lancaster Road, Leicester LE1 7HB, UK
| | - Geerten W Vuister
- Department of Molecular and Cell Biology, Institute for Structural and Chemical Biology, University of Leicester, Lancaster Road, Leicester LE1 7HB, UK
| | - Scott A Robson
- Department of Biochemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
- Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA
| | - Haribabu Arthanari
- Department of Biochemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115, USA
- Dana-Farber Cancer Institute, 450 Brookline Ave, Boston, MA 02215, USA
| | - Ilya Kuprov
- School of Chemistry, University of Southampton, University Road, Southampton SO17 1BJ, UK
| |
Collapse
|
9
|
Agarwal V, McShan AC. The power and pitfalls of AlphaFold2 for structure prediction beyond rigid globular proteins. Nat Chem Biol 2024; 20:950-959. [PMID: 38907110 PMCID: PMC11956457 DOI: 10.1038/s41589-024-01638-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 04/29/2024] [Indexed: 06/23/2024]
Abstract
Artificial intelligence-driven advances in protein structure prediction in recent years have raised the question: has the protein structure-prediction problem been solved? Here, with a focus on nonglobular proteins, we highlight the many strengths and potential weaknesses of DeepMind's AlphaFold2 in the context of its biological and therapeutic applications. We summarize the subtleties associated with evaluation of AlphaFold2 model quality and reliability using the predicted local distance difference test (pLDDT) and predicted aligned error (PAE) values. We highlight various classes of proteins that AlphaFold2 can be applied to and the caveats involved. Concrete examples of how AlphaFold2 models can be integrated with experimental data in the form of small-angle X-ray scattering (SAXS), solution NMR, cryo-electron microscopy (cryo-EM) and X-ray diffraction are discussed. Finally, we highlight the need to move beyond structure prediction of rigid, static structural snapshots toward conformational ensembles and alternate biologically relevant states. The overarching theme is that careful consideration is due when using AlphaFold2-generated models to generate testable hypotheses and structural models, rather than treating predicted models as de facto ground truth structures.
Collapse
Affiliation(s)
- Vinayak Agarwal
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA.
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA.
| | - Andrew C McShan
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA.
| |
Collapse
|
10
|
Wu F, Huang Y, Yang G, Ye S, Mukamel S, Jiang J. Unraveling dynamic protein structures by two-dimensional infrared spectra with a pretrained machine learning model. Proc Natl Acad Sci U S A 2024; 121:e2409257121. [PMID: 38917009 PMCID: PMC11228460 DOI: 10.1073/pnas.2409257121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2024] [Accepted: 05/28/2024] [Indexed: 06/27/2024] Open
Abstract
Dynamic protein structures are crucial for deciphering their diverse biological functions. Two-dimensional infrared (2DIR) spectroscopy stands as an ideal tool for tracing rapid conformational evolutions in proteins. However, linking spectral characteristics to dynamic structures poses a formidable challenge. Here, we present a pretrained machine learning model based on 2DIR spectra analysis. This model has learned signal features from approximately 204,300 spectra to establish a "spectrum-structure" correlation, thereby tracing the dynamic conformations of proteins. It excels in accurately predicting the dynamic content changes of various secondary structures and demonstrates universal transferability on real folding trajectories spanning timescales from microseconds to milliseconds. Beyond exceptional predictive performance, the model offers attention-based spectral explanations of dynamic conformational changes. Our 2DIR-based pretrained model is anticipated to provide unique insights into the dynamic structural information of proteins in their native environments.
Collapse
Affiliation(s)
- Fan Wu
- Key Laboratory of Precision and Intelligent Chemistry, Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, Anhui, China
| | - Yan Huang
- Key Laboratory of Precision and Intelligent Chemistry, Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, Anhui, China
| | - Guokun Yang
- Key Laboratory of Precision and Intelligent Chemistry, Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, Anhui, China
| | - Sheng Ye
- Anhui Provincial Engineering Research Center for Unmanned System and Intelligent Technology, School of Artificial Intelligence, Anhui University, Hefei230601, Anhui, China
| | - Shaul Mukamel
- Department of Chemistry and of Physics & Astronomy, University of California, Irvine, CA92697
| | - Jun Jiang
- Key Laboratory of Precision and Intelligent Chemistry, Hefei National Research Center for Physical Sciences at the Microscale, School of Chemistry and Materials Science, University of Science and Technology of China, Hefei230026, Anhui, China
| |
Collapse
|
11
|
Klukowski P, Damberger FF, Allain FHT, Iwai H, Kadavath H, Ramelot TA, Montelione GT, Riek R, Güntert P. The 100-protein NMR spectra dataset: A resource for biomolecular NMR data analysis. Sci Data 2024; 11:30. [PMID: 38177162 PMCID: PMC10767026 DOI: 10.1038/s41597-023-02879-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2023] [Accepted: 12/22/2023] [Indexed: 01/06/2024] Open
Abstract
Multidimensional NMR spectra are the basis for studying proteins by NMR spectroscopy and crucial for the development and evaluation of methods for biomolecular NMR data analysis. Nevertheless, in contrast to derived data such as chemical shift assignments in the BMRB and protein structures in the PDB databases, this primary data is in general not publicly archived. To change this unsatisfactory situation, we present a standardized set of solution NMR data comprising 1329 2-4-dimensional NMR spectra and associated reference (chemical shift assignments, structures) and derived (peak lists, restraints for structure calculation, etc.) annotations. With the 100-protein NMR spectra dataset that was originally compiled for the development of the ARTINA deep learning-based spectra analysis method, 100 protein structures can be reproduced from their original experimental data. The 100-protein NMR spectra dataset is expected to help the development of computational methods for NMR spectroscopy, in particular machine learning approaches, and enable consistent and objective comparisons of these methods.
Collapse
Affiliation(s)
- Piotr Klukowski
- Institute of Molecular Physical Science, ETH Zurich, 8093, Zurich, Switzerland.
| | - Fred F Damberger
- Institute of Biochemistry, ETH Zurich, 8093, Zurich, Switzerland
| | | | - Hideo Iwai
- Institute of Biotechnology, University of Helsinki, 00100, Helsinki, Finland
| | | | - Theresa A Ramelot
- Department of Chemistry and Chemical Biology, and Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA
| | - Gaetano T Montelione
- Department of Chemistry and Chemical Biology, and Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY, 12180, USA
| | - Roland Riek
- Institute of Molecular Physical Science, ETH Zurich, 8093, Zurich, Switzerland.
| | - Peter Güntert
- Institute of Molecular Physical Science, ETH Zurich, 8093, Zurich, Switzerland.
- Institute of Biophysical Chemistry, Goethe University, 60438, Frankfurt am Main, Germany.
- Department of Chemistry, Tokyo Metropolitan University, Hachioji, 192-0397, Tokyo, Japan.
| |
Collapse
|