1
|
Cersonsky RK, Cheng B, De Vivo M, Tiwary P. Machine Learning and Statistical Mechanics: Shared Synergies for Next Generation of Chemical Theory and Computation. J Chem Theory Comput 2025. [PMID: 40343763 DOI: 10.1021/acs.jctc.5c00650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/11/2025]
Affiliation(s)
- Rose K Cersonsky
- Department of Chemical and Biological Engineering, University of Wisconsin, Madison, Wisconsin 53706, United States
- Department of Materials Science and Engineering, University of Wisconsin, Madison, Wisconsin 53706, United States
| | - Bingqing Cheng
- Department of Chemistry, University of California, Berkeley, California 94720, United States
| | - Marco De Vivo
- Laboratory of Molecular Modeling and Drug Discovery, Istituto Italiano di Tecnologia, Via Morego 30, 16163 Genova, Italy
| | - Pratyush Tiwary
- Department of Chemistry and Biochemistry and Institute for Physical Science and Technology, University of Maryland, College Park 20742, United States
- University of Maryland Institute for Health Computing, Bethesda, Maryland 20852, United States
| |
Collapse
|
2
|
Tessmer MH, Stoll S. Protein Modeling with DEER Spectroscopy. Annu Rev Biophys 2025; 54:35-57. [PMID: 39689263 DOI: 10.1146/annurev-biophys-030524-013431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2024]
Abstract
Double electron-electron resonance (DEER) combined with site-directed spin labeling can provide distance distributions between selected protein residues to investigate protein structure and conformational heterogeneity. The utilization of the full quantitative information contained in DEER data requires effective protein and spin label modeling methods. Here, we review the application of DEER data to protein modeling. First, we discuss the significance of spin label modeling for accurate extraction of protein structural information and review the most popular label modeling methods. Next, we review several important aspects of protein modeling with DEER, including site selection, how DEER restraints are applied, common artifacts, and the unique potential of DEER data for modeling structural ensembles and conformational landscapes. Finally, we discuss common applications of protein modeling with DEER data and provide an outlook.
Collapse
Affiliation(s)
- Maxx H Tessmer
- Department of Chemistry, University of Washington, Seattle, Washington, USA;
| | - Stefan Stoll
- Department of Chemistry, University of Washington, Seattle, Washington, USA;
| |
Collapse
|
3
|
Sil S, Datta I, Basu S. Use of AI-methods over MD simulations in the sampling of conformational ensembles in IDPs. Front Mol Biosci 2025; 12:1542267. [PMID: 40264953 PMCID: PMC12011600 DOI: 10.3389/fmolb.2025.1542267] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2024] [Accepted: 03/17/2025] [Indexed: 04/24/2025] Open
Abstract
Intrinsically Disordered Proteins (IDPs) challenge traditional structure-function paradigms by existing as dynamic ensembles rather than stable tertiary structures. Capturing these ensembles is critical to understanding their biological roles, yet Molecular Dynamics (MD) simulations, though accurate and widely used, are computationally expensive and struggle to sample rare, transient states. Artificial intelligence (AI) offers a transformative alternative, with deep learning (DL) enabling efficient and scalable conformational sampling. They leverage large-scale datasets to learn complex, non-linear, sequence-to-structure relationships, allowing for the modeling of conformational ensembles in IDPs without the constraints of traditional physics-based approaches. Such DL approaches have been shown to outperform MD in generating diverse ensembles with comparable accuracy. Most models rely primarily on simulated data for training and experimental data serves a critical role in validation, aligning the generated conformational ensembles with observable physical and biochemical properties. However, challenges remain, including dependence on data quality, limited interpretability, and scalability for larger proteins. Hybrid approaches combining AI and MD can bridge the gaps by integrating statistical learning with thermodynamic feasibility. Future directions include incorporating physics-based constraints and learning experimental observables into DL frameworks to refine predictions and enhance applicability. AI-driven methods hold significant promise in IDP research, offering novel insights into protein dynamics and therapeutic targeting while overcoming the limitations of traditional MD simulations.
Collapse
Affiliation(s)
- Souradeep Sil
- Department of Genetics, Osmania University, Hyderabad, India
| | - Ishita Datta
- Department of Genetics and Plant Breeding, Banaras Hindu University, Varanasi, India
| | - Sankar Basu
- Department of Microbiology, Asutosh College (Affiliated with University of Calcutta), Kolkata, India
| |
Collapse
|
4
|
Jurich C, Shao Q, Ran X, Yang ZJ. Physics-based modeling in the new era of enzyme engineering. NATURE COMPUTATIONAL SCIENCE 2025; 5:279-291. [PMID: 40275092 DOI: 10.1038/s43588-025-00788-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/03/2024] [Accepted: 03/04/2025] [Indexed: 04/26/2025]
Abstract
Enzyme engineering is entering a new era characterized by the integration of computational strategies. While bioinformatics and artificial intelligence methods have been extensively applied to accelerate the screening of function-enhancing mutants, physics-based modeling methods, such as molecular mechanics and quantum mechanics, are essential complements in many objectives. In this Perspective, we highlight how physics-based modeling will help the field of computational enzyme engineering reach its full potential by exploring current developments, unmet challenges and emerging opportunities for tool development.
Collapse
Affiliation(s)
| | - Qianzhen Shao
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
| | - Xinchun Ran
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA
| | - Zhongyue J Yang
- Department of Chemistry, Vanderbilt University, Nashville, TN, USA.
- Center for Structural Biology, Vanderbilt University, Nashville, TN, USA.
- The Vanderbilt Institute of Chemical Biology, Vanderbilt University, Nashville, TN, USA.
- Data Science Institute, Vanderbilt University, Nashville, TN, USA.
- Department of Chemical and Biomolecular Engineering, Vanderbilt University, Nashville, TN, USA.
| |
Collapse
|
5
|
Kalakoti Y, Sanjeev A, Wallner B. Prediction of structural variation. Curr Opin Struct Biol 2025; 91:103003. [PMID: 39983409 DOI: 10.1016/j.sbi.2025.103003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Revised: 01/15/2025] [Accepted: 01/26/2025] [Indexed: 02/23/2025]
Abstract
Proteins are dynamic molecules that transition between conformational states to perform their functions, and characterizing the protein ensemble is important for understanding biology and therapeutic applications. While recent breakthroughs in machine learning have enabled the prediction of high-quality static models of individual proteins, generating reliable estimates of their conformational ensembles remains a challenge. Several recent methods have tried to utilize the evolutionary and structural features captured by effective sequence-to-structure models to enhance conformational diversity in generated models. Most of these approaches involve adapting existing inference pipelines, such as AlphaFold 2, combined with sampling techniques to induce the generation of diverse conformational states. Here, we describe the general problem of predicting structural variations in protein systems, explain the methods designed to address this challenge, explore why they are effective, discuss their limitations, and suggest potential future directions.
Collapse
Affiliation(s)
- Yogesh Kalakoti
- Linköping University, Division of Bioinformatics, Department of Physics, Chemistry and Biolog, Linköping, 58183, Sweden
| | - Airy Sanjeev
- Linköping University, Division of Bioinformatics, Department of Physics, Chemistry and Biolog, Linköping, 58183, Sweden
| | - Björn Wallner
- Linköping University, Division of Bioinformatics, Department of Physics, Chemistry and Biolog, Linköping, 58183, Sweden.
| |
Collapse
|
6
|
Howard MK, Hoppe N, Huang XP, Mitrovic D, Billesbølle CB, Macdonald CB, Mehrotra E, Rockefeller Grimes P, Trinidad DD, Delemotte L, English JG, Coyote-Maestas W, Manglik A. Molecular basis of proton sensing by G protein-coupled receptors. Cell 2025; 188:671-687.e20. [PMID: 39753132 PMCID: PMC11849372 DOI: 10.1016/j.cell.2024.11.036] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Revised: 09/23/2024] [Accepted: 11/21/2024] [Indexed: 02/09/2025]
Abstract
Three proton-sensing G protein-coupled receptors (GPCRs)-GPR4, GPR65, and GPR68-respond to extracellular pH to regulate diverse physiology. How protons activate these receptors is poorly understood. We determined cryogenic-electron microscopy (cryo-EM) structures of each receptor to understand the spatial arrangement of proton-sensing residues. Using deep mutational scanning (DMS), we determined the functional importance of every residue in GPR68 activation by generating ∼9,500 mutants and measuring their effects on signaling and surface expression. Constant-pH molecular dynamics simulations provided insights into the conformational landscape and protonation patterns of key residues. This unbiased approach revealed that, unlike other proton-sensitive channels and receptors, no single site is critical for proton recognition. Instead, a network of titratable residues extends from the extracellular surface to the transmembrane region, converging on canonical motifs to activate proton-sensing GPCRs. Our approach integrating structure, simulations, and unbiased functional interrogation provides a framework for understanding GPCR signaling complexity.
Collapse
Affiliation(s)
- Matthew K Howard
- Tetrad graduate program, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Bioengineering and Therapeutic Science, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Nicholas Hoppe
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143, USA; Biophysics graduate program, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Xi-Ping Huang
- Department of Pharmacology and the National Institute of Mental Health Psychoactive Drug Screening Program (NIMH PDSP), The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Darko Mitrovic
- Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, 12121 Solna, Stockholm, Stockholm County 114 28, Sweden
| | - Christian B Billesbølle
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Christian B Macdonald
- Department of Bioengineering and Therapeutic Science, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Eshan Mehrotra
- Tetrad graduate program, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143, USA; Medical Scientist Training Program, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Patrick Rockefeller Grimes
- Department of Bioengineering and Therapeutic Science, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Donovan D Trinidad
- Department of Medicine, Division of Infectious Disease, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Lucie Delemotte
- Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, 12121 Solna, Stockholm, Stockholm County 114 28, Sweden
| | - Justin G English
- Department of Biochemistry, University of Utah School of Medicine, Salt Lake City, UT 84112, USA
| | - Willow Coyote-Maestas
- Department of Bioengineering and Therapeutic Science, University of California, San Francisco, San Francisco, CA 94143, USA; Chan Zuckerberg Biohub, San Francisco, CA 94148, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA.
| | - Aashish Manglik
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143, USA; Chan Zuckerberg Biohub, San Francisco, CA 94148, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Anesthesia and Perioperative Care, University of California, San Francisco, San Francisco, CA 94115, USA.
| |
Collapse
|
7
|
Raisinghani N, Parikh V, Foley B, Verkhivker G. AlphaFold2-Based Characterization of Apo and Holo Protein Structures and Conformational Ensembles Using Randomized Alanine Sequence Scanning Adaptation: Capturing Shared Signature Dynamics and Ligand-Induced Conformational Changes. Int J Mol Sci 2024; 25:12968. [PMID: 39684679 DOI: 10.3390/ijms252312968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 11/24/2024] [Accepted: 11/29/2024] [Indexed: 12/18/2024] Open
Abstract
Proteins often exist in multiple conformational states, influenced by the binding of ligands or substrates. The study of these states, particularly the apo (unbound) and holo (ligand-bound) forms, is crucial for understanding protein function, dynamics, and interactions. In the current study, we use AlphaFold2, which combines randomized alanine sequence masking with shallow multiple sequence alignment subsampling to expand the conformational diversity of the predicted structural ensembles and capture conformational changes between apo and holo protein forms. Using several well-established datasets of structurally diverse apo-holo protein pairs, the proposed approach enables robust predictions of apo and holo structures and conformational ensembles, while also displaying notably similar dynamics distributions. These observations are consistent with the view that the intrinsic dynamics of allosteric proteins are defined by the structural topology of the fold and favor conserved conformational motions driven by soft modes. Our findings provide evidence that AlphaFold2 combined with randomized alanine sequence masking can yield accurate and consistent results in predicting moderate conformational adjustments between apo and holo states, especially for proteins with localized changes upon ligand binding. For large hinge-like domain movements, the proposed approach can predict functional conformations characteristic of both apo and ligand-bound holo ensembles in the absence of ligand information. These results are relevant for using this AlphaFold adaptation for probing conformational selection mechanisms according to which proteins can adopt multiple conformations, including those that are competent for ligand binding. The results of this study indicate that robust modeling of functional protein states may require more accurate characterization of flexible regions in functional conformations and the detection of high-energy conformations. By incorporating a wider variety of protein structures in training datasets, including both apo and holo forms, the model can learn to recognize and predict the structural changes that occur upon ligand binding.
Collapse
Affiliation(s)
- Nishank Raisinghani
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Vedant Parikh
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Brandon Foley
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Gennady Verkhivker
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA 92618, USA
| |
Collapse
|
8
|
Raisinghani N, Alshahrani M, Gupta G, Tian H, Xiao S, Tao P, Verkhivker G. Probing Functional Allosteric States and Conformational Ensembles of the Allosteric Protein Kinase States and Mutants: Atomistic Modeling and Comparative Analysis of AlphaFold2, OmegaFold, and AlphaFlow Approaches and Adaptations. J Phys Chem B 2024; 128:11088-11107. [PMID: 39485490 DOI: 10.1021/acs.jpcb.4c04985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
This study reports a comprehensive analysis and comparison of several AlphaFold2 adaptations and OmegaFold and AlphaFlow approaches in predicting distinct allosteric states, conformational ensembles, and mutation-induced structural effects for a panel of state-switching allosteric ABL mutants. The results revealed that the proposed AlphaFold2 adaptation with randomized alanine sequence scanning can generate functionally relevant allosteric states and conformational ensembles of the ABL kinase that qualitatively capture a unique pattern of population shifts between the active and inactive states in the allosteric ABL mutants. Consistent with the NMR experiments, the proposed AlphaFold2 adaptation predicted that G269E/M309L/T408Y mutant could induce population changes and sample a significant fraction of the fully inactive I2 form which is a low-populated, high-energy state for the wild-type ABL protein. We also demonstrated that other ABL mutants G269E/M309L/T334I and M309L/L320I/T334I that introduce a single activating T334I mutation can reverse equilibrium and populate exclusively the active ABL form. While the precise quantitative predictions of the relative populations of the active and various hidden inactive states in the ABL mutants remain challenging, our results provide evidence that AlphaFold2 adaptation with randomized alanine sequence scanning can adequately detect a spectrum of the allosteric ABL states and capture the equilibrium redistributions between structurally distinct functional ABL conformations. We further validated the robustness of the proposed AlphaFold2 adaptation for predicting the unique inactive architecture of the BSK8 kinase and structural differences between ligand-unbound apo and ATP-bound forms of BSK8. The results of this comparative study suggested that AlpahFold2, OmegaFold, and AlphaFlow approaches may be driven by structural memorization of existing protein folds and are strongly biased toward predictions of the thermodynamically stable ground states of the protein kinases, highlighting limitations and challenges of AI-based methodologies in detecting alternative functional conformations, accurate characterization of physically significant conformational ensembles, and prediction of mutation-induced allosteric structural changes.
Collapse
Affiliation(s)
- Nishank Raisinghani
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States
| | - Mohammed Alshahrani
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States
| | - Grace Gupta
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States
| | - Hao Tian
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| | - Sian Xiao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| | - Gennady Verkhivker
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, California 92618, United States
- Department of Pharmacology, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9500 Gilman Drive, La Jolla, California 92093, United States
| |
Collapse
|
9
|
Riccabona JR, Spoendlin FC, Fischer ALM, Loeffler JR, Quoika PK, Jenkins TP, Ferguson JA, Smorodina E, Laustsen AH, Greiff V, Forli S, Ward AB, Deane CM, Fernández-Quintero ML. Assessing AF2's ability to predict structural ensembles of proteins. Structure 2024; 32:2147-2159.e2. [PMID: 39332396 DOI: 10.1016/j.str.2024.09.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 08/07/2024] [Accepted: 09/02/2024] [Indexed: 09/29/2024]
Abstract
Recent breakthroughs in protein structure prediction have enhanced the precision and speed at which protein configurations can be determined. Additionally, molecular dynamics (MD) simulations serve as a crucial tool for capturing the conformational space of proteins, providing valuable insights into their structural fluctuations. However, the scope of MD simulations is often limited by the accessible timescales and the computational resources available, posing challenges to comprehensively exploring protein behaviors. Recently emerging approaches have focused on expanding the capability of AlphaFold2 (AF2) to predict conformational substates of protein. Here, we benchmark the performance of various workflows that have adapted AF2 for ensemble prediction and compare the obtained structures with ensembles obtained from MD simulations and NMR. We provide an overview of the levels of performance and accessible timescales that can currently be achieved with machine learning (ML) based ensemble generation. Significant minima of the free energy surfaces remain undetected.
Collapse
Affiliation(s)
- Jakob R Riccabona
- Center for Molecular Biosciences Innsbruck, Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
| | - Fabian C Spoendlin
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
| | - Anna-Lena M Fischer
- Center for Molecular Biosciences Innsbruck, Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
| | - Johannes R Loeffler
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Patrick K Quoika
- Center for Functional Protein Assemblies, Technical University of Munich, Ernst-Otto-Fischer-Str. 8, 85748 Garching, Germany
| | - Timothy P Jenkins
- Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - James A Ferguson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Eva Smorodina
- Department of Immunology, University of Oslo, Oslo, Norway
| | - Andreas H Laustsen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - Victor Greiff
- Department of Immunology, University of Oslo, Oslo, Norway
| | - Stefano Forli
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Andrew B Ward
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA.
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK.
| | - Monica L Fernández-Quintero
- Center for Molecular Biosciences Innsbruck, Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria; Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA; Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark.
| |
Collapse
|
10
|
Sánchez Rodríguez F, Simpkin AJ, Chojnowski G, Keegan RM, Rigden DJ. Using deep-learning predictions reveals a large number of register errors in PDB depositions. IUCRJ 2024; 11:938-950. [PMID: 39387575 PMCID: PMC11533997 DOI: 10.1107/s2052252524009114] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/08/2024] [Accepted: 09/17/2024] [Indexed: 10/15/2024]
Abstract
The accuracy of the information in the Protein Data Bank (PDB) is of great importance for the myriad downstream applications that make use of protein structural information. Despite best efforts, the occasional introduction of errors is inevitable, especially where the experimental data are of limited resolution. A novel protein structure validation approach based on spotting inconsistencies between the residue contacts and distances observed in a structural model and those computationally predicted by methods such as AlphaFold2 has previously been established. It is particularly well suited to the detection of register errors. Importantly, this new approach is orthogonal to traditional methods based on stereochemistry or map-model agreement, and is resolution independent. Here, thousands of likely register errors are identified by scanning 3-5 Å resolution structures in the PDB. Unlike most methods, the application of this approach yields suggested corrections to the register of affected regions, which it is shown, even by limited implementation, lead to improved refinement statistics in the vast majority of cases. A few limitations and confounding factors such as fold-switching proteins are characterized, but this approach is expected to have broad application in spotting potential issues in current accessions and, through its implementation and distribution in CCP4, helping to ensure the accuracy of future depositions.
Collapse
Affiliation(s)
- Filomeno Sánchez Rodríguez
- Institute of Systems, Molecular and Integrative BiologyUniversity of LiverpoolLiverpoolL69 7ZBUnited Kingdom
- Life ScienceDiamond Light SourceHarwell Science and Innovation CampusDidcotOX11 0DEUnited Kingdom
- Department of Chemistry, York Structural Biology LaboratoryUniversity of YorkYorkUnited Kingdom
| | - Adam J. Simpkin
- Institute of Systems, Molecular and Integrative BiologyUniversity of LiverpoolLiverpoolL69 7ZBUnited Kingdom
| | - Grzegorz Chojnowski
- European Molecular Biology LaboratoryHamburg Unit, Notkestrasse 8522607HamburgGermany
| | - Ronan M. Keegan
- UKRI–STFCRutherford Appleton LaboratoryResearch Complex at HarwellDidcotOX11 0FAUnited Kingdom
| | - Daniel J. Rigden
- Institute of Systems, Molecular and Integrative BiologyUniversity of LiverpoolLiverpoolL69 7ZBUnited Kingdom
| |
Collapse
|
11
|
Ngo K, Yang PC, Yarov-Yarovoy V, Clancy CE, Vorobyov I. Harnessing AlphaFold to reveal hERG channel conformational state secrets. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.01.27.577468. [PMID: 38352360 PMCID: PMC10862728 DOI: 10.1101/2024.01.27.577468] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/20/2024]
Abstract
To design safe, selective, and effective new therapies, there must be a deep understanding of the structure and function of the drug target. One of the most difficult problems to solve has been resolution of discrete conformational states of transmembrane ion channel proteins. An example is KV11.1 (hERG), comprising the primary cardiac repolarizing current, I Kr. hERG is a notorious drug anti-target against which all promising drugs are screened to determine potential for arrhythmia. Drug interactions with the hERG inactivated state are linked to elevated arrhythmia risk, and drugs may become trapped during channel closure. However, the structural details of multiple conformational states have remained elusive. Here, we guided AlphaFold2 to predict plausible hERG inactivated and closed conformations, obtaining results consistent with multiple available experimental data. Drug docking simulations demonstrated hERG state-specific drug interactions in good agreement with experimental results, revealing that most drugs bind more effectively in the inactivated state and are trapped in the closed state. Molecular dynamics simulations demonstrated ion conduction for an open but not AlphaFold2 predicted inactivated state that aligned with earlier studies. Finally, we identified key molecular determinants of state transitions by analyzing interaction networks across closed, open, and inactivated states in agreement with earlier mutagenesis studies. Here, we demonstrate a readily generalizable application of AlphaFold2 as an effective and robust method to predict discrete protein conformations, reconcile seemingly disparate data and identify novel linkages from structure to function.
Collapse
Affiliation(s)
- Khoa Ngo
- Center for Precision Medicine and Data Science, University of California, Davis, California
- Department of Physiology and Membrane Biology, University of California, Davis, California
| | - Pei-Chi Yang
- Center for Precision Medicine and Data Science, University of California, Davis, California
- Department of Physiology and Membrane Biology, University of California, Davis, California
| | - Vladimir Yarov-Yarovoy
- Center for Precision Medicine and Data Science, University of California, Davis, California
- Department of Physiology and Membrane Biology, University of California, Davis, California
- Department of Anesthesiology and Pain Medicine, University of California, Davis, California
| | - Colleen E. Clancy
- Center for Precision Medicine and Data Science, University of California, Davis, California
- Department of Physiology and Membrane Biology, University of California, Davis, California
- Department of Pharmacology, University of California, Davis, California
| | - Igor Vorobyov
- Department of Physiology and Membrane Biology, University of California, Davis, California
- Department of Pharmacology, University of California, Davis, California
| |
Collapse
|
12
|
Raisinghani N, Alshahrani M, Gupta G, Verkhivker G. Predicting Mutation-Induced Allosteric Changes in Structures and Conformational Ensembles of the ABL Kinase Using AlphaFold2 Adaptations with Alanine Sequence Scanning. Int J Mol Sci 2024; 25:10082. [PMID: 39337567 PMCID: PMC11432724 DOI: 10.3390/ijms251810082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Revised: 09/18/2024] [Accepted: 09/18/2024] [Indexed: 09/30/2024] Open
Abstract
Despite the success of AlphaFold2 approaches in predicting single protein structures, these methods showed intrinsic limitations in predicting multiple functional conformations of allosteric proteins and have been challenged to accurately capture the effects of single point mutations that induced significant structural changes. We examined several implementations of AlphaFold2 methods to predict conformational ensembles for state-switching mutants of the ABL kinase. The results revealed that a combination of randomized alanine sequence masking with shallow multiple sequence alignment subsampling can significantly expand the conformational diversity of the predicted structural ensembles and capture shifts in populations of the active and inactive ABL states. Consistent with the NMR experiments, the predicted conformational ensembles for M309L/L320I and M309L/H415P ABL mutants that perturb the regulatory spine networks featured the increased population of the fully closed inactive state. The proposed adaptation of AlphaFold can reproduce the experimentally observed mutation-induced redistributions in the relative populations of the active and inactive ABL states and capture the effects of regulatory mutations on allosteric structural rearrangements of the kinase domain. The ensemble-based network analysis complemented AlphaFold predictions by revealing allosteric hotspots that correspond to state-switching mutational sites which may explain the global effect of regulatory mutations on structural changes between the ABL states. This study suggested that attention-based learning of long-range dependencies between sequence positions in homologous folds and deciphering patterns of allosteric interactions may further augment the predictive abilities of AlphaFold methods for modeling of alternative protein sates, conformational ensembles and mutation-induced structural transformations.
Collapse
Affiliation(s)
- Nishank Raisinghani
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Mohammed Alshahrani
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Grace Gupta
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Gennady Verkhivker
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA 92618, USA
| |
Collapse
|
13
|
Frasnetti E, Magni A, Castelli M, Serapian SA, Moroni E, Colombo G. Structures, dynamics, complexes, and functions: From classic computation to artificial intelligence. Curr Opin Struct Biol 2024; 87:102835. [PMID: 38744148 DOI: 10.1016/j.sbi.2024.102835] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2024] [Revised: 04/14/2024] [Accepted: 04/22/2024] [Indexed: 05/16/2024]
Abstract
Computational approaches can provide highly detailed insight into the molecular recognition processes that underlie drug binding, the assembly of protein complexes, and the regulation of biological functional processes. Classical simulation methods can bridge a wide range of length- and time-scales typically involved in such processes. Lately, automated learning and artificial intelligence methods have shown the potential to expand the reach of physics-based approaches, ushering in the possibility to model and even design complex protein architectures. The synergy between atomistic simulations and AI methods is an emerging frontier with a huge potential for advances in structural biology. Herein, we explore various examples and frameworks for these approaches, providing select instances and applications that illustrate their impact on fundamental biomolecular problems.
Collapse
Affiliation(s)
- Elena Frasnetti
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | - Andrea Magni
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | - Matteo Castelli
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | - Stefano A Serapian
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy
| | | | - Giorgio Colombo
- Department of Chemistry, University of Pavia, via Taramelli 12, 27100 Pavia, Italy.
| |
Collapse
|
14
|
Raisinghani N, Alshahrani M, Gupta G, Tian H, Xiao S, Tao P, Verkhivker G. Prediction of Conformational Ensembles and Structural Effects of State-Switching Allosteric Mutants in the Protein Kinases Using Comparative Analysis of AlphaFold2 Adaptations with Sequence Masking and Shallow Subsampling. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.17.594786. [PMID: 38798650 PMCID: PMC11118581 DOI: 10.1101/2024.05.17.594786] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]
Abstract
Despite the success of AlphaFold2 approaches in predicting single protein structures, these methods showed intrinsic limitations in predicting multiple functional conformations of allosteric proteins and have been challenged to accurately capture of the effects of single point mutations that induced significant structural changes. We systematically examined several implementations of AlphaFold2 methods to predict conformational ensembles for state-switching mutants of the ABL kinase. The results revealed that a combination of randomized alanine sequence masking with shallow multiple sequence alignment subsampling can significantly expand the conformational diversity of the predicted structural ensembles and capture shifts in populations of the active and inactive ABL states. Consistent with the NMR experiments, the predicted conformational ensembles for M309L/L320I and M309L/H415P ABL mutants that perturb the regulatory spine networks featured the increased population of the fully closed inactive state. On the other hand, the predicted conformational ensembles for the G269E/M309L/T334I and M309L/L320I/T334I triple ABL mutants that share activating T334I gate-keeper substitution are dominated by the active ABL form. The proposed adaptation of AlphaFold can reproduce the experimentally observed mutation-induced redistributions in the relative populations of the active and inactive ABL states and capture the effects of regulatory mutations on allosteric structural rearrangements of the kinase domain. The ensemble-based network analysis complemented AlphaFold predictions by revealing allosteric mediating centers that often directly correspond to state-switching mutational sites or reside in their immediate local structural proximity, which may explain the global effect of regulatory mutations on structural changes between the ABL states. This study suggested that attention-based learning of long-range dependencies between sequence positions in homologous folds and deciphering patterns of allosteric interactions may further augment the predictive abilities of AlphaFold methods for modeling of alternative protein sates, conformational ensembles and mutation-induced structural transformations.
Collapse
|