1
|
Najar Najafi N, Karbassian R, Hajihassani H, Azimzadeh Irani M. Unveiling the influence of fastest nobel prize winner discovery: alphafold's algorithmic intelligence in medical sciences. J Mol Model 2025; 31:163. [PMID: 40387957 DOI: 10.1007/s00894-025-06392-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2024] [Accepted: 05/06/2025] [Indexed: 05/20/2025]
Abstract
CONTEXT AlphaFold's advanced AI technology has transformed protein structure interpretation. By predicting three-dimensional protein structures from amino acid sequences, AlphaFold has solved the complex protein-folding problem, previously challenging for experimental methods due to numerous possible conformations. Since its inception, AlphaFold has introduced several versions, including AlphaFold2, AlphaFold DB, AlphaFold Multimer, Alpha Missense, and AlphaFold3, each further enhancing protein structure prediction. Remarkably, AlphaFold is recognized as the fastest Nobel Prize winner in science history. This technology has extensive applications, potentially transforming treatment and diagnosis in medical sciences by reducing drug design costs and time, while elucidating structural pathways of human body systems. Numerous studies have demonstrated how AlphaFold aids in understanding health conditions by providing critical information about protein mutations, abnormal protein-protein interactions, and changes in protein dynamics. Researchers have also developed new technologies and pipelines using different versions of AlphaFold to amplify its potential. However, addressing existing limitations is crucial to maximizing AlphaFold's capacity to redefine medical research. This article reviews AlphaFold's impact on five key aspects of medical sciences: protein mutation, protein-protein interaction, molecular dynamics, drug design, and immunotherapy. METHODS This review examines the contributions of various AlphaFold versions AlphaFold2, AlphaFold DB, AlphaFold Multimer, Alpha Missense, and AlphaFold3 to protein structure prediction. The methods include an extensive analysis of computational techniques and software used in interpreting and predicting protein structures, emphasizing advances in AI technology and its applications in medical research.
Collapse
Affiliation(s)
- Niki Najar Najafi
- Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | - Reyhaneh Karbassian
- Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | - Helia Hajihassani
- Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
| | | |
Collapse
|
2
|
Hilser VJ, Wrabl JO, Millard CEF, Schmitz A, Brantley SJ, Pearce M, Rehfus J, Russo MM, Voortman-Sheetz K. Statistical Thermodynamics of the Protein Ensemble: Mediating Function and Evolution. Annu Rev Biophys 2025; 54:227-247. [PMID: 39929551 DOI: 10.1146/annurev-biophys-061824-104900] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/07/2025]
Abstract
The growing appreciation of native state conformational fluctuations mediating protein function calls for critical reevaluation of protein evolution and adaptation. If proteins are ensembles, does nature select solely for ground state structure, or are conformational equilibria between functional states also conserved? If so, what is the mechanism and how can it be measured? Addressing these fundamental questions, we review our investigation into the role of local unfolding fluctuations in the native state ensembles of proteins. We describe the functional importance of these ubiquitous fluctuations, as revealed through studies of adenylate kinase. We then summarize elucidation of thermodynamic organizing principles, which culminate in a quantitative probe for evolutionary conservation of protein energetics. Finally, we show that these principles are predictive of sequence compatibility for multiple folds, providing a unique thermodynamic perspective on metamorphic proteins. These research areas demonstrate that the locally unfolded ensemble is an emerging, important mechanism of protein evolution.
Collapse
Affiliation(s)
- Vincent J Hilser
- Department of Biology, Johns Hopkins University, Baltimore, Maryland, USA;
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland, USA
| | - James O Wrabl
- Department of Biology, Johns Hopkins University, Baltimore, Maryland, USA;
| | - Charles E F Millard
- Department of Biology, Johns Hopkins University, Baltimore, Maryland, USA;
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland, USA
| | - Anna Schmitz
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland, USA
| | - Sarah J Brantley
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland, USA
| | - Marie Pearce
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland, USA
| | - Joe Rehfus
- Department of Biology, Johns Hopkins University, Baltimore, Maryland, USA;
| | - Miranda M Russo
- T.C. Jenkins Department of Biophysics, Johns Hopkins University, Baltimore, Maryland, USA
| | - Keila Voortman-Sheetz
- Department of Biology, Johns Hopkins University, Baltimore, Maryland, USA;
- Chemistry/Biology Interface Program, Department of Chemistry, Johns Hopkins University, Baltimore, Maryland, USA
| |
Collapse
|
3
|
Tessmer MH, Stoll S. Protein Modeling with DEER Spectroscopy. Annu Rev Biophys 2025; 54:35-57. [PMID: 39689263 DOI: 10.1146/annurev-biophys-030524-013431] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2024]
Abstract
Double electron-electron resonance (DEER) combined with site-directed spin labeling can provide distance distributions between selected protein residues to investigate protein structure and conformational heterogeneity. The utilization of the full quantitative information contained in DEER data requires effective protein and spin label modeling methods. Here, we review the application of DEER data to protein modeling. First, we discuss the significance of spin label modeling for accurate extraction of protein structural information and review the most popular label modeling methods. Next, we review several important aspects of protein modeling with DEER, including site selection, how DEER restraints are applied, common artifacts, and the unique potential of DEER data for modeling structural ensembles and conformational landscapes. Finally, we discuss common applications of protein modeling with DEER data and provide an outlook.
Collapse
Affiliation(s)
- Maxx H Tessmer
- Department of Chemistry, University of Washington, Seattle, Washington, USA;
| | - Stefan Stoll
- Department of Chemistry, University of Washington, Seattle, Washington, USA;
| |
Collapse
|
4
|
Vargas-Rosales PA, Caflisch A. The physics-AI dialogue in drug design. RSC Med Chem 2025; 16:1499-1515. [PMID: 39906313 PMCID: PMC11788922 DOI: 10.1039/d4md00869c] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2024] [Accepted: 01/16/2025] [Indexed: 02/06/2025] Open
Abstract
A long path has led from the determination of the first protein structure in 1960 to the recent breakthroughs in protein science. Protein structure prediction and design methodologies based on machine learning (ML) have been recognized with the 2024 Nobel prize in Chemistry, but they would not have been possible without previous work and the input of many domain scientists. Challenges remain in the application of ML tools for the prediction of structural ensembles and their usage within the software pipelines for structure determination by crystallography or cryogenic electron microscopy. In the drug discovery workflow, ML techniques are being used in diverse areas such as scoring of docked poses, or the generation of molecular descriptors. As the ML techniques become more widespread, novel applications emerge which can profit from the large amounts of data available. Nevertheless, it is essential to balance the potential advantages against the environmental costs of ML deployment to decide if and when it is best to apply it. For hit to lead optimization ML tools can efficiently interpolate between compounds in large chemical series but free energy calculations by molecular dynamics simulations seem to be superior for designing novel derivatives. Importantly, the potential complementarity and/or synergism of physics-based methods (e.g., force field-based simulation models) and data-hungry ML techniques is growing strongly. Current ML methods have evolved from decades of research. It is now necessary for biologists, physicists, and computer scientists to fully understand advantages and limitations of ML techniques to ensure that the complementarity of physics-based methods and ML tools can be fully exploited for drug design.
Collapse
Affiliation(s)
| | - Amedeo Caflisch
- Department of Biochemistry, University of Zurich Winterthurerstrasse 190 8057 Zürich Switzerland
| |
Collapse
|
5
|
Jeschke G. Characterization of conformationally heterogeneous proteins by electron paramagnetic resonance spectroscopy. Curr Opin Struct Biol 2025; 92:103046. [PMID: 40220482 DOI: 10.1016/j.sbi.2025.103046] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2025] [Revised: 03/07/2025] [Accepted: 03/20/2025] [Indexed: 04/14/2025]
Abstract
The Anfinsen paradigm of representing a protein by a single conformer is challenged by the uncertainty predictions that come with AlphaFold models, which suggest a greater extent of disorder. Characterization of such conformation heterogeneity requires experimental approaches that do not depend on long-range order. Site-directed spin labeling (SDSL) coupled with electron paramagnetic resonance (EPR) spectroscopy is such an approach. The double electron-electron resonance (DEER) technique can access site-pair distance distributions in the 15-100 Å range, directly informing on ensemble width. SDSL-EPR can be applied in cellular environments, and recent work indicates that protein disorder is even more pervasive than predicted by AlphaFold. This suggests that the Anfinsen paradigm should be replaced by an ensemble paradigm.
Collapse
Affiliation(s)
- Gunnar Jeschke
- Department of Chemistry and Applied Biosciences, ETH Zürich, Vladimir-Prelog-Weg 2, Zürich, 8093, Switzerland.
| |
Collapse
|
6
|
Aranganathan A, Gu X, Wang D, Vani BP, Tiwary P. Modeling Boltzmann-weighted structural ensembles of proteins using artificial intelligence-based methods. Curr Opin Struct Biol 2025; 91:103000. [PMID: 39923288 PMCID: PMC12011212 DOI: 10.1016/j.sbi.2025.103000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2024] [Revised: 01/09/2025] [Accepted: 01/20/2025] [Indexed: 02/11/2025]
Abstract
This review highlights recent advances in AI-driven methods for generating Boltzmann-weighted structural ensembles, which are crucial for understanding biomolecular dynamics and drug discovery. With the rise of deep learning models such as AlphaFold2, there has been a shift toward more accurate and efficient sampling of structural ensembles. The review discusses the integration of AI with traditional molecular dynamics techniques as well as experiments, the challenges of conformational sampling, and future directions for AI-driven research in structural biology, particularly in drug discovery and protein dynamics.
Collapse
Affiliation(s)
- Akashnathan Aranganathan
- Biophysics Program, University of Maryland, College Park, 20742, MD, USA; Institute of Physical Science and Technology, University of Maryland, College Park, 20742, MD, USA
| | - Xinyu Gu
- Institute of Physical Science and Technology, University of Maryland, College Park, 20742, MD, USA; University of Maryland Institute for Health Computing, Bethesda, 20852, MD, USA.
| | - Dedi Wang
- Genentech, 1 DNA Way, South San Francisco, 94080, CA, USA
| | - Bodhi P Vani
- Genentech, 1 DNA Way, South San Francisco, 94080, CA, USA
| | - Pratyush Tiwary
- Institute of Physical Science and Technology, University of Maryland, College Park, 20742, MD, USA; University of Maryland Institute for Health Computing, Bethesda, 20852, MD, USA; Department of Chemistry and Biochemistry, University of Maryland, College Park, 20742, MD, USA.
| |
Collapse
|
7
|
Tanner JD, Richards SN, Corry B. Molecular basis of the functional conflict between chloroquine and peptide transport in the Malaria parasite chloroquine resistance transporter PfCRT. Nat Commun 2025; 16:2987. [PMID: 40140375 PMCID: PMC11947230 DOI: 10.1038/s41467-025-58244-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 03/16/2025] [Indexed: 03/28/2025] Open
Abstract
The Plasmodium falciparum chloroquine resistance transporter (PfCRT) is a key protein contributing to resistance against the antimalarial chloroquine (CQ). Mutations such as K76T enable PfCRT to transport CQ away from its target in the parasite's digestive vacuole, but this comes at a cost to its natural peptide transport function. This creates fitness costs which can drive changes to drug susceptibility in parasite populations, but the molecular basis of this is not well understood. To investigate, here we run 130 μs of molecular dynamics simulations of CQ-sensitive and CQ-resistant PfCRT isoforms with CQ and peptide substrates. We identify the CQ binding site and characterized diverse peptide binding modes. The K76T mutation allows CQ to access the binding site but disrupts peptide binding, highlighting the importance of cavity charge in determining substrate specificity. This study provides insight into PfCRT polyspecific peptide transport and will aid in rational, structure-based inhibitor design.
Collapse
Affiliation(s)
- John D Tanner
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - Sashika N Richards
- Research School of Biology, Australian National University, Canberra, ACT, Australia
| | - Ben Corry
- Research School of Biology, Australian National University, Canberra, ACT, Australia.
| |
Collapse
|
8
|
Kulkarni P, Porter L, Chou TF, Chong S, Chiti F, Schafer JW, Mohanty A, Ramisetty S, Onuchic JN, Tuite M, Uversky VN, Weninger KR, Koonin EV, Orban J, Salgia R. Evolving concepts of the protein universe. iScience 2025; 28:112012. [PMID: 40124498 PMCID: PMC11926713 DOI: 10.1016/j.isci.2025.112012] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2025] Open
Abstract
The protein universe is the collection of all proteins on earth from all organisms both extant and extinct. Classical studies on protein folding suggested that proteins exist as a unique three-dimensional conformation that is dictated by the genetic code and is critical for function. In this perspective, we discuss ideas and developments that emerged over the past three decades regarding the protein structure-function paradigm. It is now clear that ordered (active/functional) and disordered/denatured (and hence inactive/non-functional) represent a continuum of states rather than binary states. Some proteins can switch folds without sequence change. Others exist as conformational ensembles lacking defined structure yet play critical roles in many biological processes, including forming membrane-less organelles driven by liquid-liquid phase separation. Numerous diverse proteins harbor segments with the potential to form amyloid fibrils, many of which are functional, and some possess prion-like properties enabling conformation-based transfer of heritable information. Taken together, these developments reveal the remarkable complexity of the protein universe.
Collapse
Affiliation(s)
- Prakash Kulkarni
- Department of Medical Oncology, City of Hope Medical Center, Duarte, CA, USA
- Department of Systems Biology, City of Hope Medical Center, Duarte, CA, USA
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Lauren Porter
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Tsui-Fen Chou
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA, USA
- Proteome Exploration Laboratory, Beckman Institute, California Institute of Technology, Pasadena, CA, USA
| | - Shasha Chong
- Division of Chemistry and Chemical Engineering, California Institute of Technology, Pasadena, CA, USA
| | - Fabrizio Chiti
- Department of Experimental and Clinical Biomedical Sciences “Mario Serio”, University of Florence, Florence, Italy
| | - Joseph W. Schafer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - Atish Mohanty
- Department of Medical Oncology, City of Hope Medical Center, Duarte, CA, USA
| | - Sravani Ramisetty
- Department of Medical Oncology, City of Hope Medical Center, Duarte, CA, USA
| | - Jose N. Onuchic
- Center for Theoretical Biological Physics, Rice University, Houston, TX, USA
- Department of Physics and Astronomy, Rice University, Houston, TX, USA
| | - Mick Tuite
- Kent Fungal Group, School of Biosciences, Division of Natural Sciences, University of Kent, CT2 7NJ Canterbury, UK
| | - Vladimir N. Uversky
- Department of Molecular Medicine, Morsani College of Medicine, University of South Florida, Tampa, FL, USA
| | - Keith R. Weninger
- Department of Physics, North Carolina State University, Raleigh, NC, USA
| | - Eugene V. Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA
| | - John Orban
- W. M. Keck Laboratory for Structural Biology, University of Maryland Institute for Bioscience and Biotechnology Research, Rockville, MD, USA
- Department of Chemistry and Biochemistry, University of Maryland, College Park, MD, USA
| | - Ravi Salgia
- Department of Medical Oncology, City of Hope Medical Center, Duarte, CA, USA
| |
Collapse
|
9
|
van der Weg K, Merdivan E, Piraud M, Gohlke H. TopEC: prediction of Enzyme Commission classes by 3D graph neural networks and localized 3D protein descriptor. Nat Commun 2025; 16:2737. [PMID: 40108108 PMCID: PMC11923149 DOI: 10.1038/s41467-025-57324-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2024] [Accepted: 02/11/2025] [Indexed: 03/22/2025] Open
Abstract
Tools available for inferring enzyme function from general sequence, fold, or evolutionary information are generally successful. However, they can lead to misclassification if a deviation in local structural features influences the function. Here, we present TopEC, a 3D graph neural network based on a localized 3D descriptor to learn chemical reactions of enzymes from enzyme structures and predict Enzyme Commission (EC) classes. Using message-passing frameworks, we include distance and angle information to significantly improve the predictive performance for EC classification (F-score: 0.72) compared to regular 2D graph neural networks. We trained networks without fold bias that can classify enzyme structures for a vast functional space (>800 ECs). Our model is robust to uncertainties in binding site locations and similar functions in distinct binding sites. We observe that TopEC networks learn from an interplay between biochemical features and local shape-dependent features. TopEC is available as a repository on GitHub: https://github.com/IBG4-CBCLab/TopEC and https://doi.org/10.25838/d5p-66 .
Collapse
Affiliation(s)
- Karel van der Weg
- Institute of Bio- and Geosciences (IBG-4: Bioinformatics), Forschungszentrum Jülich GmbH, 52425, Jülich, Germany
| | - Erinc Merdivan
- Helmholtz AI Central Unit, Ingolstädter Landstraße 1, 85764, Oberschleißheim, Germany
| | - Marie Piraud
- Helmholtz AI Central Unit, Ingolstädter Landstraße 1, 85764, Oberschleißheim, Germany
| | - Holger Gohlke
- Institute of Bio- and Geosciences (IBG-4: Bioinformatics), Forschungszentrum Jülich GmbH, 52425, Jülich, Germany.
- Institute for Pharmaceutical and Medicinal Chemistry, Heinrich Heine University Düsseldorf, 40225, Düsseldorf, Germany.
| |
Collapse
|
10
|
Yang J, Cheng WX, Zhang P, Wu G, Sheng ST, Yang J, Zhao S, Hu Q, Ji W, Shi Q. Conformational ensembles for protein structure prediction. Sci Rep 2025; 15:8513. [PMID: 40074747 PMCID: PMC11904239 DOI: 10.1038/s41598-024-84066-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2024] [Accepted: 12/19/2024] [Indexed: 03/14/2025] Open
Abstract
Acquisition of conformational ensembles for a protein is a challenging task, which is actually involving to the solution for protein folding problem and the study of intrinsically disordered protein. Despite AlphaFold with artificial intelligence acquired unprecedented accuracy to predict structures, its result is limited to a single state of conformation and it cannot provide multiple conformations to display protein intrinsic disorder. To overcome the barrier, a FiveFold approach was developed with a single sequence method. It applied the protein folding shape code (PFSC) uniformly to expose local folds of five amino acid residues, formed the protein folding variation matrix (PFVM) to reveal local folding variations along sequence, obtained a massive number of folding conformations in PFSC strings, and then an ensemble of multiple conformational protein structures is constructed. The P53_HUMAN as a well-known protein and LEF1_HUMAN and Q8GT36_SPIOL as typical disordered proteins are token as the benchmark to evaluate the predicted outcomes. The results demonstrated an effective algorithm and biological meaningful process well to predict protein multiple conformation structures.
Collapse
Affiliation(s)
- Jiaan Yang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, Guangdong, China.
- Micro Biotech, Ltd., Shanghai, 200123, China.
| | - Wen Xiang Cheng
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, Guangdong, China
| | - Peng Zhang
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen, 518055, Guangdong, China
- Biomedical Engineering, Shenzhen University of Advanced Technology, Shenzhen, 518060, China
| | - Gang Wu
- School of Basic Medicine, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, 430030, Hubei, China
| | - Si Tong Sheng
- HYK High-Throughput Biotechnology Institute, Shenzhen, 518057, Guangdong, China
| | - Junjie Yang
- Wuhan International Biohub Cooperation, Wuhan, 430075, Hubei, China
| | - Suwen Zhao
- iHuman Institute, ShanghaiTech University, Shanghai, 201210, China
| | - Qiyue Hu
- Beyang Therapeutics Co. Ltd, Shanghai, 201210, China
| | - Wenxin Ji
- National Facility for Protein Science in Shanghai, Shanghai Advanced Research Institute, Chinese Academy of Sciences, Shanghai, 201210, China
| | - Qiong Shi
- Laboratory of Aquatic Genomics, College of Life Sciences and Oceanography, Shenzhen University, Shenzhen, 518057, China
| |
Collapse
|
11
|
Jiang T, Thielges MC, Feng C. Emerging approaches to investigating functional protein dynamics in modular redox enzymes: Nitric oxide synthase as a model system. J Biol Chem 2025; 301:108282. [PMID: 39929300 PMCID: PMC11929083 DOI: 10.1016/j.jbc.2025.108282] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2024] [Revised: 02/03/2025] [Accepted: 02/04/2025] [Indexed: 02/13/2025] Open
Abstract
Approximately 80% of eukaryotic and 65% of prokaryotic proteins are composed of multiple folding units (i.e., domains) connected by flexible linkers. These dynamic protein architectures enable diverse, essential functions such as electron transfer, respiration, and biosynthesis. This review critically assesses recent advancements in methods for studying protein dynamics, with a particular focus on modular, multidomain nitric oxide synthase (NOS) enzymes. Moving beyond traditional static "snapshots" of protein structures, current research emphasizes the dynamic nature of proteins, viewing them as flexible architectures modulated by conformational changes and interactions. In this context, the review discusses key developments in the integration of quantitative crosslinking mass spectrometry (qXL MS) with AlphaFold 2 predictions, which provides a powerful approach to disentangling NOS structural dynamics and understanding their modulation by external regulatory cues. Additionally, advances in site-specific infrared (IR) spectroscopy offer exciting potential in providing rich details about the conformational dynamics of NOSs in docked states. Moreover, optimization of genetic code expansion machinery enables the generation of genuine phosphorylated NOS enzymes, paving the way for detailed biophysical and functional analyses of phosphorylation's role in shaping NOS activity and structural flexibility; notably, this approach also empowers site-specific IR probe labeling with cyano groups. By embracing and leveraging AI-driven tools like AlphaFold 2 for structural and conformational modeling, alongside solution-based biophysical methods such as qXL MS and site-specific IR spectroscopy, researchers will gain integrative insights into functional protein dynamics. Collectively, these breakthroughs highlight the transformative potential of modern approaches in driving fundamental biological chemistry research.
Collapse
Affiliation(s)
- Ting Jiang
- College of Pharmacy, University of New Mexico, Albuquerque, New Mexico, USA
| | - Megan C Thielges
- Department of Chemistry, Indiana University, Bloomington, Indiana, USA
| | - Changjian Feng
- College of Pharmacy, University of New Mexico, Albuquerque, New Mexico, USA.
| |
Collapse
|
12
|
Huang YJ, Ramelot TA, Spaman LE, Kobayashi N, Montelione GT. Hidden Structural States of Proteins Revealed by Conformer Selection with AlphaFold-NMR. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2025:2024.06.26.600902. [PMID: 38979209 PMCID: PMC11230435 DOI: 10.1101/2024.06.26.600902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/10/2024]
Abstract
We introduce AlphaFold-NMR, a novel approach to NMR structure determination that reveals previously undetected protein conformational states. Unlike conventional NMR methods that rely on NOE-derived spatial restraints, AlphaFold-NMR combines AI-driven conformational sampling with Bayesian scoring of realistic protein models against NOESY and chemical shift data. This method uncovers alternative conformational states of the enzyme Gaussia luciferase, involving large-scale changes in the lid, binding pockets, and other surface cavities. It also identifies similar yet distinct conformational states of the human tumor suppressor Cyclin-Dependent Kinase 2-Associated Protein 1. These studies demonstrate the potential of AI-based modeling with enhanced sampling to generate diverse structural models followed by conformer selection and validation with experimental data as an alternative to traditional restraint-satisfaction protocols for protein NMR structure determination. The AlphaFold-NMR framework enables discovery of conformational heterogeneity and cryptic pockets that conventional NMR analysis methods do not distinguish, providing new insights into protein structure-function relationships.
Collapse
Affiliation(s)
- Yuanpeng J. Huang
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| | - Theresa A. Ramelot
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| | - Laura E. Spaman
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| | - Naohiro Kobayashi
- NMR Science and Development Division. RSC, RIKEN, 1-7-22 Suehiro-cho, Tsurumi-ku, Yokohama City, Kanagawa 230-0045, JAPAN
| | - Gaetano T. Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, New York, 12180 USA
| |
Collapse
|
13
|
Brotzakis ZF, Zhang S, Murtada MH, Vendruscolo M. AlphaFold prediction of structural ensembles of disordered proteins. Nat Commun 2025; 16:1632. [PMID: 39952928 PMCID: PMC11829000 DOI: 10.1038/s41467-025-56572-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2023] [Accepted: 01/23/2025] [Indexed: 02/17/2025] Open
Abstract
Deep learning methods of predicting protein structures have reached an accuracy comparable to that of high-resolution experimental methods. It is thus possible to generate accurate models of the native states of hundreds of millions of proteins. An open question, however, concerns whether these advances can be translated to disordered proteins, which should be represented as structural ensembles because of their heterogeneous and dynamical nature. To address this problem, we introduce the AlphaFold-Metainference method to use AlphaFold-derived distances as structural restraints in molecular dynamics simulations to construct structural ensembles of ordered and disordered proteins. The results obtained using AlphaFold-Metainference illustrate the possibility of making predictions of the conformational properties of disordered proteins using deep learning methods trained on the large structural databases available for folded proteins.
Collapse
Affiliation(s)
- Z Faidon Brotzakis
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
- Institute for Bioinnovation, Biomedical Sciences Research Center "Alexander Fleming", 16672, Vari, Greece
| | - Shengyu Zhang
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Mhd Hussein Murtada
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK
| | - Michele Vendruscolo
- Centre for Misfolding Diseases, Yusuf Hamied Department of Chemistry, University of Cambridge, Cambridge, UK.
| |
Collapse
|
14
|
Howard MK, Hoppe N, Huang XP, Mitrovic D, Billesbølle CB, Macdonald CB, Mehrotra E, Rockefeller Grimes P, Trinidad DD, Delemotte L, English JG, Coyote-Maestas W, Manglik A. Molecular basis of proton sensing by G protein-coupled receptors. Cell 2025; 188:671-687.e20. [PMID: 39753132 PMCID: PMC11849372 DOI: 10.1016/j.cell.2024.11.036] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2024] [Revised: 09/23/2024] [Accepted: 11/21/2024] [Indexed: 02/09/2025]
Abstract
Three proton-sensing G protein-coupled receptors (GPCRs)-GPR4, GPR65, and GPR68-respond to extracellular pH to regulate diverse physiology. How protons activate these receptors is poorly understood. We determined cryogenic-electron microscopy (cryo-EM) structures of each receptor to understand the spatial arrangement of proton-sensing residues. Using deep mutational scanning (DMS), we determined the functional importance of every residue in GPR68 activation by generating ∼9,500 mutants and measuring their effects on signaling and surface expression. Constant-pH molecular dynamics simulations provided insights into the conformational landscape and protonation patterns of key residues. This unbiased approach revealed that, unlike other proton-sensitive channels and receptors, no single site is critical for proton recognition. Instead, a network of titratable residues extends from the extracellular surface to the transmembrane region, converging on canonical motifs to activate proton-sensing GPCRs. Our approach integrating structure, simulations, and unbiased functional interrogation provides a framework for understanding GPCR signaling complexity.
Collapse
Affiliation(s)
- Matthew K Howard
- Tetrad graduate program, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Bioengineering and Therapeutic Science, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Nicholas Hoppe
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143, USA; Biophysics graduate program, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Xi-Ping Huang
- Department of Pharmacology and the National Institute of Mental Health Psychoactive Drug Screening Program (NIMH PDSP), The University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, USA
| | - Darko Mitrovic
- Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, 12121 Solna, Stockholm, Stockholm County 114 28, Sweden
| | - Christian B Billesbølle
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Christian B Macdonald
- Department of Bioengineering and Therapeutic Science, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Eshan Mehrotra
- Tetrad graduate program, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143, USA; Medical Scientist Training Program, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Patrick Rockefeller Grimes
- Department of Bioengineering and Therapeutic Science, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Donovan D Trinidad
- Department of Medicine, Division of Infectious Disease, University of California, San Francisco, San Francisco, CA 94143, USA
| | - Lucie Delemotte
- Science for Life Laboratory, Department of Applied Physics, KTH Royal Institute of Technology, 12121 Solna, Stockholm, Stockholm County 114 28, Sweden
| | - Justin G English
- Department of Biochemistry, University of Utah School of Medicine, Salt Lake City, UT 84112, USA
| | - Willow Coyote-Maestas
- Department of Bioengineering and Therapeutic Science, University of California, San Francisco, San Francisco, CA 94143, USA; Chan Zuckerberg Biohub, San Francisco, CA 94148, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA.
| | - Aashish Manglik
- Department of Pharmaceutical Chemistry, University of California, San Francisco, San Francisco, CA 94143, USA; Chan Zuckerberg Biohub, San Francisco, CA 94148, USA; Quantitative Biosciences Institute, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Anesthesia and Perioperative Care, University of California, San Francisco, San Francisco, CA 94115, USA.
| |
Collapse
|
15
|
Rosignoli S, Pacelli M, Manganiello F, Paiardini A. An outlook on structural biology after AlphaFold: tools, limits and perspectives. FEBS Open Bio 2025; 15:202-222. [PMID: 39313455 PMCID: PMC11788754 DOI: 10.1002/2211-5463.13902] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Revised: 08/19/2024] [Accepted: 09/13/2024] [Indexed: 09/25/2024] Open
Abstract
AlphaFold and similar groundbreaking, AI-based tools, have revolutionized the field of structural bioinformatics, with their remarkable accuracy in ab-initio protein structure prediction. This success has catalyzed the development of new software and pipelines aimed at incorporating AlphaFold's predictions, often focusing on addressing the algorithm's remaining challenges. Here, we present the current landscape of structural bioinformatics shaped by AlphaFold, and discuss how the field is dynamically responding to this revolution, with new software, methods, and pipelines. While the excitement around AI-based tools led to their widespread application, it is essential to acknowledge that their practical success hinges on their integration into established protocols within structural bioinformatics, often neglected in the context of AI-driven advancements. Indeed, user-driven intervention is still as pivotal in the structure prediction process as in complementing state-of-the-art algorithms with functional and biological knowledge.
Collapse
Affiliation(s)
- Serena Rosignoli
- Department of Biochemical sciences “A. Rossi Fanelli”Sapienza Università di RomaItaly
| | - Maddalena Pacelli
- Department of Biochemical sciences “A. Rossi Fanelli”Sapienza Università di RomaItaly
| | - Francesca Manganiello
- Department of Biochemical sciences “A. Rossi Fanelli”Sapienza Università di RomaItaly
| | - Alessandro Paiardini
- Department of Biochemical sciences “A. Rossi Fanelli”Sapienza Università di RomaItaly
| |
Collapse
|
16
|
Chakravarty D, Lee M, Porter LL. Proteins with alternative folds reveal blind spots in AlphaFold-based protein structure prediction. Curr Opin Struct Biol 2025; 90:102973. [PMID: 39756261 PMCID: PMC11791787 DOI: 10.1016/j.sbi.2024.102973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2024] [Revised: 11/25/2024] [Accepted: 12/06/2024] [Indexed: 01/07/2025]
Abstract
In recent years, advances in artificial intelligence (AI) have transformed structural biology, particularly protein structure prediction. Though AI-based methods, such as AlphaFold (AF), often predict single conformations of proteins with high accuracy and confidence, predictions of alternative folds are often inaccurate, low-confidence, or simply not predicted at all. Here, we review three blind spots that alternative conformations reveal about AF-based protein structure prediction. First, proteins that assume conformations distinct from their training-set homologs can be mispredicted. Second, AF overrelies on its training set to predict alternative conformations. Third, degeneracies in pairwise representations can lead to high-confidence predictions inconsistent with experiment. These weaknesses suggest approaches to predict alternative folds more reliably.
Collapse
Affiliation(s)
- Devlina Chakravarty
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Myeongsang Lee
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Lauren L Porter
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA; Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
17
|
Yudenko A, Bukhdruker S, Shishkin P, Rodin S, Burtseva A, Petrov A, Pigareva N, Sokolov A, Zinovev E, Eliseev I, Remeeva A, Marin E, Mishin A, Gordeliy V, Gushchin I, Ischenko A, Borshchevskiy V. Structural basis of signaling complex inhibition by IL-6 domain-swapped dimers. Structure 2025; 33:171-180.e5. [PMID: 39566503 DOI: 10.1016/j.str.2024.10.028] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2024] [Revised: 09/16/2024] [Accepted: 10/24/2024] [Indexed: 11/22/2024]
Abstract
Interleukin-6 (IL-6) is a multifaceted cytokine essential in many immune system processes and their regulation. It also plays a key role in hematopoiesis, and in triggering the acute phase reaction. IL-6 overproduction is critical in chronic inflammation associated with autoimmune diseases like rheumatoid arthritis and contributes to cytokine storms in COVID-19 patients. Over 20 years ago, researchers proposed that IL-6, which is typically monomeric, can also form dimers via a domain-swap mechanism, with indirect evidence supporting their existence. The physiological significance of IL-6 dimers was shown in B-cell chronic lymphocytic leukemia. However, no structures have been reported so far. Here, we present the crystal structure of an IL-6 domain-swapped dimer that computational approaches could not predict. The structure explains why the IL-6 dimer is antagonistic to the IL-6 monomer in signaling complex formation and provides insights for IL-6 targeted therapies.
Collapse
Affiliation(s)
- Anna Yudenko
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Sergey Bukhdruker
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Pavel Shishkin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Sergey Rodin
- Institute of Experimental Medicine, St. Petersburg 197022, Russia; Research Institute of Highly Pure Biopreparations, St. Petersburg 197110, Russia
| | - Anastasia Burtseva
- St. Petersburg Pasteur Institute, St. Petersburg 197101, Russia; Research Institute of Highly Pure Biopreparations, St. Petersburg 197110, Russia
| | - Aleksandr Petrov
- Research Institute of Highly Pure Biopreparations, St. Petersburg 197110, Russia; Medicinal Chemistry Center, Togliatti State University, Togliatti, Samara Region 445020, Russia
| | - Natalia Pigareva
- St. Petersburg Pasteur Institute, St. Petersburg 197101, Russia; Research Institute of Highly Pure Biopreparations, St. Petersburg 197110, Russia
| | - Alexey Sokolov
- Institute of Experimental Medicine, St. Petersburg 197022, Russia
| | - Egor Zinovev
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Igor Eliseev
- Alferov University, St. Petersburg 194021, Russia; St. Petersburg School of Physics, Mathematics, and Computer Science, HSE University, St. Petersburg 194100, Russia
| | - Alina Remeeva
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Egor Marin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Alexey Mishin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Valentin Gordeliy
- Institut de Biologie Structurale J.-P. Ebel, Université Grenoble Alpes-CEA-CNRS, 38000 Grenoble, France
| | - Ivan Gushchin
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia
| | - Aleksandr Ischenko
- St. Petersburg Pasteur Institute, St. Petersburg 197101, Russia; Research Institute of Highly Pure Biopreparations, St. Petersburg 197110, Russia.
| | - Valentin Borshchevskiy
- Research Center for Molecular Mechanisms of Aging and Age-Related Diseases, Moscow Institute of Physics and Technology, Dolgoprudny, Moscow Region 141701, Russia; Joint Institute for Nuclear Research, Dubna, Moscow Region 141980, Russia.
| |
Collapse
|
18
|
Dubianok Y, Kumar A, Rak A. Structural Biology for Target Identification and Validation. Methods Mol Biol 2025; 2905:17-49. [PMID: 40163296 DOI: 10.1007/978-1-0716-4418-8_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/02/2025]
Abstract
Structural biology is catalyzing a paradigm shift in drug discovery towards rational approaches in target identification and validation. Leveraging structural insights obtained through cryo-EM or X-ray crystallography not only enhances the efficiency of drug discovery projects in terms of time and cost, but also significantly improves the likelihood of achieving market approval.Initiating a successful project necessitates more than just a robust package for target credentialing; it demands a comprehensive strategy for the identification and optimization of potential drugs. The critical evaluation of target druggability is markedly enhanced when supported by experimentally derived structural information. This nuanced approach ensures a more thorough understanding of the technical feasibility of drug development from the project's inception.
Collapse
Affiliation(s)
- Yuliya Dubianok
- Sanofi R&D, Bio Structure and Biophysics at Integrated Drug Discovery, Vitry-sur-Seine, France
| | - Anand Kumar
- Sanofi R&D, Bio Structure and Biophysics at Integrated Drug Discovery, Vitry-sur-Seine, France
| | - Alexey Rak
- Sanofi R&D, Bio Structure and Biophysics at Integrated Drug Discovery, Vitry-sur-Seine, France.
| |
Collapse
|
19
|
van Aalst EJ, Wylie BJ. An in silico framework to visualize how cancer-associated mutations influence structural plasticity of the chemokine receptor CCR3. Protein Sci 2025; 34:e70013. [PMID: 39723881 DOI: 10.1002/pro.70013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2024] [Revised: 11/06/2024] [Accepted: 12/12/2024] [Indexed: 12/28/2024]
Abstract
G protein Coupled Receptors (GPCRs) are the largest family of cell surface receptors in humans. Somatic mutations in GPCRs are implicated in cancer progression and metastasis, but mechanisms are poorly understood. Emerging evidence implicates perturbation of intra-receptor activation pathway motifs whereby extracellular signals are transmitted intracellularly. Recently, sufficiently sensitive methodology was described to calculate structural strain as a function of missense mutations in AlphaFold-predicted model structures, which was extensively validated on experimental and predicted structural datasets. When paired with Molecular Dynamics (MD) simulations, these tools provide a facile approach to screen mutations in silico. We applied this framework to calculate the structural and dynamic effects of cancer-associated mutations in the chemokine receptor CCR3, a Class A GPCR involved in cancer and autoimmune disorders. Residue-residue contact scoring refined effective strain results, highlighting significant remodeling of inter- and intra-motif contacts along the highly conserved GPCR activation pathway network. We then integrated AlphaFold-derived predicted Local Distance Difference Test scores with per-residue Root Mean Square Fluctuations and activation pathway Contact Analysis (CONAN) from coarse grain MD simulations to identify statistically significant changes in receptor dynamics upon mutation. Finally, analysis of negative control mutants suggests false positive results in AlphaFold pipelines should be considered but can be mitigated with stricter control of statistical analysis. Our results indicate selected mutants influence structural plasticity of CCR3 related to ligand interaction, activation, and G protein coupling, using a framework that could be applicable to a wide range of biochemically relevant protein targets following further validation.
Collapse
Affiliation(s)
- Evan J van Aalst
- Department of Chemistry and Biochemistry, Texas Tech University, Lubbock, Texas, USA
| | - Benjamin J Wylie
- Department of Chemistry and Biochemistry, Texas Tech University, Lubbock, Texas, USA
| |
Collapse
|
20
|
Chu HY, Peng J, Mou Y, Wong ASL. Quantifying Protein-Nucleic Acid Interactions for Engineering Useful CRISPR-Cas9 Genome-Editing Variants. Methods Mol Biol 2025; 2870:227-243. [PMID: 39543038 DOI: 10.1007/978-1-0716-4213-9_12] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2024]
Abstract
Numerous high-specificity Cas9 variants have been engineered for precision genome editing. These variants typically harbor multiple mutations designed to alter the Cas9-single guide RNA (sgRNA)-DNA complex interactions for reduced off-target cleavage. By dissecting the contributions of individual mutations, we attempt to derive principles for designing high-specificity Cas9 variants. Here, we computationally modeled the specificity harnessing mutations of the widely used Cas9 isolated from Streptococcus pyogenes (SpCas9) and investigated their individual mutational effects. We quantified the mutational effects in terms of energy and contact changes by comparing the wild-type and mutant structures. We found that these mutations disrupt the protein-protein or protein-DNA contacts within the Cas9-sgRNA-DNA complex. We also identified additional impacted amino acid sites via energy changes that constitute the structural microenvironment encompassing the focal mutation, giving insights into how the mutations contribute to the high-specificity phenotype of SpCas9. Our method outlines a strategy to evaluate mutational effects that can facilitate rational design for Cas9 optimization.
Collapse
Affiliation(s)
- Hoi Yee Chu
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Jiaxing Peng
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Yuanbiao Mou
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Alan S L Wong
- Laboratory of Combinatorial Genetics and Synthetic Biology, School of Biomedical Sciences, The University of Hong Kong, Pokfulam, Hong Kong SAR, China.
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China.
| |
Collapse
|
21
|
Xu J, Wang Y. Generating Multistate Conformations of P-type ATPases with a Conditional Diffusion Model. J Chem Inf Model 2024; 64:9227-9239. [PMID: 39480276 DOI: 10.1021/acs.jcim.4c01519] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2024]
Abstract
Understanding and predicting the diverse conformational states of membrane proteins is essential for elucidating their biological functions. Despite advancements in computational methods, accurately capturing these complex structural changes remains a significant challenge. Here, we introduce a computational approach to generate diverse and biologically relevant conformations of membrane proteins using a conditional diffusion model. Our approach integrates forward and backward diffusion processes, incorporating state classifiers and additional conditioners to control the generation gradient of conformational states. We specifically targeted the P-type ATPases, a critical family of membrane transporters, and constructed a comprehensive data set through a combination of experimental structures and molecular dynamics simulations. Our model, incorporating a graph neural network with specialized membrane constraints, demonstrates exceptional accuracy in generating a wide range of P-type ATPase conformations associated with different functional states. This approach represents a meaningful step forward in the computational generation of membrane protein conformations using AI and holds promise for studying the dynamics of other membrane proteins.
Collapse
Affiliation(s)
- Jingtian Xu
- College of Life Sciences, Zhejiang University, Hangzhou 310027, China
| | - Yong Wang
- College of Life Sciences, Zhejiang University, Hangzhou 310027, China
| |
Collapse
|
22
|
Raisinghani N, Parikh V, Foley B, Verkhivker G. AlphaFold2-Based Characterization of Apo and Holo Protein Structures and Conformational Ensembles Using Randomized Alanine Sequence Scanning Adaptation: Capturing Shared Signature Dynamics and Ligand-Induced Conformational Changes. Int J Mol Sci 2024; 25:12968. [PMID: 39684679 DOI: 10.3390/ijms252312968] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2024] [Revised: 11/24/2024] [Accepted: 11/29/2024] [Indexed: 12/18/2024] Open
Abstract
Proteins often exist in multiple conformational states, influenced by the binding of ligands or substrates. The study of these states, particularly the apo (unbound) and holo (ligand-bound) forms, is crucial for understanding protein function, dynamics, and interactions. In the current study, we use AlphaFold2, which combines randomized alanine sequence masking with shallow multiple sequence alignment subsampling to expand the conformational diversity of the predicted structural ensembles and capture conformational changes between apo and holo protein forms. Using several well-established datasets of structurally diverse apo-holo protein pairs, the proposed approach enables robust predictions of apo and holo structures and conformational ensembles, while also displaying notably similar dynamics distributions. These observations are consistent with the view that the intrinsic dynamics of allosteric proteins are defined by the structural topology of the fold and favor conserved conformational motions driven by soft modes. Our findings provide evidence that AlphaFold2 combined with randomized alanine sequence masking can yield accurate and consistent results in predicting moderate conformational adjustments between apo and holo states, especially for proteins with localized changes upon ligand binding. For large hinge-like domain movements, the proposed approach can predict functional conformations characteristic of both apo and ligand-bound holo ensembles in the absence of ligand information. These results are relevant for using this AlphaFold adaptation for probing conformational selection mechanisms according to which proteins can adopt multiple conformations, including those that are competent for ligand binding. The results of this study indicate that robust modeling of functional protein states may require more accurate characterization of flexible regions in functional conformations and the detection of high-energy conformations. By incorporating a wider variety of protein structures in training datasets, including both apo and holo forms, the model can learn to recognize and predict the structural changes that occur upon ligand binding.
Collapse
Affiliation(s)
- Nishank Raisinghani
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Vedant Parikh
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Brandon Foley
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Gennady Verkhivker
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA 92618, USA
| |
Collapse
|
23
|
Torres J, Pervushin K, Surya W. Prediction of conformational states in a coronavirus channel using Alphafold-2 and DeepMSA2: Strengths and limitations. Comput Struct Biotechnol J 2024; 23:3730-3740. [PMID: 39525089 PMCID: PMC11543627 DOI: 10.1016/j.csbj.2024.10.021] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2024] [Revised: 10/01/2024] [Accepted: 10/15/2024] [Indexed: 11/16/2024] Open
Abstract
The envelope (E) protein is present in all coronavirus genera. This protein can form pentameric oligomers with ion channel activity which have been proposed as a possible therapeutic target. However, high resolution structures of E channels are limited to those of the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2), responsible for the recent COVID-19 pandemic. In the present work, we used Alphafold-2 (AF2), in ColabFold without templates, to predict the transmembrane domain (TMD) structure of six E-channels representative of genera alpha-, beta- and gamma-coronaviruses in the Coronaviridae family. High-confidence models were produced in all cases when combining multiple sequence alignments (MSAs) obtained from DeepMSA2. Overall, AF2 predicted at least two possible orientations of the α-helices in E-TMD channels: one where a conserved polar residue (Asn-15 in the SARS sequence) is oriented towards the center of the channel, 'polar-in', and one where this residue is in an interhelical orientation 'polar-inter'. For the SARS models, the comparison with the two experimental models 'closed' (PDB: 7K3G) and 'open' (PDB: 8SUZ) is described, and suggests a ∼60˚ α-helix rotation mechanism involving either the full TMD or only its N-terminal half, to allow the passage of ions. While the results obtained are not identical to the two high resolution models available, they suggest various conformational states with striking similarities to those models. We believe these results can be further optimized by means of MSA subsampling, and guide future high resolution structural studies in these and other viral channels.
Collapse
Affiliation(s)
- Jaume Torres
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore
| | - Konstantin Pervushin
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore
| | - Wahyu Surya
- School of Biological Sciences, Nanyang Technological University, 60 Nanyang Drive, Singapore 637551, Singapore
| |
Collapse
|
24
|
Harding-Larsen D, Funk J, Madsen NG, Gharabli H, Acevedo-Rocha CG, Mazurenko S, Welner DH. Protein representations: Encoding biological information for machine learning in biocatalysis. Biotechnol Adv 2024; 77:108459. [PMID: 39366493 DOI: 10.1016/j.biotechadv.2024.108459] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2024] [Revised: 09/19/2024] [Accepted: 09/29/2024] [Indexed: 10/06/2024]
Abstract
Enzymes offer a more environmentally friendly and low-impact solution to conventional chemistry, but they often require additional engineering for their application in industrial settings, an endeavour that is challenging and laborious. To address this issue, the power of machine learning can be harnessed to produce predictive models that enable the in silico study and engineering of improved enzymatic properties. Such machine learning models, however, require the conversion of the complex biological information to a numerical input, also called protein representations. These inputs demand special attention to ensure the training of accurate and precise models, and, in this review, we therefore examine the critical step of encoding protein information to numeric representations for use in machine learning. We selected the most important approaches for encoding the three distinct biological protein representations - primary sequence, 3D structure, and dynamics - to explore their requirements for employment and inductive biases. Combined representations of proteins and substrates are also introduced as emergent tools in biocatalysis. We propose the division of fixed representations, a collection of rule-based encoding strategies, and learned representations extracted from the latent spaces of large neural networks. To select the most suitable protein representation, we propose two main factors to consider. The first one is the model setup, which is influenced by the size of the training dataset and the choice of architecture. The second factor is the model objectives such as consideration about the assayed property, the difference between wild-type models and mutant predictors, and requirements for explainability. This review is aimed at serving as a source of information and guidance for properly representing enzymes in future machine learning models for biocatalysis.
Collapse
Affiliation(s)
- David Harding-Larsen
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Jonathan Funk
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Niklas Gesmar Madsen
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Hani Gharabli
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Carlos G Acevedo-Rocha
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark
| | - Stanislav Mazurenko
- Loschmidt Laboratories, Department of Experimental Biology and RECETOX, Faculty of Science, Masaryk University, Kamenice 5, 625 00 Brno, Czech Republic; International Clinical Research Center, St. Anne's University Hospital Brno, Pekarska 53, 656 91 Brno, Czech Republic
| | - Ditte Hededam Welner
- The Novo Nordisk Center for Biosustainability, Technical University of Denmark, Søltofts Plads, Bygning 220, 2800 Kgs. Lyngby, Denmark.
| |
Collapse
|
25
|
Lazou M, Khan O, Nguyen T, Padhorny D, Kozakov D, Joseph-McCarthy D, Vajda S. Predicting multiple conformations of ligand binding sites in proteins suggests that AlphaFold2 may remember too much. Proc Natl Acad Sci U S A 2024; 121:e2412719121. [PMID: 39565312 PMCID: PMC11621821 DOI: 10.1073/pnas.2412719121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2024] [Accepted: 10/21/2024] [Indexed: 11/21/2024] Open
Abstract
The goal of this paper is predicting the conformational distributions of ligand binding sites using the AlphaFold2 (AF2) protein structure prediction program with stochastic subsampling of the multiple sequence alignment (MSA). We explored the opening of cryptic ligand binding sites in 16 proteins, where the closed and open conformations define the expected extreme points of the conformational variation. Due to the many structures of these proteins in the Protein Data Bank (PDB), we were able to study whether the distribution of X-ray structures affects the distribution of AF2 models. We have found that AF2 generates both a cluster of open and a cluster of closed models for proteins that have comparable numbers of open and closed structures in the PDB and not too many other conformations. This was observed even with default MSA parameters, thus without further subsampling. In contrast, with the exception of a single protein, AF2 did not yield multiple clusters of conformations for proteins that had imbalanced numbers of open and closed structures in the PDB, or had substantial numbers of other structures. Subsampling improved the results only for a single protein, but very shallow MSA led to incorrect structures. The ability of generating both open and closed conformations for six out of the 16 proteins agrees with the success rates of similar studies reported in the literature. However, we showed that this partial success is due to AF2 "remembering" the conformational distributions in the PDB and that the approach fails to predict rarely seen conformations.
Collapse
Affiliation(s)
- Maria Lazou
- Department of Biomedical Engineering, Boston University, Boston, MA02215
| | - Omeir Khan
- Department of Chemistry, Boston University, Boston, MA02215
| | - Thu Nguyen
- Department of Computer Science, Stony Brook University, Stony Brook, NY11794
| | - Dzmitry Padhorny
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY11794
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY11794
| | - Dima Kozakov
- Department of Applied Mathematics and Statistics, Stony Brook University, Stony Brook, NY11794
- Laufer Center for Physical and Quantitative Biology, Stony Brook University, Stony Brook, NY11794
| | - Diane Joseph-McCarthy
- Department of Biomedical Engineering, Boston University, Boston, MA02215
- Department of Chemistry, Boston University, Boston, MA02215
| | - Sandor Vajda
- Department of Biomedical Engineering, Boston University, Boston, MA02215
- Department of Chemistry, Boston University, Boston, MA02215
| |
Collapse
|
26
|
Olanders G, Testa G, Tibo A, Nittinger E, Tyrchan C. Challenge for Deep Learning: Protein Structure Prediction of Ligand-Induced Conformational Changes at Allosteric and Orthosteric Sites. J Chem Inf Model 2024; 64:8481-8494. [PMID: 39484820 DOI: 10.1021/acs.jcim.4c01475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
In the realm of biomedical research, understanding the intricate structure of proteins is crucial, as these structures determine how proteins function within our bodies and interact with potential drugs. Traditionally, methods like X-ray crystallography and cryo-electron microscopy have been used to unravel these structures, but they are often challenging, time-consuming and costly. Recently, a breakthrough in computational biology has emerged with the development of deep learning algorithms capable of predicting protein structures based on their amino acid sequences (Jumper, J., et al. Nature 2021, 596, 583. Lane, T. J. Nature Methods 2023, 20, 170. Kryshtafovych, A., et al. Proteins: Structure, Function and Bioinformatics 2021, 89, 1607). This study focuses on predicting the dynamic changes that proteins undergo upon ligand binding, specifically when they bind to allosteric sites, i.e. a pocket different from the active site. Allosteric modulators are particularly important for drug discovery, as they open new avenues for designing drugs that can target proteins more effectively and with fewer side effects (Nussinov, R.; Tsai, C. J. Cell 2013, 153, 293). To study this, we curated a data set of 578 X-ray structures comprised of proteins displaying orthosteric and allosteric binding as well as a general framework to evaluate deep learning-based structure prediction methods. Our findings demonstrate the potential and current limitations of deep learning methods, such as AlphaFold2 (Jumper, J., et al. Nature 2021, 596, 583), NeuralPLexer (Qiao, Z., et al. Nat Mach Intell 2024, 6, 195), and RoseTTAFold All-Atom (Krishna, R., et al. Science 2024, 384, eadl2528) to predict not just static protein structures but also the dynamic conformational changes. Herein we show that predicting the allosteric induce-fit conformation still poses a challenge to deep learning methods as they more accurately predict the orthosteric bound conformation compared to the allosteric induce fit conformation. For AlphaFold2, we observed that conformational diversity, and sampling between the apo and holo state could be increased by modifying the MSA depth, but this did not enhance the ability to generate conformations close to the allosteric induced-fit conformation. To further support advancements in protein structure prediction field, the curated data set and evaluation framework are made publicly available.
Collapse
Affiliation(s)
- Gustav Olanders
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, 43183 Gothenburg, Sweden
| | - Giulia Testa
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, 43183 Gothenburg, Sweden
| | - Alessandro Tibo
- Molecular AI, Discovery Sciences, R&D, AstraZeneca, 43183 Gothenburg, Sweden
| | - Eva Nittinger
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, 43183 Gothenburg, Sweden
| | - Christian Tyrchan
- Medicinal Chemistry, Research and Early Development, Respiratory and Immunology (R&I), BioPharmaceuticals R&D, AstraZeneca, 43183 Gothenburg, Sweden
| |
Collapse
|
27
|
Raisinghani N, Alshahrani M, Gupta G, Tian H, Xiao S, Tao P, Verkhivker G. Probing Functional Allosteric States and Conformational Ensembles of the Allosteric Protein Kinase States and Mutants: Atomistic Modeling and Comparative Analysis of AlphaFold2, OmegaFold, and AlphaFlow Approaches and Adaptations. J Phys Chem B 2024; 128:11088-11107. [PMID: 39485490 PMCID: PMC12103074 DOI: 10.1021/acs.jpcb.4c04985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2024]
Abstract
This study reports a comprehensive analysis and comparison of several AlphaFold2 adaptations and OmegaFold and AlphaFlow approaches in predicting distinct allosteric states, conformational ensembles, and mutation-induced structural effects for a panel of state-switching allosteric ABL mutants. The results revealed that the proposed AlphaFold2 adaptation with randomized alanine sequence scanning can generate functionally relevant allosteric states and conformational ensembles of the ABL kinase that qualitatively capture a unique pattern of population shifts between the active and inactive states in the allosteric ABL mutants. Consistent with the NMR experiments, the proposed AlphaFold2 adaptation predicted that G269E/M309L/T408Y mutant could induce population changes and sample a significant fraction of the fully inactive I2 form which is a low-populated, high-energy state for the wild-type ABL protein. We also demonstrated that other ABL mutants G269E/M309L/T334I and M309L/L320I/T334I that introduce a single activating T334I mutation can reverse equilibrium and populate exclusively the active ABL form. While the precise quantitative predictions of the relative populations of the active and various hidden inactive states in the ABL mutants remain challenging, our results provide evidence that AlphaFold2 adaptation with randomized alanine sequence scanning can adequately detect a spectrum of the allosteric ABL states and capture the equilibrium redistributions between structurally distinct functional ABL conformations. We further validated the robustness of the proposed AlphaFold2 adaptation for predicting the unique inactive architecture of the BSK8 kinase and structural differences between ligand-unbound apo and ATP-bound forms of BSK8. The results of this comparative study suggested that AlpahFold2, OmegaFold, and AlphaFlow approaches may be driven by structural memorization of existing protein folds and are strongly biased toward predictions of the thermodynamically stable ground states of the protein kinases, highlighting limitations and challenges of AI-based methodologies in detecting alternative functional conformations, accurate characterization of physically significant conformational ensembles, and prediction of mutation-induced allosteric structural changes.
Collapse
Affiliation(s)
- Nishank Raisinghani
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States
| | - Mohammed Alshahrani
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States
| | - Grace Gupta
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States
| | - Hao Tian
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| | - Sian Xiao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| | - Peng Tao
- Department of Chemistry, Center for Research Computing, Center for Drug Discovery, Design, and Delivery (CD4), Southern Methodist University, Dallas, Texas 75275, United States
| | - Gennady Verkhivker
- Keck Center for Science and Engineering, Graduate Program in Computational and Data Sciences, Schmid College of Science and Technology, Chapman University, Orange, California 92866, United States
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, California 92618, United States
- Department of Pharmacology, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego, 9500 Gilman Drive, La Jolla, California 92093, United States
| |
Collapse
|
28
|
Riccabona JR, Spoendlin FC, Fischer ALM, Loeffler JR, Quoika PK, Jenkins TP, Ferguson JA, Smorodina E, Laustsen AH, Greiff V, Forli S, Ward AB, Deane CM, Fernández-Quintero ML. Assessing AF2's ability to predict structural ensembles of proteins. Structure 2024; 32:2147-2159.e2. [PMID: 39332396 DOI: 10.1016/j.str.2024.09.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2024] [Revised: 08/07/2024] [Accepted: 09/02/2024] [Indexed: 09/29/2024]
Abstract
Recent breakthroughs in protein structure prediction have enhanced the precision and speed at which protein configurations can be determined. Additionally, molecular dynamics (MD) simulations serve as a crucial tool for capturing the conformational space of proteins, providing valuable insights into their structural fluctuations. However, the scope of MD simulations is often limited by the accessible timescales and the computational resources available, posing challenges to comprehensively exploring protein behaviors. Recently emerging approaches have focused on expanding the capability of AlphaFold2 (AF2) to predict conformational substates of protein. Here, we benchmark the performance of various workflows that have adapted AF2 for ensemble prediction and compare the obtained structures with ensembles obtained from MD simulations and NMR. We provide an overview of the levels of performance and accessible timescales that can currently be achieved with machine learning (ML) based ensemble generation. Significant minima of the free energy surfaces remain undetected.
Collapse
Affiliation(s)
- Jakob R Riccabona
- Center for Molecular Biosciences Innsbruck, Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
| | - Fabian C Spoendlin
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
| | - Anna-Lena M Fischer
- Center for Molecular Biosciences Innsbruck, Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria
| | - Johannes R Loeffler
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Patrick K Quoika
- Center for Functional Protein Assemblies, Technical University of Munich, Ernst-Otto-Fischer-Str. 8, 85748 Garching, Germany
| | - Timothy P Jenkins
- Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - James A Ferguson
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Eva Smorodina
- Department of Immunology, University of Oslo, Oslo, Norway
| | - Andreas H Laustsen
- Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark
| | - Victor Greiff
- Department of Immunology, University of Oslo, Oslo, Norway
| | - Stefano Forli
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA
| | - Andrew B Ward
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA.
| | - Charlotte M Deane
- Oxford Protein Informatics Group, Department of Statistics, University of Oxford, Oxford OX1 3LB, UK.
| | - Monica L Fernández-Quintero
- Center for Molecular Biosciences Innsbruck, Department of General, Inorganic and Theoretical Chemistry, University of Innsbruck, Innsbruck, Austria; Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA 92037, USA; Department of Biotechnology and Biomedicine, Technical University of Denmark, DK-2800 Kongens Lyngby, Denmark.
| |
Collapse
|
29
|
Raouraoua N, Mirabello C, Véry T, Blanchet C, Wallner B, Lensink MF, Brysbaert G. MassiveFold: unveiling AlphaFold's hidden potential with optimized and parallelized massive sampling. NATURE COMPUTATIONAL SCIENCE 2024; 4:824-828. [PMID: 39528570 PMCID: PMC11578886 DOI: 10.1038/s43588-024-00714-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2024] [Accepted: 10/03/2024] [Indexed: 11/16/2024]
Abstract
Massive sampling in AlphaFold enables access to increased structural diversity. In combination with its efficient confidence ranking, this unlocks elevated modeling capabilities for monomeric structures and foremost for protein assemblies. However, the approach struggles with GPU cost and data storage. Here we introduce MassiveFold, an optimized and customizable version of AlphaFold that runs predictions in parallel, reducing the computing time from several months to hours. MassiveFold is scalable and able to run on anything from a single computer to a large GPU infrastructure, where it can fully benefit from all the computing nodes.
Collapse
Affiliation(s)
- Nessim Raouraoua
- Université de Lille, CNRS, UMR 8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Université de Lille, CNRS, Lille, France
| | - Claudio Mirabello
- Science for Life Laboratory, Department of Physics, Chemistry and Biology, National Bioinformatics Infrastructure Sweden, Linköping University, Linköping, Sweden
| | - Thibaut Véry
- Institut du Développement et des Ressources en Informatique Scientifique (IDRIS), CNRS, Université Paris-Saclay, Orsay, France
| | - Christophe Blanchet
- IFB-core, Institut Français de Bioinformatique (IFB), CNRS, INSERM, INRAE, CEA, Evry, France
| | - Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping, Sweden
| | - Marc F Lensink
- Université de Lille, CNRS, UMR 8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Université de Lille, CNRS, Lille, France
| | - Guillaume Brysbaert
- Université de Lille, CNRS, UMR 8576 - UGSF - Unité de Glycobiologie Structurale et Fonctionnelle, Université de Lille, CNRS, Lille, France.
| |
Collapse
|
30
|
Li X, Zuo Y, Lin X, Guo B, Jiang H, Guan N, Zheng H, Huang Y, Gu X, Yu B, Wang X. Develop Targeted Protein Drug Carriers through a High-Throughput Screening Platform and Rational Design. Adv Healthc Mater 2024; 13:e2401793. [PMID: 38804201 DOI: 10.1002/adhm.202401793] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2024] [Revised: 05/24/2024] [Indexed: 05/29/2024]
Abstract
Protein-based drugs offer advantages, such as high specificity, low toxicity, and minimal side effects compared to small molecule drugs. However, delivery of proteins to target tissues or cells remains challenging due to the instability, diverse structures, charges, and molecular weights of proteins. Polymers have emerged as a leading choice for designing effective protein delivery systems, but identifying a suitable polymer for a given protein is complicated by the complexity of both proteins and polymers. To address this challenge, a fluorescence-based high-throughput screening platform called ProMatch to efficiently collect data on protein-polymer interactions, followed by in vivo and in vitro experiments with rational design is developed. Using this approach to streamline polymer selection for targeted protein delivery, candidate polymers from commercially available options are identified and a polyhexamethylene biguanide (PHMB)-based system for delivering proteins to white adipose tissue as a treatment for obesity is developed. A branched polyethylenimine (bPEI)-based system for neuron-specific protein delivery to stimulate optic nerve regeneration is also developed. The high-throughput screening methodology expedites identification of promising polymer candidates for tissue-specific protein delivery systems, thereby providing a platform to develop innovative protein-based therapeutics.
Collapse
Affiliation(s)
- Xiaodan Li
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province, 310003, P. R. China
- Nanhu Brain-Computer Interface Institute, Hangzhou, 311100, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-Machine Integration, State Key Laboratory of Brain-Machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou, 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou, 310058, China
| | - Yanming Zuo
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province, 310003, P. R. China
- Nanhu Brain-Computer Interface Institute, Hangzhou, 311100, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-Machine Integration, State Key Laboratory of Brain-Machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou, 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou, 310058, China
| | - Xurong Lin
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province, 310003, P. R. China
- Lingang Laboratory, Shanghai, 200031, China
| | - Binjie Guo
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province, 310003, P. R. China
- Lingang Laboratory, Shanghai, 200031, China
| | - Haohan Jiang
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province, 310003, P. R. China
- Lingang Laboratory, Shanghai, 200031, China
| | - Naiyu Guan
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province, 310003, P. R. China
- Nanhu Brain-Computer Interface Institute, Hangzhou, 311100, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-Machine Integration, State Key Laboratory of Brain-Machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou, 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou, 310058, China
| | - Hanyu Zheng
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province, 310003, P. R. China
- Lingang Laboratory, Shanghai, 200031, China
| | - Yan Huang
- Department of Hepatobiliary and Pancreatic Surgery, Affiliated Hospital of Nantong University, Medical School of Nantong University, Nantong, 226001, China
| | - Xiaosong Gu
- Key Laboratory of Neuroregeneration of Jiangsu and Ministry of Education, Nantong University, Nantong, Jiangsu, 226001, P. R. China
| | - Bin Yu
- Key Laboratory of Neuroregeneration of Jiangsu and Ministry of Education, Nantong University, Nantong, Jiangsu, 226001, P. R. China
| | - Xuhua Wang
- Department of Neurobiology and Department of Rehabilitation Medicine, First Affiliated Hospital, Zhejiang University School of Medicine, Hangzhou, Zhejiang Province, 310003, P. R. China
- Nanhu Brain-Computer Interface Institute, Hangzhou, 311100, China
- Liangzhu Laboratory, MOE Frontier Science Center for Brain Science and Brain-Machine Integration, State Key Laboratory of Brain-Machine Intelligence, Zhejiang University, 1369 West Wenyi Road, Hangzhou, 311121, China
- NHC and CAMS Key Laboratory of Medical Neurobiology, Zhejiang University, Hangzhou, 310058, China
- Lingang Laboratory, Shanghai, 200031, China
- Co-Innovation Center of Neuroregeneration, Nantong University, Nantong, Jiangsu, 226001, P. R. China
| |
Collapse
|
31
|
Zhong J, Chen Y, Shi H, Zhou T, Wang C, Guo Z, Liang Y, Zhang Q, Sun M. Identification and functional analysis of terpene synthases revealing the secrets of aroma formation in Chrysanthemum aromaticum. Int J Biol Macromol 2024; 279:135377. [PMID: 39244131 DOI: 10.1016/j.ijbiomac.2024.135377] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2024] [Revised: 08/18/2024] [Accepted: 09/04/2024] [Indexed: 09/09/2024]
Abstract
C. aromaticum is widely cultivated for its aromatic, medicinal, and tea-applicable properties, earning the nickname 'lavender in composite'. Terpenoids are the major compounds of C. aromaticum fragrance. To reveal the molecular mechanisms of terpenoid biosynthesis in C. aromaticum, NGS and SMRT sequencing were employed to identify the key terpene synthase genes. A total of 59,903 non-redundant transcripts were obtained by the transcriptome analysis. Twenty-nine terpene synthase genes (TPSs) were identified, and phylogenetic analysis showed that they belong to four subfamilies of terpene synthases. Five CaTPSs were successfully cloned. Subcellular localization showed they were present in the nucleus and cytosol. Structure models of five terpene synthases were predicted, and molecular docking results showed good binding affinities with FPP/GPP. In vitro enzymatic tests showed that CaTPS7, CaTPS8, CaTPS10 and CaTPS20 could catalyze substrates to produce terpenoids. CaTPS7 and CaTPS20 were both able to effectively convert the precursor FPP into caryophyllene. CaTPS8 could convert FPP to trans-nerolidol and nerolidyl acetate, while CaTPS10 could convert FPP to elemene and aristolochene. This study lays the groundwork for further research to depict the metabolism network of terpenoid in C. aromaticum. These identical terpene synthase genes could be introduced into the cultivated chrysanthemums to enhance their fragrance.
Collapse
Affiliation(s)
- Jian Zhong
- State Key Laboratory of Efficient Production of Forest Resources, National Engineering Research Center for Floriculture, Beijng Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, Beijing Laboratory of Urban and Rural Ecological Environment, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants of Ministry of Education, School of Landscape Architecture, Beijing Forestry University, Beijing 100083, China
| | - Yuyuan Chen
- State Key Laboratory of Efficient Production of Forest Resources, National Engineering Research Center for Floriculture, Beijng Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, Beijing Laboratory of Urban and Rural Ecological Environment, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants of Ministry of Education, School of Landscape Architecture, Beijing Forestry University, Beijing 100083, China
| | - Huajin Shi
- State Key Laboratory of Efficient Production of Forest Resources, National Engineering Research Center for Floriculture, Beijng Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, Beijing Laboratory of Urban and Rural Ecological Environment, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants of Ministry of Education, School of Landscape Architecture, Beijing Forestry University, Beijing 100083, China
| | - Tongjun Zhou
- State Key Laboratory of Efficient Production of Forest Resources, National Engineering Research Center for Floriculture, Beijng Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, Beijing Laboratory of Urban and Rural Ecological Environment, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants of Ministry of Education, School of Landscape Architecture, Beijing Forestry University, Beijing 100083, China
| | - Chen Wang
- State Key Laboratory of Efficient Production of Forest Resources, National Engineering Research Center for Floriculture, Beijng Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, Beijing Laboratory of Urban and Rural Ecological Environment, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants of Ministry of Education, School of Landscape Architecture, Beijing Forestry University, Beijing 100083, China
| | - Ziyu Guo
- State Key Laboratory of Efficient Production of Forest Resources, National Engineering Research Center for Floriculture, Beijng Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, Beijing Laboratory of Urban and Rural Ecological Environment, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants of Ministry of Education, School of Landscape Architecture, Beijing Forestry University, Beijing 100083, China
| | - Yilin Liang
- State Key Laboratory of Efficient Production of Forest Resources, National Engineering Research Center for Floriculture, Beijng Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, Beijing Laboratory of Urban and Rural Ecological Environment, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants of Ministry of Education, School of Landscape Architecture, Beijing Forestry University, Beijing 100083, China
| | - Qixiang Zhang
- State Key Laboratory of Efficient Production of Forest Resources, National Engineering Research Center for Floriculture, Beijng Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, Beijing Laboratory of Urban and Rural Ecological Environment, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants of Ministry of Education, School of Landscape Architecture, Beijing Forestry University, Beijing 100083, China
| | - Ming Sun
- State Key Laboratory of Efficient Production of Forest Resources, National Engineering Research Center for Floriculture, Beijng Key Laboratory of Ornamental Plants Germplasm Innovation and Molecular Breeding, Beijing Laboratory of Urban and Rural Ecological Environment, Key Laboratory of Genetics and Breeding in Forest Trees and Ornamental Plants of Ministry of Education, School of Landscape Architecture, Beijing Forestry University, Beijing 100083, China.
| |
Collapse
|
32
|
Tripp A, Braun M, Wieser F, Oberdorfer G, Lechner H. Click, Compute, Create: A Review of Web-based Tools for Enzyme Engineering. Chembiochem 2024; 25:e202400092. [PMID: 38634409 DOI: 10.1002/cbic.202400092] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 04/14/2024] [Accepted: 04/15/2024] [Indexed: 04/19/2024]
Abstract
Enzyme engineering, though pivotal across various biotechnological domains, is often plagued by its time-consuming and labor-intensive nature. This review aims to offer an overview of supportive in silico methodologies for this demanding endeavor. Starting from methods to predict protein structures, to classification of their activity and even the discovery of new enzymes we continue with describing tools used to increase thermostability and production yields of selected targets. Subsequently, we discuss computational methods to modulate both, the activity as well as selectivity of enzymes. Last, we present recent approaches based on cutting-edge machine learning methods to redesign enzymes. With exception of the last chapter, there is a strong focus on methods easily accessible via web-interfaces or simple Python-scripts, therefore readily useable for a diverse and broad community.
Collapse
Affiliation(s)
- Adrian Tripp
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
| | - Markus Braun
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
| | - Florian Wieser
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
| | - Gustav Oberdorfer
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
- BioTechMed, Graz, Austria
| | - Horst Lechner
- Institute of Biochemistry, Graz University of Technology, Petersgasse 12/2, 8010, Graz, Austria
- BioTechMed, Graz, Austria
| |
Collapse
|
33
|
Schafer JW, Porter LL. AlphaFold2's training set powers its predictions of fold-switched conformations. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.11.617857. [PMID: 39803493 PMCID: PMC11722258 DOI: 10.1101/2024.10.11.617857] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/23/2025]
Abstract
AlphaFold2 (AF2), a deep-learning based model that predicts protein structures from their amino acid sequences, has recently been used to predict multiple protein conformations. In some cases, AF2 has successfully predicted both dominant and alternative conformations of fold-switching proteins, which remodel their secondary and tertiary structures in response to cellular stimuli. Whether AF2 has learned enough protein folding principles to reliably predict alternative conformations outside of its training set is unclear. Here, we address this question by assessing whether CFold-an implementation of the AF2 network trained on a more limited subset of experimentally determined protein structures- predicts alternative conformations of eight fold switchers from six protein families. Previous work suggests that AF2 predicted these alternative conformations by memorizing them during training. Unlike AF2, CFold's training set contains only one of these alternative conformations. Despite sampling 1300-4400 structures/protein with various sequence sampling techniques, CFold predicted only one alternative structure outside of its training set accurately and with high confidence while also generating experimentally inconsistent structures with higher confidence. Though these results indicate that AF2's current success in predicting alternative conformations of fold switchers stems largely from its training data, results from a sequence pruning technique suggest developments that could lead to a more reliable generative model in the future.
Collapse
Affiliation(s)
- Joseph W. Schafer
- National Library of Medicine, National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Lauren L. Porter
- National Library of Medicine, National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD, 20894, USA
- National Heart, Lung, and Blood Institute, Biochemistry and Biophysics Center, National Institutes of Health, Bethesda, MD, 20892, USA
| |
Collapse
|
34
|
Benavides TL, Montelione GT. Integrative Modeling of Protein-Polypeptide Complexes by Bayesian Model Selection using AlphaFold and NMR Chemical Shift Perturbation Data. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.19.613999. [PMID: 39345459 PMCID: PMC11430059 DOI: 10.1101/2024.09.19.613999] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/01/2024]
Abstract
Protein-polypeptide interactions, including those involving intrinsically-disordered peptides and intrinsically-disordered regions of protein binding partners, are crucial for many biological functions. However, experimental structure determination of protein-peptide complexes can be challenging. Computational methods, while promising, generally require experimental data for validation and refinement. Here we present CSP_Rank, an integrated modeling approach to determine the structures of protein-peptide complexes. This method combines AlphaFold2 (AF2) enhanced sampling methods with a Bayesian conformational selection process based on experimental Nuclear Magnetic Resonance (NMR) Chemical Shift Perturbation (CSP) data and AF2 confidence metrics. Using a curated dataset of 108 protein-peptide complexes from the Biological Magnetic Resonance Data Bank (BMRB), we observe that while AF2 typically yields models with excellent consistency with experimental CSP data, applying enhanced sampling followed by data-guided conformational selection routinely results in ensembles of structures with improved agreement with NMR observables. For two systems, we cross-validate the CSP-selected models using independently acquired nuclear Overhauser effect (NOE) NMR data and demonstrate how CSP and NMR can be combined using our Bayesian framework for model selection. CSP_Rank is a novel method for integrative modeling of protein-peptide complexes and has broad implications for studies of protein-peptide interactions and aiding in understanding their biological functions.
Collapse
Affiliation(s)
- Tiburon L. Benavides
- Department of Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| | - Gaetano T. Montelione
- Department of Chemistry and Chemical Biology, Center for Biotechnology and Interdisciplinary Sciences, Rensselaer Polytechnic Institute, Troy, NY 12180 USA
| |
Collapse
|
35
|
Raisinghani N, Alshahrani M, Gupta G, Verkhivker G. Predicting Mutation-Induced Allosteric Changes in Structures and Conformational Ensembles of the ABL Kinase Using AlphaFold2 Adaptations with Alanine Sequence Scanning. Int J Mol Sci 2024; 25:10082. [PMID: 39337567 PMCID: PMC11432724 DOI: 10.3390/ijms251810082] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2024] [Revised: 09/18/2024] [Accepted: 09/18/2024] [Indexed: 09/30/2024] Open
Abstract
Despite the success of AlphaFold2 approaches in predicting single protein structures, these methods showed intrinsic limitations in predicting multiple functional conformations of allosteric proteins and have been challenged to accurately capture the effects of single point mutations that induced significant structural changes. We examined several implementations of AlphaFold2 methods to predict conformational ensembles for state-switching mutants of the ABL kinase. The results revealed that a combination of randomized alanine sequence masking with shallow multiple sequence alignment subsampling can significantly expand the conformational diversity of the predicted structural ensembles and capture shifts in populations of the active and inactive ABL states. Consistent with the NMR experiments, the predicted conformational ensembles for M309L/L320I and M309L/H415P ABL mutants that perturb the regulatory spine networks featured the increased population of the fully closed inactive state. The proposed adaptation of AlphaFold can reproduce the experimentally observed mutation-induced redistributions in the relative populations of the active and inactive ABL states and capture the effects of regulatory mutations on allosteric structural rearrangements of the kinase domain. The ensemble-based network analysis complemented AlphaFold predictions by revealing allosteric hotspots that correspond to state-switching mutational sites which may explain the global effect of regulatory mutations on structural changes between the ABL states. This study suggested that attention-based learning of long-range dependencies between sequence positions in homologous folds and deciphering patterns of allosteric interactions may further augment the predictive abilities of AlphaFold methods for modeling of alternative protein sates, conformational ensembles and mutation-induced structural transformations.
Collapse
Affiliation(s)
- Nishank Raisinghani
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Mohammed Alshahrani
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Grace Gupta
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
| | - Gennady Verkhivker
- Keck Center for Science and Engineering, Schmid College of Science and Technology, Chapman University, Orange, CA 92866, USA
- Department of Biomedical and Pharmaceutical Sciences, Chapman University School of Pharmacy, Irvine, CA 92618, USA
| |
Collapse
|
36
|
Bhatt R, Wang A, Durrant JD. Teaching old docks new tricks with machine learning enhanced ensemble docking. Sci Rep 2024; 14:20722. [PMID: 39237737 PMCID: PMC11377811 DOI: 10.1038/s41598-024-71699-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2024] [Accepted: 08/30/2024] [Indexed: 09/07/2024] Open
Abstract
We here introduce Ensemble Optimizer (EnOpt), a machine-learning tool to improve the accuracy and interpretability of ensemble virtual screening (VS). Ensemble VS is an established method for predicting protein/small-molecule (ligand) binding. Unlike traditional VS, which focuses on a single protein conformation, ensemble VS better accounts for protein flexibility by predicting binding to multiple protein conformations. Each compound is thus associated with a spectrum of scores (one score per protein conformation) rather than a single score. To effectively rank and prioritize the molecules for further evaluation (including experimental testing), researchers must select which protein conformations to consider and how best to map each compound's spectrum of scores to a single value, decisions that are system-specific. EnOpt uses machine learning to address these challenges. We perform benchmark VS to show that for many systems, EnOpt ranking distinguishes active compounds from inactive or decoy molecules more effectively than traditional ensemble VS methods. To encourage broad adoption, we release EnOpt free of charge under the terms of the MIT license.
Collapse
Affiliation(s)
- Roshni Bhatt
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Ann Wang
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, 15260, USA
| | - Jacob D Durrant
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, 15260, USA.
| |
Collapse
|
37
|
Wang J, Lu X, Zhuge B, Zong H. Enhancing the catalytic efficiency of M32 carboxypeptidase by semi-rational design and its applications in food taste improvement. JOURNAL OF THE SCIENCE OF FOOD AND AGRICULTURE 2024; 104:7375-7385. [PMID: 38666395 DOI: 10.1002/jsfa.13558] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/01/2024] [Revised: 04/02/2024] [Accepted: 04/26/2024] [Indexed: 05/09/2024]
Abstract
BACKGROUND Carboxypeptidase is an exopeptidase that hydrolyzes amino acids at the C-terminal end of the peptide chain and has a wide range of applications in food. However, in industrial applications, the relatively low catalytic efficiency of carboxypeptidases is one of the main limiting factors for industrialization. RESULTS The study has enhanced the catalytic efficiency of Bacillus megaterium M32 carboxypeptidase (BmeCPM32) through semi-rational design. Firstly, the specific activity of the optimal mutant, BmeCPM32-M2, obtained through single-site mutagenesis and combinatorial mutagenesis, was 2.2-fold higher than that of the wild type (187.9 versus 417.8 U mg-1), and the catalytic efficiency was 2.9-fold higher (110.14 versus 325.75 s-1 mmol-1). Secondly, compared to the wild type, BmeCPM32-M2 exhibited a 1.8-fold increase in half-life at 60 °C, with no significant changes in its enzymatic properties (optimal pH, optimal temperature). Finally, BmeCPM32-M2 significantly increased the umami intensity of soy protein isolate hydrolysate by 55% and reduced bitterness by 83%, indicating its potential in developing tasty protein components. CONCLUSION Our research has revealed that the strategy based on protein sequence evolution and computational residue mutation energy led to an improved catalytic efficiency of BmeCPM32. Molecular dynamics simulations have revealed that a smaller substrate binding pocket and increased enzyme-substrate affinity are the reasons for the enhanced catalytic efficiency. Furthermore the number of hydrogen bonds and solvent and surface area may contribute to the improvement of thermostability. Finally, the de-bittering effect of BmeCPM32-M2 in soy protein isolate hydrolysate suggests its potential in developing palatable protein components. © 2024 Society of Chemical Industry.
Collapse
Affiliation(s)
- Jinjiang Wang
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- Research Centre of Industrial Microbiology, School of Biotechnology, Jiangnan University, Wuxi, China
| | - Xinyao Lu
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- Research Centre of Industrial Microbiology, School of Biotechnology, Jiangnan University, Wuxi, China
| | - Bin Zhuge
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- Research Centre of Industrial Microbiology, School of Biotechnology, Jiangnan University, Wuxi, China
| | - Hong Zong
- Key Laboratory of Industrial Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- Key Laboratory of Carbohydrate Chemistry and Biotechnology, Ministry of Education, School of Biotechnology, Jiangnan University, Wuxi, China
- Research Centre of Industrial Microbiology, School of Biotechnology, Jiangnan University, Wuxi, China
| |
Collapse
|
38
|
Pons JL, Reys V, Grand F, Moreau V, Gracy J, Exner TE, Labesse G. @TOME 3.0: Interfacing Protein Structure Modeling and Ligand Docking. J Mol Biol 2024; 436:168704. [PMID: 39237192 DOI: 10.1016/j.jmb.2024.168704] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 07/02/2024] [Accepted: 07/09/2024] [Indexed: 09/07/2024]
Abstract
Knowledge of protein-ligand complexes is essential for efficient drug design. Virtual docking can bring important information on putative complexes but it is still far from being simultaneously fast and accurate. Receptors are flexible and adapt to the incoming small molecules while docking is highly sensitive to small conformational deviations. Conformation ensemble is providing a mean to simulate protein flexibility. However, modeling multiple protein structures for many targets is seldom connected to ligand screening in an efficient and straightforward manner. @TOME-3 is an updated version of our former pipeline @TOME-2, in which protein structure modeling is now directly interfaced with flexible ligand docking. Sequence-sequence profile comparisons identify suitable PDB templates for structure modeling and ligands from these templates are used to deduce binding sites to be screened. In addition, bound ligand can be used as pharmacophoric restraint during the virtual docking. The latter is performed by PLANTS while the docking poses are analysed through multiple chemoinformatics functions. This unique combination of tools allows rapid and efficient ligand docking on multiple receptor conformations in parallel. @TOME-3 is freely available on the web at https://atome.cbs.cnrs.fr.
Collapse
Affiliation(s)
- Jean-Luc Pons
- A.B.C.I.S, CNRS UMR5048 - INSERM U1054 - Université de Montpellier 29, Rue de Navacelles, 34090 Montpellier Cedex, France
| | - Victor Reys
- A.B.C.I.S, CNRS UMR5048 - INSERM U1054 - Université de Montpellier 29, Rue de Navacelles, 34090 Montpellier Cedex, France
| | - François Grand
- A.B.C.I.S, CNRS UMR5048 - INSERM U1054 - Université de Montpellier 29, Rue de Navacelles, 34090 Montpellier Cedex, France
| | - Violaine Moreau
- A.B.C.I.S, CNRS UMR5048 - INSERM U1054 - Université de Montpellier 29, Rue de Navacelles, 34090 Montpellier Cedex, France
| | - Jerôme Gracy
- A.B.C.I.S, CNRS UMR5048 - INSERM U1054 - Université de Montpellier 29, Rue de Navacelles, 34090 Montpellier Cedex, France
| | - Thomas E Exner
- Seven Past Nine d.o.o., Hribljane 10, 1380 Cerknica, Slovenia
| | - Gilles Labesse
- A.B.C.I.S, CNRS UMR5048 - INSERM U1054 - Université de Montpellier 29, Rue de Navacelles, 34090 Montpellier Cedex, France.
| |
Collapse
|
39
|
Nithin C, Fornari RP, Pilla SP, Wroblewski K, Zalewski M, Madaj R, Kolinski A, Macnar JM, Kmiecik S. Exploring protein functions from structural flexibility using CABS-flex modeling. Protein Sci 2024; 33:e5090. [PMID: 39194135 PMCID: PMC11350595 DOI: 10.1002/pro.5090] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2024] [Revised: 05/06/2024] [Accepted: 06/10/2024] [Indexed: 08/29/2024]
Abstract
Understanding protein function often necessitates characterizing the flexibility of protein structures. However, simulating protein flexibility poses significant challenges due to the complex dynamics of protein systems, requiring extensive computational resources and accurate modeling techniques. In response to these challenges, the CABS-flex method has been developed as an efficient modeling tool that combines coarse-grained simulations with all-atom detail. Available both as a web server and a standalone package, CABS-flex is dedicated to a wide range of users. The web server version offers an accessible interface for straightforward tasks, while the standalone command-line program is designed for advanced users, providing additional features, analytical tools, and support for handling large systems. This paper examines the application of CABS-flex across various structure-function studies, facilitating investigations into the interplay among protein structure, dynamics, and function in diverse research fields. We present an overview of the current status of the CABS-flex methodology, highlighting its recent advancements, practical applications, and forthcoming challenges.
Collapse
Affiliation(s)
- Chandran Nithin
- Biological and Chemical Research Centre, Faculty of ChemistryUniversity of WarsawWarsawPoland
| | - Rocco Peter Fornari
- Biological and Chemical Research Centre, Faculty of ChemistryUniversity of WarsawWarsawPoland
| | - Smita P. Pilla
- Biological and Chemical Research Centre, Faculty of ChemistryUniversity of WarsawWarsawPoland
| | - Karol Wroblewski
- Biological and Chemical Research Centre, Faculty of ChemistryUniversity of WarsawWarsawPoland
| | - Mateusz Zalewski
- Biological and Chemical Research Centre, Faculty of ChemistryUniversity of WarsawWarsawPoland
| | - Rafał Madaj
- Institute of Evolutionary Biology, Biological and Chemical Research Centre, Faculty of BiologyUniversity of WarsawWarsawPoland
| | - Andrzej Kolinski
- Biological and Chemical Research Centre, Faculty of ChemistryUniversity of WarsawWarsawPoland
| | - Joanna M. Macnar
- Biological and Chemical Research Centre, Faculty of ChemistryUniversity of WarsawWarsawPoland
- Present address:
Ryvu TherapeuticsCracowPoland
| | - Sebastian Kmiecik
- Biological and Chemical Research Centre, Faculty of ChemistryUniversity of WarsawWarsawPoland
| |
Collapse
|
40
|
Liu Q, Meng X, Song Z, Shao Y, Zhao Y, Fang R, Huo Y, Zhang L. Insect-transmitted plant virus balances its vertical transmission through regulating Rab1-mediated receptor localization. Cell Rep 2024; 43:114571. [PMID: 39093698 DOI: 10.1016/j.celrep.2024.114571] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 06/23/2024] [Accepted: 07/17/2024] [Indexed: 08/04/2024] Open
Abstract
Rice stripe virus (RSV) establishes infection in the ovaries of its vector insect, Laodelphax striatellus. We demonstrate that RSV infection delays ovarian maturation by inhibiting membrane localization of the vitellogenin receptor (VgR), thereby reducing the vitellogenin (Vg) accumulation essential for egg development. We identify the host protein L. striatellus Rab1 protein (LsRab1), which directly interacts with RSV nucleocapsid protein (NP) within nurse cells. LsRab1 is required for VgR surface localization and ovarian Vg accumulation. RSV inhibits LsRab1 function through two mechanisms: NP binding LsRab1 prevents GTP binding, and NP binding LsRab1-GTP complexes stimulates GTP hydrolysis, forming an inactive LsRab1 form. Through this dual inhibition, RSV infection prevents LsRab1 from facilitating VgR trafficking to the cell membrane, leading to inefficient Vg uptake. The Vg-VgR pathway is present in most oviparous animals, and the mechanisms detailed here provide insights into the vertical transmission of other insect-transmitted viruses of medical and agricultural importance.
Collapse
Affiliation(s)
- Qing Liu
- State Key Laboratory of Plant Genomics, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; University of the Chinese Academy of Sciences, Beijing 100049, China
| | - Xiangyi Meng
- State Key Laboratory of Plant Genomics, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; University of the Chinese Academy of Sciences, Beijing 100049, China
| | - Zhiyu Song
- State Key Laboratory of Plant Genomics, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; University of the Chinese Academy of Sciences, Beijing 100049, China
| | - Ying Shao
- College of Veterinary Medicine, Shanxi Agricultural University, Jinzhong, Shanxi Province 030801, China
| | - Yao Zhao
- State Key Laboratory of Plant Genomics, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; University of the Chinese Academy of Sciences, Beijing 100049, China
| | - Rongxiang Fang
- State Key Laboratory of Plant Genomics, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China; University of the Chinese Academy of Sciences, Beijing 100049, China
| | - Yan Huo
- State Key Laboratory of Plant Genomics, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Lili Zhang
- State Key Laboratory of Plant Genomics, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China; CAS Center for Excellence in Biotic Interactions, University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
41
|
Guan X, Tang QY, Ren W, Chen M, Wang W, Wolynes PG, Li W. Predicting protein conformational motions using energetic frustration analysis and AlphaFold2. Proc Natl Acad Sci U S A 2024; 121:e2410662121. [PMID: 39163334 PMCID: PMC11363347 DOI: 10.1073/pnas.2410662121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2024] [Accepted: 07/16/2024] [Indexed: 08/22/2024] Open
Abstract
Proteins perform their biological functions through motion. Although high throughput prediction of the three-dimensional static structures of proteins has proved feasible using deep-learning-based methods, predicting the conformational motions remains a challenge. Purely data-driven machine learning methods encounter difficulty for addressing such motions because available laboratory data on conformational motions are still limited. In this work, we develop a method for generating protein allosteric motions by integrating physical energy landscape information into deep-learning-based methods. We show that local energetic frustration, which represents a quantification of the local features of the energy landscape governing protein allosteric dynamics, can be utilized to empower AlphaFold2 (AF2) to predict protein conformational motions. Starting from ground state static structures, this integrative method generates alternative structures as well as pathways of protein conformational motions, using a progressive enhancement of the energetic frustration features in the input multiple sequence alignment sequences. For a model protein adenylate kinase, we show that the generated conformational motions are consistent with available experimental and molecular dynamics simulation data. Applying the method to another two proteins KaiB and ribose-binding protein, which involve large-amplitude conformational changes, can also successfully generate the alternative conformations. We also show how to extract overall features of the AF2 energy landscape topography, which has been considered by many to be black box. Incorporating physical knowledge into deep-learning-based structure prediction algorithms provides a useful strategy to address the challenges of dynamic structure prediction of allosteric proteins.
Collapse
Affiliation(s)
- Xingyue Guan
- Department of Physics, National Laboratory of Solid State Microstructure, Nanjing University, Nanjing210093, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang325000, China
| | - Qian-Yuan Tang
- Department of Physics, Hong Kong Baptist University, Kowloon Tong, Hong Kong Special Administrative Region999077, China
| | - Weitong Ren
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang325000, China
| | | | - Wei Wang
- Department of Physics, National Laboratory of Solid State Microstructure, Nanjing University, Nanjing210093, China
| | - Peter G. Wolynes
- Center for Theoretical Biological Physics, Rice University, Houston, TX77005
| | - Wenfei Li
- Department of Physics, National Laboratory of Solid State Microstructure, Nanjing University, Nanjing210093, China
- Wenzhou Key Laboratory of Biophysics, Wenzhou Institute, University of Chinese Academy of Sciences, Wenzhou, Zhejiang325000, China
| |
Collapse
|
42
|
Chakravarty D, Schafer JW, Chen EA, Thole JF, Ronish LA, Lee M, Porter LL. AlphaFold predictions of fold-switched conformations are driven by structure memorization. Nat Commun 2024; 15:7296. [PMID: 39181864 PMCID: PMC11344769 DOI: 10.1038/s41467-024-51801-z] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/13/2024] [Accepted: 08/19/2024] [Indexed: 08/27/2024] Open
Abstract
Recent work suggests that AlphaFold (AF)-a deep learning-based model that can accurately infer protein structure from sequence-may discern important features of folded protein energy landscapes, defined by the diversity and frequency of different conformations in the folded state. Here, we test the limits of its predictive power on fold-switching proteins, which assume two structures with regions of distinct secondary and/or tertiary structure. We find that (1) AF is a weak predictor of fold switching and (2) some of its successes result from memorization of training-set structures rather than learned protein energetics. Combining >280,000 models from several implementations of AF2 and AF3, a 35% success rate was achieved for fold switchers likely in AF's training sets. AF2's confidence metrics selected against models consistent with experimentally determined fold-switching structures and failed to discriminate between low and high energy conformations. Further, AF captured only one out of seven experimentally confirmed fold switchers outside of its training sets despite extensive sampling of an additional ~280,000 models. Several observations indicate that AF2 has memorized structural information during training, and AF3 misassigns coevolutionary restraints. These limitations constrain the scope of successful predictions, highlighting the need for physically based methods that readily predict multiple protein conformations.
Collapse
Affiliation(s)
- Devlina Chakravarty
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Joseph W Schafer
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Ethan A Chen
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Joseph F Thole
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Leslie A Ronish
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA
| | - Myeongsang Lee
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA
| | - Lauren L Porter
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, 20894, USA.
- Biochemistry and Biophysics Center, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, MD, 20892, USA.
| |
Collapse
|
43
|
Kovalevskiy O, Mateos-Garcia J, Tunyasuvunakool K. AlphaFold two years on: Validation and impact. Proc Natl Acad Sci U S A 2024; 121:e2315002121. [PMID: 39133843 PMCID: PMC11348012 DOI: 10.1073/pnas.2315002121] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 08/29/2024] Open
Abstract
Two years on from the initial release of AlphaFold, we have seen its widespread adoption as a structure prediction tool. Here, we discuss some of the latest work based on AlphaFold, with a particular focus on its use within the structural biology community. This encompasses use cases like speeding up structure determination itself, enabling new computational studies, and building new tools and workflows. We also look at the ongoing validation of AlphaFold, as its predictions continue to be compared against large numbers of experimental structures to further delineate the model's capabilities and limitations.
Collapse
|
44
|
Correa Marrero M, Jänes J, Baptista D, Beltrao P. Integrating Large-Scale Protein Structure Prediction into Human Genetics Research. Annu Rev Genomics Hum Genet 2024; 25:123-140. [PMID: 38621234 DOI: 10.1146/annurev-genom-120622-020615] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/17/2024]
Abstract
The last five years have seen impressive progress in deep learning models applied to protein research. Most notably, sequence-based structure predictions have seen transformative gains in the form of AlphaFold2 and related approaches. Millions of missense protein variants in the human population lack annotations, and these computational methods are a valuable means to prioritize variants for further analysis. Here, we review the recent progress in deep learning models applied to the prediction of protein structure and protein variants, with particular emphasis on their implications for human genetics and health. Improved prediction of protein structures facilitates annotations of the impact of variants on protein stability, protein-protein interaction interfaces, and small-molecule binding pockets. Moreover, it contributes to the study of host-pathogen interactions and the characterization of protein function. As genome sequencing in large cohorts becomes increasingly prevalent, we believe that better integration of state-of-the-art protein informatics technologies into human genetics research is of paramount importance.
Collapse
Affiliation(s)
- Miguel Correa Marrero
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland;
| | - Jürgen Jänes
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland;
| | | | - Pedro Beltrao
- Instituto Gulbenkian de Ciência, Oeiras, Portugal
- SIB Swiss Institute of Bioinformatics, Lausanne, Switzerland
- Institute of Molecular Systems Biology, Department of Biology, ETH Zurich, Zurich, Switzerland;
| |
Collapse
|
45
|
Agarwal V, McShan AC. The power and pitfalls of AlphaFold2 for structure prediction beyond rigid globular proteins. Nat Chem Biol 2024; 20:950-959. [PMID: 38907110 PMCID: PMC11956457 DOI: 10.1038/s41589-024-01638-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 04/29/2024] [Indexed: 06/23/2024]
Abstract
Artificial intelligence-driven advances in protein structure prediction in recent years have raised the question: has the protein structure-prediction problem been solved? Here, with a focus on nonglobular proteins, we highlight the many strengths and potential weaknesses of DeepMind's AlphaFold2 in the context of its biological and therapeutic applications. We summarize the subtleties associated with evaluation of AlphaFold2 model quality and reliability using the predicted local distance difference test (pLDDT) and predicted aligned error (PAE) values. We highlight various classes of proteins that AlphaFold2 can be applied to and the caveats involved. Concrete examples of how AlphaFold2 models can be integrated with experimental data in the form of small-angle X-ray scattering (SAXS), solution NMR, cryo-electron microscopy (cryo-EM) and X-ray diffraction are discussed. Finally, we highlight the need to move beyond structure prediction of rigid, static structural snapshots toward conformational ensembles and alternate biologically relevant states. The overarching theme is that careful consideration is due when using AlphaFold2-generated models to generate testable hypotheses and structural models, rather than treating predicted models as de facto ground truth structures.
Collapse
Affiliation(s)
- Vinayak Agarwal
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA.
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA, USA.
| | - Andrew C McShan
- School of Chemistry and Biochemistry, Georgia Institute of Technology, Atlanta, GA, USA.
| |
Collapse
|
46
|
Berksoz M, Atilgan C. Allosteric modulation of fluorescence revealed by hydrogen bond dynamics in a genetically encoded maltose biosensor. Proteins 2024; 92:923-932. [PMID: 38572606 DOI: 10.1002/prot.26688] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 03/02/2024] [Accepted: 03/22/2024] [Indexed: 04/05/2024]
Abstract
Genetically encoded fluorescent biosensors (GEFBs) proved to be reliable tracers for many metabolites and cellular processes. In the simplest case, a fluorescent protein (FP) is genetically fused to a sensing protein which undergoes a conformational change upon ligand binding. This drives a rearrangement in the chromophore environment and changes the spectral properties of the FP. Structural determinants of successful biosensors are revealed only in hindsight when the crystal structures of both ligand-bound and ligand-free forms are available. This makes the development of new biosensors for desired analytes a long trial-and-error process. In the current study, we conducted μs-long all atom molecular dynamics (MD) simulations of a maltose biosensor in both the apo (dark) and holo (bright) forms. We performed detailed hydrogen bond occupancy analyses to shed light on the mechanism of ligand induced conformational change in the sensor protein and its allosteric effect on the chromophore environment. We find that two strong indicators for distinguishing bright and dark states of biosensors are due to substantial changes in hydrogen bond dynamics in the system and solvent accessibility of the chromophore.
Collapse
Affiliation(s)
- Melike Berksoz
- Faculty of Engineering and Natural Sciences, Sabanci University, Turkey
| | - Canan Atilgan
- Faculty of Engineering and Natural Sciences, Sabanci University, Turkey
| |
Collapse
|
47
|
Romagnoli A, Rexha J, Perta N, Di Cristofano S, Borgognoni N, Venturini G, Pignotti F, Raimondo D, Borsello T, Di Marino D. Peptidomimetics design and characterization: Bridging experimental and computer-based approaches. PROGRESS IN MOLECULAR BIOLOGY AND TRANSLATIONAL SCIENCE 2024; 212:279-327. [PMID: 40122649 DOI: 10.1016/bs.pmbts.2024.07.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/25/2025]
Abstract
Peptidomimetics, designed to mimic peptide biological activity with more drug-like properties, are increasingly pivotal in medicinal chemistry. They offer enhanced systemic delivery, cell penetration, target specificity, and protection against peptidases when compared to their native peptide counterparts. Already utilized in treating diverse diseases like neurodegenerative disorders, cancer and infectious diseases, their future in medicine seems bright, with many peptidomimetics in clinical trials or development stages. Peptidomimetics are well-suited for addressing disturbed protein-protein interactions (PPIs), which often underlie various pathologies. Structural biology and computational methods like molecular dynamics simulations facilitate rational design, whereas machine learning algorithms accelerate protein structure prediction, enabling efficient drug development. Experimental validation via various spectroscopic, biophysical, and biochemical assays confirms computational predictions and guides further optimization. Peptidomimetics, with their tailored constrained structures, represent a frontier in drug design focused on targeting PPIs. In this overview, we present a comprehensive landscape of peptidomimetics, encompassing perspectives on involvement in pathologies, chemical strategies, and methodologies for their characterization, spanning in silico, in vitro and in cell approaches. With increasing interest from pharmaceutical sectors, peptidomimetics hold promise for revolutionizing therapeutic approaches, marking a new era of precision drug discovery.
Collapse
Affiliation(s)
- Alice Romagnoli
- Department of Life and Environmental Sciences, Polytechnic University of Marche, Ancona, Italy; New York-Marche Structural Biology Centre (NY-MaSBiC), Polytechnic University of Marche, Ancona, Italy; Neuronal Death and Neuroprotection Unit, Department of Neuroscience, Mario Negri Institute for Pharmacological Research-IRCCS, Milan, Italy.
| | - Jesmina Rexha
- Department of Life and Environmental Sciences, Polytechnic University of Marche, Ancona, Italy; New York-Marche Structural Biology Centre (NY-MaSBiC), Polytechnic University of Marche, Ancona, Italy; Neuronal Death and Neuroprotection Unit, Department of Neuroscience, Mario Negri Institute for Pharmacological Research-IRCCS, Milan, Italy
| | - Nunzio Perta
- Department of Life and Environmental Sciences, Polytechnic University of Marche, Ancona, Italy; New York-Marche Structural Biology Centre (NY-MaSBiC), Polytechnic University of Marche, Ancona, Italy; Neuronal Death and Neuroprotection Unit, Department of Neuroscience, Mario Negri Institute for Pharmacological Research-IRCCS, Milan, Italy
| | | | - Noemi Borgognoni
- Department of Life and Environmental Sciences, Polytechnic University of Marche, Ancona, Italy; New York-Marche Structural Biology Centre (NY-MaSBiC), Polytechnic University of Marche, Ancona, Italy; Neuronal Death and Neuroprotection Unit, Department of Neuroscience, Mario Negri Institute for Pharmacological Research-IRCCS, Milan, Italy
| | - Gloria Venturini
- Department of Life and Environmental Sciences, Polytechnic University of Marche, Ancona, Italy; New York-Marche Structural Biology Centre (NY-MaSBiC), Polytechnic University of Marche, Ancona, Italy
| | - Francesco Pignotti
- Department of Life and Environmental Sciences, Polytechnic University of Marche, Ancona, Italy; New York-Marche Structural Biology Centre (NY-MaSBiC), Polytechnic University of Marche, Ancona, Italy
| | - Domenico Raimondo
- Department of Molecular Medicine, Spienza University of Rome, Rome, Italy; National Biodiversity Future Center (NBFC), Rome, Italy
| | - Tiziana Borsello
- Neuronal Death and Neuroprotection Unit, Department of Neuroscience, Mario Negri Institute for Pharmacological Research-IRCCS, Milan, Italy; Department of Pharmacological and Biomolecular Sciences, University of Milan, Milan, Italy.
| | - Daniele Di Marino
- Department of Life and Environmental Sciences, Polytechnic University of Marche, Ancona, Italy; New York-Marche Structural Biology Centre (NY-MaSBiC), Polytechnic University of Marche, Ancona, Italy; Neuronal Death and Neuroprotection Unit, Department of Neuroscience, Mario Negri Institute for Pharmacological Research-IRCCS, Milan, Italy
| |
Collapse
|
48
|
Ivashchenko SD, Shulga DA, Ivashchenko VD, Zinovev EV, Vlasov AV. In silico studies of the open form of human tissue transglutaminase. Sci Rep 2024; 14:15981. [PMID: 38987418 PMCID: PMC11236986 DOI: 10.1038/s41598-024-66348-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2024] [Accepted: 07/01/2024] [Indexed: 07/12/2024] Open
Abstract
Human tissue transglutaminase (tTG) is an intriguing multifunctional enzyme involved in various diseases, including celiac disease and neurological disorders. Although a number of tTG inhibitors have been developed, the molecular determinants governing ligand binding remain incomplete due to the lack of high-resolution structural data in the vicinity of its active site. In this study, we obtained the complete high-resolution model of tTG by in silico methods based on available PDB structures. We discovered significant differences in the active site architecture between our and known tTG models, revealing an additional loop which affects the ligand binding affinity. We assembled a library of new potential tTG inhibitors based on the obtained complete model of the enzyme. Our library substantially expands the spectrum of possible drug candidates targeting tTG and encompasses twelve molecular scaffolds, eleven of which are novel and exhibit higher binding affinity then already known ones, according to our in silico studies. The results of this study open new directions for structure-based drug design of tTG inhibitors, offering the complete protein model and suggesting a wide range of new compounds for further experimental validation.
Collapse
Affiliation(s)
- S D Ivashchenko
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia, 141701
- Laboratory of Microbiology, BIOTECH University, Moscow, Russia, 125080
| | - D A Shulga
- Department of Chemistry, Moscow State University, Moscow, Russia, 119991
| | - V D Ivashchenko
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia, 141701
| | - E V Zinovev
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia, 141701
| | - A V Vlasov
- Moscow Institute of Physics and Technology, Dolgoprudny, Russia, 141701.
- Laboratory of Microbiology, BIOTECH University, Moscow, Russia, 125080.
- Joint Institute for Nuclear Research, Dubna, Russia, 141980.
| |
Collapse
|
49
|
Koehl P, Navaza R, Tekpinar M, Delarue M. MinActionPath2: path generation between different conformations of large macromolecular assemblies by action minimization. Nucleic Acids Res 2024; 52:W256-W263. [PMID: 38783081 PMCID: PMC11223808 DOI: 10.1093/nar/gkae421] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/12/2024] [Revised: 04/25/2024] [Accepted: 05/07/2024] [Indexed: 05/25/2024] Open
Abstract
Recent progress in solving macromolecular structures and assemblies by cryogenic electron microscopy techniques enables sampling of their conformations in different states that are relevant to their biological function. Knowing the transition path between these conformations would provide new avenues for drug discovery. While the experimental study of transition paths is intrinsically difficult, in-silico methods can be used to generate an initial guess for those paths. The Elastic Network Model (ENM), along with a coarse-grained representation (CG) of the structures are among the most popular models to explore such possible paths. Here we propose an update to our software platform MinActionPath that generates non-linear transition paths based on ENM and CG models, using action minimization to solve the equations of motion. The new website enables the study of large structures such as ribosomes or entire virus envelopes. It provides direct visualization of the trajectories along with quantitative analyses of their behaviors at http://dynstr.pasteur.fr/servers/minactionpath/minactionpath2_submission.
Collapse
Affiliation(s)
- Patrice Koehl
- Department of Computer Science and Genome Centre, University of California, Davis, CA 95616, USA
| | - Rafael Navaza
- Plateforme de Cristallographie, C2RT, Institut Pasteur, Université Paris Cité, UMR 3528 du CNRS, 75015 Paris, France
| | - Mustafa Tekpinar
- Unité Architecture et Dynamique des Macromolécules Biologiques, Institut Pasteur, Université Paris Cité, UMR 3528 du CNRS, 75015 Paris, France
| | - Marc Delarue
- Unité Architecture et Dynamique des Macromolécules Biologiques, Institut Pasteur, Université Paris Cité, UMR 3528 du CNRS, 75015 Paris, France
| |
Collapse
|
50
|
Park H, Patel P, Haas R, Huerta EA. APACE: AlphaFold2 and advanced computing as a service for accelerated discovery in biophysics. Proc Natl Acad Sci U S A 2024; 121:e2311888121. [PMID: 38913887 PMCID: PMC11228474 DOI: 10.1073/pnas.2311888121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 12/25/2023] [Indexed: 06/26/2024] Open
Abstract
The prediction of protein 3D structure from amino acid sequence is a computational grand challenge in biophysics and plays a key role in robust protein structure prediction algorithms, from drug discovery to genome interpretation. The advent of AI models, such as AlphaFold, is revolutionizing applications that depend on robust protein structure prediction algorithms. To maximize the impact, and ease the usability, of these AI tools we introduce APACE, AlphaFold2 and advanced computing as a service, a computational framework that effectively handles this AI model and its TB-size database to conduct accelerated protein structure prediction analyses in modern supercomputing environments. We deployed APACE in the Delta and Polaris supercomputers and quantified its performance for accurate protein structure predictions using four exemplar proteins: 6AWO, 6OAN, 7MEZ, and 6D6U. Using up to 300 ensembles, distributed across 200 NVIDIA A100 GPUs, we found that APACE is up to two orders of magnitude faster than off-the-self AlphaFold2 implementations, reducing time-to-solution from weeks to minutes. This computational approach may be readily linked with robotics laboratories to automate and accelerate scientific discovery.
Collapse
Affiliation(s)
- Hyun Park
- Data Science and Learning Division, Argonne National Laboratory, Lemont, IL 60439
- Theoretical and Computational Biophysics Group, Beckman Institute for Advanced Science and Technology, University of Illinois at Urbana-Champaign, Urbana, IL 61801
- Center for Biophysics and Quantitative Biology, University of Illinois at Urbana-Champaign, Urbana, IL 61801
| | - Parth Patel
- Data Science and Learning Division, Argonne National Laboratory, Lemont, IL 60439
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL 61801
| | - Roland Haas
- National Center for Supercomputing Applications, University of Illinois at Urbana-Champaign, Urbana, IL 61801
| | - E A Huerta
- Data Science and Learning Division, Argonne National Laboratory, Lemont, IL 60439
- Department of Computer Science, The University of Chicago, Chicago, IL 60637
- Department of Physics, University of Illinois at Urbana-Champaign, Urbana, IL 61801
| |
Collapse
|