1
|
Christoffer C, Kagaya Y, Verburgt J, Terashi G, Shin WH, Jain A, Sarkar D, Aderinwale T, Maddhuri Venkata Subramaniya SR, Wang X, Zhang Z, Zhang Y, Kihara D. Integrative Protein Assembly With LZerD and Deep Learning in CAPRI 47-55. Proteins 2025. [PMID: 40095385 DOI: 10.1002/prot.26818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Accepted: 02/18/2025] [Indexed: 03/19/2025]
Abstract
We report the performance of the protein complex prediction approaches of our group and their results in CAPRI Rounds 47-55, excluding the joint CASP Rounds 50 and 54, as well as the special COVID-19 Round 51. Our approaches integrated classical pipelines developed in our group as well as more recently developed deep learning pipelines. In the cases of human group prediction, we surveyed the literature to find information to integrate into the modeling, such as assayed interface residues. In addition to any literature information, generated complex models were selected by a rank aggregation of statistical scoring functions, by generative model confidence, or by expert inspection. In these CAPRI rounds, our human group successfully modeled eight interfaces and achieved the top quality level among the submissions for all of them, including two where no other group did. We note that components of our modeling pipelines have become increasingly unified within deep learning approaches. Finally, we discuss several case studies that illustrate successful and unsuccessful modeling using our approaches.
Collapse
Affiliation(s)
- Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
- Rosen Center for Advanced Computing, Purdue University, West Lafayette, Indiana, USA
| | - Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Woong-Hee Shin
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
- College of Medicine, Korea University, Seoul, South Korea
| | - Anika Jain
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | | | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Yuanyuan Zhang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
- Purdue University Institute for Cancer Research, Purdue University, West Lafayette, Indiana, USA
| |
Collapse
|
2
|
Ding X, Chen X, Sullivan EE, Shay TF, Gradinaru V. Fast, accurate ranking of engineered proteins by target-binding propensity using structure modeling. Mol Ther 2024; 32:1687-1700. [PMID: 38582966 PMCID: PMC11184338 DOI: 10.1016/j.ymthe.2024.04.003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 02/08/2024] [Accepted: 04/03/2024] [Indexed: 04/08/2024] Open
Abstract
Deep-learning-based methods for protein structure prediction have achieved unprecedented accuracy, yet their utility in the engineering of protein-based binders remains constrained due to a gap between the ability to predict the structures of candidate proteins and the ability toprioritize proteins by their potential to bind to a target. To bridge this gap, we introduce Automated Pairwise Peptide-Receptor Analysis for Screening Engineered proteins (APPRAISE), a method for predicting the target-binding propensity of engineered proteins. After generating structural models of engineered proteins competing for binding to a target using an established structure prediction tool such as AlphaFold-Multimer or ESMFold, APPRAISE performs a rapid (under 1 CPU second per model) scoring analysis that takes into account biophysical and geometrical constraints. As proof-of-concept cases, we demonstrate that APPRAISE can accurately classify receptor-dependent vs. receptor-independent adeno-associated viral vectors and diverse classes of engineered proteins such as miniproteins targeting the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike, nanobodies targeting a G-protein-coupled receptor, and peptides that specifically bind to transferrin receptor or programmed death-ligand 1 (PD-L1). APPRAISE is accessible through a web-based notebook interface using Google Colaboratory (https://tiny.cc/APPRAISE). With its accuracy, interpretability, and generalizability, APPRAISE promises to expand the utility of protein structure prediction and accelerate protein engineering for biomedical applications.
Collapse
Affiliation(s)
- Xiaozhe Ding
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA.
| | - Xinhong Chen
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA
| | - Erin E Sullivan
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA
| | - Timothy F Shay
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA
| | - Viviana Gradinaru
- Division of Biology and Biological Engineering, California Institute of Technology, 1200 E California, Boulevard, Pasadena, CA 91125, USA.
| |
Collapse
|
3
|
Corso G, Deng A, Fry B, Polizzi N, Barzilay R, Jaakkola T. Deep Confident Steps to New Pockets: Strategies for Docking Generalization. ARXIV 2024:arXiv:2402.18396v1. [PMID: 38463508 PMCID: PMC10925391] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Accurate blind docking has the potential to lead to new biological breakthroughs, but for this promise to be realized, docking methods must generalize well across the proteome. Existing benchmarks, however, fail to rigorously assess generalizability. Therefore, we develop DockGen, a new benchmark based on the ligand-binding domains of proteins, and we show that existing machine learning-based docking models have very weak generalization abilities. We carefully analyze the scaling laws of ML-based docking and show that, by scaling data and model size, as well as integrating synthetic data strategies, we are able to significantly increase the generalization capacity and set new state-of-the-art performance across benchmarks. Further, we propose Confidence Bootstrapping, a new training paradigm that solely relies on the interaction between diffusion and confidence models and exploits the multi-resolution generation process of diffusion models. We demonstrate that Confidence Bootstrapping significantly improves the ability of ML-based docking methods to dock to unseen protein classes, edging closer to accurate and generalizable blind docking methods.
Collapse
Affiliation(s)
| | | | - Benjamin Fry
- Dana-Farber Cancer Institute and Harvard Medical School
| | | | | | | |
Collapse
|
4
|
Shin WH, Kihara D. PL-PatchSurfer3: Improved Structure-Based Virtual Screening for Structure Variation Using 3D Zernike Descriptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.22.581511. [PMID: 38464318 PMCID: PMC10925112 DOI: 10.1101/2024.02.22.581511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Structure-based virtual screening (SBVS) is a widely used method in silico drug discovery. It necessitates a receptor structure or binding site to predict the binding pose and fitness of a ligand. Therefore, the performance of the SBVS is affected by the protein conformation. The most frequently used method in SBVS is the protein-ligand docking program, which utilizes atomic distance-based scoring functions. Hence, they are highly prone to sensitivity towards variation in receptor structure, and it is reported that the conformational change significantly drops the performance of the docking program. To address the problem, we have introduced a novel program of SBVS, named PL-PatchSurfer. This program makes use of molecular surface patches and the Zernike descriptor. The surfaces of the pocket and ligand are segmented into several patches by the program. These patches are then mapped with physico-chemical properties such as shape and electrostatic potential before being converted into the Zernike descriptor, which is rotationally invariant. A complementarity between the protein and the ligand is assessed by comparing the descriptors and geometric distribution of the patches in the molecules. A benchmarking study showed that PL-PatchSurfer2 was able to screen active molecules regardless of the receptor structure change with fast speed. However, the program could not achieve high performance for the targets that the hydrogen bonding feature is important such as nuclear hormone receptors. In this paper, we present the newer version of PL-PatchSurfer, PL-PatchSurfer3, which incorporates two new features: a change in the definition of hydrogen bond complementarity and consideration of visibility that contains curvature information of a patch. Our evaluation demonstrates that the new program outperforms its predecessor and other SBVS methods while retaining its characteristic tolerance to receptor structure changes. Interested individuals can access the program at kiharalab.org/plps3.
Collapse
Affiliation(s)
- Woong-Hee Shin
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul, Republic of Korea
| | - Daisuke Kihara
- Department of Biological Science, Purdue University, West Lafayette, IN, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
- Center for Cancer Research, Purdue University, West Lafayette, IN, USA
| |
Collapse
|
5
|
Zhang Y, Wang X, Zhang Z, Huang Y, Kihara D. Assessment of Protein-Protein Docking Models Using Deep Learning. Methods Mol Biol 2024; 2780:149-162. [PMID: 38987469 DOI: 10.1007/978-1-0716-3985-6_10] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 07/12/2024]
Abstract
Protein-protein interactions are involved in almost all processes in a living cell and determine the biological functions of proteins. To obtain mechanistic understandings of protein-protein interactions, the tertiary structures of protein complexes have been determined by biophysical experimental methods, such as X-ray crystallography and cryogenic electron microscopy. However, as experimental methods are costly in resources, many computational methods have been developed that model protein complex structures. One of the difficulties in computational protein complex modeling (protein docking) is to select the most accurate models among many models that are usually generated by a docking method. This article reviews advances in protein docking model assessment methods, focusing on recent developments that apply deep learning to several network architectures.
Collapse
Affiliation(s)
- Yuanyuan Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Yunhan Huang
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, IN, USA.
- Department of Biological Sciences, Purdue University, West Lafayette, IN, USA.
| |
Collapse
|
6
|
Christoffer C, Kihara D. Modeling protein-nucleic acid complexes with extremely large conformational changes using Flex-LZerD. Proteomics 2023; 23:e2200322. [PMID: 36529945 PMCID: PMC10448949 DOI: 10.1002/pmic.202200322] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2022] [Revised: 12/08/2022] [Accepted: 12/13/2022] [Indexed: 12/23/2022]
Abstract
Proteins and nucleic acids are key components in many processes in living cells, and interactions between proteins and nucleic acids are often crucial pathway components. In many cases, large flexibility of proteins as they interact with nucleic acids is key to their function. To understand the mechanisms of these processes, it is necessary to consider the 3D atomic structures of such protein-nucleic acid complexes. When such structures are not yet experimentally determined, protein docking can be used to computationally generate useful structure models. However, such docking has long had the limitation that the consideration of flexibility is usually limited to small movements or to small structures. We previously developed a method of flexible protein docking which could model ordered proteins which undergo large-scale conformational changes, which we also showed was compatible with nucleic acids. Here, we elaborate on the ability of that pipeline, Flex-LZerD, to model specifically interactions between proteins and nucleic acids, and demonstrate that Flex-LZerD can model more interactions and types of conformational change than previously shown.
Collapse
Affiliation(s)
- Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
- Purdue University Center for Cancer Research, Purdue University, West Lafayette, Indiana, USA
| |
Collapse
|
7
|
Chiliveri SC, Shen Y, Baber JL, Ying J, Sagar V, Wistow G, Anfinrud P, Bax A. Experimental NOE, Chemical Shift, and Proline Isomerization Data Provide Detailed Insights into Amelotin Oligomerization. J Am Chem Soc 2023; 145:18063-18074. [PMID: 37548612 PMCID: PMC10436275 DOI: 10.1021/jacs.3c05710] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2023] [Indexed: 08/08/2023]
Abstract
Amelotin is an intrinsically disordered protein (IDP) rich in Pro residues and is involved in hydroxyapatite mineralization. It rapidly oligomerizes under physiological conditions of pH and pressure but reverts to its monomeric IDP state at elevated pressure. We identified a 105-residue segment of the protein that becomes ordered upon oligomerization, and we used pressure-jump NMR spectroscopy to measure long-range NOE contacts that exist exclusively in the oligomeric NMR-invisible state. The kinetics of oligomerization and dissociation were probed at the residue-specific level, revealing that the oligomerization process is initiated in the C-terminal half of the segment. Using pressure-jump NMR, the degree of order in the oligomer at the sites of Pro residues was probed by monitoring changes in cis/trans equilibria relative to the IDP state after long-term equilibration under oligomerizing conditions. Whereas most Pro residues revert to trans in the oligomeric state, Pro-49 favors a cis configuration and three Pro residues retain an unchanged cis fraction, pointing to their local lack of order in the oligomeric state. NOE contacts and secondary 13C chemical shifts in the oligomeric state indicate the presence of an 11-residue α-helix, preceded by a small intramolecular antiparallel β-sheet, with slower formation of long-range intermolecular interactions to N-terminal residues. Although none of the models generated by AlphaFold2 for the amelotin monomer was consistent with experimental data, subunits of a hexamer generated by AlphaFold-Multimer satisfied intramolecular NOE and chemical shift data and may provide a starting point for developing atomic models for the oligomeric state.
Collapse
Affiliation(s)
- Sai Chaitanya Chiliveri
- Laboratory
of Chemical Physics, National Institute of Diabetes and Digestive
and Kidney Diseases, National Institutes
of Health, Bethesda, Maryland 20892, United States
| | - Yang Shen
- Laboratory
of Chemical Physics, National Institute of Diabetes and Digestive
and Kidney Diseases, National Institutes
of Health, Bethesda, Maryland 20892, United States
| | - James L. Baber
- Laboratory
of Chemical Physics, National Institute of Diabetes and Digestive
and Kidney Diseases, National Institutes
of Health, Bethesda, Maryland 20892, United States
| | - Jinfa Ying
- Laboratory
of Chemical Physics, National Institute of Diabetes and Digestive
and Kidney Diseases, National Institutes
of Health, Bethesda, Maryland 20892, United States
| | - Vatsala Sagar
- Section
on Molecular Structure and Function, National Eye Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Graeme Wistow
- Section
on Molecular Structure and Function, National Eye Institute, National Institutes of Health, Bethesda, Maryland 20892, United States
| | - Philip Anfinrud
- Laboratory
of Chemical Physics, National Institute of Diabetes and Digestive
and Kidney Diseases, National Institutes
of Health, Bethesda, Maryland 20892, United States
| | - Ad Bax
- Laboratory
of Chemical Physics, National Institute of Diabetes and Digestive
and Kidney Diseases, National Institutes
of Health, Bethesda, Maryland 20892, United States
| |
Collapse
|
8
|
Kurisaki I, Suzuki M. Simulation toolkits at the molecular scale for trans-scale thermal signaling. Comput Struct Biotechnol J 2023; 21:2547-2557. [PMID: 37102156 PMCID: PMC10123322 DOI: 10.1016/j.csbj.2023.03.040] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2023] [Revised: 03/22/2023] [Accepted: 03/23/2023] [Indexed: 04/28/2023] Open
Abstract
Thermogenesis is a physiological activity of releasing heat that originates from intracellular biochemical reactions. Recent experimental studies discovered that externally applied heat changes intracellular signaling locally, resulting in global changes in cell morphology and signaling. Therefore, we hypothesize an inevitable contribution of thermogenesis in modulating biological system functions throughout the spatial scales from molecules to individual organisms. One key issue examining the hypothesis, namely, the "trans-scale thermal signaling," resides at the molecular scale on the amount of heat released via individual reactions and by which mechanism the heat is employed for cellular function operations. This review introduces atomistic simulation tool kits for studying the mechanisms of thermal signaling processes at the molecular scale that even state-of-the-art experimental methodologies of today are hardly accessible. We consider biological processes and biomolecules as potential heat sources in cells, such as ATP/GTP hydrolysis and multiple biopolymer complex formation and disassembly. Microscopic heat release could be related to mesoscopic processes via thermal conductivity and thermal conductance. Additionally, theoretical simulations to estimate these thermal properties in biological membranes and proteins are introduced. Finally, we envisage the future direction of this research field.
Collapse
Affiliation(s)
- Ikuo Kurisaki
- Waseda Research Institute for Science and Engineering, Waseda University, Bldg. No.55, S Tower, 4th Floor, 3–4-1 Okubo Shinjuku-ku, Tokyo 169–8555, Japan
- Corresponding authors.
| | - Madoka Suzuki
- Institute for Protein Research, Osaka University, 3–2 Yamadaoka, Suita, Osaka 565–0871, Japan
- Corresponding authors.
| |
Collapse
|