1
|
Christoffer C, Kagaya Y, Verburgt J, Terashi G, Shin WH, Jain A, Sarkar D, Aderinwale T, Maddhuri Venkata Subramaniya SR, Wang X, Zhang Z, Zhang Y, Kihara D. Integrative Protein Assembly With LZerD and Deep Learning in CAPRI 47-55. Proteins 2025. [PMID: 40095385 DOI: 10.1002/prot.26818] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2024] [Accepted: 02/18/2025] [Indexed: 03/19/2025]
Abstract
We report the performance of the protein complex prediction approaches of our group and their results in CAPRI Rounds 47-55, excluding the joint CASP Rounds 50 and 54, as well as the special COVID-19 Round 51. Our approaches integrated classical pipelines developed in our group as well as more recently developed deep learning pipelines. In the cases of human group prediction, we surveyed the literature to find information to integrate into the modeling, such as assayed interface residues. In addition to any literature information, generated complex models were selected by a rank aggregation of statistical scoring functions, by generative model confidence, or by expert inspection. In these CAPRI rounds, our human group successfully modeled eight interfaces and achieved the top quality level among the submissions for all of them, including two where no other group did. We note that components of our modeling pipelines have become increasingly unified within deep learning approaches. Finally, we discuss several case studies that illustrate successful and unsuccessful modeling using our approaches.
Collapse
Affiliation(s)
- Charles Christoffer
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
- Rosen Center for Advanced Computing, Purdue University, West Lafayette, Indiana, USA
| | - Yuki Kagaya
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Woong-Hee Shin
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
- College of Medicine, Korea University, Seoul, South Korea
| | - Anika Jain
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Daipayan Sarkar
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
| | - Tunde Aderinwale
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | | | - Xiao Wang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Zicong Zhang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Yuanyuan Zhang
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
| | - Daisuke Kihara
- Department of Computer Science, Purdue University, West Lafayette, Indiana, USA
- Department of Biological Sciences, Purdue University, West Lafayette, Indiana, USA
- Purdue University Institute for Cancer Research, Purdue University, West Lafayette, Indiana, USA
| |
Collapse
|
2
|
Shin WH, Kihara D. PL-PatchSurfer3: Improved Structure-Based Virtual Screening for Structure Variation Using 3D Zernike Descriptors. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.02.22.581511. [PMID: 38464318 PMCID: PMC10925112 DOI: 10.1101/2024.02.22.581511] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/12/2024]
Abstract
Structure-based virtual screening (SBVS) is a widely used method in silico drug discovery. It necessitates a receptor structure or binding site to predict the binding pose and fitness of a ligand. Therefore, the performance of the SBVS is affected by the protein conformation. The most frequently used method in SBVS is the protein-ligand docking program, which utilizes atomic distance-based scoring functions. Hence, they are highly prone to sensitivity towards variation in receptor structure, and it is reported that the conformational change significantly drops the performance of the docking program. To address the problem, we have introduced a novel program of SBVS, named PL-PatchSurfer. This program makes use of molecular surface patches and the Zernike descriptor. The surfaces of the pocket and ligand are segmented into several patches by the program. These patches are then mapped with physico-chemical properties such as shape and electrostatic potential before being converted into the Zernike descriptor, which is rotationally invariant. A complementarity between the protein and the ligand is assessed by comparing the descriptors and geometric distribution of the patches in the molecules. A benchmarking study showed that PL-PatchSurfer2 was able to screen active molecules regardless of the receptor structure change with fast speed. However, the program could not achieve high performance for the targets that the hydrogen bonding feature is important such as nuclear hormone receptors. In this paper, we present the newer version of PL-PatchSurfer, PL-PatchSurfer3, which incorporates two new features: a change in the definition of hydrogen bond complementarity and consideration of visibility that contains curvature information of a patch. Our evaluation demonstrates that the new program outperforms its predecessor and other SBVS methods while retaining its characteristic tolerance to receptor structure changes. Interested individuals can access the program at kiharalab.org/plps3.
Collapse
Affiliation(s)
- Woong-Hee Shin
- Department of Biomedical Informatics, Korea University College of Medicine, Seoul, Republic of Korea
| | - Daisuke Kihara
- Department of Biological Science, Purdue University, West Lafayette, IN, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, USA
- Center for Cancer Research, Purdue University, West Lafayette, IN, USA
| |
Collapse
|
3
|
Sledzieski S, Devkota K, Singh R, Cowen L, Berger B. TT3D: Leveraging precomputed protein 3D sequence models to predict protein-protein interactions. Bioinformatics 2023; 39:btad663. [PMID: 37897686 PMCID: PMC10640393 DOI: 10.1093/bioinformatics/btad663] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2023] [Revised: 09/24/2023] [Accepted: 10/27/2023] [Indexed: 10/30/2023] Open
Abstract
MOTIVATION High-quality computational structural models are now precomputed and available for nearly every protein in UniProt. However, the best way to leverage these models to predict which pairs of proteins interact in a high-throughput manner is not immediately clear. The recent Foldseek method of van Kempen et al. encodes the structural information of distances and angles along the protein backbone into a linear string of the same length as the protein string, using tokens from a 21-letter discretized structural alphabet (3Di). RESULTS We show that using both the amino acid sequence and the 3Di sequence generated by Foldseek as inputs to our recent deep-learning method, Topsy-Turvy, substantially improves the performance of predicting protein-protein interactions cross-species. Thus TT3D (Topsy-Turvy 3D) presents a way to reuse all the computational effort going into producing high-quality structural models from sequence, while being sufficiently lightweight so that high-quality binary protein-protein interaction predictions across all protein pairs can be made genome-wide. AVAILABILITY AND IMPLEMENTATION TT3D is available at https://github.com/samsledje/D-SCRIPT. An archived version of the code at time of submission can be found at https://zenodo.org/records/10037674.
Collapse
Affiliation(s)
- Samuel Sledzieski
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
| | - Kapil Devkota
- Department of Computer Science, Tufts University, 177 College Avenue, Medford, MA 02155, United States
| | - Rohit Singh
- Department of Biostatistics & Bioinformatics, Duke University, Durham, NC 27705, United States
- Department of Cell Biology, Duke University, Durham, NC 27705, United States
| | - Lenore Cowen
- Department of Computer Science, Tufts University, 177 College Avenue, Medford, MA 02155, United States
| | - Bonnie Berger
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, MA 02139, United States
- Department of Mathematics, Massachusetts Institute of Technology, 77 Massachusetts Avenue, Cambridge, MA 02139, United States
| |
Collapse
|