1
|
Gao M, Skolnick J. Predicting protein interactions of the kinase Lck critical to T cell modulation. Structure 2024; 32:2168-2179.e2. [PMID: 39368461 PMCID: PMC11560573 DOI: 10.1016/j.str.2024.09.010] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2024] [Revised: 08/19/2024] [Accepted: 09/10/2024] [Indexed: 10/07/2024]
Abstract
Protein-protein interactions (PPIs) play pivotal roles in directing T cell fate. One key player is the non-receptor tyrosine protein kinase Lck that helps to transduce T cell activation signals. Lck is mediated by other proteins via interactions that are inadequately understood. Here, we use the deep learning method AF2Complex to predict PPIs involving Lck, by screening it against ∼1,000 proteins implicated in immune responses, followed by extensive structural modeling for selected interactions. Remarkably, we describe how Lck may be specifically targeted by a palmitoyltransferase using a phosphotyrosine motif. We uncover "hotspot" interactions between Lck and the tyrosine phosphatase CD45, leading to a significant conformational shift of Lck for activation. Lastly, we present intriguing interactions between the phosphotyrosine-binding domain of Lck and the cytoplasmic tail of the immune checkpoint LAG3 and propose a molecular mechanism for its inhibitory role. Together, this multifaceted study provides valuable insights into T cell regulation and signaling.
Collapse
Affiliation(s)
- Mu Gao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA; AgnistaBio Inc, Palo Alto, CA 94301, USA.
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA 30332, USA.
| |
Collapse
|
2
|
Gao M, Skolnick J. Improved deep learning prediction of antigen-antibody interactions. Proc Natl Acad Sci U S A 2024; 121:e2410529121. [PMID: 39361651 PMCID: PMC11474075 DOI: 10.1073/pnas.2410529121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/26/2024] [Accepted: 09/04/2024] [Indexed: 10/05/2024] Open
Abstract
Identifying antibodies that neutralize specific antigens is crucial for developing effective immunotherapies, but this task remains challenging for many target antigens. The rise of deep learning-based computational approaches presents a promising avenue to address this challenge. Here, we assess the performance of a deep learning approach through two benchmark tests aimed at predicting antibodies for the receptor-binding domain of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) spike protein. Three different strategies for constructing input sequence alignments are employed for predicting structural models of antigen-antibody complexes. In our initial testing set, which comprises known experimental structures, these strategies collectively yield a significant top-ranked prediction for 61% of cases and a success rate of 47%. Notably, one strategy that utilizes the sequences of known antigen binders outperforms the other two, achieving a precision of 90% in a subsequent test set of ~1,000 antibodies, balanced between true and control antibodies for the antigen, albeit with a lower recall of 25%. Our results underscore the potential of integrating deep learning methods with single B cell sequencing techniques to enhance the prediction accuracy of antigen-antibody interactions.
Collapse
Affiliation(s)
- Mu Gao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA30332
- AgnistaBio Inc., Palo Alto, CA94301
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, GA30332
| |
Collapse
|
3
|
Meador K, Castells-Graells R, Aguirre R, Sawaya MR, Arbing MA, Sherman T, Senarathne C, Yeates TO. A suite of designed protein cages using machine learning and protein fragment-based protocols. Structure 2024; 32:751-765.e11. [PMID: 38513658 PMCID: PMC11162342 DOI: 10.1016/j.str.2024.02.017] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/19/2023] [Revised: 01/22/2024] [Accepted: 02/23/2024] [Indexed: 03/23/2024]
Abstract
Designed protein cages and related materials provide unique opportunities for applications in biotechnology and medicine, but their creation remains challenging. Here, we apply computational approaches to design a suite of tetrahedrally symmetric, self-assembling protein cages. For the generation of docked conformations, we emphasize a protein fragment-based approach, while for sequence design of the de novo interface, a comparison of knowledge-based and machine learning protocols highlights the power and increased experimental success achieved using ProteinMPNN. An analysis of design outcomes provides insights for improving interface design protocols, including prioritizing fragment-based motifs, balancing interface hydrophobicity and polarity, and identifying preferred polar contact patterns. In all, we report five structures for seven protein cages, along with two structures of intermediate assemblies, with the highest resolution reaching 2.0 Å using cryo-EM. This set of designed cages adds substantially to the body of available protein nanoparticles, and to methodologies for their creation.
Collapse
Affiliation(s)
- Kyle Meador
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA
| | | | - Roman Aguirre
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA
| | - Michael R Sawaya
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA 90095, USA
| | - Mark A Arbing
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA 90095, USA
| | - Trent Sherman
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA
| | - Chethaka Senarathne
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA
| | - Todd O Yeates
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA 90095, USA; UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA 90095, USA.
| |
Collapse
|
4
|
Graef J, Ehrt C, Reim T, Rarey M. Database-Driven Identification of Structurally Similar Protein-Protein Interfaces. J Chem Inf Model 2024; 64:3332-3349. [PMID: 38470439 PMCID: PMC11040719 DOI: 10.1021/acs.jcim.3c01462] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2023] [Revised: 02/26/2024] [Accepted: 02/26/2024] [Indexed: 03/13/2024]
Abstract
Analyzing the similarity of protein interfaces in protein-protein interactions gives new insights into protein function and assists in discovering new drugs. Usually, tools that assess the similarity focus on the interactions between two protein interfaces, while sometimes we only have one predicted interface. Herein, we present PiMine, a database-driven protein interface similarity search. It compares interface residues of one or two interacting chains by calculating and searching tetrahedral geometric patterns of α-carbon atoms and calculating physicochemical and shape-based similarity. On a dedicated, tailor-made dataset, we show that PiMine outperforms commonly used comparison tools in terms of early enrichment when considering interfaces of sequentially and structurally unrelated proteins. In an application example, we demonstrate its usability for protein interaction partner prediction by comparing predicted interfaces to known protein-protein interfaces.
Collapse
Affiliation(s)
- Joel Graef
- Universität Hamburg, ZBH—Center
for Bioinformatics , Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| | - Christiane Ehrt
- Universität Hamburg, ZBH—Center
for Bioinformatics , Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| | - Thorben Reim
- Universität Hamburg, ZBH—Center
for Bioinformatics , Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| | - Matthias Rarey
- Universität Hamburg, ZBH—Center
for Bioinformatics , Albert-Einstein-Ring 8-10, 22761 Hamburg, Germany
| |
Collapse
|
5
|
Ozden B, Şamiloğlu E, Özsan A, Erguven M, Yükrük C, Koşaca M, Oktayoğlu M, Menteş M, Arslan N, Karakülah G, Barlas AB, Savaş B, Karaca E. Benchmarking the accuracy of structure-based binding affinity predictors on Spike-ACE2 deep mutational interaction set. Proteins 2024; 92:529-539. [PMID: 37991066 DOI: 10.1002/prot.26645] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 10/25/2023] [Accepted: 11/13/2023] [Indexed: 11/23/2023]
Abstract
Since the start of COVID-19 pandemic, a huge effort has been devoted to understanding the Spike (SARS-CoV-2)-ACE2 recognition mechanism. To this end, two deep mutational scanning studies traced the impact of all possible mutations across receptor binding domain (RBD) of Spike and catalytic domain of human ACE2. By concentrating on the interface mutations of these experimental data, we benchmarked six commonly used structure-based binding affinity predictors (FoldX, EvoEF1, MutaBind2, SSIPe, HADDOCK, and UEP). These predictors were selected based on their user-friendliness, accessibility, and speed. As a result of our benchmarking efforts, we observed that none of the methods could generate a meaningful correlation with the experimental binding data. The best correlation is achieved by FoldX (R = -0.51). When we simplified the prediction problem to a binary classification, that is, whether a mutation is enriching or depleting the binding, we showed that the highest accuracy is achieved by FoldX with a 64% success rate. Surprisingly, on this set, simple energetic scoring functions performed significantly better than the ones using extra evolutionary-based terms, as in Mutabind and SSIPe. Furthermore, we demonstrated that recent AI approaches, mmCSM-PPI and TopNetTree, yielded comparable performances to the force field-based techniques. These observations suggest plenty of room to improve the binding affinity predictors in guessing the variant-induced binding profile changes of a host-pathogen system, such as Spike-ACE2. To aid such improvements we provide our benchmarking data at https://github.com/CSB-KaracaLab/RBD-ACE2-MutBench with the option to visualize our mutant models at https://rbd-ace2-mutbench.github.io/.
Collapse
Affiliation(s)
- Burcu Ozden
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
- Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey
| | - Eda Şamiloğlu
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
- Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey
| | - Atakan Özsan
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
| | - Mehmet Erguven
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
| | - Can Yükrük
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
| | - Mehdi Koşaca
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
- Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey
| | - Melis Oktayoğlu
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
| | - Muratcan Menteş
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
| | - Nazmiye Arslan
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
| | - Gökhan Karakülah
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
- Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey
| | - Ayşe Berçin Barlas
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
- Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey
| | - Büşra Savaş
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
- Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey
| | - Ezgi Karaca
- Izmir Biomedicine and Genome Center, Dokuz Eylul University Health Campus, Izmir, Turkey
- Izmir International Biomedicine and Genome Institute, Dokuz Eylül University, Izmir, Turkey
| |
Collapse
|
6
|
Sayin AZ, Abali Z, Senyuz S, Cankara F, Gursoy A, Keskin O. Conformational diversity and protein-protein interfaces in drug repurposing in Ras signaling pathway. Sci Rep 2024; 14:1239. [PMID: 38216592 PMCID: PMC10786864 DOI: 10.1038/s41598-023-50913-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 12/27/2023] [Indexed: 01/14/2024] Open
Abstract
We focus on drug repurposing in the Ras signaling pathway, considering structural similarities of protein-protein interfaces. The interfaces formed by physically interacting proteins are found from PDB if available and via PRISM (PRotein Interaction by Structural Matching) otherwise. The structural coverage of these interactions has been increased from 21 to 92% using PRISM. Multiple conformations of each protein are used to include protein dynamics and diversity. Next, we find FDA-approved drugs bound to structurally similar protein-protein interfaces. The results suggest that HIV protease inhibitors tipranavir, indinavir, and saquinavir may bind to EGFR and ERBB3/HER3 interface. Tipranavir and indinavir may also bind to EGFR and ERBB2/HER2 interface. Additionally, a drug used in Alzheimer's disease can bind to RAF1 and BRAF interface. Hence, we propose a methodology to find drugs to be potentially used for cancer using a dataset of structurally similar protein-protein interface clusters rather than pockets in a systematic way.
Collapse
Affiliation(s)
- Ahenk Zeynep Sayin
- Department of Chemical and Biological Engineering, College of Engineering, Koc University, Rumeli Feneri Yolu Sariyer, 34450, Istanbul, Turkey
| | - Zeynep Abali
- Graduate School of Science and Engineering, Computational Sciences and Engineering, Koc University, 34450, Istanbul, Turkey
| | - Simge Senyuz
- Graduate School of Science and Engineering, Computational Sciences and Engineering, Koc University, 34450, Istanbul, Turkey
| | - Fatma Cankara
- Graduate School of Science and Engineering, Computational Sciences and Engineering, Koc University, 34450, Istanbul, Turkey
| | - Attila Gursoy
- Department of Computer Engineering, Koc University, 34450, Istanbul, Turkey
| | - Ozlem Keskin
- Department of Chemical and Biological Engineering, College of Engineering, Koc University, Rumeli Feneri Yolu Sariyer, 34450, Istanbul, Turkey.
| |
Collapse
|
7
|
Tsishyn M, Pucci F, Rooman M. Quantification of biases in predictions of protein-protein binding affinity changes upon mutations. Brief Bioinform 2023; 25:bbad491. [PMID: 38197311 PMCID: PMC10777193 DOI: 10.1093/bib/bbad491] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2023] [Revised: 10/02/2023] [Accepted: 12/05/2023] [Indexed: 01/11/2024] Open
Abstract
Understanding the impact of mutations on protein-protein binding affinity is a key objective for a wide range of biotechnological applications and for shedding light on disease-causing mutations, which are often located at protein-protein interfaces. Over the past decade, many computational methods using physics-based and/or machine learning approaches have been developed to predict how protein binding affinity changes upon mutations. They all claim to achieve astonishing accuracy on both training and test sets, with performances on standard benchmarks such as SKEMPI 2.0 that seem overly optimistic. Here we benchmarked eight well-known and well-used predictors and identified their biases and dataset dependencies, using not only SKEMPI 2.0 as a test set but also deep mutagenesis data on the severe acute respiratory syndrome coronavirus 2 spike protein in complex with the human angiotensin-converting enzyme 2. We showed that, even though most of the tested methods reach a significant degree of robustness and accuracy, they suffer from limited generalizability properties and struggle to predict unseen mutations. Interestingly, the generalizability problems are more severe for pure machine learning approaches, while physics-based methods are less affected by this issue. Moreover, undesirable prediction biases toward specific mutation properties, the most marked being toward destabilizing mutations, are also observed and should be carefully considered by method developers. We conclude from our analyses that there is room for improvement in the prediction models and suggest ways to check, assess and improve their generalizability and robustness.
Collapse
Affiliation(s)
- Matsvei Tsishyn
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Roosevelt Ave, 1050, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium
| | - Fabrizio Pucci
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Roosevelt Ave, 1050, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium
| | - Marianne Rooman
- Computational Biology and Bioinformatics, Université Libre de Bruxelles, Roosevelt Ave, 1050, Brussels, Belgium
- Interuniversity Institute of Bioinformatics in Brussels, Brussels, Belgium
| |
Collapse
|
8
|
Meador K, Castells-Graells R, Aguirre R, Sawaya MR, Arbing MA, Sherman T, Senarathne C, Yeates TO. A Suite of Designed Protein Cages Using Machine Learning Algorithms and Protein Fragment-Based Protocols. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.10.09.561468. [PMID: 37873110 PMCID: PMC10592684 DOI: 10.1101/2023.10.09.561468] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
Designed protein cages and related materials provide unique opportunities for applications in biotechnology and medicine, while methods for their creation remain challenging and unpredictable. In the present study, we apply new computational approaches to design a suite of new tetrahedrally symmetric, self-assembling protein cages. For the generation of docked poses, we emphasize a protein fragment-based approach, while for de novo interface design, a comparison of computational protocols highlights the power and increased experimental success achieved using the machine learning program ProteinMPNN. In relating information from docking and design, we observe that agreement between fragment-based sequence preferences and ProteinMPNN sequence inference correlates with experimental success. Additional insights for designing polar interactions are highlighted by experimentally testing larger and more polar interfaces. In all, using X-ray crystallography and cryo-EM, we report five structures for seven protein cages, with atomic resolution in the best case reaching 2.0 Å. We also report structures of two incompletely assembled protein cages, providing unique insights into one type of assembly failure. The new set of designed cages and their structures add substantially to the body of available protein nanoparticles, and to methodologies for their creation.
Collapse
Affiliation(s)
- Kyle Meador
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
| | | | - Roman Aguirre
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
| | - Michael R. Sawaya
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA, USA 90095
| | - Mark A. Arbing
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA, USA 90095
| | - Trent Sherman
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
| | - Chethaka Senarathne
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
| | - Todd O. Yeates
- Department of Chemistry and Biochemistry, University of California, Los Angeles, CA, USA 90095
- UCLA-DOE Institute for Genomics and Proteomics, Los Angeles, CA, USA 90095
| |
Collapse
|
9
|
Shin WH, Kumazawa K, Imai K, Hirokawa T, Kihara D. Quantitative comparison of protein-protein interaction interface using physicochemical feature-based descriptors of surface patches. Front Mol Biosci 2023; 10:1110567. [PMID: 36814641 PMCID: PMC9939524 DOI: 10.3389/fmolb.2023.1110567] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2022] [Accepted: 01/24/2023] [Indexed: 02/09/2023] Open
Abstract
Driving mechanisms of many biological functions in a cell include physical interactions of proteins. As protein-protein interactions (PPIs) are also important in disease development, protein-protein interactions are highlighted in the pharmaceutical industry as possible therapeutic targets in recent years. To understand the variety of protein-protein interactions in a proteome, it is essential to establish a method that can identify similarity and dissimilarity between protein-protein interactions for inferring the binding of similar molecules, including drugs and other proteins. In this study, we developed a novel method, protein-protein interaction-Surfer, which compares and quantifies similarity of local surface regions of protein-protein interactions. protein-protein interaction-Surfer represents a protein-protein interaction surface with overlapping surface patches, each of which is described with a three-dimensional Zernike descriptor (3DZD), a compact mathematical representation of 3D function. 3DZD captures both the 3D shape and physicochemical properties of the protein surface. The performance of protein-protein interaction-Surfer was benchmarked on datasets of protein-protein interactions, where we were able to show that protein-protein interaction-Surfer finds similar potential drug binding regions that do not share sequence and structure similarity. protein-protein interaction-Surfer is available at https://kiharalab.org/ppi-surfer.
Collapse
Affiliation(s)
- Woong-Hee Shin
- Department of Chemistry Education, Sunchon National University, Suncheon, South Korea,Department of Advanced Components and Materials Engineering, Sunchon National University, Suncheon, South Korea
| | - Keiko Kumazawa
- Pharmaceutical Discovery Research Laboratories, Teijin Pharma Limited, Tokyo, Japan
| | - Kenichiro Imai
- Cellular and Molecular Biotechnology Research Institute, National Institute of Advanced Industrial Science and Technology, Tokyo, Japan
| | - Takatsugu Hirokawa
- Division of Biomedical Science, Faculty of Medicine, University of Tsukuba, Tsukuba, Japan,Transborder Medical Research Center, University of Tsukuba, Tsukuba, Japan
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, United States,Department of Computer Science, Purdue University, West Lafayette, IN, United States,Center for Cancer Research, Purdue University, West Lafayette, IN, United States,*Correspondence: Daisuke Kihara,
| |
Collapse
|
10
|
Mosalaganti S, Obarska-Kosinska A, Siggel M, Taniguchi R, Turoňová B, Zimmerli CE, Buczak K, Schmidt FH, Margiotta E, Mackmull MT, Hagen WJH, Hummer G, Kosinski J, Beck M. AI-based structure prediction empowers integrative structural analysis of human nuclear pores. Science 2022; 376:eabm9506. [PMID: 35679397 DOI: 10.1126/science.abm9506] [Citation(s) in RCA: 180] [Impact Index Per Article: 60.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022]
Abstract
INTRODUCTION The eukaryotic nucleus pro-tects the genome and is enclosed by the two membranes of the nuclear envelope. Nuclear pore complexes (NPCs) perforate the nuclear envelope to facilitate nucleocytoplasmic transport. With a molecular weight of ∼120 MDa, the human NPC is one of the larg-est protein complexes. Its ~1000 proteins are taken in multiple copies from a set of about 30 distinct nucleoporins (NUPs). They can be roughly categorized into two classes. Scaf-fold NUPs contain folded domains and form a cylindrical scaffold architecture around a central channel. Intrinsically disordered NUPs line the scaffold and extend into the central channel, where they interact with cargo complexes. The NPC architecture is highly dynamic. It responds to changes in nuclear envelope tension with conforma-tional breathing that manifests in dilation and constriction movements. Elucidating the scaffold architecture, ultimately at atomic resolution, will be important for gaining a more precise understanding of NPC function and dynamics but imposes a substantial chal-lenge for structural biologists. RATIONALE Considerable progress has been made toward this goal by a joint effort in the field. A synergistic combination of complementary approaches has turned out to be critical. In situ structural biology techniques were used to reveal the overall layout of the NPC scaffold that defines the spatial reference for molecular modeling. High-resolution structures of many NUPs were determined in vitro. Proteomic analysis and extensive biochemical work unraveled the interaction network of NUPs. Integra-tive modeling has been used to combine the different types of data, resulting in a rough outline of the NPC scaffold. Previous struc-tural models of the human NPC, however, were patchy and limited in accuracy owing to several challenges: (i) Many of the high-resolution structures of individual NUPs have been solved from distantly related species and, consequently, do not comprehensively cover their human counterparts. (ii) The scaf-fold is interconnected by a set of intrinsically disordered linker NUPs that are not straight-forwardly accessible to common structural biology techniques. (iii) The NPC scaffold intimately embraces the fused inner and outer nuclear membranes in a distinctive topol-ogy and cannot be studied in isolation. (iv) The conformational dynamics of scaffold NUPs limits the resolution achievable in structure determination. RESULTS In this study, we used artificial intelligence (AI)-based prediction to generate an exten-sive repertoire of structural models of human NUPs and their subcomplexes. The resulting models cover various domains and interfaces that so far remained structurally uncharac-terized. Benchmarking against previous and unpublished x-ray and cryo-electron micros-copy structures revealed unprecedented accu-racy. We obtained well-resolved cryo-electron tomographic maps of both the constricted and dilated conformational states of the hu-man NPC. Using integrative modeling, we fit-ted the structural models of individual NUPs into the cryo-electron microscopy maps. We explicitly included several linker NUPs and traced their trajectory through the NPC scaf-fold. We elucidated in great detail how mem-brane-associated and transmembrane NUPs are distributed across the fusion topology of both nuclear membranes. The resulting architectural model increases the structural coverage of the human NPC scaffold by about twofold. We extensively validated our model against both earlier and new experimental data. The completeness of our model has enabled microsecond-long coarse-grained molecular dynamics simulations of the NPC scaffold within an explicit membrane en-vironment and solvent. These simulations reveal that the NPC scaffold prevents the constriction of the otherwise stable double-membrane fusion pore to small diameters in the absence of membrane tension. CONCLUSION Our 70-MDa atomically re-solved model covers >90% of the human NPC scaffold. It captures conforma-tional changes that occur during dilation and constriction. It also reveals the precise anchoring sites for intrinsically disordered NUPs, the identification of which is a prerequisite for a complete and dy-namic model of the NPC. Our study exempli-fies how AI-based structure prediction may accelerate the elucidation of subcellular ar-chitecture at atomic resolution. [Figure: see text].
Collapse
Affiliation(s)
- Shyamal Mosalaganti
- Department of Molecular Sociology, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany.,Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany.,Life Sciences Institute and Department of Cell and Developmental Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Agnieszka Obarska-Kosinska
- Department of Molecular Sociology, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany.,European Molecular Biology Laboratory Hamburg, 22607 Hamburg, Germany
| | - Marc Siggel
- European Molecular Biology Laboratory Hamburg, 22607 Hamburg, Germany.,Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany.,Centre for Structural Systems Biology, 22607 Hamburg, Germany
| | - Reiya Taniguchi
- Department of Molecular Sociology, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany.,Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Beata Turoňová
- Department of Molecular Sociology, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany.,Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Christian E Zimmerli
- Department of Molecular Sociology, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany.,Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Katarzyna Buczak
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Florian H Schmidt
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Erica Margiotta
- Department of Molecular Sociology, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany.,Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Marie-Therese Mackmull
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Wim J H Hagen
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Gerhard Hummer
- Department of Theoretical Biophysics, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany.,Institute of Biophysics, Goethe University Frankfurt, 60438 Frankfurt am Main, Germany
| | - Jan Kosinski
- Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany.,European Molecular Biology Laboratory Hamburg, 22607 Hamburg, Germany.,Centre for Structural Systems Biology, 22607 Hamburg, Germany
| | - Martin Beck
- Department of Molecular Sociology, Max Planck Institute of Biophysics, 60438 Frankfurt am Main, Germany.,Structural and Computational Biology Unit, European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| |
Collapse
|
11
|
Gupta S, Azadvari N, Hosseinzadeh P. Design of Protein Segments and Peptides for Binding to Protein Targets. BIODESIGN RESEARCH 2022; 2022:9783197. [PMID: 37850124 PMCID: PMC10521657 DOI: 10.34133/2022/9783197] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2021] [Accepted: 03/16/2022] [Indexed: 10/19/2023] Open
Abstract
Recent years have witnessed a rise in methods for accurate prediction of structure and design of novel functional proteins. Design of functional protein fragments and peptides occupy a small, albeit unique, space within the general field of protein design. While the smaller size of these peptides allows for more exhaustive computational methods, flexibility in their structure and sparsity of data compared to proteins, as well as presence of noncanonical building blocks, add additional challenges to their design. This review summarizes the current advances in the design of protein fragments and peptides for binding to targets and discusses the challenges in the field, with an eye toward future directions.
Collapse
Affiliation(s)
- Suchetana Gupta
- Knight Campus Center for Accelerating Scientific Impact, University of Oregon, Eugene OR 97403, USA
| | - Noora Azadvari
- Knight Campus Center for Accelerating Scientific Impact, University of Oregon, Eugene OR 97403, USA
| | - Parisa Hosseinzadeh
- Knight Campus Center for Accelerating Scientific Impact, University of Oregon, Eugene OR 97403, USA
| |
Collapse
|
12
|
Gao M, Nakajima An D, Parks JM, Skolnick J. AF2Complex predicts direct physical interactions in multimeric proteins with deep learning. Nat Commun 2022; 13:1744. [PMID: 35365655 PMCID: PMC8975832 DOI: 10.1038/s41467-022-29394-2] [Citation(s) in RCA: 147] [Impact Index Per Article: 49.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2021] [Accepted: 03/15/2022] [Indexed: 12/20/2022] Open
Abstract
Accurate descriptions of protein-protein interactions are essential for understanding biological systems. Remarkably accurate atomic structures have been recently computed for individual proteins by AlphaFold2 (AF2). Here, we demonstrate that the same neural network models from AF2 developed for single protein sequences can be adapted to predict the structures of multimeric protein complexes without retraining. In contrast to common approaches, our method, AF2Complex, does not require paired multiple sequence alignments. It achieves higher accuracy than some complex protein-protein docking strategies and provides a significant improvement over AF-Multimer, a development of AlphaFold for multimeric proteins. Moreover, we introduce metrics for predicting direct protein-protein interactions between arbitrary protein pairs and validate AF2Complex on some challenging benchmark sets and the E. coli proteome. Lastly, using the cytochrome c biogenesis system I as an example, we present high-confidence models of three sought-after assemblies formed by eight members of this system.
Collapse
Affiliation(s)
- Mu Gao
- Center for the Study of Systems Biology, School of Biological Sciences, Atlanta, GA, USA.
| | - Davi Nakajima An
- School of Computer Science, Georgia Institute of Technology, Atlanta, GA, USA
| | - Jerry M Parks
- Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Atlanta, GA, USA.
| |
Collapse
|
13
|
Gao M, Nakajima An D, Skolnick J. Deep learning-driven insights into super protein complexes for outer membrane protein biogenesis in bacteria. eLife 2022; 11:82885. [PMID: 36576775 PMCID: PMC9797188 DOI: 10.7554/elife.82885] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2022] [Accepted: 11/28/2022] [Indexed: 12/29/2022] Open
Abstract
To reach their final destinations, outer membrane proteins (OMPs) of gram-negative bacteria undertake an eventful journey beginning in the cytosol. Multiple molecular machines, chaperones, proteases, and other enzymes facilitate the translocation and assembly of OMPs. These helpers usually associate, often transiently, forming large protein assemblies. They are not well understood due to experimental challenges in capturing and characterizing protein-protein interactions (PPIs), especially transient ones. Using AF2Complex, we introduce a high-throughput, deep learning pipeline to identify PPIs within the Escherichia coli cell envelope and apply it to several proteins from an OMP biogenesis pathway. Among the top confident hits obtained from screening ~1500 envelope proteins, we find not only expected interactions but also unexpected ones with profound implications. Subsequently, we predict atomic structures for these protein complexes. These structures, typically of high confidence, explain experimental observations and lead to mechanistic hypotheses for how a chaperone assists a nascent, precursor OMP emerging from a translocon, how another chaperone prevents it from aggregating and docks to a β-barrel assembly port, and how a protease performs quality control. This work presents a general strategy for investigating biological pathways by using structural insights gained from deep learning-based predictions.
Collapse
Affiliation(s)
- Mu Gao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of TechnologyAtlantaUnited States
| | - Davi Nakajima An
- School of Computer Science, Georgia Institute of TechnologyAtlantaUnited States
| | - Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of TechnologyAtlantaUnited States
| |
Collapse
|
14
|
Gao M, Lund-Andersen P, Morehead A, Mahmud S, Chen C, Chen X, Giri N, Roy RS, Quadir F, Effler TC, Prout R, Abraham S, Elwasif W, Haas NQ, Skolnick J, Cheng J, Sedova A. High-Performance Deep Learning Toolbox for Genome-Scale Prediction of Protein Structure and Function. WORKSHOP ON MACHINE LEARNING IN HPC ENVIRONMENTS. WORKSHOP ON MACHINE LEARNING IN HPC ENVIRONMENTS 2021; 2021:46-57. [PMID: 35112110 PMCID: PMC8802329 DOI: 10.1109/mlhpc54614.2021.00010] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Computational biology is one of many scientific disciplines ripe for innovation and acceleration with the advent of high-performance computing (HPC). In recent years, the field of machine learning has also seen significant benefits from adopting HPC practices. In this work, we present a novel HPC pipeline that incorporates various machine-learning approaches for structure-based functional annotation of proteins on the scale of whole genomes. Our pipeline makes extensive use of deep learning and provides computational insights into best practices for training advanced deep-learning models for high-throughput data such as proteomics data. We showcase methodologies our pipeline currently supports and detail future tasks for our pipeline to envelop, including large-scale sequence comparison using SAdLSA and prediction of protein tertiary structures using AlphaFold2.
Collapse
Affiliation(s)
- Mu Gao
- Georgia Institute of Technology, Atlanta, GA
| | | | | | | | - Chen Chen
- University of Missouri, Columbia, MO
| | - Xiao Chen
- University of Missouri, Columbia, MO
| | | | | | | | | | - Ryan Prout
- Oak Ridge National Laboratory, Oak Ridge, TN
| | | | | | | | | | | | - Ada Sedova
- Oak Ridge National Laboratory, Oak Ridge, TN
| |
Collapse
|
15
|
Skolnick J, Gao M, Zhou H, Singh S. AlphaFold 2: Why It Works and Its Implications for Understanding the Relationships of Protein Sequence, Structure, and Function. J Chem Inf Model 2021; 61:4827-4831. [PMID: 34586808 DOI: 10.1021/acs.jcim.1c01114] [Citation(s) in RCA: 125] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023]
Abstract
AlphaFold 2 (AF2) was the star of CASP14, the last biannual structure prediction experiment. Using novel deep learning, AF2 predicted the structures of many difficult protein targets at or near experimental resolution. Here, we present our perspective of why AF2 works and show that it is a very sophisticated fold recognition algorithm that exploits the completeness of the library of single domain PDB structures. It has also learned local side chain packing rearrangements that enable it to refine proteins to high resolution. The benefits and limitations of its ability to predict the structures of many more proteins at or close to atomic detail are discussed.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Mu Gao
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biological Sciences, Georgia Institute of Technology, Atlanta, Georgia 30332, United States
| | - Suresh Singh
- Twilight Design, 4 Adams Road, Kendall Park, New Jersey 08824, United States
| |
Collapse
|
16
|
Malhotra S, Joseph AP, Thiyagalingam J, Topf M. Assessment of protein-protein interfaces in cryo-EM derived assemblies. Nat Commun 2021; 12:3399. [PMID: 34099703 PMCID: PMC8184972 DOI: 10.1038/s41467-021-23692-x] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Accepted: 04/14/2021] [Indexed: 02/05/2023] Open
Abstract
Structures of macromolecular assemblies derived from cryo-EM maps often contain errors that become more abundant with decreasing resolution. Despite efforts in the cryo-EM community to develop metrics for map and atomistic model validation, thus far, no specific scoring metrics have been applied systematically to assess the interface between the assembly subunits. Here, we comprehensively assessed protein-protein interfaces in macromolecular assemblies derived by cryo-EM. To this end, we developed Protein Interface-score (PI-score), a density-independent machine learning-based metric, trained using the features of protein-protein interfaces in crystal structures. We evaluated 5873 interfaces in 1053 PDB-deposited cryo-EM models (including SARS-CoV-2 complexes), as well as the models submitted to CASP13 cryo-EM targets and the EM model challenge. We further inspected the interfaces associated with low-scores and found that some of those, especially in intermediate-to-low resolution (worse than 4 Å) structures, were not captured by density-based assessment scores. A combined score incorporating PI-score and fit-to-density score showed discriminatory power, allowing our method to provide a powerful complementary assessment tool for the ever-increasing number of complexes solved by cryo-EM.
Collapse
Affiliation(s)
- Sony Malhotra
- grid.4464.20000 0001 2161 2573Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck College, University of London, London, UK ,grid.14467.30Scientific Computing Department, Science and Technology Facilities Council, Didcot, UK
| | - Agnel Praveen Joseph
- grid.14467.30Scientific Computing Department, Science and Technology Facilities Council, Didcot, UK
| | - Jeyan Thiyagalingam
- grid.14467.30Scientific Computing Department, Science and Technology Facilities Council, Didcot, UK
| | - Maya Topf
- grid.4464.20000 0001 2161 2573Institute of Structural and Molecular Biology, Department of Biological Sciences, Birkbeck College, University of London, London, UK ,grid.13648.380000 0001 2180 3484Centre for Structural Systems Biology, Leibniz-Institut für Experimentelle Virologie and Universitätsklinikum Hamburg-Eppendorf (UKE), Hamburg, Germany
| |
Collapse
|
17
|
Huang X, Zheng W, Pearce R, Zhang Y. SSIPe: accurately estimating protein-protein binding affinity change upon mutations using evolutionary profiles in combination with an optimized physical energy function. Bioinformatics 2020; 36:2429-2437. [PMID: 31830252 DOI: 10.1093/bioinformatics/btz926] [Citation(s) in RCA: 34] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2019] [Revised: 11/08/2019] [Accepted: 12/09/2019] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Most proteins perform their biological functions through interactions with other proteins in cells. Amino acid mutations, especially those occurring at protein interfaces, can change the stability of protein-protein interactions (PPIs) and impact their functions, which may cause various human diseases. Quantitative estimation of the binding affinity changes (ΔΔGbind) caused by mutations can provide critical information for protein function annotation and genetic disease diagnoses. RESULTS We present SSIPe, which combines protein interface profiles, collected from structural and sequence homology searches, with a physics-based energy function for accurate ΔΔGbind estimation. To offset the statistical limits of the PPI structure and sequence databases, amino acid-specific pseudocounts were introduced to enhance the profile accuracy. SSIPe was evaluated on large-scale experimental data containing 2204 mutations from 177 proteins, where training and test datasets were stringently separated with the sequence identity between proteins from the two datasets below 30%. The Pearson correlation coefficient between estimated and experimental ΔΔGbind was 0.61 with a root-mean-square-error of 1.93 kcal/mol, which was significantly better than the other methods. Detailed data analyses revealed that the major advantage of SSIPe over other traditional approaches lies in the novel combination of the physical energy function with the new knowledge-based interface profile. SSIPe also considerably outperformed a former profile-based method (BindProfX) due to the newly introduced sequence profiles and optimized pseudocount technique that allows for consideration of amino acid-specific prior mutation probabilities. AVAILABILITY AND IMPLEMENTATION Web-server/standalone program, source code and datasets are freely available at https://zhanglab.ccmb.med.umich.edu/SSIPe and https://github.com/tommyhuangthu/SSIPe. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics
| | - Robin Pearce
- Department of Computational Medicine and Bioinformatics
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics.,Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
18
|
Tandon H, Melarkode Vattekatte A, Srinivasan N, Sandhya S. Molecular and Structural Basis of Cross-Reactivity in M. tuberculosis Toxin-Antitoxin Systems. Toxins (Basel) 2020; 12:E481. [PMID: 32751054 PMCID: PMC7472061 DOI: 10.3390/toxins12080481] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Revised: 06/21/2020] [Accepted: 06/23/2020] [Indexed: 01/12/2023] Open
Abstract
Mycobacterium tuberculosis genome encodes over 80 toxin-antitoxin (TA) systems. While each toxin interacts with its cognate antitoxin, the abundance of TA systems presents an opportunity for potential non-cognate interactions. TA systems mediate manifold interactions to manage pathogenicity and stress response network of the cell and non-cognate interactions may play vital roles as well. To address if non-cognate and heterologous interactions are feasible and to understand the structural basis of their interactions, we have performed comprehensive computational analyses on the available 3D structures and generated structural models of paralogous M. tuberculosis VapBC and MazEF TA systems. For a majority of the TA systems, we show that non-cognate toxin-antitoxin interactions are structurally incompatible except for complexes like VapBC15 and VapBC11, which show similar interfaces and potential for cross-reactivity. For TA systems which have been experimentally shown earlier to disfavor non-cognate interactions, we demonstrate that they are structurally and stereo-chemically incompatible. For selected TA systems, our detailed structural analysis identifies specificity conferring residues. Thus, our work improves the current understanding of TA interfaces and generates a hypothesis based on congenial binding site, geometric complementarity, and chemical nature of interfaces. Overall, our work offers a structure-based explanation for non-cognate toxin-antitoxin interactions in M. tuberculosis.
Collapse
Affiliation(s)
- Himani Tandon
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India; (H.T.); (A.M.V.)
| | - Akhila Melarkode Vattekatte
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India; (H.T.); (A.M.V.)
- Biologie Intégrée du Globule Rouge UMR_S1134, INSERM, Université Paris, Université de la Réunion, Université des Antilles, F-75739 Paris, France
- Laboratoire d’Excellence GR-Ex, F-75739 Paris, France
- Faculté des Sciences et Technologies, Saint Denis Messag, F-97715 La Réunion, France
- Institut National de la Transfusion Sanguine (INTS), F-75739 Paris, France
| | - Narayanaswamy Srinivasan
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India; (H.T.); (A.M.V.)
| | - Sankaran Sandhya
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore 560012, India; (H.T.); (A.M.V.)
| |
Collapse
|
19
|
Andreani J, Quignot C, Guerois R. Structural prediction of protein interactions and docking using conservation and coevolution. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1470] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Jessica Andreani
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Chloé Quignot
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Raphael Guerois
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| |
Collapse
|
20
|
Mirabello C, Wallner B. Topology independent structural matching discovers novel templates for protein interfaces. Bioinformatics 2019; 34:i787-i794. [PMID: 30423106 DOI: 10.1093/bioinformatics/bty587] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open
Abstract
Motivation Protein-protein interactions (PPI) are essential for the function of the cellular machinery. The rapid growth of protein-protein complexes with known 3D structures offers a unique opportunity to study PPI to gain crucial insights into protein function and the causes of many diseases. In particular, it would be extremely useful to compare interaction surfaces of monomers, as this would enable the pinpointing of potential interaction surfaces based solely on the monomer structure, without the need to predict the complete complex structure. While there are many structural alignment algorithms for individual proteins, very few have been developed for protein interfaces, and none that can align only the interface residues to other interfaces or surfaces of interacting monomer subunits in a topology independent (non-sequential) manner. Results We present InterComp, a method for topology and sequence-order independent structural comparisons. The method is general and can be applied to various structural comparison applications. By representing residues as independent points in space rather than as a sequence of residues, InterComp can be applied to a wide range of problems including interface-surface comparisons and interface-interface comparisons. We demonstrate a use-case by applying InterComp to find similar protein interfaces on the surface of proteins. We show that InterComp pinpoints the correct interface for almost half of the targets (283 of 586) when considering the top 10 hits, and for 24% of the top 1, even when no templates can be found with regular sequence-order dependent structural alignment methods. Availability and implementation The source code and the datasets are available at: http://wallnerlab.org/InterComp. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Claudio Mirabello
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping SE, Sweden
| | - Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology, Linköping University, Linköping SE, Sweden
| |
Collapse
|
21
|
Verma R, Pandit SB. Unraveling the structural landscape of intra-chain domain interfaces: Implication in the evolution of domain-domain interactions. PLoS One 2019; 14:e0220336. [PMID: 31374091 PMCID: PMC6677297 DOI: 10.1371/journal.pone.0220336] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 07/12/2019] [Indexed: 12/22/2022] Open
Abstract
Intra-chain domain interactions are known to play a significant role in the function and stability of multidomain proteins. These interactions are mediated through a physical interaction at domain-domain interfaces (DDIs). With a motivation to understand evolution of interfaces, we have investigated similarities among DDIs. Even though interfaces of protein-protein interactions (PPIs) have been previously studied by structurally aligning interfaces, similar analyses have not yet been performed on DDIs of either multidomain proteins or PPIs. For studying the structural landscape of DDIs, we have used iAlign to structurally align intra-chain domain interfaces of domains. The interface alignment of spatially constrained domains (due to inter-domain linkers) showed that ~88% of these could identify a structural matching interface having similar C-alpha geometry and contact pattern despite that aligned domain pairs are not structurally related. Moreover, the mean interface similarity score (IS-score) is 0.307, which is higher compared to the average random IS-score (0.207) suggesting domain interfaces are not random. The structural space of DDIs is highly connected as ~84% of all possible directed edges among interfaces are found to have at most path length of 8 when 0.26 is IS-score threshold. At this threshold, ~83% of interfaces form the largest strongly connected component. Thus, suggesting that structural space of intra-chain domain interfaces is degenerate and highly connected, as has been found in PPI interfaces. Interestingly, searching for structural neighbors of inter-chain interfaces among intra-chain interfaces showed that ~86% could find a statistically significant match to intra-chain interface with a mean IS-score of 0.311. This implies that domain interfaces are degenerate whether formed within a protein or between proteins. The interface degeneracy is most likely due to limited possible ways of packing secondary structures. In principle, interface similarities can be exploited to accurately model domain interfaces in structure prediction of multidomain proteins.
Collapse
Affiliation(s)
- Rivi Verma
- Department of Biological Sciences, Indian Institute of Science Education and Research, Mohali, India
| | - Shashi Bhushan Pandit
- Department of Biological Sciences, Indian Institute of Science Education and Research, Mohali, India
- * E-mail:
| |
Collapse
|
22
|
The Symmetric Difference Distance: A New Way to Evaluate the Evolution of Interfaces along Molecular Dynamics Trajectories; Application to Influenza Hemagglutinin. Symmetry (Basel) 2019. [DOI: 10.3390/sym11050662] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
We propose a new and easy approach to evaluate structural dissimilarities between frames issued from molecular dynamics, and we test this methodology on human hemagglutinin. This protein is responsible for the entry of the influenza virus into the host cell by endocytosis, and this virus causes seasonal epidemics of infectious disease, which can be estimated to result in hundreds of thousands of deaths each year around the world. We computed the three interfaces between the three protomers of the hemagglutinin H1 homotrimer (PDB code: 1RU7) for each of its conformations generated from molecular dynamics simulation. For each conformation, we considered the set of residues involved in the union of these three interfaces. The dissimilarity between each pair of conformations was measured with our new methodology, the symmetric difference distance between the associated set of residues. The main advantages of the full procedure are: (i) it is parameter free; (ii) no spatial alignment is needed and (iii) it is simple enough so that it can be implemented by a beginner in programming. It is shown to be a relevant tool to follow the evolution of the conformation along the molecular dynamics trajectories.
Collapse
|
23
|
Pearce R, Huang X, Setiawan D, Zhang Y. EvoDesign: Designing Protein-Protein Binding Interactions Using Evolutionary Interface Profiles in Conjunction with an Optimized Physical Energy Function. J Mol Biol 2019; 431:2467-2476. [PMID: 30851277 DOI: 10.1016/j.jmb.2019.02.028] [Citation(s) in RCA: 47] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2018] [Revised: 02/10/2019] [Accepted: 02/26/2019] [Indexed: 01/19/2023]
Abstract
EvoDesign (https://zhanglab.ccmb.med.umich.edu/EvoDesign) is an online server system for protein design. The method uses evolutionary profiles to guide the sequence search simulation and demonstrated significant advantages over physics-based approaches in terms of more accurately designing proteins that adopt desired target folds. Despite the success, the previous EvoDesign program focused only on monomer protein design, which limited its ability and usefulness in terms of designing functional proteins. In this work, we propose a new EvoDesign server, which extends the principles of evolution-based design to design protein-protein interactions. Starting from a two-chain complex structure, structurally similar interfaces are identified from known protein-protein interaction databases. An interface evolutionary profile is then constructed from a multiple sequence alignment of the interface analogies, which is combined with a newly developed, atomic-level physical energy function to guide the replica-exchange Monte Carlo simulation search. The purpose of the server is to redesign the specified complex chain to increase its stability and binding affinity for the other chain in the complex. With the improved scope and accuracy of the methodology, the new EvoDesign pipeline should become a useful online tool for functional protein design and drug discovery studies.
Collapse
Affiliation(s)
- Robin Pearce
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA; Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Xiaoqiang Huang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Dani Setiawan
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Biological Chemistry, University of Michigan, Ann Arbor, MI 48109, USA; Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
24
|
Budowski-Tal I, Kolodny R, Mandel-Gutfreund Y. A Novel Geometry-Based Approach to Infer Protein Interface Similarity. Sci Rep 2018; 8:8192. [PMID: 29844500 PMCID: PMC5974305 DOI: 10.1038/s41598-018-26497-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2017] [Accepted: 05/10/2018] [Indexed: 11/21/2022] Open
Abstract
The protein interface is key to understand protein function, providing a vital insight on how proteins interact with each other and with other molecules. Over the years, many computational methods to compare protein structures were developed, yet evaluating interface similarity remains a very difficult task. Here, we present PatchBag – a geometry based method for efficient comparison of protein surfaces and interfaces. PatchBag is a Bag-Of-Words approach, which represents complex objects as vectors, enabling to search interface similarity in a highly efficient manner. Using a novel framework for evaluating interface similarity, we show that PatchBag performance is comparable to state-of-the-art alignment-based structural comparison methods. The great advantage of PatchBag is that it does not rely on sequence or fold information, thus enabling to detect similarities between interfaces in unrelated proteins. We propose that PatchBag can contribute to reveal novel evolutionary and functional relationships between protein interfaces.
Collapse
Affiliation(s)
- Inbal Budowski-Tal
- Faculty of Biology, Technion, Israel Institute of Technology, Haifa, 3200003, Israel.,Department of Computer Science, University of Haifa, Mount Carmel, Haifa, 3498838, Israel
| | - Rachel Kolodny
- Department of Computer Science, University of Haifa, Mount Carmel, Haifa, 3498838, Israel.
| | - Yael Mandel-Gutfreund
- Faculty of Biology, Technion, Israel Institute of Technology, Haifa, 3200003, Israel.
| |
Collapse
|
25
|
Radu L, Schoenwetter E, Braun C, Marcoux J, Koelmel W, Schmitt DR, Kuper J, Cianférani S, Egly JM, Poterszman A, Kisker C. The intricate network between the p34 and p44 subunits is central to the activity of the transcription/DNA repair factor TFIIH. Nucleic Acids Res 2017; 45:10872-10883. [PMID: 28977422 PMCID: PMC5737387 DOI: 10.1093/nar/gkx743] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Revised: 08/10/2017] [Accepted: 08/23/2017] [Indexed: 01/29/2023] Open
Abstract
The general transcription factor IIH (TFIIH) is a multi-protein complex and its 10 subunits are engaged in an intricate protein-protein interaction network critical for the regulation of its transcription and DNA repair activities that are so far little understood on a molecular level. In this study, we focused on the p44 and the p34 subunits, which are central for the structural integrity of core-TFIIH. We solved crystal structures of a complex formed by the p34 N-terminal vWA and p44 C-terminal zinc binding domains from Chaetomium thermophilum and from Homo sapiens. Intriguingly, our functional analyses clearly revealed the presence of a second interface located in the C-terminal zinc binding region of p34, which can rescue a disrupted interaction between the p34 vWA and the p44 RING domain. In addition, we demonstrate that the C-terminal zinc binding domain of p34 assumes a central role with respect to the stability and function of TFIIH. Our data reveal a redundant interaction network within core-TFIIH, which may serve to minimize the susceptibility to mutational impairment. This provides first insights why so far no mutations in the p34 or p44 TFIIH-core subunits have been identified that would lead to the hallmark nucleotide excision repair syndromes xeroderma pigmentosum or trichothiodystrophy.
Collapse
Affiliation(s)
- Laura Radu
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, UMR 7104 CNRS/Inserm/UdS, BP163, 67404 Illkirch Cedex, C.U. Strasbourg, France
| | - Elisabeth Schoenwetter
- Rudolf Virchow Center for Experimental Biomedicine, Institute for Structural Biology, University of Würzburg, 97080 Würzburg, Germany
| | - Cathy Braun
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, UMR 7104 CNRS/Inserm/UdS, BP163, 67404 Illkirch Cedex, C.U. Strasbourg, France
| | - Julien Marcoux
- Laboratoire de Spectrométrie de Masse Bio-Organique, Université de Strasbourg, CNRS, IPHC UMR 7178, 25 rue Becquerel, 67087 Strasbourg, France
| | - Wolfgang Koelmel
- Rudolf Virchow Center for Experimental Biomedicine, Institute for Structural Biology, University of Würzburg, 97080 Würzburg, Germany
| | - Dominik R. Schmitt
- Rudolf Virchow Center for Experimental Biomedicine, Institute for Structural Biology, University of Würzburg, 97080 Würzburg, Germany
| | - Jochen Kuper
- Rudolf Virchow Center for Experimental Biomedicine, Institute for Structural Biology, University of Würzburg, 97080 Würzburg, Germany
| | - Sarah Cianférani
- Laboratoire de Spectrométrie de Masse Bio-Organique, Université de Strasbourg, CNRS, IPHC UMR 7178, 25 rue Becquerel, 67087 Strasbourg, France
| | - Jean M. Egly
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, UMR 7104 CNRS/Inserm/UdS, BP163, 67404 Illkirch Cedex, C.U. Strasbourg, France
| | - Arnaud Poterszman
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, UMR 7104 CNRS/Inserm/UdS, BP163, 67404 Illkirch Cedex, C.U. Strasbourg, France
| | - Caroline Kisker
- Rudolf Virchow Center for Experimental Biomedicine, Institute for Structural Biology, University of Würzburg, 97080 Würzburg, Germany
| |
Collapse
|
26
|
Goodacre N, Edwards N, Danielsen M, Uetz P, Wu C. Predicting nsSNPs that Disrupt Protein-Protein Interactions Using Docking. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2017; 14:1082-1093. [PMID: 26812731 DOI: 10.1109/tcbb.2016.2520931] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
The human genome contains a large number of protein polymorphisms due to individual genome variation. How many of these polymorphisms lead to altered protein-protein interaction is unknown. We have developed a method to address this question. The intersection of the SKEMPI database (of affinity constants among interacting proteins) and CAPRI 4.0 docking benchmark was docked using HADDOCK, leading to a training set of 166 mutant pairs. A random forest classifier based on the differences in resulting docking scores between the 166 mutant pairs and their wild-types was used, to distinguish between variants that have either completely or partially lost binding ability. Fifty percent of non-binders were correctly predicted with a false discovery rate of only 2 percent. The model was tested on a set of 15 HIV-1 - human, as well as seven human- human glioblastoma-related, mutant protein pairs: 50 percent of combined non-binders were correctly predicted with a false discovery rate of 10 percent. The model was also used to identify 10 protein-protein interactions between human proteins and their HIV-1 partners that are likely to be abolished by rare non-synonymous single-nucleotide polymorphisms (nsSNPs). These nsSNPs may represent novel and potentially therapeutically-valuable targets for anti-viral therapy by disruption of viral binding.
Collapse
|
27
|
Maheshwari S, Brylinski M. Across-proteome modeling of dimer structures for the bottom-up assembly of protein-protein interaction networks. BMC Bioinformatics 2017; 18:257. [PMID: 28499419 PMCID: PMC5427563 DOI: 10.1186/s12859-017-1675-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/11/2017] [Accepted: 05/03/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Deciphering complete networks of interactions between proteins is the key to comprehend cellular regulatory mechanisms. A significant effort has been devoted to expanding the coverage of the proteome-wide interaction space at molecular level. Although a growing body of research shows that protein docking can, in principle, be used to predict biologically relevant interactions, the accuracy of the across-proteome identification of interacting partners and the selection of near-native complex structures still need to be improved. RESULTS In this study, we developed a new method to discover and model protein interactions employing an exhaustive all-to-all docking strategy. This approach integrates molecular modeling, structural bioinformatics, machine learning, and functional annotation filters in order to provide interaction data for the bottom-up assembly of protein interaction networks. Encouragingly, the success rates for dimer modeling is 57.5 and 48.7% when experimental and computer-generated monomer structures are employed, respectively. Further, our protocol correctly identifies 81% of protein-protein interactions at the expense of only 19% false positive rate. As a proof of concept, 61,913 protein-protein interactions were confidently predicted and modeled for the proteome of E. coli. Finally, we validated our method against the human immune disease pathway. CONCLUSIONS Protein docking supported by evolutionary restraints and machine learning can be used to reliably identify and model biologically relevant protein assemblies at the proteome scale. Moreover, the accuracy of the identification of protein-protein interactions is improved by considering only those protein pairs co-localized in the same cellular compartment and involved in the same biological process. The modeling protocol described in this communication can be applied to detect protein-protein interactions in other organisms and pathways as well as to construct dimer structures and estimate the confidence of protein interactions experimentally identified with high-throughput techniques.
Collapse
Affiliation(s)
- Surabhi Maheshwari
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, USA. .,Center for Computation & Technology, Louisiana State University, Baton Rouge, LA, USA.
| |
Collapse
|
28
|
Mirabello C, Wallner B. InterPred: A pipeline to identify and model protein-protein interactions. Proteins 2017; 85:1159-1170. [DOI: 10.1002/prot.25280] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2016] [Revised: 02/27/2017] [Accepted: 03/01/2017] [Indexed: 12/22/2022]
Affiliation(s)
- Claudio Mirabello
- Division of Bioinformatics, Department of Physics, Chemistry and Biology; Linköping University; Linköping 581 83 Sweden
| | - Björn Wallner
- Division of Bioinformatics, Department of Physics, Chemistry and Biology; Linköping University; Linköping 581 83 Sweden
| |
Collapse
|
29
|
Brender JR, Shultis D, Khattak NA, Zhang Y. An Evolution-Based Approach to De Novo Protein Design. Methods Mol Biol 2017; 1529:243-264. [PMID: 27914055 PMCID: PMC5667548 DOI: 10.1007/978-1-4939-6637-0_12] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/19/2023]
Abstract
EvoDesign is a computational algorithm that allows the rapid creation of new protein sequences that are compatible with specific protein structures. As such, it can be used to optimize protein stability, to resculpt the protein surface to eliminate undesired protein-protein interactions, and to optimize protein-protein binding. A major distinguishing feature of EvoDesign in comparison to other protein design programs is the use of evolutionary information in the design process to guide the sequence search toward native-like sequences known to adopt structurally similar folds as the target. The observed frequencies of amino acids in specific positions in the structure in the form of structural profiles collected from proteins with similar folds and complexes with similar interfaces can implicitly capture many subtle effects that are essential for correct folding and protein-binding interactions. As a result of the inclusion of evolutionary information, the sequences designed by EvoDesign have native-like folding and binding properties not seen by other physics-based design methods. In this chapter, we describe how EvoDesign can be used to redesign proteins with a focus on the computational and experimental procedures that can be used to validate the designs.
Collapse
|
30
|
Xiong P, Zhang C, Zheng W, Zhang Y. BindProfX: Assessing Mutation-Induced Binding Affinity Change by Protein Interface Profiles with Pseudo-Counts. J Mol Biol 2016; 429:426-434. [PMID: 27899282 DOI: 10.1016/j.jmb.2016.11.022] [Citation(s) in RCA: 87] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2016] [Revised: 11/22/2016] [Accepted: 11/23/2016] [Indexed: 11/27/2022]
Abstract
Understanding how gene-level mutations affect the binding affinity of protein-protein interactions is a key issue of protein engineering. Due to the complexity of the problem, using physical force field to predict the mutation-induced binding free-energy change remains challenging. In this work, we present a renewed approach to calculate the impact of gene mutations on the binding affinity through the structure-based profiling of protein-protein interfaces, where the binding free-energy change (ΔΔG) is counted as the logarithm of relative probability of mutant amino acids over wild-type ones in the interface alignment matrix; three pseudo-counts are introduced to alleviate the limit of the current interface library. Compared with a previous profile score that was based on the log-odds likelihood calculation, the correlation between predicted and experimental ΔΔG of single-site mutations is increased in this approach from 0.33 to 0.68. The structure-based profile score is found complementary to the physical potentials, where a linear combination of the profile score with the FoldX potential could increase the ΔΔG correlation from 0.46 to 0.74. It is also shown that the profile score is robust for counting the coupling effect of multiple individual mutations. For the mutations involving more than two mutation sites where the correlation between FoldX and experimental data vanishes, the profile-based calculation retains a strong correlation with the experimental measurements.
Collapse
Affiliation(s)
- Peng Xiong
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Chengxin Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Wei Zheng
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.
| |
Collapse
|
31
|
Cao C, Xu S. Improving the performance of the PLB index for ligand-binding site prediction using dihedral angles and the solvent-accessible surface area. Sci Rep 2016; 6:33232. [PMID: 27619067 PMCID: PMC5020399 DOI: 10.1038/srep33232] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2016] [Accepted: 08/23/2016] [Indexed: 12/02/2022] Open
Abstract
Protein ligand-binding site prediction is highly important for protein function determination and structure-based drug design. Over the past twenty years, dozens of computational methods have been developed to address this problem. Soga et al. identified ligand cavities based on the preferences of amino acids for the ligand-binding site (RA) and proposed the propensity for ligand binding (PLB) index to rank the cavities on the protein surface. However, we found that residues exhibit different RAs in response to changes in solvent exposure. Furthermore, previous studies have suggested that some dihedral angles of amino acids in specific regions of the Ramachandran plot are preferred at the functional sites of proteins. Based on these discoveries, the amino acid solvent-accessible surface area and dihedral angles were combined with the RA and PLB to obtain two new indexes, multi-factor RA (MF-RA) and multi-factor PLB (MF-PLB). MF-PLB, PLB and other methods were tested using two benchmark databases and two particular ligand-binding sites. The results show that MF-PLB can improve the success rate of PLB for both ligand-bound and ligand-unbound structures, particularly for top choice prediction.
Collapse
Affiliation(s)
- Chen Cao
- College of Computer Science and Technology, Jilin University, Changchun, Jilin, China
- Key Laboratory of Symbol Computation and Knowledge Engineering of the Ministry of Education, Jilin University, Changchun, Jilin, China
| | - Shutan Xu
- Department of Biochemistry and Molecular Biology, Institute of Bioinformatics, University of Georgia, Athens, GA, USA
| |
Collapse
|
32
|
Yu J, Guerois R. PPI4DOCK: large scale assessment of the use of homology models in free docking over more than 1000 realistic targets. Bioinformatics 2016; 32:3760-3767. [PMID: 27551106 DOI: 10.1093/bioinformatics/btw533] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2016] [Revised: 07/22/2016] [Accepted: 08/10/2016] [Indexed: 12/14/2022] Open
Abstract
MOTIVATION Protein-protein docking methods are of great importance for understanding interactomes at the structural level. It has become increasingly appealing to use not only experimental structures but also homology models of unbound subunits as input for docking simulations. So far we are missing a large scale assessment of the success of rigid-body free docking methods on homology models. RESULTS We explored how we could benefit from comparative modelling of unbound subunits to expand docking benchmark datasets. Starting from a collection of 3157 non-redundant, high X-ray resolution heterodimers, we developed the PPI4DOCK benchmark containing 1417 docking targets based on unbound homology models. Rigid-body docking by Zdock showed that for 1208 cases (85.2%), at least one correct decoy was generated, emphasizing the efficiency of rigid-body docking in generating correct assemblies. Overall, the PPI4DOCK benchmark contains a large set of realistic cases and provides new ground for assessing docking and scoring methodologies. AVAILABILITY AND IMPLEMENTATION Benchmark sets can be downloaded from http://biodev.cea.fr/interevol/ppi4dock/ CONTACT: guerois@cea.frSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jinchao Yu
- Institute for Integrative Biology of the Cell (I2BC), IBITECS, CEA, CNRS, Univ Paris-Sud, Université Paris-Saclay, F-91198, Gif-sur-Yvette, France
| | - Raphaël Guerois
- Institute for Integrative Biology of the Cell (I2BC), IBITECS, CEA, CNRS, Univ Paris-Sud, Université Paris-Saclay, F-91198, Gif-sur-Yvette, France
| |
Collapse
|
33
|
Keskin O, Tuncbag N, Gursoy A. Predicting Protein–Protein Interactions from the Molecular to the Proteome Level. Chem Rev 2016; 116:4884-909. [PMID: 27074302 DOI: 10.1021/acs.chemrev.5b00683] [Citation(s) in RCA: 221] [Impact Index Per Article: 24.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023]
Affiliation(s)
| | - Nurcan Tuncbag
- Graduate
School of Informatics, Department of Health Informatics, Middle East Technical University, 06800 Ankara, Turkey
| | | |
Collapse
|
34
|
Lee HS, Im W. G-LoSA: An efficient computational tool for local structure-centric biological studies and drug design. Protein Sci 2016; 25:865-76. [PMID: 26813336 DOI: 10.1002/pro.2890] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2015] [Revised: 01/20/2016] [Accepted: 01/21/2016] [Indexed: 11/11/2022]
Abstract
Molecular recognition by protein mostly occurs in a local region on the protein surface. Thus, an efficient computational method for accurate characterization of protein local structural conservation is necessary to better understand biology and drug design. We present a novel local structure alignment tool, G-LoSA. G-LoSA aligns protein local structures in a sequence order independent way and provides a GA-score, a chemical feature-based and size-independent structure similarity score. Our benchmark validation shows the robust performance of G-LoSA to the local structures of diverse sizes and characteristics, demonstrating its universal applicability to local structure-centric comparative biology studies. In particular, G-LoSA is highly effective in detecting conserved local regions on the entire surface of a given protein. In addition, the applications of G-LoSA to identifying template ligands and predicting ligand and protein binding sites illustrate its strong potential for computer-aided drug design. We hope that G-LoSA can be a useful computational method for exploring interesting biological problems through large-scale comparison of protein local structures and facilitating drug discovery research and development. G-LoSA is freely available to academic users at http://im.compbio.ku.edu/GLoSA/.
Collapse
Affiliation(s)
- Hui Sun Lee
- Higuchi Biosciences Center, University of Kansas, Lawrence, Kansas, 66047
| | - Wonpil Im
- Department of Molecular Biosciences and Center for Computational Biology, University of Kansas, Lawrence, Kansas, 66047
| |
Collapse
|
35
|
Abstract
Native proteins perform an amazing variety of biochemical functions, including enzymatic catalysis, and can engage in protein-protein and protein-DNA interactions that are essential for life. A key question is how special are these functional properties of proteins. Are they extremely rare, or are they an intrinsic feature? Comparison to the properties of compact conformations of artificially generated compact protein structures selected for thermodynamic stability but not any type of function, the artificial (ART) protein library, demonstrates that a remarkable number of the properties of native-like proteins are recapitulated. These include the complete set of small molecule ligand-binding pockets and most protein-protein interfaces. ART structures are predicted to be capable of weakly binding metabolites and cover a significant fraction of metabolic pathways, with the most enriched pathways including ancient ones such as glycolysis. Native-like active sites are also found in ART proteins. A small fraction of ART proteins are predicted to have strong protein-protein and protein-DNA interactions. Overall, it appears that biochemical function is an intrinsic feature of proteins which nature has significantly optimized during evolution. These studies raise questions as to the relative roles of specificity and promiscuity in the biochemical function and control of cells that need investigation.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Mu Gao
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, Atlanta, GA, USA
| |
Collapse
|
36
|
Shivashankar N, Patil S, Bhosle A, Chandra N, Natarajan V. MS3ALIGN: an efficient molecular surface aligner using the topology of surface curvature. BMC Bioinformatics 2016; 17:26. [PMID: 26753741 PMCID: PMC4710026 DOI: 10.1186/s12859-015-0874-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2015] [Accepted: 12/15/2015] [Indexed: 11/17/2022] Open
Abstract
Background Aligning similar molecular structures is an important step in the process of bio-molecular structure and function analysis. Molecular surfaces are simple representations of molecular structure that are easily constructed from various forms of molecular data such as 3D atomic coordinates (PDB) and Electron Microscopy (EM) data. Methods We present a Multi-Scale Morse-Smale Molecular-Surface Alignment tool, MS3ALIGN, which aligns molecular surfaces based on significant protrusions on the molecular surface. The input is a pair of molecular surfaces represented as triangle meshes. A key advantage of MS3ALIGN is computational efficiency that is achieved because it processes only a few carefully chosen protrusions on the molecular surface. Furthermore, the alignments are partial in nature and therefore allows for inexact surfaces to be aligned. Results The method is evaluated in four settings. First, we establish performance using known alignments with varying overlap and noise values. Second, we compare the method with SurfComp, an existing surface alignment method. We show that we are able to determine alignments reported by SurfComp, as well as report relevant alignments not found by SurfComp. Third, we validate the ability of MS3ALIGN to determine alignments in the case of structurally dissimilar binding sites. Fourth, we demonstrate the ability of MS3ALIGN to align iso-surfaces derived from cryo-electron microscopy scans. Conclusions We have presented an algorithm that aligns Molecular Surfaces based on the topology of surface curvature. A webserver and standalone software implementation of the algorithm available at http://vgl.serc.iisc.ernet.in/ms3align. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0874-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Nithin Shivashankar
- Department of Computer Science and Automation, Indian Institute of Science, Bangalore, 560012, India.
| | - Sonali Patil
- Department of Computer Science and Automation, Indian Institute of Science, Bangalore, 560012, India
| | - Amrisha Bhosle
- Department of Biochemistry, Indian Institute of Science, Bangalore, 560012, India
| | - Nagasuma Chandra
- Department of Biochemistry, Indian Institute of Science, Bangalore, 560012, India
| | - Vijay Natarajan
- Department of Computer Science and Automation, and Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore, 560012, India.
| |
Collapse
|
37
|
Cui X, Naveed H, Gao X. Finding optimal interaction interface alignments between biological complexes. Bioinformatics 2015; 31:i133-41. [PMID: 26072475 PMCID: PMC4765866 DOI: 10.1093/bioinformatics/btv242] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/02/2022] Open
Abstract
Motivation: Biological molecules perform their functions through interactions with other molecules. Structure alignment of interaction interfaces between biological complexes is an indispensable step in detecting their structural similarities, which are keys to understanding their evolutionary histories and functions. Although various structure alignment methods have been developed to successfully access the similarities of protein structures or certain types of interaction interfaces, existing alignment tools cannot directly align arbitrary types of interfaces formed by protein, DNA or RNA molecules. Specifically, they require a ‘blackbox preprocessing’ to standardize interface types and chain identifiers. Yet their performance is limited and sometimes unsatisfactory. Results: Here we introduce a novel method, PROSTA-inter, that automatically determines and aligns interaction interfaces between two arbitrary types of complex structures. Our method uses sequentially remote fragments to search for the optimal superimposition. The optimal residue matching problem is then formulated as a maximum weighted bipartite matching problem to detect the optimal sequence order-independent alignment. Benchmark evaluation on all non-redundant protein–DNA complexes in PDB shows significant performance improvement of our method over TM-align and iAlign (with the ‘blackbox preprocessing’). Two case studies where our method discovers, for the first time, structural similarities between two pairs of functionally related protein–DNA complexes are presented. We further demonstrate the power of our method on detecting structural similarities between a protein–protein complex and a protein–RNA complex, which is biologically known as a protein–RNA mimicry case. Availability and implementation: The PROSTA-inter web-server is publicly available at http://www.cbrc.kaust.edu.sa/prosta/. Contact:xin.gao@kaust.edu.sa
Collapse
Affiliation(s)
- Xuefeng Cui
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Hammad Naveed
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Xin Gao
- Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| |
Collapse
|
38
|
Maheshwari S, Brylinski M. Predicted binding site information improves model ranking in protein docking using experimental and computer-generated target structures. BMC STRUCTURAL BIOLOGY 2015; 15:23. [PMID: 26597230 PMCID: PMC4657198 DOI: 10.1186/s12900-015-0050-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/21/2015] [Accepted: 10/30/2015] [Indexed: 01/10/2023]
Abstract
Background Protein-protein interactions (PPIs) mediate the vast majority of biological processes, therefore, significant efforts have been directed to investigate PPIs to fully comprehend cellular functions. Predicting complex structures is critical to reveal molecular mechanisms by which proteins operate. Despite recent advances in the development of new methods to model macromolecular assemblies, most current methodologies are designed to work with experimentally determined protein structures. However, because only computer-generated models are available for a large number of proteins in a given genome, computational tools should tolerate structural inaccuracies in order to perform the genome-wide modeling of PPIs. Results To address this problem, we developed eRankPPI, an algorithm for the identification of near-native conformations generated by protein docking using experimental structures as well as protein models. The scoring function implemented in eRankPPI employs multiple features including interface probability estimates calculated by eFindSitePPI and a novel contact-based symmetry score. In comparative benchmarks using representative datasets of homo- and hetero-complexes, we show that eRankPPI consistently outperforms state-of-the-art algorithms improving the success rate by ~10 %. Conclusions eRankPPI was designed to bridge the gap between the volume of sequence data, the evidence of binary interactions, and the atomic details of pharmacologically relevant protein complexes. Tolerating structure imperfections in computer-generated models opens up a possibility to conduct the exhaustive structure-based reconstruction of PPI networks across proteomes. The methods and datasets used in this study are available at www.brylinski.org/erankppi.
Collapse
Affiliation(s)
- Surabhi Maheshwari
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA.
| | - Michal Brylinski
- Department of Biological Sciences, Louisiana State University, Baton Rouge, LA, 70803, USA. .,Center for Computation & Technology, Louisiana State University, Baton Rouge, LA, 70803, USA.
| |
Collapse
|
39
|
Brender JR, Zhang Y. Predicting the Effect of Mutations on Protein-Protein Binding Interactions through Structure-Based Interface Profiles. PLoS Comput Biol 2015; 11:e1004494. [PMID: 26506533 PMCID: PMC4624718 DOI: 10.1371/journal.pcbi.1004494] [Citation(s) in RCA: 101] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2015] [Accepted: 08/06/2015] [Indexed: 11/18/2022] Open
Abstract
The formation of protein-protein complexes is essential for proteins to perform their physiological functions in the cell. Mutations that prevent the proper formation of the correct complexes can have serious consequences for the associated cellular processes. Since experimental determination of protein-protein binding affinity remains difficult when performed on a large scale, computational methods for predicting the consequences of mutations on binding affinity are highly desirable. We show that a scoring function based on interface structure profiles collected from analogous protein-protein interactions in the PDB is a powerful predictor of protein binding affinity changes upon mutation. As a standalone feature, the differences between the interface profile score of the mutant and wild-type proteins has an accuracy equivalent to the best all-atom potentials, despite being two orders of magnitude faster once the profile has been constructed. Due to its unique sensitivity in collecting the evolutionary profiles of analogous binding interactions and the high speed of calculation, the interface profile score has additional advantages as a complementary feature to combine with physics-based potentials for improving the accuracy of composite scoring approaches. By incorporating the sequence-derived and residue-level coarse-grained potentials with the interface structure profile score, a composite model was constructed through the random forest training, which generates a Pearson correlation coefficient >0.8 between the predicted and observed binding free-energy changes upon mutation. This accuracy is comparable to, or outperforms in most cases, the current best methods, but does not require high-resolution full-atomic models of the mutant structures. The binding interface profiling approach should find useful application in human-disease mutation recognition and protein interface design studies. Few proteins carry out their tasks in isolation. Instead, proteins combine with each other in complicated ways that can be affected by either the natural genetic variation that occurs among people or by disease causing mutations such as those that occur in cancer or in genetic disorders. To understand how these mutations affect our health, it is necessary to understand how mutations can affect the strength of the interactions that bind proteins together. This is a difficult task to do in a laboratory on a large scale and scientists are increasingly turning to computational methods to predict these effects in advance. We show that by looking at the multiple alignments of similar protein-protein complex structures at the interface regions, new constraints based on the evolution of the three dimensional structures of proteins can be made to predict which mutations are compatible with two proteins interacting and which are not.
Collapse
Affiliation(s)
- Jeffrey R. Brender
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, Michigan, United States of America
- Department of Biological Chemistry, University of Michigan, Ann Arbor, Michigan, United States of America
- * E-mail:
| |
Collapse
|
40
|
Krull F, Korff G, Elghobashi-Meinhardt N, Knapp EW. ProPairs: A Data Set for Protein–Protein Docking. J Chem Inf Model 2015; 55:1495-507. [DOI: 10.1021/acs.jcim.5b00082] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Affiliation(s)
- Florian Krull
- Institute of Chemistry and
Biochemistry, Freie Universität Berlin, Fabeckstrasse 36a, 14195 Berlin, Germany
| | - Gerrit Korff
- Institute of Chemistry and
Biochemistry, Freie Universität Berlin, Fabeckstrasse 36a, 14195 Berlin, Germany
| | - Nadia Elghobashi-Meinhardt
- Institute of Chemistry and
Biochemistry, Freie Universität Berlin, Fabeckstrasse 36a, 14195 Berlin, Germany
| | - Ernst-Walter Knapp
- Institute of Chemistry and
Biochemistry, Freie Universität Berlin, Fabeckstrasse 36a, 14195 Berlin, Germany
| |
Collapse
|
41
|
Pang B, Schlessman D, Kuang X, Zhao N, Shyu D, Korkin D, Shyu CR. An Integrated Approach to Sequence-Independent Local Alignment of Protein Binding Sites. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2015; 12:298-308. [PMID: 26357218 DOI: 10.1109/tcbb.2014.2355208] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]
Abstract
Accurate alignment of protein-protein binding sites can aid in protein docking studies and constructing templates for predicting structure of protein complexes, along with in-depth understanding of evolutionary and functional relationships. However, over the past three decades, structural alignment algorithms have focused predominantly on global alignments with little effort on the alignment of local interfaces. In this paper, we introduce the PBSalign (Protein-protein Binding Site alignment) method, which integrates techniques in graph theory, 3D localized shape analysis, geometric scoring, and utilization of physicochemical and geometrical properties. Computational results demonstrate that PBSalign is capable of identifying similar homologous and analogous binding sites accurately and performing alignments with better geometric match measures than existing protein-protein interface comparison tools. The proportion of better alignment quality generated by PBSalign is 46, 56, and 70 percent more than iAlign as judged by the average match index (MI), similarity index (SI), and structural alignment score (SAS), respectively. PBSalign provides the life science community an efficient and accurate solution to binding-site alignment while striking the balance between topological details and computational complexity.
Collapse
|
42
|
Cheng S, Zhang Y, Brooks CL. PCalign: a method to quantify physicochemical similarity of protein-protein interfaces. BMC Bioinformatics 2015; 16:33. [PMID: 25638036 PMCID: PMC4339745 DOI: 10.1186/s12859-015-0471-x] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2014] [Accepted: 01/15/2015] [Indexed: 02/07/2023] Open
Abstract
Background Structural comparison of protein-protein interfaces provides valuable insights into the functional relationship between proteins, which may not solely arise from shared evolutionary origin. A few methods that exist for such comparative studies have focused on structural models determined at atomic resolution, and may miss out interesting patterns present in large macromolecular complexes that are typically solved by low-resolution techniques. Results We developed a coarse-grained method, PCalign, to quantitatively evaluate physicochemical similarities between a given pair of protein-protein interfaces. This method uses an order-independent algorithm, geometric hashing, to superimpose the backbone atoms of a given pair of interfaces, and provides a normalized scoring function, PC-score, to account for the extent of overlap in terms of both geometric and chemical characteristics. We demonstrate that PCalign outperforms existing methods, and additionally facilitates comparative studies across models of different resolutions, which are not accommodated by existing methods. Furthermore, we illustrate potential application of our method to recognize interesting biological relationships masked by apparent lack of structural similarity. Conclusions PCalign is a useful method in recognizing shared chemical and spatial patterns among protein-protein interfaces. It outperforms existing methods for high-quality data, and additionally facilitates comparison across structural models with different levels of details with proven robustness against noise. Electronic supplementary material The online version of this article (doi:10.1186/s12859-015-0471-x) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Shanshan Cheng
- Department of Computational Medicine and Bioinformatics, Medical School, University of Michigan, Ann Arbor, MI, USA.
| | - Yang Zhang
- Department of Computational Medicine and Bioinformatics, Medical School, University of Michigan, Ann Arbor, MI, USA. .,Department of Biological Chemistry, Medical School, University of Michigan, Ann Arbor, MI, USA.
| | - Charles L Brooks
- Department of Computational Medicine and Bioinformatics, Medical School, University of Michigan, Ann Arbor, MI, USA. .,Department of Chemistry, University of Michigan, Ann Arbor, MI, USA. .,Biophysics Program, University of Michigan, Ann Arbor, MI, USA.
| |
Collapse
|
43
|
Maheshwari S, Brylinski M. Prediction of protein-protein interaction sites from weakly homologous template structures using meta-threading and machine learning. J Mol Recognit 2015; 28:35-48. [DOI: 10.1002/jmr.2410] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2014] [Revised: 06/19/2014] [Accepted: 06/27/2014] [Indexed: 11/11/2022]
Affiliation(s)
- Surabhi Maheshwari
- Department of Biological Sciences; Louisiana State University; Baton Rouge LA 70803 USA
| | - Michal Brylinski
- Department of Biological Sciences; Louisiana State University; Baton Rouge LA 70803 USA
- Center for Computation & Technology; Louisiana State University; Baton Rouge LA 70803 USA
| |
Collapse
|
44
|
Skolnick J, Gao M, Zhou H. On the role of physics and evolution in dictating protein structure and function. Isr J Chem 2014; 54:1176-1188. [PMID: 25484448 PMCID: PMC4255337 DOI: 10.1002/ijch.201400013] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022]
Abstract
How many of the structural and functional properties of proteins are inherent? Computer simulations provide a powerful tool to address this question. A series of studies on QS, quasi-spherical, compact polypeptides which lack any secondary structure; ART, artificial, proteins comprised of compact homopolypeptides with protein-like secondary structure; and PDB, native, single domain proteins shows that essentially all native global folds, pockets and protein-protein interfaces are in the ART library. This suggests that many protein properties are inherent and that evolution is involved in fine-tuning. The completeness of the space of ligand binding pockets and protein-protein interfaces suggests that promiscuous interactions are intrinsic to proteins and that the capacity to perform the biochemistry of life at low level does not require evolution. If so, this has profound consequences for the origin of life.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318, USA
| | - Mu Gao
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318, USA
| | - Hongyi Zhou
- Center for the Study of Systems Biology, School of Biology, Georgia Institute of Technology, 250 14th Street NW, Atlanta, GA 30318, USA
| |
Collapse
|
45
|
Rueda M, Orozco M, Totrov M, Abagyan R. BioSuper: a web tool for the superimposition of biomolecules and assemblies with rotational symmetry. BMC STRUCTURAL BIOLOGY 2013; 13:32. [PMID: 24330655 PMCID: PMC3924234 DOI: 10.1186/1472-6807-13-32] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/21/2013] [Accepted: 12/03/2013] [Indexed: 12/02/2022]
Abstract
Background Most of the proteins in the Protein Data Bank (PDB) are oligomeric complexes consisting of two or more subunits that associate by rotational or helical symmetries. Despite the myriad of superimposition tools in the literature, we could not find any able to account for rotational symmetry and display the graphical results in the web browser. Results BioSuper is a free web server that superimposes and calculates the root mean square deviation (RMSD) of protein complexes displaying rotational symmetry. To the best of our knowledge, BioSuper is the first tool of its kind that provides immediate interactive visualization of the graphical results in the browser, biomolecule generator capabilities, different levels of atom selection, sequence-dependent and structure-based superimposition types, and is the only web tool that takes into account the equivalence of atoms in side chains displaying symmetry ambiguity. BioSuper uses ICM program functionality as a core for the superimpositions and displays the results as text, HTML tables and 3D interactive molecular objects that can be visualized in the browser or in Android and iOS platforms with a free plugin. Conclusions BioSuper is a fast and functional tool that allows for pairwise superimposition of proteins and assemblies displaying rotational symmetry. The web server was created after our own frustration when attempting to superimpose flexible oligomers. We strongly believe that its user-friendly and functional design will be of great interest for structural and computational biologists who need to superimpose oligomeric proteins (or any protein). BioSuper web server is freely available to all users at http://ablab.ucsd.edu/BioSuper.
Collapse
Affiliation(s)
| | | | | | - Ruben Abagyan
- Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093, USA.
| |
Collapse
|
46
|
Bhaskara RM, Padhi A, Srinivasan N. Accurate prediction of interfacial residues in two-domain proteins using evolutionary information: implications for three-dimensional modeling. Proteins 2013; 82:1219-34. [PMID: 24375512 DOI: 10.1002/prot.24486] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2013] [Revised: 11/04/2013] [Accepted: 11/19/2013] [Indexed: 01/08/2023]
Abstract
With the preponderance of multidomain proteins in eukaryotic genomes, it is essential to recognize the constituent domains and their functions. Often function involves communications across the domain interfaces, and the knowledge of the interacting sites is essential to our understanding of the structure-function relationship. Using evolutionary information extracted from homologous domains in at least two diverse domain architectures (single and multidomain), we predict the interface residues corresponding to domains from the two-domain proteins. We also use information from the three-dimensional structures of individual domains of two-domain proteins to train naïve Bayes classifier model to predict the interfacial residues. Our predictions are highly accurate (∼85%) and specific (∼95%) to the domain-domain interfaces. This method is specific to multidomain proteins which contain domains in at least more than one protein architectural context. Using predicted residues to constrain domain-domain interaction, rigid-body docking was able to provide us with accurate full-length protein structures with correct orientation of domains. We believe that these results can be of considerable interest toward rational protein and interaction design, apart from providing us with valuable information on the nature of interactions.
Collapse
|
47
|
Bhaskara RM, de Brevern AG, Srinivasan N. Understanding the role of domain–domain linkers in the spatial orientation of domains in multi-domain proteins. J Biomol Struct Dyn 2013; 31:1467-80. [DOI: 10.1080/07391102.2012.743438] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
48
|
Jalencas X, Mestres J. Identification of Similar Binding Sites to Detect Distant Polypharmacology. Mol Inform 2013; 32:976-90. [PMID: 27481143 DOI: 10.1002/minf.201300082] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2013] [Accepted: 07/29/2013] [Indexed: 01/19/2023]
Abstract
The ability of small molecules to interact with multiple proteins is referred to as polypharmacology. This property is often linked to the therapeutic action of drugs but it is known also to be responsible for many of their side effects. Because of its importance, the development of computational methods that can predict drug polypharmacology has become an important line of research that led recently to the identification of many novel targets for known drugs. Nowadays, the majority of these methods are based on measuring the similarity of a query molecule against the hundreds of thousands of molecules for which pharmacological data on thousands of proteins are available in public sources. However, similarity-based methods are inherently biased by the chemical coverage offered by the active molecules present in those public repositories, which limits significantly their capacity to predict interactions with proteins structurally and functionally unrelated to any of the already known targets for drugs. It is in this respect that structure-based methods aiming at identifying similar binding sites may offer an alternative complementary means to ligand-based methods for detecting distant polypharmacology. The different existing approaches to binding site detection, representation, comparison, and fragmentation are reviewed and recent successful applications presented.
Collapse
Affiliation(s)
- Xavier Jalencas
- Systems Pharmacology, Research Program on Biomedical Informatics (GRIB), IMIM Hospital del Mar Research Institute & University Pompeu Fabra, Parc de Recerca Biomèdica, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain fax: +34 93 3160550
| | - Jordi Mestres
- Systems Pharmacology, Research Program on Biomedical Informatics (GRIB), IMIM Hospital del Mar Research Institute & University Pompeu Fabra, Parc de Recerca Biomèdica, Doctor Aiguader 88, 08003 Barcelona, Catalonia, Spain fax: +34 93 3160550.
| |
Collapse
|
49
|
Protein structure alignment beyond spatial proximity. Sci Rep 2013; 3:1448. [PMID: 23486213 PMCID: PMC3596798 DOI: 10.1038/srep01448] [Citation(s) in RCA: 98] [Impact Index Per Article: 8.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2012] [Accepted: 02/25/2013] [Indexed: 11/08/2022] Open
Abstract
Protein structure alignment is a fundamental problem in computational structure biology. Many programs have been developed for automatic protein structure alignment, but most of them align two protein structures purely based upon geometric similarity without considering evolutionary and functional relationship. As such, these programs may generate structure alignments which are not very biologically meaningful from the evolutionary perspective. This paper presents a novel method DeepAlign for automatic pairwise protein structure alignment. DeepAlign aligns two protein structures using not only spatial proximity of equivalent residues (after rigid-body superposition), but also evolutionary relationship and hydrogen-bonding similarity. Experimental results show that DeepAlign can generate structure alignments much more consistent with manually-curated alignments than other automatic tools especially when proteins under consideration are remote homologs. These results imply that in addition to geometric similarity, evolutionary information and hydrogen-bonding similarity are essential to aligning two protein structures.
Collapse
|
50
|
Brylinski M. Unleashing the power of meta-threading for evolution/structure-based function inference of proteins. Front Genet 2013; 4:118. [PMID: 23802014 PMCID: PMC3686302 DOI: 10.3389/fgene.2013.00118] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2013] [Accepted: 06/04/2013] [Indexed: 01/17/2023] Open
Abstract
Protein threading is widely used in the prediction of protein structure and the subsequent functional annotation. Most threading approaches employ similar criteria for the template identification for use in both protein structure and function modeling. Using structure similarity alone might result in a high false positive rate in protein function inference, which suggests that selecting functional templates should be subject to a different set of constraints. In this study, we extend the functionality of eThread, a recently developed approach to meta-threading, focusing on the optimal selection of functional templates. We optimized the selection of template proteins to cover a broad spectrum of protein molecular function: ligand, metal, inorganic cluster, protein, and nucleic acid binding. In large-scale benchmarks, we demonstrate that the recognition rates in identifying templates that bind molecular partners in similar locations are very high, typically 70-80%, at the expense of a relatively low false positive rate. eThread also provides useful insights into the chemical properties of binding molecules and the structural features of binding. For instance, the sensitivity in recognizing similar protein-binding interfaces is 58% at only 18% false positive rate. Furthermore, in comparative analysis, we demonstrate that meta-threading supported by machine learning outperforms single-threading approaches in functional template selection. We show that meta-threading effectively detects many facets of protein molecular function, even in a low-sequence identity regime. The enhanced version of eThread is freely available as a webserver and stand-alone software at http://www.brylinski.org/ethread.
Collapse
Affiliation(s)
- Michal Brylinski
- Department of Biological Sciences, Louisiana State University Baton Rouge, LA, USA ; Center for Computation and Technology, Louisiana State University Baton Rouge, LA, USA
| |
Collapse
|