1
|
Li H, Nithin C, Kmiecik S, Huang SY. Computational methods for modeling protein-protein interactions in the AI era: Current status and future directions. Drug Discov Today 2025; 30:104382. [PMID: 40398752 DOI: 10.1016/j.drudis.2025.104382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2025] [Revised: 04/30/2025] [Accepted: 05/14/2025] [Indexed: 05/23/2025]
Abstract
The modeling of protein-protein interactions (PPIs) has been revolutionized by artificial intelligence, with deep learning and end-to-end frameworks such as AlphaFold and its derivatives now dominating the field. This review surveys the current computational landscape for predicting protein complex structures, outlining the role of traditional docking approaches as well as focusing on recent advances in AI-driven methods. We discuss key challenges, including protein flexibility, reliance on co-evolutionary signals, modeling of large assemblies, and interactions involving intrinsically disordered regions (IDRs). Recent innovations aimed at improving sampling diversity, integrating experimental data, and enhancing robustness are also highlighted. Although classical methods remain relevant in specific contexts, the continued evolution of AI-based tools offers transformative potential for structural biology. These advances are poised to deepen our understanding of biomolecular interactions and accelerate the design of therapeutic interventions.
Collapse
Affiliation(s)
- Hao Li
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China
| | - Chandran Nithin
- University of Warsaw, Biological and Chemical Research Centre, Faculty of Chemistry, Warsaw, Poland
| | - Sebastian Kmiecik
- University of Warsaw, Biological and Chemical Research Centre, Faculty of Chemistry, Warsaw, Poland.
| | - Sheng-You Huang
- School of Physics, Huazhong University of Science and Technology, Wuhan, Hubei 430074, China.
| |
Collapse
|
2
|
Mahmoudi I, Quignot C, Martins C, Andreani J. Structural comparison of homologous protein-RNA interfaces reveals widespread overall conservation contrasted with versatility in polar contacts. PLoS Comput Biol 2024; 20:e1012650. [PMID: 39625988 PMCID: PMC11642956 DOI: 10.1371/journal.pcbi.1012650] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Revised: 12/13/2024] [Accepted: 11/18/2024] [Indexed: 12/14/2024] Open
Abstract
Protein-RNA interactions play a critical role in many cellular processes and pathologies. However, experimental determination of protein-RNA structures is still challenging, therefore computational tools are needed for the prediction of protein-RNA interfaces. Although evolutionary pressures can be exploited for structural prediction of protein-protein interfaces, and recent deep learning methods using protein multiple sequence alignments have radically improved the performance of protein-protein interface structural prediction, protein-RNA structural prediction is lagging behind, due to the scarcity of structural data and the flexibility involved in these complexes. To study the evolution of protein-RNA interface structures, we first identified a large and diverse dataset of 2,022 pairs of structurally homologous interfaces (termed structural interologs). We leveraged this unique dataset to analyze the conservation of interface contacts among structural interologs based on the properties of involved amino acids and nucleotides. We uncovered that 73% of distance-based contacts and 68% of apolar contacts are conserved on average, and the strong conservation of these contacts occurs even in distant homologs with sequence identity below 20%. Distance-based contacts are also much more conserved compared to what we had found in a previous study of homologous protein-protein interfaces. In contrast, hydrogen bonds, salt bridges, and π-stacking interactions are very versatile in pairs of protein-RNA interologs, even for close homologs with high interface sequence identity. We found that almost half of the non-conserved distance-based contacts are linked to a small proportion of interface residues that no longer make interface contacts in the interolog, a phenomenon we term "interface switching out". We also examined possible recovery mechanisms for non-conserved hydrogen bonds and salt bridges, uncovering diverse scenarios of switching out, change in amino acid chemical nature, intermolecular and intramolecular compensations. Our findings provide insights for integrating evolutionary signals into predictive protein-RNA structural modeling methods.
Collapse
Affiliation(s)
- Ikram Mahmoudi
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Chloé Quignot
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Carla Martins
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| |
Collapse
|
3
|
Zhu YN, He J, Wang J, Guo W, Liu H, Song Z, Kang L. Parental experiences orchestrate locust egg hatching synchrony by regulating nuclear export of precursor miRNA. Nat Commun 2024; 15:4328. [PMID: 38773155 PMCID: PMC11109280 DOI: 10.1038/s41467-024-48658-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2023] [Accepted: 05/08/2024] [Indexed: 05/23/2024] Open
Abstract
Parental experiences can affect the phenotypic plasticity of offspring. In locusts, the population density that adults experience regulates the number and hatching synchrony of their eggs, contributing to locust outbreaks. However, the pathway of signal transmission from parents to offspring remains unclear. Here, we find that transcription factor Forkhead box protein N1 (FOXN1) responds to high population density and activates the polypyrimidine tract-binding protein 1 (Ptbp1) in locusts. FOXN1-PTBP1 serves as an upstream regulator of miR-276, a miRNA to control egg-hatching synchrony. PTBP1 boosts the nucleo-cytoplasmic transport of pre-miR-276 in a "CU motif"-dependent manner, by collaborating with the primary exportin protein exportin 5 (XPO5). Enhanced nuclear export of pre-miR-276 elevates miR-276 expression in terminal oocytes, where FOXN1 activates Ptbp1 and leads to egg-hatching synchrony in response to high population density. Additionally, PTBP1-prompted nuclear export of pre-miR-276 is conserved in insects, implying a ubiquitous mechanism to mediate transgenerational effects.
Collapse
Affiliation(s)
- Ya Nan Zhu
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
- University of Chinese Academy of Sciences, Beijing, 100101, China
| | - Jing He
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Jiawen Wang
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Wei Guo
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Hongran Liu
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Zhuoran Song
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China
| | - Le Kang
- State Key Laboratory of Integrated Management of Pest Insects and Rodents, Institute of Zoology, Chinese Academy of Sciences, Beijing, 100101, China.
- University of Chinese Academy of Sciences, Beijing, 100101, China.
- College of Life Science, Hebei University, Baoding, Hebei, 071002, China.
| |
Collapse
|
4
|
Larrea-Sebal A, Jebari-Benslaiman S, Galicia-Garcia U, Jose-Urteaga AS, Uribe KB, Benito-Vicente A, Martín C. Predictive Modeling and Structure Analysis of Genetic Variants in Familial Hypercholesterolemia: Implications for Diagnosis and Protein Interaction Studies. Curr Atheroscler Rep 2023; 25:839-859. [PMID: 37847331 PMCID: PMC10618353 DOI: 10.1007/s11883-023-01154-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/15/2023] [Indexed: 10/18/2023]
Abstract
PURPOSE OF REVIEW Familial hypercholesterolemia (FH) is a hereditary condition characterized by elevated levels of low-density lipoprotein cholesterol (LDL-C), which increases the risk of cardiovascular disease if left untreated. This review aims to discuss the role of bioinformatics tools in evaluating the pathogenicity of missense variants associated with FH. Specifically, it highlights the use of predictive models based on protein sequence, structure, evolutionary conservation, and other relevant features in identifying genetic variants within LDLR, APOB, and PCSK9 genes that contribute to FH. RECENT FINDINGS In recent years, various bioinformatics tools have emerged as valuable resources for analyzing missense variants in FH-related genes. Tools such as REVEL, Varity, and CADD use diverse computational approaches to predict the impact of genetic variants on protein function. These tools consider factors such as sequence conservation, structural alterations, and receptor binding to aid in interpreting the pathogenicity of identified missense variants. While these predictive models offer valuable insights, the accuracy of predictions can vary, especially for proteins with unique characteristics that might not be well represented in the databases used for training. This review emphasizes the significance of utilizing bioinformatics tools for assessing the pathogenicity of FH-associated missense variants. Despite their contributions, a definitive diagnosis of a genetic variant necessitates functional validation through in vitro characterization or cascade screening. This step ensures the precise identification of FH-related variants, leading to more accurate diagnoses. Integrating genetic data with reliable bioinformatics predictions and functional validation can enhance our understanding of the genetic basis of FH, enabling improved diagnosis, risk stratification, and personalized treatment for affected individuals. The comprehensive approach outlined in this review promises to advance the management of this inherited disorder, potentially leading to better health outcomes for those affected by FH.
Collapse
Affiliation(s)
- Asier Larrea-Sebal
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain
- Fundación Biofisika Bizkaia, 48940, Leioa, Spain
| | - Shifa Jebari-Benslaiman
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain
| | - Unai Galicia-Garcia
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain
| | - Ane San Jose-Urteaga
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
| | - Kepa B Uribe
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
| | - Asier Benito-Vicente
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain
| | - César Martín
- Department of Biochemistry and Molecular Biology, Universidad del País Vasco UPV/EHU, 48080, Bilbao, Spain.
- Department of Molecular Biophysics, Biofisika Institute, University of Basque Country and Consejo Superior de Investigaciones Científicas (UPV/EHU, CSIC), 48940, Leioa, Spain.
| |
Collapse
|
5
|
Schweke H, Xu Q, Tauriello G, Pantolini L, Schwede T, Cazals F, Lhéritier A, Fernandez-Recio J, Rodríguez-Lumbreras LÁ, Schueler-Furman O, Varga JK, Jiménez-García B, Réau MF, Bonvin A, Savojardo C, Martelli PL, Casadio R, Tubiana J, Wolfson H, Oliva R, Barradas-Bautista D, Ricciardelli T, Cavallo L, Venclovas Č, Olechnovič K, Guerois R, Andreani J, Martin J, Wang X, Kihara D, Marchand A, Correia B, Zou X, Dey S, Dunbrack R, Levy E, Wodak S. Discriminating physiological from non-physiological interfaces in structures of protein complexes: A community-wide study. Proteomics 2023; 23:e2200323. [PMID: 37365936 PMCID: PMC10937251 DOI: 10.1002/pmic.202200323] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2023] [Revised: 05/11/2023] [Accepted: 05/11/2023] [Indexed: 06/28/2023]
Abstract
Reliably scoring and ranking candidate models of protein complexes and assigning their oligomeric state from the structure of the crystal lattice represent outstanding challenges. A community-wide effort was launched to tackle these challenges. The latest resources on protein complexes and interfaces were exploited to derive a benchmark dataset consisting of 1677 homodimer protein crystal structures, including a balanced mix of physiological and non-physiological complexes. The non-physiological complexes in the benchmark were selected to bury a similar or larger interface area than their physiological counterparts, making it more difficult for scoring functions to differentiate between them. Next, 252 functions for scoring protein-protein interfaces previously developed by 13 groups were collected and evaluated for their ability to discriminate between physiological and non-physiological complexes. A simple consensus score generated using the best performing score of each of the 13 groups, and a cross-validated Random Forest (RF) classifier were created. Both approaches showed excellent performance, with an area under the Receiver Operating Characteristic (ROC) curve of 0.93 and 0.94, respectively, outperforming individual scores developed by different groups. Additionally, AlphaFold2 engines recalled the physiological dimers with significantly higher accuracy than the non-physiological set, lending support to the reliability of our benchmark dataset annotations. Optimizing the combined power of interface scoring functions and evaluating it on challenging benchmark datasets appears to be a promising strategy.
Collapse
Affiliation(s)
| | | | | | | | | | | | | | | | | | | | - Julia K. Varga
- Hebrew University of Jerusalem Institute for Medical Research Israel-Canada
| | | | | | | | | | | | | | - Jérôme Tubiana
- Tel Aviv University Blavatnik School of Computer Science
| | - Haim Wolfson
- Tel Aviv University Blavatnik School of Computer Science
| | | | | | | | | | | | | | | | | | | | | | | | | | | | - Xiaoqin Zou
- Dalton Cardiovascular Research Center, Institute for Data Science and Informatics, University of Missouri
| | | | | | | | | |
Collapse
|
6
|
Shuvo MH, Karim M, Roche R, Bhattacharya D. PIQLE: protein-protein interface quality estimation by deep graph learning of multimeric interaction geometries. BIOINFORMATICS ADVANCES 2023; 3:vbad070. [PMID: 37351310 PMCID: PMC10281963 DOI: 10.1093/bioadv/vbad070] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 02/22/2023] [Revised: 05/17/2023] [Accepted: 06/01/2023] [Indexed: 06/24/2023]
Abstract
Motivation Accurate modeling of protein-protein interaction interface is essential for high-quality protein complex structure prediction. Existing approaches for estimating the quality of a predicted protein complex structural model utilize only the physicochemical properties or energetic contributions of the interacting atoms, ignoring evolutionarily information or inter-atomic multimeric geometries, including interaction distance and orientations. Results Here, we present PIQLE, a deep graph learning method for protein-protein interface quality estimation. PIQLE leverages multimeric interaction geometries and evolutionarily information along with sequence- and structure-derived features to estimate the quality of individual interactions between the interfacial residues using a multi-head graph attention network and then probabilistically combines the estimated quality for scoring the overall interface. Experimental results show that PIQLE consistently outperforms existing state-of-the-art methods including DProQA, TRScore, GNN-DOVE and DOVE on multiple independent test datasets across a wide range of evaluation metrics. Our ablation study and comparison with the self-assessment module of AlphaFold-Multimer repurposed for protein complex scoring reveal that the performance gains are connected to the effectiveness of the multi-head graph attention network in leveraging multimeric interaction geometries and evolutionary information along with other sequence- and structure-derived features adopted in PIQLE. Availability and implementation An open-source software implementation of PIQLE is freely available at https://github.com/Bhattacharya-Lab/PIQLE. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
- Md Hossain Shuvo
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, USA
| | - Mohimenul Karim
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, USA
| | - Rahmatullah Roche
- Department of Computer Science, Virginia Tech, Blacksburg, VA 24061, USA
| | | |
Collapse
|
7
|
Flores RMA, Pantaleão SQ, Araujo SC, Malpartida HMG, Honorio KM. Structural analysis of factors related to FAM3C/ILEI dimerization and identification of inhibitor candidates targeting cancer treatment. Comput Biol Chem 2023; 104:107869. [PMID: 37068312 DOI: 10.1016/j.compbiolchem.2023.107869] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Revised: 04/05/2023] [Accepted: 04/09/2023] [Indexed: 04/19/2023]
Abstract
FAM3 is a superfamily of four cytokines that maintain a single globular structure β -β -α of three classes: FAM3A, B, C and D. FAM3C was the first member of this family related to cancer and is functionally characterized as an essential factor for the epithelial-mesenchymal transition (EMT), leading to late delays in tumor progression. Due to its crucial role in EMT and metastasis, FAM3C has been termed an interleukin-like EMT (ILEI) inducer. There are several studies on the part of FAM3C in the progression of cancer and other diseases. However, little is known about its cellular receptors and possible inhibitors. In this study, based on in silico approaches, we performed structural analyses of factors related to FAM3C/ILEI dimerization. We also identified four possible inhibitor candidates, expected to be exciting prototypes and could be submitted to future biological tests targeting cancer treatment.
Collapse
Affiliation(s)
| | - Simone Queiroz Pantaleão
- Center for Mathematics, Computing, and Cognition, Federal University of ABC, 09210-170 Santo André, SP, Brazil
| | - Sheila Cruz Araujo
- Center for Sciences Natural and Human, Federal University of ABC, 09210-170 Santo André, SP, Brazil
| | | | - Kathia Maria Honorio
- Center for Sciences Natural and Human, Federal University of ABC, 09210-170 Santo André, SP, Brazil; School of Arts, Sciences and Humanities, University of São Paulo, 03828-0000 São Paulo, SP, Brazil.
| |
Collapse
|
8
|
Barradas-Bautista D, Almajed A, Oliva R, Kalnis P, Cavallo L. Improving classification of correct and incorrect protein-protein docking models by augmenting the training set. BIOINFORMATICS ADVANCES 2023; 3:vbad012. [PMID: 36789292 PMCID: PMC9923443 DOI: 10.1093/bioadv/vbad012] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/27/2022] [Revised: 01/20/2023] [Accepted: 02/01/2023] [Indexed: 02/04/2023]
Abstract
Motivation Protein-protein interactions drive many relevant biological events, such as infection, replication and recognition. To control or engineer such events, we need to access the molecular details of the interaction provided by experimental 3D structures. However, such experiments take time and are expensive; moreover, the current technology cannot keep up with the high discovery rate of new interactions. Computational modeling, like protein-protein docking, can help to fill this gap by generating docking poses. Protein-protein docking generally consists of two parts, sampling and scoring. The sampling is an exhaustive search of the tridimensional space. The caveat of the sampling is that it generates a large number of incorrect poses, producing a highly unbalanced dataset. This limits the utility of the data to train machine learning classifiers. Results Using weak supervision, we developed a data augmentation method that we named hAIkal. Using hAIkal, we increased the labeled training data to train several algorithms. We trained and obtained different classifiers; the best classifier has 81% accuracy and 0.51 Matthews' correlation coefficient on the test set, surpassing the state-of-the-art scoring functions. Availability and implementation Docking models from Benchmark 5 are available at https://doi.org/10.5281/zenodo.4012018. Processed tabular data are available at https://repository.kaust.edu.sa/handle/10754/666961. Google colab is available at https://colab.research.google.com/drive/1vbVrJcQSf6\_C3jOAmZzgQbTpuJ5zC1RP?usp=sharing. Supplementary information Supplementary data are available at Bioinformatics Advances online.
Collapse
Affiliation(s)
| | - Ali Almajed
- Computer, Electrical and Mathematical Science and Engineering Division, Kaust Extreme Computing Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Romina Oliva
- Department of Sciences and Technologies, University of Naples “Parthenope”, I-80143 Naples, Italy
| | - Panos Kalnis
- Computer, Electrical and Mathematical Science and Engineering Division, Kaust Extreme Computing Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| | - Luigi Cavallo
- Physical Sciences and Engineering Division, Kaust Catalysis Center, King Abdullah University of Science and Technology (KAUST), Thuwal 23955-6900, Saudi Arabia
| |
Collapse
|
9
|
Jung Y, Geng C, Bonvin AMJJ, Xue LC, Honavar VG. MetaScore: A Novel Machine-Learning-Based Approach to Improve Traditional Scoring Functions for Scoring Protein-Protein Docking Conformations. Biomolecules 2023; 13:121. [PMID: 36671507 PMCID: PMC9855734 DOI: 10.3390/biom13010121] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/01/2022] [Revised: 12/22/2022] [Accepted: 12/26/2022] [Indexed: 01/11/2023] Open
Abstract
Protein-protein interactions play a ubiquitous role in biological function. Knowledge of the three-dimensional (3D) structures of the complexes they form is essential for understanding the structural basis of those interactions and how they orchestrate key cellular processes. Computational docking has become an indispensable alternative to the expensive and time-consuming experimental approaches for determining the 3D structures of protein complexes. Despite recent progress, identifying near-native models from a large set of conformations sampled by docking-the so-called scoring problem-still has considerable room for improvement. We present MetaScore, a new machine-learning-based approach to improve the scoring of docked conformations. MetaScore utilizes a random forest (RF) classifier trained to distinguish near-native from non-native conformations using their protein-protein interfacial features. The features include physicochemical properties, energy terms, interaction-propensity-based features, geometric properties, interface topology features, evolutionary conservation, and also scores produced by traditional scoring functions (SFs). MetaScore scores docked conformations by simply averaging the score produced by the RF classifier with that produced by any traditional SF. We demonstrate that (i) MetaScore consistently outperforms each of the nine traditional SFs included in this work in terms of success rate and hit rate evaluated over conformations ranked among the top 10; (ii) an ensemble method, MetaScore-Ensemble, that combines 10 variants of MetaScore obtained by combining the RF score with each of the traditional SFs outperforms each of the MetaScore variants. We conclude that the performance of traditional SFs can be improved upon by using machine learning to judiciously leverage protein-protein interfacial features and by using ensemble methods to combine multiple scoring functions.
Collapse
Affiliation(s)
- Yong Jung
- Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA
- Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Cunliang Geng
- Bijvoet Centre for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Alexandre M. J. J. Bonvin
- Bijvoet Centre for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Li C. Xue
- Bijvoet Centre for Biomolecular Research, Faculty of Science—Chemistry, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Center for Molecular and Biomolecular Informatics, Radboudumc, Greet Grooteplein 26-28, 6525 GA Nijmegen, The Netherlands
| | - Vasant G. Honavar
- Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA
- Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, PA 16802, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
- Clinical and Translational Sciences Institute, Pennsylvania State University, University Park, PA 16802, USA
- College of Information Sciences & Technology, Pennsylvania State University, University Park, PA 16802, USA
- Institute for Computational and Data Sciences, Pennsylvania State University, University Park, PA 16802, USA
- Center for Big Data Analytics and Discovery Informatics, Pennsylvania State University, University Park, PA 16823, USA
| |
Collapse
|
10
|
Launay R, Teppa E, Esque J, André I. Modeling Protein Complexes and Molecular Assemblies Using Computational Methods. Methods Mol Biol 2023; 2553:57-77. [PMID: 36227539 DOI: 10.1007/978-1-0716-2617-7_4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/16/2023]
Abstract
Many biological molecules are assembled into supramolecular complexes that are necessary to perform functions in the cell. Better understanding and characterization of these molecular assemblies are thus essential to further elucidate molecular mechanisms and key protein-protein interactions that could be targeted to modulate the protein binding affinity or develop new binders. Experimental access to structural information on these supramolecular assemblies is often hampered by the size of these systems that make their recombinant production and characterization rather difficult. Computational methods combining both structural data, molecular modeling techniques, and sequence coevolution information can thus offer a good alternative to gain access to the structural organization of protein complexes and assemblies. Herein, we present some computational methods to predict structural models of the protein partners, to search for interacting regions using coevolution information, and to build molecular assemblies. The approach is exemplified using a case study to model the succinate-quinone oxidoreductase heterocomplex.
Collapse
Affiliation(s)
- Romain Launay
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, Toulouse Cedex 04, France
| | - Elin Teppa
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, Toulouse Cedex 04, France
| | - Jérémy Esque
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, Toulouse Cedex 04, France.
| | - Isabelle André
- Toulouse Biotechnology Institute, TBI, Université de Toulouse, CNRS, INRAE, INSA, Toulouse Cedex 04, France.
| |
Collapse
|
11
|
Benincore-Flórez E, El-Azaz J, Solarte GA, Rodríguez A, Reyes LH, Alméciga-Díaz CJ, Cardona C. Iduronate-2-sulfatase interactome: Validation by Yeast Two-Hybrid Assay. Heliyon 2022; 8:e09031. [PMID: 35284671 PMCID: PMC8913312 DOI: 10.1016/j.heliyon.2022.e09031] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2021] [Revised: 12/08/2021] [Accepted: 02/24/2022] [Indexed: 11/25/2022] Open
Abstract
Mucopolysaccharidosis type II (MPS II), also known as Hunter syndrome, is a rare X-linked recessive disease caused by a deficiency of the lysosomal enzyme iduronate-2-sulfatase (IDS), which activates intracellular accumulation of nonmetabolized glycosaminoglycans such as heparan sulfate and dermatan sulfate. This accumulation causes severe damage to several tissues, principally the central nervous system. Previously, we identified 187 IDS-protein interactions in the mouse brain. To validate a subset of these interactions, we selected and cloned the coding regions of 10 candidate genes to perform a targeted yeast two-hybrid assay. The results allowed the identification of the physical interaction of IDS with LSAMP and SYT1. Although the physiological relevance of these complexes is unknown, recent advances allow us to point out that these interactions could be involved in vesicular trafficking of IDS through the interaction with SYT1, as well as to the ability to form a transcytosis module between the cellular components of the blood-brain-barrier (BBB) through its interaction with LSAMP. These results may shed light on the role of IDS on cellular homeostasis and may also contribute to the understanding of MPS II physiopathology and the development of novel therapeutic strategies to transport recombinant IDS through the brain endothelial cells toward the brain parenchyma.
Collapse
|
12
|
Dong X, Wang J, Wang Z, Shi P, Bian L. Mutation and evolution of metallo-beta-lactamase CphA under the selective pressure of biapenem continuous concentration gradient. J Inorg Biochem 2022; 230:111776. [DOI: 10.1016/j.jinorgbio.2022.111776] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2021] [Revised: 02/18/2022] [Accepted: 02/22/2022] [Indexed: 11/25/2022]
|
13
|
Verburgt J, Kihara D. Benchmarking of structure refinement methods for protein complex models. Proteins 2022; 90:83-95. [PMID: 34309909 PMCID: PMC8671191 DOI: 10.1002/prot.26188] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2021] [Revised: 06/24/2021] [Accepted: 07/22/2021] [Indexed: 01/03/2023]
Abstract
Protein structure docking is the process in which the quaternary structure of a protein complex is predicted from individual tertiary structures of the protein subunits. Protein docking is typically performed in two main steps. The subunits are first docked while keeping them rigid to form the complex, which is then followed by structure refinement. Structure refinement is crucial for a practical use of computational protein docking models, as it is aimed for correcting conformations of interacting residues and atoms at the interface. Here, we benchmarked the performance of eight existing protein structure refinement methods in refinement of protein complex models. We show that the fraction of native contacts between subunits is by far the most straightforward metric to improve. However, backbone dependent metrics, based on the Root Mean Square Deviation proved more difficult to improve via refinement.
Collapse
Affiliation(s)
- Jacob Verburgt
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
- Purdue University Center for Cancer Research, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
14
|
Barradas-Bautista D, Cao Z, Vangone A, Oliva R, Cavallo L. A random forest classifier for protein-protein docking models. BIOINFORMATICS ADVANCES 2021; 2:vbab042. [PMID: 36699405 PMCID: PMC9710594 DOI: 10.1093/bioadv/vbab042] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/05/2021] [Revised: 11/11/2021] [Accepted: 12/06/2021] [Indexed: 01/28/2023]
Abstract
Herein, we present the results of a machine learning approach we developed to single out correct 3D docking models of protein-protein complexes obtained by popular docking software. To this aim, we generated 3 × 10 4 docking models for each of the 230 complexes in the protein-protein benchmark, version 5, using three different docking programs (HADDOCK, FTDock and ZDOCK), for a cumulative set of ≈ 7 × 10 6 docking models. Three different machine learning approaches (Random Forest, Supported Vector Machine and Perceptron) were used to train classifiers with 158 different scoring functions (features). The Random Forest algorithm outperformed the other two algorithms and was selected for further optimization. Using a features selection algorithm, and optimizing the random forest hyperparameters, allowed us to train and validate a random forest classifier, named COnservation Driven Expert System (CoDES). Testing of CoDES on independent datasets, as well as results of its comparative performance with machine learning methods recently developed in the field for the scoring of docking decoys, confirm its state-of-the-art ability to discriminate correct from incorrect decoys both in terms of global parameters and in terms of decoys ranked at the top positions. Supplementary information Supplementary data are available at Bioinformatics Advances online. Software and data availability statement The docking models are available at https://doi.org/10.5281/zenodo.4012018. The programs underlying this article will be shared on request to the corresponding authors.
Collapse
Affiliation(s)
- Didier Barradas-Bautista
- Kaust Catalysis Center, Physical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Saudi Arabia,To whom correspondence should be addressed. or or
| | - Zhen Cao
- Kaust Catalysis Center, Physical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Saudi Arabia
| | - Anna Vangone
- Pharma Research and Early Development, Therapeutic Modalities, Roche Innovation Center Munich Large Molecule Research, 82377 Penzberg, Germany
| | - Romina Oliva
- Department of Sciences and Technologies, University Parthenope of Naples, Centro Direzionale Isola C4, I-80143 Naples, Italy,To whom correspondence should be addressed. or or
| | - Luigi Cavallo
- Kaust Catalysis Center, Physical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), 23955-6900 Thuwal, Saudi Arabia,To whom correspondence should be addressed. or or
| |
Collapse
|
15
|
Abstract
The biological significance of proteins attracted the scientific community in exploring their characteristics. The studies shed light on the interaction patterns and functions of proteins in a living body. Due to their practical difficulties, reliable experimental techniques pave the way for introducing computational methods in the interaction prediction. Automated methods reduced the difficulties but could not yet replace experimental studies as the field is still evolving. Interaction prediction problem being critical needs highly accurate results, but none of the existing methods could offer reliable performance that can parallel with experimental results yet. This article aims to assess the existing computational docking algorithms, their challenges, and future scope. Blind docking techniques are quite helpful when no information other than the individual structures are available. As more and more complex structures are being added to different databases, information-driven approaches can be a good alternative. Artificial intelligence, ruling over the major fields, is expected to take over this domain very shortly.
Collapse
|
16
|
Distinct RPA domains promote recruitment and the helicase-nuclease activities of Dna2. Nat Commun 2021; 12:6521. [PMID: 34764291 PMCID: PMC8586334 DOI: 10.1038/s41467-021-26863-y] [Citation(s) in RCA: 14] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2021] [Accepted: 10/21/2021] [Indexed: 01/25/2023] Open
Abstract
The Dna2 helicase-nuclease functions in concert with the replication protein A (RPA) in DNA double-strand break repair. Using ensemble and single-molecule biochemistry, coupled with structure modeling, we demonstrate that the stimulation of S. cerevisiae Dna2 by RPA is not a simple consequence of Dna2 recruitment to single-stranded DNA. The large RPA subunit Rfa1 alone can promote the Dna2 nuclease activity, and we identified mutations in a helix embedded in the N-terminal domain of Rfa1 that specifically disrupt this capacity. The same RPA mutant is instead fully functional to recruit Dna2 and promote its helicase activity. Furthermore, we found residues located on the outside of the central DNA-binding OB-fold domain Rfa1-A, which are required to promote the Dna2 motor activity. Our experiments thus unexpectedly demonstrate that different domains of Rfa1 regulate Dna2 recruitment, and its nuclease and helicase activities. Consequently, the identified separation-of-function RPA variants are compromised to stimulate Dna2 in the processing of DNA breaks. The results explain phenotypes of replication-proficient but radiation-sensitive RPA mutants and illustrate the unprecedented functional interplay of RPA and Dna2. An enzymatic ensemble including Dna2 functions in DNA end resection; the function of the single-stranded DNA binding protein RPA in this complex has been underappreciated. Here the authors employ molecular modeling, biochemistry, and single molecule biophysics to reveal RPA directly promotes Dna2 recruitment, nuclease and helicase activities.
Collapse
|
17
|
Merikhian P, Darvishi B, Jalili N, Esmailinejad MR, Khatibi AS, Kalbolandi SM, Salehi M, Mosayebzadeh M, Barough MS, Majidzadeh-A K, Yadegari F, Rahbarizadeh F, Farahmand L. Recombinant nanobody against MUC1 tandem repeats inhibits growth, invasion, metastasis, and vascularization of spontaneous mouse mammary tumors. Mol Oncol 2021; 16:485-507. [PMID: 34694686 PMCID: PMC8763658 DOI: 10.1002/1878-0261.13123] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/24/2021] [Revised: 06/20/2021] [Accepted: 10/19/2021] [Indexed: 11/11/2022] Open
Abstract
Alteration in glycosylation pattern of MUC1 mucin tandem repeats during carcinomas has been shown to negatively affect adhesive properties of malignant cells and enhance tumor invasiveness and metastasis. In addition, MUC1 overexpression is closely interrelated with angiogenesis, making it a great target for immunotherapy. Alongside, easier interaction of nanobodies (single-domain antibodies) with their antigens, compared to conventional antibodies, is usually associated with superior desirable results. Herein, we evaluated the preclinical efficacy of a recombinant nanobody against MUC1 tandem repeats in suppressing tumor growth, angiogenesis, invasion, and metastasis. Expressed nanobody demonstrated specificity only toward MUC1-overexpressing cancer cells and could internalize in cancer cell lines. The IC50 values (the concentration at which the nanobody exerted half of its maximal inhibitory effect) of the anti-MUC1 nanobody against MUC1-positive human cancer cell lines ranged from 1.2 to 14.3 nm. Similar concentrations could also effectively induce apoptosis in MUC1-positive cancer cells but not in normal cells or MUC1-negative human cancer cells. Immunohistochemical staining of spontaneously developed mouse breast tumors prior to in vivo studies confirmed cross-reactivity of nanobody with mouse MUC1 despite large structural dissimilarities between mouse and human MUC1 tandem repeats. In vivo, a dose of 3 µg nanobody per gram of body weight in tumor-bearing mice could attenuate tumor progression and suppress excessive circulating levels of IL-1a, IL-2, IL-10, IL-12, and IL-17A pro-inflammatory cytokines. Also, a significant decline in expression of Ki-67, MMP9, and VEGFR2 biomarkers, as well as vasculogenesis, was evident in immunohistochemically stained tumor sections of anti-MUC1 nanobody-treated mice. In conclusion, the anti-MUC1 tandem repeat nanobody of the present study could effectively overcome tumor growth, invasion, and metastasis.
Collapse
Affiliation(s)
- Parnaz Merikhian
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Behrad Darvishi
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Neda Jalili
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | | | - Azadeh Sharif Khatibi
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Shima Moradi Kalbolandi
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Malihe Salehi
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Marjan Mosayebzadeh
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Mahdieh Shokrollahi Barough
- Cancer Immunotherapy and Regenerative Medicine, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Keivan Majidzadeh-A
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Fatemeh Yadegari
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| | - Fatemeh Rahbarizadeh
- Department of Medical Biotechnology, Faculty of Medical Sciences, Tarbiat Modares University, Tehran, Iran
| | - Leila Farahmand
- Recombinant Proteins Department, Breast Cancer Research Center, Motamed Cancer Institute, ACECR, Tehran, Iran
| |
Collapse
|
18
|
Quadir F, Roy RS, Soltanikazemi E, Cheng J. DeepComplex: A Web Server of Predicting Protein Complex Structures by Deep Learning Inter-chain Contact Prediction and Distance-Based Modelling. Front Mol Biosci 2021; 8:716973. [PMID: 34497831 PMCID: PMC8419425 DOI: 10.3389/fmolb.2021.716973] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Accepted: 08/12/2021] [Indexed: 11/13/2022] Open
Abstract
Proteins interact to form complexes. Predicting the quaternary structure of protein complexes is useful for protein function analysis, protein engineering, and drug design. However, few user-friendly tools leveraging the latest deep learning technology for inter-chain contact prediction and the distance-based modelling to predict protein quaternary structures are available. To address this gap, we develop DeepComplex, a web server for predicting structures of dimeric protein complexes. It uses deep learning to predict inter-chain contacts in a homodimer or heterodimer. The predicted contacts are then used to construct a quaternary structure of the dimer by the distance-based modelling, which can be interactively viewed and analysed. The web server is freely accessible and requires no registration. It can be easily used by providing a job name and an email address along with the tertiary structure for one chain of a homodimer or two chains of a heterodimer. The output webpage provides the multiple sequence alignment, predicted inter-chain residue-residue contact map, and predicted quaternary structure of the dimer. DeepComplex web server is freely available at http://tulip.rnet.missouri.edu/deepcomplex/web_index.html.
Collapse
Affiliation(s)
| | | | | | - Jianlin Cheng
- Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO, United States
| |
Collapse
|
19
|
Quignot C, Postic G, Bret H, Rey J, Granger P, Murail S, Chacón P, Andreani J, Tufféry P, Guerois R. InterEvDock3: a combined template-based and free docking server with increased performance through explicit modeling of complex homologs and integration of covariation-based contact maps. Nucleic Acids Res 2021; 49:W277-W284. [PMID: 33978743 PMCID: PMC8265070 DOI: 10.1093/nar/gkab358] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2021] [Revised: 04/09/2021] [Accepted: 04/23/2021] [Indexed: 12/19/2022] Open
Abstract
The InterEvDock3 protein docking server exploits the constraints of evolution by multiple means to generate structural models of protein assemblies. The server takes as input either several sequences or 3D structures of proteins known to interact. It returns a set of 10 consensus candidate complexes, together with interface predictions to guide further experimental validation interactively. Three key novelties were implemented in InterEvDock3 to help obtain more reliable models: users can (i) generate template-based structural models of assemblies using close and remote homologs of known 3D structure, detected through an automated search protocol, (ii) select the assembly models most consistent with contact maps from external methods that implement covariation-based contact prediction with or without deep learning and (iii) exploit a novel coevolution-based scoring scheme at atomic level, which leads to significantly higher free docking success rates. The performance of the server was validated on two large free docking benchmark databases, containing respectively 230 unbound targets (Weng dataset) and 812 models of unbound targets (PPI4DOCK dataset). Its effectiveness has also been proven on a number of challenging examples. The InterEvDock3 web interface is available at http://bioserv.rpbs.univ-paris-diderot.fr/services/InterEvDock3/.
Collapse
Affiliation(s)
- Chloé Quignot
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Guillaume Postic
- Université de Paris, CNRS UMR 8251, INSERM U1133, RPBS, Paris 75205, France
| | - Hélène Bret
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Julien Rey
- Université de Paris, CNRS UMR 8251, INSERM U1133, RPBS, Paris 75205, France
| | - Pierre Granger
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Samuel Murail
- Université de Paris, CNRS UMR 8251, INSERM U1133, RPBS, Paris 75205, France
| | - Pablo Chacón
- Department of Biological Physical Chemistry, Rocasolano Institute of Physical Chemistry C.S.I.C, Madrid, Spain
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| | - Pierre Tufféry
- Université de Paris, CNRS UMR 8251, INSERM U1133, RPBS, Paris 75205, France
| | - Raphaël Guerois
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198 Gif-sur-Yvette, France
| |
Collapse
|
20
|
Mishra SK, Cooper CJ, Parks JM, Mitchell JC. Hotspot Coevolution Is a Key Identifier of Near-Native Protein Complexes. J Phys Chem B 2021; 125:6058-6067. [PMID: 34077660 DOI: 10.1021/acs.jpcb.0c11525] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Protein-protein interactions play a key role in mediating numerous biological functions, with more than half the proteins in living organisms existing as either homo- or hetero-oligomeric assemblies. Protein subunits that form oligomers minimize the free energy of the complex, but exhaustive computational search-based docking methods have not comprehensively addressed the challenge of distinguishing a natively bound complex from non-native forms. Current protein docking approaches address this problem by sampling multiple binding modes in proteins and scoring each mode, with the lowest-energy (or highest scoring) binding mode being regarded as a near-native complex. However, high-scoring modes often match poorly with the true bound form, suggesting a need for improvement of the scoring function. In this study, we propose a scoring function, KFC-E, that accounts for both conservation and coevolution of putative binding hotspot residues at protein-protein interfaces. We tested KFC-E on four benchmark sets of unbound examples and two benchmark sets of bound examples, with the results demonstrating a clear improvement over scores that examine conservation and coevolution across the entire interface.
Collapse
Affiliation(s)
- Sambit K Mishra
- Biosciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, Tennessee 37831-6038, United States
| | - Connor J Cooper
- Biosciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, Tennessee 37831-6038, United States
| | - Jerry M Parks
- Biosciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, Tennessee 37831-6038, United States
| | - Julie C Mitchell
- Biosciences Division, Oak Ridge National Laboratory, 1 Bethel Valley Road, Oak Ridge, Tennessee 37831-6038, United States
| |
Collapse
|
21
|
Quignot C, Granger P, Chacón P, Guerois R, Andreani J. Atomic-level evolutionary information improves protein-protein interface scoring. Bioinformatics 2021; 37:3175-3181. [PMID: 33901284 DOI: 10.1093/bioinformatics/btab254] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Revised: 03/20/2021] [Accepted: 04/19/2021] [Indexed: 12/11/2022] Open
Abstract
MOTIVATION The crucial role of protein interactions and the difficulty in characterising them experimentally strongly motivates the development of computational approaches for structural prediction. Even when protein-protein docking samples correct models, current scoring functions struggle to discriminate them from incorrect decoys. The previous incorporation of conservation and coevolution information has shown promise for improving protein-protein scoring. Here, we present a novel strategy to integrate atomic-level evolutionary information into different types of scoring functions to improve their docking discrimination. RESULTS : We applied this general strategy to our residue-level statistical potential from InterEvScore and to two atomic-level scores, SOAP-PP and Rosetta interface score (ISC). Including evolutionary information from as few as ten homologous sequences improves the top 10 success rates of individual atomic-level scores SOAP-PP and Rosetta ISC by respectively 6 and 13.5 percentage points, on a large benchmark of 752 docking cases. The best individual homology-enriched score reaches a top 10 success rate of 34.4%. A consensus approach based on the complementarity between different homology-enriched scores further increases the top 10 success rate to 40%. AVAILABILITY All data used for benchmarking and scoring results, as well as a Singularity container of the pipeline, are available at http://biodev.cea.fr/interevol/interevdata/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Chloé Quignot
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Pierre Granger
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Pablo Chacón
- Department of Biological Chemical Physics, Rocasolano Institute of Physical Chemistry C.S.I.C, Madrid, Spain
| | - Raphael Guerois
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| | - Jessica Andreani
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), 91198, Gif-sur-Yvette, France
| |
Collapse
|
22
|
Nagy B, Polak M, Ozohanics O, Zambo Z, Szabo E, Hubert A, Jordan F, Novaček J, Adam-Vizi V, Ambrus A. Structure of the dihydrolipoamide succinyltransferase (E2) component of the human alpha-ketoglutarate dehydrogenase complex (hKGDHc) revealed by cryo-EM and cross-linking mass spectrometry: Implications for the overall hKGDHc structure. Biochim Biophys Acta Gen Subj 2021; 1865:129889. [PMID: 33684457 DOI: 10.1016/j.bbagen.2021.129889] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2020] [Revised: 02/05/2021] [Accepted: 03/02/2021] [Indexed: 12/19/2022]
Abstract
BACKGROUND The human mitochondrial alpha-ketoglutarate dehydrogenase complex (hKGDHc) converts KG to succinyl-CoA and NADH. Malfunction of and reactive oxygen species generation by the hKGDHc as well as its E1-E2 subcomplex are implicated in neurodegenerative disorders, ischemia-reperfusion injury, E3-deficiency and cancers. METHODS We performed cryo-EM, cross-linking mass spectrometry (CL-MS) and molecular modeling analyses to determine the structure of the E2 component of the hKGDHc (hE2k); hE2k transfers a succinyl group to CoA and forms the structural core of hKGDHc. We also assessed the overall structure of the hKGDHc by negative-stain EM and modeling. RESULTS We report the 2.9 Å resolution cryo-EM structure of the hE2k component. The cryo-EM map comprises density for hE2k residues 151-386 - the entire (inner) core catalytic domain plus a few additional residues -, while residues 1-150 are not observed due to the inherent flexibility of the N-terminal region. The structure of the latter segment was also determined by CL-MS and homology modeling. Negative-stain EM on in vitro assembled hKGDHc and previous data were used to build a putative overall structural model of the hKGDHc. CONCLUSIONS The E2 core of the hKGDHc is composed of 24 hE2k chains organized in octahedral (8 × 3 type) assembly. Each lipoyl domain is oriented towards the core domain of an adjacent chain in the hE2k homotrimer. hE1k and hE3 are most likely tethered at the edges and faces, respectively, of the cubic hE2k assembly. GENERAL SIGNIFICANCE The revealed structural information will support the future pharmacologically targeting of the hKGDHc.
Collapse
Affiliation(s)
- Balint Nagy
- Department of Biochemistry, Institute of Biochemistry and Molecular Biology, Semmelweis University, Budapest, Hungary
| | - Martin Polak
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Oliver Ozohanics
- Department of Biochemistry, Institute of Biochemistry and Molecular Biology, Semmelweis University, Budapest, Hungary
| | - Zsofia Zambo
- Department of Biochemistry, Institute of Biochemistry and Molecular Biology, Semmelweis University, Budapest, Hungary
| | - Eszter Szabo
- Department of Biochemistry, Institute of Biochemistry and Molecular Biology, Semmelweis University, Budapest, Hungary
| | - Agnes Hubert
- Department of Biochemistry, Institute of Biochemistry and Molecular Biology, Semmelweis University, Budapest, Hungary
| | - Frank Jordan
- Department of Chemistry, Rutgers, The State University of New Jersey, Newark, NJ, USA
| | - Jiří Novaček
- Central European Institute of Technology, Masaryk University, Brno, Czech Republic
| | - Vera Adam-Vizi
- Department of Biochemistry, Institute of Biochemistry and Molecular Biology, Semmelweis University, Budapest, Hungary
| | - Attila Ambrus
- Department of Biochemistry, Institute of Biochemistry and Molecular Biology, Semmelweis University, Budapest, Hungary.
| |
Collapse
|
23
|
Eismann S, Townshend RJL, Thomas N, Jagota M, Jing B, Dror RO. Hierarchical, rotation-equivariant neural networks to select structural models of protein complexes. Proteins 2020; 89:493-501. [PMID: 33289162 DOI: 10.1002/prot.26033] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2020] [Revised: 10/10/2020] [Accepted: 11/21/2020] [Indexed: 12/16/2022]
Abstract
Predicting the structure of multi-protein complexes is a grand challenge in biochemistry, with major implications for basic science and drug discovery. Computational structure prediction methods generally leverage predefined structural features to distinguish accurate structural models from less accurate ones. This raises the question of whether it is possible to learn characteristics of accurate models directly from atomic coordinates of protein complexes, with no prior assumptions. Here we introduce a machine learning method that learns directly from the 3D positions of all atoms to identify accurate models of protein complexes, without using any precomputed physics-inspired or statistical terms. Our neural network architecture combines multiple ingredients that together enable end-to-end learning from molecular structures containing tens of thousands of atoms: a point-based representation of atoms, equivariance with respect to rotation and translation, local convolutions, and hierarchical subsampling operations. When used in combination with previously developed scoring functions, our network substantially improves the identification of accurate structural models among a large set of possible models. Our network can also be used to predict the accuracy of a given structural model in absolute terms. The architecture we present is readily applicable to other tasks involving learning on 3D structures of large atomic systems.
Collapse
Affiliation(s)
- Stephan Eismann
- Department of Applied Physics, Stanford University, Stanford, California, USA.,Department of Computer Science, Stanford University, Stanford, California, USA
| | | | - Nathaniel Thomas
- Department of Physics, Stanford University, Stanford, California, USA
| | - Milind Jagota
- Department of Computer Science, Stanford University, Stanford, California, USA.,Department of Electrical Engineering, Stanford University, Stanford, California, USA
| | - Bowen Jing
- Department of Computer Science, Stanford University, Stanford, California, USA
| | - Ron O Dror
- Department of Computer Science, Stanford University, Stanford, California, USA.,Department of Structural Biology, Stanford University, Stanford, California, USA.,Department of Molecular and Cellular Physiology, Stanford University, Stanford, California, USA.,Institute for Computational and Mathematical Engineering, Stanford University, Stanford, California, USA
| |
Collapse
|
24
|
Launay G, Ohue M, Prieto Santero J, Matsuzaki Y, Hilpert C, Uchikoga N, Hayashi T, Martin J. Evaluation of CONSRANK-Like Scoring Functions for Rescoring Ensembles of Protein–Protein Docking Poses. Front Mol Biosci 2020; 7:559005. [PMID: 33195406 PMCID: PMC7641601 DOI: 10.3389/fmolb.2020.559005] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2020] [Accepted: 09/28/2020] [Indexed: 11/13/2022] Open
Abstract
Scoring is a challenging step in protein–protein docking, where typically thousands of solutions are generated. In this study, we ought to investigate the contribution of consensus-rescoring, as introduced by Oliva et al. (2013) with the CONSRANK method, where the set of solutions is used to build statistics in order to identify recurrent solutions. We explore several ways to perform consensus-based rescoring on the ZDOCK decoy set for Benchmark 4. We show that the information of the interface size is critical for successful rescoring in this context, but that consensus rescoring in itself performs less well than traditional physics-based evaluation. The results of physics-based and consensus-based rescoring are partially overlapping, supporting the use of a combination of these approaches.
Collapse
Affiliation(s)
- Guillaume Launay
- CNRS, UMR 5086 Molecular Microbiology and Structural Biochemistry, University of Lyon, Lyon, France
| | - Masahito Ohue
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Tokyo, Japan
- *Correspondence: Masahito Ohue,
| | - Julia Prieto Santero
- CNRS, UMR 5086 Molecular Microbiology and Structural Biochemistry, University of Lyon, Lyon, France
| | - Yuri Matsuzaki
- Tokyo Tech Academy for Leadership, Tokyo Institute of Technology, Tokyo, Japan
| | - Cécile Hilpert
- CNRS, UMR 5086 Molecular Microbiology and Structural Biochemistry, University of Lyon, Lyon, France
| | - Nobuyuki Uchikoga
- Department of Network Design, School of Interdisciplinary Mathematical Sciences, Meiji University, Tokyo, Japan
| | - Takanori Hayashi
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, Tokyo, Japan
| | - Juliette Martin
- CNRS, UMR 5086 Molecular Microbiology and Structural Biochemistry, University of Lyon, Lyon, France
- Juliette Martin,
| |
Collapse
|
25
|
Phongsavanh X, Al-Qatabi N, Shaban MS, Khoder-Agha F, El Asri M, Comisso M, Guérois R, Mirande M. How HIV-1 Integrase Associates with Human Mitochondrial Lysyl-tRNA Synthetase. Viruses 2020; 12:v12101202. [PMID: 33096929 PMCID: PMC7589778 DOI: 10.3390/v12101202] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Revised: 10/14/2020] [Accepted: 10/20/2020] [Indexed: 01/13/2023] Open
Abstract
Replication of human immunodeficiency virus type 1 (HIV-1) requires the packaging of tRNALys,3 from the host cell into the new viral particles. The GagPol viral polyprotein precursor associates with mitochondrial lysyl-tRNA synthetase (mLysRS) in a complex with tRNALys, an essential step to initiate reverse transcription in the virions. The C-terminal integrase moiety of GagPol is essential for its association with mLysRS. We show that integrases from HIV-1 and HIV-2 bind mLysRS with the same efficiency. In this work, we have undertaken to probe the three-dimensional (3D) architecture of the complex of integrase with mLysRS. We first established that the C-terminal domain (CTD) of integrase is the major interacting domain with mLysRS. Using the pBpa-photo crosslinking approach, inter-protein cross-links were observed involving amino acid residues located at the surface of the catalytic domain of mLysRS and of the CTD of integrase. In parallel, using molecular docking simulation, a single structural model of complex was found to outscore other alternative conformations. Consistent with crosslinking experiments, this structural model was further probed experimentally. Five compensatory mutations in the two partners were successfully designed which supports the validity of the model. The complex highlights that binding of integrase could stabilize the tRNALys:mLysRS interaction.
Collapse
|
26
|
Tanemura KA, Pei J, Merz KM. Refinement of pairwise potentials via logistic regression to score protein-protein interactions. Proteins 2020; 88:1559-1568. [PMID: 32729132 DOI: 10.1002/prot.25973] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2020] [Revised: 05/17/2020] [Accepted: 06/14/2020] [Indexed: 12/20/2022]
Abstract
Protein-protein interactions (PPIs) are ubiquitous and functionally of great importance in biological systems. Hence, the accurate prediction of PPIs by protein-protein docking and scoring tools is highly desirable in order to characterize their structure and biological function. Ab initio docking protocols are divided into the sampling of docking poses to produce at least one near-native structure, and then to evaluate the vast candidate structures by scoring. Concurrent development in both sampling and scoring is crucial for the deployment of protein-protein docking software. In the present work, we apply a machine learning model on pairwise potentials to refine the task of protein quaternary structure native structure detection among decoys. A decoy set was featurized using the Knowledge and Empirical Combined Scoring Algorithm 2 (KECSA2) pairwise potential. The highly unbalanced decoy set was then balanced using a comparison concept between native and decoy structures. The resultant comparison descriptors were used to train a logistic regression (LR) classifier. The LR model yielded the optimal performance for native detection among decoys compared with conventional scoring functions, while exhibiting lesser performance for the detection of low root mean square deviation decoy structures. Its deployment on an independent benchmark set confirms that the scoring function performs competitively relative to other scoring functions. The scripts used are available at https://github.com/TanemuraKiyoto/PPI-native-detection-via-LR.
Collapse
Affiliation(s)
- Kiyoto A Tanemura
- Department of Chemistry, Michigan State University, East Lansing, Michigan, USA
| | - Jun Pei
- Department of Chemistry, Michigan State University, East Lansing, Michigan, USA
| | - Kenneth M Merz
- Department of Chemistry, Michigan State University, East Lansing, Michigan, USA
| |
Collapse
|
27
|
Rosell M, Fernández-Recio J. Docking approaches for modeling multi-molecular assemblies. Curr Opin Struct Biol 2020; 64:59-65. [PMID: 32615514 PMCID: PMC7324114 DOI: 10.1016/j.sbi.2020.05.016] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2020] [Revised: 05/13/2020] [Accepted: 05/21/2020] [Indexed: 12/12/2022]
Abstract
Computational docking approaches aim to overcome the limited availability of experimental structural data on protein-protein interactions, which are key in biology. The field is rapidly moving from the traditional docking methodologies for modeling of binary complexes to more integrative approaches using template-based, data-driven modeling of multi-molecular assemblies. We will review here the predictive capabilities of current docking methods in blind conditions, based on the results from the most recent community-wide blind experiments. Integration of template-based and ab initio docking approaches is emerging as the optimal strategy for modeling protein complexes and multimolecular assemblies. We will also review the new methodological advances on ab initio docking and integrative modeling.
Collapse
Affiliation(s)
- Mireia Rosell
- Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain; Instituto de Ciencias de la Vid y del Vino (ICVV), CSIC - Universidad de La Rioja - Gobierno de La Rioja, 26007 Logroño, Spain
| | - Juan Fernández-Recio
- Barcelona Supercomputing Center (BSC), 08034 Barcelona, Spain; Instituto de Ciencias de la Vid y del Vino (ICVV), CSIC - Universidad de La Rioja - Gobierno de La Rioja, 26007 Logroño, Spain.
| |
Collapse
|
28
|
Roel-Touris J, Bonvin AM. Coarse-grained (hybrid) integrative modeling of biomolecular interactions. Comput Struct Biotechnol J 2020; 18:1182-1190. [PMID: 32514329 PMCID: PMC7264466 DOI: 10.1016/j.csbj.2020.05.002] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2020] [Revised: 04/23/2020] [Accepted: 05/06/2020] [Indexed: 12/23/2022] Open
Abstract
The computational modeling field has vastly evolved over the past decades. The early developments of simplified protein systems represented a stepping stone towards establishing more efficient approaches to sample intricated conformational landscapes. Downscaling the level of resolution of biomolecules to coarser representations allows for studying protein structure, dynamics and interactions that are not accessible by classical atomistic approaches. The combination of different resolutions, namely hybrid modeling, has also been proved as an alternative when mixed levels of details are required. In this review, we provide an overview of coarse-grained/hybrid models focusing on their applicability in the modeling of biomolecular interactions. We give a detailed list of ready-to-use modeling software for studying biomolecular interactions allowing various levels of coarse-graining and provide examples of complexes determined by integrative coarse-grained/hybrid approaches in combination with experimental information.
Collapse
|
29
|
Andreani J, Quignot C, Guerois R. Structural prediction of protein interactions and docking using conservation and coevolution. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2020. [DOI: 10.1002/wcms.1470] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Affiliation(s)
- Jessica Andreani
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Chloé Quignot
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| | - Raphael Guerois
- Université Paris‐Saclay CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC) Gif‐sur‐Yvette France
| |
Collapse
|
30
|
Geng C, Jung Y, Renaud N, Honavar V, Bonvin AMJJ, Xue LC. iScore: a novel graph kernel-based function for scoring protein-protein docking models. Bioinformatics 2020; 36:112-121. [PMID: 31199455 PMCID: PMC6956772 DOI: 10.1093/bioinformatics/btz496] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2018] [Revised: 05/08/2019] [Accepted: 06/11/2019] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Protein complexes play critical roles in many aspects of biological functions. Three-dimensional (3D) structures of protein complexes are critical for gaining insights into structural bases of interactions and their roles in the biomolecular pathways that orchestrate key cellular processes. Because of the expense and effort associated with experimental determinations of 3D protein complex structures, computational docking has evolved as a valuable tool to predict 3D structures of biomolecular complexes. Despite recent progress, reliably distinguishing near-native docking conformations from a large number of candidate conformations, the so-called scoring problem, remains a major challenge. RESULTS Here we present iScore, a novel approach to scoring docked conformations that combines HADDOCK energy terms with a score obtained using a graph representation of the protein-protein interfaces and a measure of evolutionary conservation. It achieves a scoring performance competitive with, or superior to, that of state-of-the-art scoring functions on two independent datasets: (i) Docking software-specific models and (ii) the CAPRI score set generated by a wide variety of docking approaches (i.e. docking software-non-specific). iScore ranks among the top scoring approaches on the CAPRI score set (13 targets) when compared with the 37 scoring groups in CAPRI. The results demonstrate the utility of combining evolutionary, topological and energetic information for scoring docked conformations. This work represents the first successful demonstration of graph kernels to protein interfaces for effective discrimination of near-native and non-native conformations of protein complexes. AVAILABILITY AND IMPLEMENTATION The iScore code is freely available from Github: https://github.com/DeepRank/iScore (DOI: 10.5281/zenodo.2630567). And the docking models used are available from SBGrid: https://data.sbgrid.org/dataset/684). SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Cunliang Geng
- Bijvoet Center for Biomolecular Research, Faculty of Science – Chemistry, Utrecht University, Utrecht 3584 CH, The Netherlands
| | - Yong Jung
- Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA
- Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, PA 16823, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
| | - Nicolas Renaud
- Netherlands eScience Center, Amsterdam 1098 XG, The Netherlands
| | - Vasant Honavar
- Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA
- Artificial Intelligence Research Laboratory, Pennsylvania State University, University Park, PA 16823, USA
- Huck Institutes of the Life Sciences, Pennsylvania State University, University Park, PA 16802, USA
- Center for Big Data Analytics and Discovery Informatics, Pennsylvania State University, University Park, PA 16823, USA
- Institute for Cyberscience, University Park, PA 16802, USA
- Clinical and Translational Sciences Institute, University Park, PA 16802, USA
- College of Information Sciences & Technology, Pennsylvania State University, University Park, PA 16802, USA
| | - Alexandre M J J Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science – Chemistry, Utrecht University, Utrecht 3584 CH, The Netherlands
| | - Li C Xue
- Bijvoet Center for Biomolecular Research, Faculty of Science – Chemistry, Utrecht University, Utrecht 3584 CH, The Netherlands
| |
Collapse
|
31
|
Renaud N, Jung Y, Honavar V, Geng C, Bonvin AM, Xue LC. iScore: An MPI supported software for ranking protein-protein docking models based on a random walk graph kernel and support vector machines. SOFTWAREX 2020; 11:100462. [PMID: 35419466 PMCID: PMC9005067 DOI: 10.1016/j.softx.2020.100462] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Computational docking is a promising tool to model three-dimensional (3D) structures of protein-protein complexes, which provides fundamental insights of protein functions in the cellular life. Singling out near-native models from the huge pool of generated docking models (referred to as the scoring problem) remains as a major challenge in computational docking. We recently published iScore, a novel graph kernel based scoring function. iScore ranks docking models based on their interface graph similarities to the training interface graph set. iScore uses a support vector machine approach with random-walk graph kernels to classify and rank protein-protein interfaces. Here, we present the software for iScore. The software provides executable scripts that fully automate the computational workflow. In addition, the creation and analysis of the interface graph can be distributed across different processes using Message Passing interface (MPI) and can be offloaded to GPUs thanks to dedicated CUDA kernels.
Collapse
Affiliation(s)
- Nicolas Renaud
- Netherlands eScience Center, Science Park 140, 1098 XG, Amsterdam, The Netherlands
| | - Yong Jung
- Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA
| | - Vasant Honavar
- Bioinformatics & Genomics Graduate Program, Pennsylvania State University, University Park, PA 16802, USA
- College of Information Science & Technology, Pennsylvania State University, University Park, PA 16802, USA
| | - Cunliang Geng
- Netherlands eScience Center, Science Park 140, 1098 XG, Amsterdam, The Netherlands
- Bijvoet Centre for Biomolecular Research Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Alexandre M.J.J. Bonvin
- Bijvoet Centre for Biomolecular Research Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
| | - Li C. Xue
- Bijvoet Centre for Biomolecular Research Faculty of Science - Chemistry, Utrecht University, Padualaan 8, 3584 CH Utrecht, The Netherlands
- Center for Molecular and Biomolecular Informatics, Radboudumc, Nijmegen, The Netherlands
| |
Collapse
|
32
|
Nadaradjane AA, Quignot C, Traoré S, Andreani J, Guerois R. Docking proteins and peptides under evolutionary constraints in Critical Assessment of PRediction of Interactions rounds 38 to 45. Proteins 2019; 88:986-998. [PMID: 31746034 DOI: 10.1002/prot.25857] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2019] [Revised: 11/13/2019] [Accepted: 11/15/2019] [Indexed: 01/25/2023]
Abstract
Computational structural prediction of macromolecular interactions is a fundamental tool toward the global understanding of cellular processes. The Critical Assessment of PRediction of Interactions (CAPRI) community-wide experiment provides excellent opportunities for blind testing computational docking methods and includes original targets, thus widening the range of docking applications. Our participation in CAPRI rounds 38 to 45 enabled us to expand the way we include evolutionary information in structural predictions beyond our standard free docking InterEvDock pipeline. InterEvDock integrates a coarse-grained potential that accounts for interface coevolution based on joint multiple sequence alignments of two protein partners (co-alignments). However, even though such co-alignments could be built for none of the CAPRI targets in rounds 38 to 45, including host-pathogen and protein-oligosaccharide complexes and a redesigned interface, we identified multiple strategies that can be used to incorporate evolutionary constraints, which helped us to identify the most likely macromolecular binding modes. These strategies include template-based modeling where only local adjustments should be applied when query-template sequence identity is above 30% and larger perturbations are needed below this threshold; covariation-based structure prediction for individual protein partners; and the identification of evolutionarily conserved and structurally recurrent anchoring interface motifs. Overall, we submitted correct predictions among the top 5 models for 12 out of 19 interface challenges, including four High- and five Medium-quality predictions. Our top 20 models included correct predictions for three out of the five targets we missed in the top 5, including two targets for which misleading biological data led us to downgrade correct free docking models.
Collapse
Affiliation(s)
- Aravindan Arun Nadaradjane
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, University of Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette Cedex, France
| | - Chloé Quignot
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, University of Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette Cedex, France
| | - Seydou Traoré
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, University of Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette Cedex, France
| | - Jessica Andreani
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, University of Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette Cedex, France
| | - Raphaël Guerois
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, University of Paris-Sud, Université Paris-Saclay, Gif-sur-Yvette Cedex, France
| |
Collapse
|
33
|
Dapkūnas J, Kairys V, Olechnovič K, Venclovas Č. Template-based modeling of diverse protein interactions in CAPRI rounds 38-45. Proteins 2019; 88:939-947. [PMID: 31697420 DOI: 10.1002/prot.25845] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2019] [Accepted: 11/03/2019] [Indexed: 11/09/2022]
Abstract
Structures of proteins complexed with other proteins, peptides, or ligands are essential for investigation of molecular mechanisms. However, the experimental structures of protein complexes of interest are often not available. Therefore, computational methods are widely used to predict these structures, and, of those methods, template-based modeling is the most successful. In the rounds 38-45 of the Critical Assessment of PRediction of Interactions (CAPRI), we applied template-based modeling for 9 of 11 protein-protein and protein-peptide interaction targets, resulting in medium and high-quality models for six targets. For the protein-oligosaccharide docking targets, we used constraints derived from template structures, and generated models of at least acceptable quality for most of the targets. Apparently, high flexibility of oligosaccharide molecules was the main cause preventing us from obtaining models of higher quality. We also participated in the CAPRI scoring challenge, the goal of which was to identify the highest quality models from a large pool of decoys. In this experiment, we tested VoroMQA, a scoring method based on interatomic contact areas. The results showed VoroMQA to be quite effective in scoring strongly binding and obligatory protein complexes, but less successful in the case of transient interactions. We extensively used manual intervention in both CAPRI modeling and scoring experiments. This oftentimes allowed us to select the correct templates from available alternatives and to limit the search space during the model scoring.
Collapse
Affiliation(s)
- Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Visvaldas Kairys
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| |
Collapse
|
34
|
Investigation into Early Steps of Actin Recognition by the Intrinsically Disordered N-WASP Domain V. Int J Mol Sci 2019; 20:ijms20184493. [PMID: 31514372 PMCID: PMC6770570 DOI: 10.3390/ijms20184493] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2019] [Accepted: 09/08/2019] [Indexed: 12/21/2022] Open
Abstract
Cellular regulation or signaling processes are mediated by many proteins which often have one or several intrinsically disordered regions (IDRs). These IDRs generally serve as binders to different proteins with high specificity. In many cases, IDRs undergo a disorder-to-order transition upon binding, following a mechanism between two possible pathways, the induced fit or the conformational selection. Since these mechanisms contribute differently to the kinetics of IDR associations, it is important to investigate them in order to gain insight into the physical factors that determine the biomolecular recognition process. The verprolin homology domain (V) of the Neural Wiskott-Aldrich Syndrome Protein (N-WASP), involved in the regulation of actin polymerization, is a typical example of IDR. It is composed of two WH2 motifs, each being able to bind one actin molecule. In this study, we investigated the early steps of the recognition process of actin by the WH2 motifs of N-WASP domain V. Using docking calculations and molecular dynamics simulations, our study shows that actin is first recognized by the N-WASP domain V regions which have the highest propensity to form transient α -helices. The WH2 motif consensus sequences "LKKV" subsequently bind to actin through large conformational changes of the disordered domain V.
Collapse
|
35
|
Quignot C, Rey J, Yu J, Tufféry P, Guerois R, Andreani J. InterEvDock2: an expanded server for protein docking using evolutionary and biological information from homology models and multimeric inputs. Nucleic Acids Res 2019; 46:W408-W416. [PMID: 29741647 PMCID: PMC6030979 DOI: 10.1093/nar/gky377] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 05/02/2018] [Indexed: 12/15/2022] Open
Abstract
Computational protein docking is a powerful strategy to predict structures of protein-protein interactions and provides crucial insights for the functional characterization of macromolecular cross-talks. We previously developed InterEvDock, a server for ab initio protein docking based on rigid-body sampling followed by consensus scoring using physics-based and statistical potentials, including the InterEvScore function specifically developed to incorporate co-evolutionary information in docking. InterEvDock2 is a major evolution of InterEvDock which allows users to submit input sequences – not only structures – and multimeric inputs and to specify constraints for the pairwise docking process based on previous knowledge about the interaction. For this purpose, we added modules in InterEvDock2 for automatic template search and comparative modeling of the input proteins. The InterEvDock2 pipeline was benchmarked on 812 complexes for which unbound homology models of the two partners and co-evolutionary information are available in the PPI4DOCK database. InterEvDock2 identified a correct model among the top 10 consensus in 29% of these cases (compared to 15–24% for individual scoring functions) and at least one correct interface residue among 10 predicted in 91% of these cases. InterEvDock2 is thus a unique protein docking server, designed to be useful for the experimental biology community. The InterEvDock2 web interface is available at http://bioserv.rpbs.univ-paris-diderot.fr/services/InterEvDock2/.
Collapse
Affiliation(s)
- Chloé Quignot
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, 91198, Gif-sur-Yvette cedex, France
| | - Julien Rey
- INSERM UMR-S 973, Université Paris Diderot, Sorbonne Paris Cité, RPBS, Paris 75205, France
| | - Jinchao Yu
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, 91198, Gif-sur-Yvette cedex, France
| | - Pierre Tufféry
- INSERM UMR-S 973, Université Paris Diderot, Sorbonne Paris Cité, RPBS, Paris 75205, France
| | - Raphaël Guerois
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, 91198, Gif-sur-Yvette cedex, France
| | - Jessica Andreani
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, 91198, Gif-sur-Yvette cedex, France
| |
Collapse
|
36
|
Dapkūnas J, Olechnovič K, Venclovas Č. Structural modeling of protein complexes: Current capabilities and challenges. Proteins 2019; 87:1222-1232. [PMID: 31294859 DOI: 10.1002/prot.25774] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 06/21/2019] [Accepted: 07/06/2019] [Indexed: 12/27/2022]
Abstract
Proteins frequently interact with each other, and the knowledge of structures of the corresponding protein complexes is necessary to understand how they function. Computational methods are increasingly used to provide structural models of protein complexes. Not surprisingly, community-wide Critical Assessment of protein Structure Prediction (CASP) experiments have recently started monitoring the progress in this research area. We participated in CASP13 with the aim to evaluate our current capabilities in modeling of protein complexes and to gain a better understanding of factors that exert the largest impact on these capabilities. To model protein complexes in CASP13, we applied template-based modeling, free docking and hybrid techniques that enabled us to generate models of the topmost quality for 27 of 42 multimers. If templates for protein complexes could be identified, we modeled the structures with reasonable accuracy by straightforward homology modeling. If only partial templates were available, it was nevertheless possible to predict the interaction interfaces correctly or to generate acceptable models for protein complexes by combining template-based modeling with docking. If no templates were available, we used rigid-body docking with limited success. However, in some free docking models, despite the incorrect subunit orientation and missed interface contacts, the approximate location of protein binding sites was identified correctly. Apparently, our overall performance in docking was limited by the quality of monomer models and by the imperfection of scoring methods. The impact of human intervention on our results in modeling of protein complexes was significant indicating the need for improvements of automatic methods.
Collapse
Affiliation(s)
- Justas Dapkūnas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Kliment Olechnovič
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| | - Česlovas Venclovas
- Institute of Biotechnology, Life Sciences Center, Vilnius University, Vilnius, Lithuania
| |
Collapse
|
37
|
Plundrich NJ, Cook BT, Maleki SJ, Fourches D, Lila MA. Binding of peanut allergen Ara h 2 with Vaccinium fruit polyphenols. Food Chem 2019; 284:287-295. [PMID: 30744860 DOI: 10.1016/j.foodchem.2019.01.081] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Revised: 01/08/2019] [Accepted: 01/08/2019] [Indexed: 01/30/2023]
Abstract
The potential for 42 different polyphenols found in Vaccinium fruits to bind to peanut allergen Ara h 2 and inhibit IgE binding epitopes was investigated using cheminformatics techniques. Out of 12 predicted binders, delphinidin-3-glucoside, cyanidin-3-glucoside, procyanidin C1, and chlorogenic acid were further evaluated in vitro. Circular dichroism, UV-Vis spectroscopy, and immunoblotting determined their capacity to (i) bind to Ara h 2, (ii) induce protein secondary structural changes, and (iii) inhibit IgE binding epitopes. UV-Vis spectroscopy clearly indicated that procyanidin C1 and chlorogenic acid interacted with Ara h 2, and circular dichroism results suggested that interactions with these polyphenols resulted in changes to Ara h 2 secondary structures. Immunoblotting showed that procyanidin C1 and chlorogenic acid bound to Ara h 2 significantly decreased the IgE binding capacity by 37% and 50%, respectively. These results suggest that certain polyphenols can inhibit IgE recognition of Ara h 2 by obstructing linear IgE epitopes.
Collapse
Affiliation(s)
- Nathalie J Plundrich
- Plants for Human Health Institute, Department of Food, Bioprocessing and Nutrition Sciences, North Carolina State University, North Carolina Research Campus, Kannapolis, NC 28081, USA
| | - Bethany T Cook
- Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27695, USA
| | - Soheila J Maleki
- United States Department of Agriculture-Agricultural Research Service-Southern Regional Research Center, New Orleans, LA 70124, USA
| | - Denis Fourches
- Department of Chemistry, Bioinformatics Research Center, North Carolina State University, Raleigh, NC 27695, USA
| | - Mary Ann Lila
- Plants for Human Health Institute, Department of Food, Bioprocessing and Nutrition Sciences, North Carolina State University, North Carolina Research Campus, Kannapolis, NC 28081, USA.
| |
Collapse
|
38
|
Geng C, Xue LC, Roel‐Touris J, Bonvin AMJJ. Finding the ΔΔ
G
spot: Are predictors of binding affinity changes upon mutations in protein–protein interactions ready for it? WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2019. [DOI: 10.1002/wcms.1410] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Affiliation(s)
- Cunliang Geng
- Bijvoet Center for Biomolecular Research, Faculty of Science—Chemistry Utrecht University Utrecht The Netherlands
| | - Li C. Xue
- Bijvoet Center for Biomolecular Research, Faculty of Science—Chemistry Utrecht University Utrecht The Netherlands
| | - Jorge Roel‐Touris
- Bijvoet Center for Biomolecular Research, Faculty of Science—Chemistry Utrecht University Utrecht The Netherlands
| | - Alexandre M. J. J. Bonvin
- Bijvoet Center for Biomolecular Research, Faculty of Science—Chemistry Utrecht University Utrecht The Netherlands
| |
Collapse
|
39
|
Dittrich J, Schmidt D, Pfleger C, Gohlke H. Converging a Knowledge-Based Scoring Function: DrugScore2018. J Chem Inf Model 2018; 59:509-521. [DOI: 10.1021/acs.jcim.8b00582] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Jonas Dittrich
- Mathematisch-Naturwissenschaftliche Fakultät, Institut für Pharmazeutische und Medizinische Chemie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
| | - Denis Schmidt
- Mathematisch-Naturwissenschaftliche Fakultät, Institut für Pharmazeutische und Medizinische Chemie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
| | - Christopher Pfleger
- Mathematisch-Naturwissenschaftliche Fakultät, Institut für Pharmazeutische und Medizinische Chemie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
| | - Holger Gohlke
- Mathematisch-Naturwissenschaftliche Fakultät, Institut für Pharmazeutische und Medizinische Chemie, Heinrich-Heine-Universität Düsseldorf, 40225 Düsseldorf, Germany
- John von Neumann Institute for Computing (NIC), Jülich Supercomputing Centre (JSC) & Institute for Complex Systems−Structural Biochemistry (ICS-6), Forschungszentrum Jülich GmbH, 52425 Jülich, Germany
| |
Collapse
|
40
|
Hu J, Liu HF, Sun J, Wang J, Liu R. Integrating co-evolutionary signals and other properties of residue pairs to distinguish biological interfaces from crystal contacts. Protein Sci 2018; 27:1723-1735. [PMID: 29931702 DOI: 10.1002/pro.3448] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2018] [Revised: 04/21/2018] [Accepted: 05/16/2018] [Indexed: 12/25/2022]
Abstract
It remains challenging to accurately discriminate between biological and crystal interfaces. Most existing analyses and algorithms focused on the features derived from a single side of the interface. However, less attention has been paid to the properties of residue pairs across protein interfaces. To address this problem, we defined a novel co-evolutionary feature for homodimers through integrating direct coupling analysis and image processing techniques. The residue pairs across biological homodimeric interfaces were significantly enriched in co-evolving residues compared to those across crystal contacts, resulting in a promising classification accuracy with area under the curves (AUCs) of >0.85. Considering the availability of co-evolutionary feature, we also designed other residue pair based features that were useful for both homodimers and heterodimers. The most informative residue pairs were identified to reflect the interaction preferences across protein interfaces. Regarding the other extant properties, we designed the new descriptors at the interface residue level as well as at the pairwise contact level. Extensive validation showed that these single properties can be used to identify biological interfaces with AUCs ranging from 0.60 to 0.88. By integrating co-evolutionary feature with other residue pair based properties, our final prediction model output excellent performance with AUCs of >0.91 on different datasets. Compared to existing methods, our algorithm not only yielded better or comparable results but also provided complementary information. An easy-to-use web server is freely accessible at http://liulab.hzau.edu.cn/RPAIAnalyst.
Collapse
Affiliation(s)
- Jian Hu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China.,College of Biomedical Engineering, South-Central University for Nationalities, Wuhan, 430074, P. R. China
| | - Hui-Fang Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| | - Jun Sun
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| | - Jia Wang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| | - Rong Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| |
Collapse
|
41
|
Berto A, Yu J, Morchoisne-Bolhy S, Bertipaglia C, Vallee R, Dumont J, Ochsenbein F, Guerois R, Doye V. Disentangling the molecular determinants for Cenp-F localization to nuclear pores and kinetochores. EMBO Rep 2018; 19:embr.201744742. [PMID: 29632243 DOI: 10.15252/embr.201744742] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2017] [Revised: 03/02/2018] [Accepted: 03/08/2018] [Indexed: 11/09/2022] Open
Abstract
Cenp-F is a multifaceted protein implicated in cancer and developmental pathologies. The Cenp-F C-terminal region contains overlapping binding sites for numerous proteins that contribute to its functions throughout the cell cycle. Here, we focus on the nuclear pore protein Nup133 that interacts with Cenp-F both at nuclear pores in prophase and at kinetochores in mitosis, and on the kinase Bub1, known to contribute to Cenp-F targeting to kinetochores. By combining in silico structural modeling and yeast two-hybrid assays, we generate an interaction model between a conserved helix within the Nup133 β-propeller and a short leucine zipper-containing dimeric segment of Cenp-F. We thereby create mutants affecting the Nup133/Cenp-F interface and show that they prevent Cenp-F localization to the nuclear envelope, but not to kinetochores. Conversely, a point mutation within an adjacent leucine zipper affecting the kinetochore targeting of Cenp-F KT-core domain impairs its interaction with Bub1, but not with Nup133, identifying Bub1 as the direct KT-core binding partner of Cenp-F. Finally, we show that Cenp-E redundantly contributes together with Bub1 to the recruitment of Cenp-F to kinetochores.
Collapse
Affiliation(s)
- Alessandro Berto
- Institut Jacques Monod, UMR7592, CNRS, Université Paris Diderot, Sorbonne Paris Cité, Paris, France.,Ecole Doctorale Structure et Dynamique des Systèmes Vivants (#577), Univ Paris Sud, Université Paris-Saclay, Orsay, France
| | - Jinchao Yu
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ Paris Sud, Université Paris-Saclay, Gif sur Yvette, France
| | | | - Chiara Bertipaglia
- Department of Pathology and Cell Biology, Columbia University, New York, NY, USA
| | - Richard Vallee
- Department of Pathology and Cell Biology, Columbia University, New York, NY, USA
| | - Julien Dumont
- Institut Jacques Monod, UMR7592, CNRS, Université Paris Diderot, Sorbonne Paris Cité, Paris, France
| | - Francoise Ochsenbein
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ Paris Sud, Université Paris-Saclay, Gif sur Yvette, France
| | - Raphael Guerois
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ Paris Sud, Université Paris-Saclay, Gif sur Yvette, France
| | - Valérie Doye
- Institut Jacques Monod, UMR7592, CNRS, Université Paris Diderot, Sorbonne Paris Cité, Paris, France
| |
Collapse
|
42
|
Mercer RCC, Daude N, Dorosh L, Fu ZL, Mays CE, Gapeshina H, Wohlgemuth SL, Acevedo-Morantes CY, Yang J, Cashman NR, Coulthart MB, Pearson DM, Joseph JT, Wille H, Safar JG, Jansen GH, Stepanova M, Sykes BD, Westaway D. A novel Gerstmann-Sträussler-Scheinker disease mutation defines a precursor for amyloidogenic 8 kDa PrP fragments and reveals N-terminal structural changes shared by other GSS alleles. PLoS Pathog 2018; 14:e1006826. [PMID: 29338055 PMCID: PMC5786331 DOI: 10.1371/journal.ppat.1006826] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2017] [Revised: 01/26/2018] [Accepted: 12/18/2017] [Indexed: 11/29/2022] Open
Abstract
To explore pathogenesis in a young Gerstmann-Sträussler-Scheinker Disease (GSS) patient, the corresponding mutation, an eight-residue duplication in the hydrophobic region (HR), was inserted into the wild type mouse PrP gene. Transgenic (Tg) mouse lines expressing this mutation (Tg.HRdup) developed spontaneous neurologic syndromes and brain extracts hastened disease in low-expressor Tg.HRdup mice, suggesting de novo formation of prions. While Tg.HRdup mice exhibited spongiform change, PrP aggregates and the anticipated GSS hallmark of a proteinase K (PK)-resistant 8 kDa fragment deriving from the center of PrP, the LGGLGGYV insertion also imparted alterations in PrP's unstructured N-terminus, resulting in a 16 kDa species following thermolysin exposure. This species comprises a plausible precursor to the 8 kDa PK-resistant fragment and its detection in adolescent Tg.HRdup mice suggests that an early start to accumulation could account for early disease of the index case. A 16 kDa thermolysin-resistant signature was also found in GSS patients with P102L, A117V, H187R and F198S alleles and has coordinates similar to GSS stop codon mutations. Our data suggest a novel shared pathway of GSS pathogenesis that is fundamentally distinct from that producing structural alterations in the C-terminus of PrP, as observed in other prion diseases such as Creutzfeldt-Jakob Disease and scrapie. Prion diseases can be sporadic, infectious or genetic. The central event of all prion diseases is the structural conversion of the cellular prion protein (PrPC) to its disease associated conformer, PrPSc. Gerstmann-Sträussler-Scheinker Disease (GSS) is a genetic prion disease presenting as a multi-systemic neurological syndrome. A novel mutation, an eight amino acid insertion, was discovered in a young GSS patient. We created transgenic mice expressing this mutation and found that they recapitulate key features of the disease; namely PrP deposition in the brain and a low molecular weight proteinase K (PK) resistant internal PrP fragment. While structural investigations did not reveal a gross alteration in the conformation of this mutant PrP, the insertion lying at the boundary of the globular domain causes alterations in the unstructured amino terminal portion of the protein such that it becomes resistant to digestion by the enzyme thermolysin. We demonstrate by kinetic analysis and sequential digestion that this novel thermolysin resistant species is a precursor to the pathognomonic PK resistant fragment. Analysis of samples from other GSS patients revealed this same signature, suggesting a common molecular pathway.
Collapse
Affiliation(s)
- Robert C. C. Mercer
- Centre for Prions and Protein Folding Diseases, University of Alberta, Edmonton, Alberta, Canada
- Department of Medicine (Neurology), University of Alberta, Edmonton, Alberta, Canada
| | - Nathalie Daude
- Centre for Prions and Protein Folding Diseases, University of Alberta, Edmonton, Alberta, Canada
| | - Lyudmyla Dorosh
- National Research Council of Canada, Edmonton, Alberta, Canada
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada
| | - Ze-Lin Fu
- Centre for Prions and Protein Folding Diseases, University of Alberta, Edmonton, Alberta, Canada
- Department of Biochemistry, University of Alberta, Edmonton, Alberta, Canada
| | - Charles E. Mays
- Centre for Prions and Protein Folding Diseases, University of Alberta, Edmonton, Alberta, Canada
| | - Hristina Gapeshina
- Centre for Prions and Protein Folding Diseases, University of Alberta, Edmonton, Alberta, Canada
| | - Serene L. Wohlgemuth
- Centre for Prions and Protein Folding Diseases, University of Alberta, Edmonton, Alberta, Canada
| | | | - Jing Yang
- Centre for Prions and Protein Folding Diseases, University of Alberta, Edmonton, Alberta, Canada
| | - Neil R. Cashman
- Brain Research Centre, University of British Columbia, Vancouver, British Columbia, Canada
| | - Michael B. Coulthart
- Canadian Creutzfeldt-Jakob Disease Surveillance System, Centre for Foodborne, Environmental and Zoonotic Infectious Diseases, Public Health Agency of Canada, Ottawa, Ontario, Canada
| | - Dawn M. Pearson
- Department of Clinical Neurosciences, University of Calgary, Calgary, Alberta, Canada
| | - Jeffrey T. Joseph
- Hotchkiss Brain Institute and Calgary Laboratory Services, University of Calgary, Calgary, Alberta, Canada
| | - Holger Wille
- Centre for Prions and Protein Folding Diseases, University of Alberta, Edmonton, Alberta, Canada
- Department of Biochemistry, University of Alberta, Edmonton, Alberta, Canada
| | - Jiri G. Safar
- Departments of Pathology and Neurology, School of Medicine Case Western Reserve University, Cleveland, Ohio, United States of America
| | - Gerard H. Jansen
- Canadian Creutzfeldt-Jakob Disease Surveillance System, Centre for Foodborne, Environmental and Zoonotic Infectious Diseases, Public Health Agency of Canada, Ottawa, Ontario, Canada
- Division of Anatomical Pathology, University of Ottawa, Ottawa, Ontario, Canada
| | - Maria Stepanova
- National Research Council of Canada, Edmonton, Alberta, Canada
- Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada
| | - Brian D. Sykes
- Centre for Prions and Protein Folding Diseases, University of Alberta, Edmonton, Alberta, Canada
- Department of Biochemistry, University of Alberta, Edmonton, Alberta, Canada
| | - David Westaway
- Centre for Prions and Protein Folding Diseases, University of Alberta, Edmonton, Alberta, Canada
- Department of Medicine (Neurology), University of Alberta, Edmonton, Alberta, Canada
- Department of Biochemistry, University of Alberta, Edmonton, Alberta, Canada
- * E-mail:
| |
Collapse
|
43
|
Abstract
The structural modeling of protein complexes by docking simulations has been attracting increasing interest with the rise of proteomics and of the number of experimentally identified binary interactions. Structures of unbound partners, either modeled or experimentally determined, can be used as input to sample as extensively as possible all putative binding modes and single out the most plausible ones. At the scoring step, evolutionary information contained in the joint multiple sequence alignments of both partners can provide key insights to recognize correct interfaces. Here, we describe a computational protocol based on the InterEvDock web server to exploit coevolution constraints in protein-protein docking methods. We provide methodology guidelines to prepare the input protein structures and generate improved alignments. We also explain how to extract and use the information returned by the server through the analysis of two representative examples.
Collapse
Affiliation(s)
- Aravindan Arun Nadaradjane
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, 91198, Gif-sur-Yvette Cedex, France
| | - Raphael Guerois
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, 91198, Gif-sur-Yvette Cedex, France.
| | - Jessica Andreani
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Univ. Paris-Sud, Université Paris-Saclay, 91198, Gif-sur-Yvette Cedex, France.
| |
Collapse
|
44
|
Feng T, Chen F, Kang Y, Sun H, Liu H, Li D, Zhu F, Hou T. HawkRank: a new scoring function for protein-protein docking based on weighted energy terms. J Cheminform 2017; 9:66. [PMID: 29282565 PMCID: PMC5745212 DOI: 10.1186/s13321-017-0254-7] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2017] [Accepted: 12/14/2017] [Indexed: 01/09/2023] Open
Abstract
Deciphering the structural determinants of protein–protein interactions (PPIs) is essential to gain a deep understanding of many important biological functions in the living cells. Computational approaches for the structural modeling of PPIs, such as protein–protein docking, are quite needed to complement existing experimental techniques. The reliability of a protein–protein docking method is dependent on the ability of the scoring function to accurately distinguish the near-native binding structures from a huge number of decoys. In this study, we developed HawkRank, a novel scoring function designed for the sampling stage of protein–protein docking by summing the contributions from several energy terms, including van der Waals potentials, electrostatic potentials and desolvation potentials. First, based on the solvation free energies predicted by the Generalized Born model for ~ 800 proteins, a SASA (solvent accessible surface area)-based solvation model was developed, which can give the aqueous solvation free energies for proteins by summing the contributions of 21 atom types. Then, the van der Waals potentials and electrostatic potentials based on the Amber ff14SB force field were computed. Finally, the HawkRank scoring function was derived by determining the most optimal weights for five energy terms based on the training set. Here, MSR (modified success rate), a novel protein–protein scoring quality index, was used to assess the performance of HawkRank and three other popular protein–protein scoring functions, including ZRANK, FireDock and dDFIRE. The results show that HawkRank outperformed the other three scoring functions according to the total number of hits and MSR. HawkRank is available at http://cadd.zju.edu.cn/programs/hawkrank.
Collapse
Affiliation(s)
- Ting Feng
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Fu Chen
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Yu Kang
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Huiyong Sun
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Hui Liu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Dan Li
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Feng Zhu
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China
| | - Tingjun Hou
- College of Pharmaceutical Sciences, Zhejiang University, Hangzhou, 310058, Zhejiang, China. .,State Key Lab of CAD&CG, Zhejiang University, Hangzhou, 310058, Zhejiang, China.
| |
Collapse
|
45
|
Membrane proteins structures: A review on computational modeling tools. BIOCHIMICA ET BIOPHYSICA ACTA-BIOMEMBRANES 2017; 1859:2021-2039. [DOI: 10.1016/j.bbamem.2017.07.008] [Citation(s) in RCA: 62] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/23/2017] [Revised: 07/04/2017] [Accepted: 07/13/2017] [Indexed: 01/02/2023]
|
46
|
Abstract
Co-evolution techniques were originally conceived to assist in protein structure prediction by inferring pairs of residues that share spatial proximity. However, the functional relationships that can be extrapolated from co-evolution have also proven to be useful in a wide array of structural bioinformatics applications. These techniques are a powerful way to extract structural and functional information in a sequence-rich world.
Collapse
|
47
|
Peterson LX, Kim H, Esquivel-Rodriguez J, Roy A, Han X, Shin WH, Zhang J, Terashi G, Lee M, Kihara D. Human and server docking prediction for CAPRI round 30-35 using LZerD with combined scoring functions. Proteins 2017; 85:513-527. [PMID: 27654025 PMCID: PMC5313330 DOI: 10.1002/prot.25165] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2016] [Revised: 09/09/2016] [Accepted: 09/15/2016] [Indexed: 12/12/2022]
Abstract
We report the performance of protein-protein docking predictions by our group for recent rounds of the Critical Assessment of Prediction of Interactions (CAPRI), a community-wide assessment of state-of-the-art docking methods. Our prediction procedure uses a protein-protein docking program named LZerD developed in our group. LZerD represents a protein surface with 3D Zernike descriptors (3DZD), which are based on a mathematical series expansion of a 3D function. The appropriate soft representation of protein surface with 3DZD makes the method more tolerant to conformational change of proteins upon docking, which adds an advantage for unbound docking. Docking was guided by interface residue prediction performed with BindML and cons-PPISP as well as literature information when available. The generated docking models were ranked by a combination of scoring functions, including PRESCO, which evaluates the native-likeness of residues' spatial environments in structure models. First, we discuss the overall performance of our group in the CAPRI prediction rounds and investigate the reasons for unsuccessful cases. Then, we examine the performance of several knowledge-based scoring functions and their combinations for ranking docking models. It was found that the quality of a pool of docking models generated by LZerD, that is whether or not the pool includes near-native models, can be predicted by the correlation of multiple scores. Although the current analysis used docking models generated by LZerD, findings on scoring functions are expected to be universally applicable to other docking methods. Proteins 2017; 85:513-527. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Lenna X. Peterson
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Hyungrae Kim
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | | | - Amitava Roy
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- Department of Medicinal Chemistry and Molecular Pharmacology, Purdue University, West Lafayette, IN, 47907, USA
- Bioinformatics and Computational Biosciences Branch, Rocky Mountain Laboratories, NIAID, National Institutes of Health, Hamilton, Montana 59840, USA
| | - Xusi Han
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Woong-Hee Shin
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Jian Zhang
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
| | - Genki Terashi
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- School of Pharmacy, Kitasato University, Minato-Ku, Tokyo, 108-8641, Japan
| | - Matt Lee
- Lilly Biotechnology Center San Diego, 10300 Campus Point Drive, San Diego, CA, 92121, USA
| | - Daisuke Kihara
- Department of Biological Sciences, Purdue University, West Lafayette, IN, 47907, USA
- Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
| |
Collapse
|
48
|
Petoukhov MV, Tuukkanen A. SAS-Based Structural Modelling and Model Validation. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2017; 1009:87-105. [PMID: 29218555 DOI: 10.1007/978-981-10-6038-0_6] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Small angle scattering of X-rays (SAXS) and neutrons (SANS) is a structural technique to study disordered systems with chaotic orientations of scattering inhomogeneities at low resolution. An important example of such systems are solutions of biological macromolecules. Rapid development in the methodology for solution scattering data interpretation and model building during the last two decades brought the analysis far beyond the determination of just few overall structural parameters (which was the only possibility in the past) and ensured SAS a firm position in the methods palette of the modern life sciences. The advances in the methodology include ab initio approaches for shape and domain structure restoration from scattering curves without a priori structural knowledge, classification and validation of the models, evaluation of potential ambiguity associated with the reconstruction. In rigid body and hybrid modelling approaches, solution scattering is synergistically used with other structural techniques utilizing the complementary information such as atomic models of the components, intramolecular contacts, subunits orientations etc. for the reconstruction of complex systems. The usual requirement of the sample monodispersity has been loosed recently and the technique can now address such systems as weakly bound oligomers and transient complexes. These state-of-the-art methods are described together with the examples of their applications and the possible ways of post-processing of the models.
Collapse
Affiliation(s)
- Maxim V Petoukhov
- Hamburg Unit, European Molecular Biology Laboratory, EMBL c/o DESY, Notkestrasse 85, 22607, Hamburg, Germany. .,Federal Scientific Research Centre "Crystallography and Photonics", RAS, Leninsky prospect 59, 119333, Moscow, Russia. .,A.N. Frumkin Institute of Physical Chemistry and Electrochemistry, RAS, Leninsky prospect 31, 119071, Moscow, Russia.
| | - Anne Tuukkanen
- Hamburg Unit, European Molecular Biology Laboratory, EMBL c/o DESY, Notkestrasse 85, 22607, Hamburg, Germany
| |
Collapse
|
49
|
Conservation of coevolving protein interfaces bridges prokaryote-eukaryote homologies in the twilight zone. Proc Natl Acad Sci U S A 2016; 113:15018-15023. [PMID: 27965389 DOI: 10.1073/pnas.1611861114] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Protein-protein interactions are fundamental for the proper functioning of the cell. As a result, protein interaction surfaces are subject to strong evolutionary constraints. Recent developments have shown that residue coevolution provides accurate predictions of heterodimeric protein interfaces from sequence information. So far these approaches have been limited to the analysis of families of prokaryotic complexes for which large multiple sequence alignments of homologous sequences can be compiled. We explore the hypothesis that coevolution points to structurally conserved contacts at protein-protein interfaces, which can be reliably projected to homologous complexes with distantly related sequences. We introduce a domain-centered protocol to study the interplay between residue coevolution and structural conservation of protein-protein interfaces. We show that sequence-based coevolutionary analysis systematically identifies residue contacts at prokaryotic interfaces that are structurally conserved at the interface of their eukaryotic counterparts. In turn, this allows the prediction of conserved contacts at eukaryotic protein-protein interfaces with high confidence using solely mutational patterns extracted from prokaryotic genomes. Even in the context of high divergence in sequence (the twilight zone), where standard homology modeling of protein complexes is unreliable, our approach provides sequence-based accurate information about specific details of protein interactions at the residue level. Selected examples of the application of prokaryotic coevolutionary analysis to the prediction of eukaryotic interfaces further illustrate the potential of this approach.
Collapse
|
50
|
Lensink MF, Velankar S, Wodak SJ. Modeling protein-protein and protein-peptide complexes: CAPRI 6th edition. Proteins 2016; 85:359-377. [PMID: 27865038 DOI: 10.1002/prot.25215] [Citation(s) in RCA: 162] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2016] [Revised: 10/07/2016] [Accepted: 10/10/2016] [Indexed: 12/19/2022]
Abstract
We present the sixth report evaluating the performance of methods for predicting the atomic resolution structures of protein complexes offered as targets to the community-wide initiative on the Critical Assessment of Predicted Interactions (CAPRI). The evaluation is based on a total of 20,670 predicted models for 8 protein-peptide complexes, a novel category of targets in CAPRI, and 12 protein-protein targets in CAPRI prediction Rounds held during the years 2013-2016. For two of the protein-protein targets, the focus was on the prediction of side-chain conformation and positions of interfacial water molecules. Seven of the protein-protein targets were particularly challenging owing to their multicomponent nature, to conformational changes at the binding site, or to a combination of both. Encouragingly, the very large multiprotein complex with the nucleosome was correctly predicted, and correct models were submitted for the protein-peptide targets, but not for some of the challenging protein-protein targets. Models of acceptable quality or better were obtained for 14 of the 20 targets, including medium quality models for 13 targets and high quality models for 8 targets, indicating tangible progress of present-day computational methods in modeling protein complexes with increased accuracy. Our evaluation suggests that the progress stems from better integration of different modeling tools with docking procedures, as well as the use of more sophisticated evolutionary information to score models. Nonetheless, adequate modeling of conformational flexibility in interacting proteins remains an important area with a crucial need for improvement. Proteins 2017; 85:359-377. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Marc F Lensink
- University of Lille, CNRS UMR8576 UGSF, Lille, 59000, France
| | - Sameer Velankar
- European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, United Kingdom
| | - Shoshana J Wodak
- VIB Structural Biology Research Center, VUB Pleinlaan 2, Brussels, 1050, Belgium
| |
Collapse
|