1
|
Feidakis CP, Krivak R, Hoksza D, Novotny M. AHoJ-DB: A PDB-wide Assignment of apo & holo Relationships Based on Individual Protein-Ligand Interactions. J Mol Biol 2024; 436:168545. [PMID: 38508305 DOI: 10.1016/j.jmb.2024.168545] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2024] [Revised: 03/12/2024] [Accepted: 03/14/2024] [Indexed: 03/22/2024]
Abstract
A single protein structure is rarely sufficient to capture the conformational variability of a protein. Both bound and unbound (holo and apo) forms of a protein are essential for understanding its geometry and making meaningful comparisons. Nevertheless, docking or drug design studies often still consider only single protein structures in their holo form, which are for the most part rigid. With the recent explosion in the field of structural biology, large, curated datasets are urgently needed. Here, we use a previously developed application (AHoJ) to perform a comprehensive search for apo-holo pairs for 468,293 biologically relevant protein-ligand interactions across 27,983 proteins. In each search, the binding pocket is captured and mapped across existing structures within the same UniProt, and the mapped pockets are annotated as apo or holo, based on the presence or absence of ligands. We assemble the results into a database, AHoJ-DB (www.apoholo.cz/db), that captures the variability of proteins with identical sequences, thereby exposing the agents responsible for the observed differences in geometry. We report several metrics for each annotated pocket, and we also include binding pockets that form at the interface of multiple chains. Analysis of the database shows that about 24% of the binding sites occur at the interface of two or more chains and that less than 50% of the total binding sites processed have an apo form in the PDB. These results can be used to train and evaluate predictors, discover potentially druggable proteins, and reveal protein- and ligand-specific relationships that were previously obscured by intermittent or partial data. Availability: www.apoholo.cz/db.
Collapse
Affiliation(s)
- Christos P Feidakis
- Department of Cell Biology, Faculty of Science, Charles University, Prague 12843, Czech Republic.
| | - Radoslav Krivak
- Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague 12116, Czech Republic; Institute of Organic Chemistry and Biochemistry of the Czech Academy of Sciences, Prague 16000, Czech Republic
| | - David Hoksza
- Department of Software Engineering, Faculty of Mathematics and Physics, Charles University, Prague 12116, Czech Republic
| | - Marian Novotny
- Department of Cell Biology, Faculty of Science, Charles University, Prague 12843, Czech Republic.
| |
Collapse
|
2
|
Singh PK, Stan RC. ThermoPCD: a database of molecular dynamics trajectories of antibody-antigen complexes at physiologic and fever-range temperatures. Database (Oxford) 2024; 2024:baae015. [PMID: 38502609 PMCID: PMC10950042 DOI: 10.1093/database/baae015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2023] [Revised: 02/12/2024] [Accepted: 02/22/2024] [Indexed: 03/21/2024]
Abstract
Progression of various cancers and autoimmune diseases is associated with changes in systemic or local tissue temperatures, which may impact current therapies. The role of fever and acute inflammation-range temperatures on the stability and activity of antibodies relevant for cancers and autoimmunity is unknown. To produce molecular dynamics (MD) trajectories of immune complexes at relevant temperatures, we used the Research Collaboratory for Structural Bioinformatics (RCSB) database to identify 50 antibody:antigen complexes of interest, in addition to single antibodies and antigens, and deployed Groningen Machine for Chemical Simulations (GROMACS) to prepare and run the structures at different temperatures for 100-500 ns, in single or multiple random seeds. MD trajectories are freely available. Processed data include Protein Data Bank outputs for all files obtained every 50 ns, and free binding energy calculations for some of the immune complexes. Protocols for using the data are also available. Individual datasets contain unique DOIs. We created a web interface, ThermoPCD, as a platform to explore the data. The outputs of ThermoPCD allow the users to relate thermally-dependent changes in epitopes:paratopes interfaces to their free binding energies, or against own experimentally derived binding affinities. ThermoPCD is a free to use database of immune complexes' trajectories at different temperatures that does not require registration and allows for all the data to be available for download. Database URL: https://sites.google.com/view/thermopcd/home.
Collapse
Affiliation(s)
- Puneet K Singh
- Department of Basic Medical Science, Chonnam National University, Hwasun 58128, Republic of Korea
| | - Razvan C Stan
- Department of Basic Medical Science, Chonnam National University, Hwasun 58128, Republic of Korea
| |
Collapse
|
3
|
D3PM: a comprehensive database for protein motions ranging from residue to domain. BMC Bioinformatics 2022; 23:70. [PMID: 35164668 PMCID: PMC8845362 DOI: 10.1186/s12859-022-04595-0] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Accepted: 02/01/2022] [Indexed: 11/24/2022] Open
Abstract
Background Knowledge of protein motions is significant to understand its functions. While currently available databases for protein motions are mostly focused on overall domain motions, little attention is paid on local residue motions. Albeit with relatively small scale, the local residue motions, especially those residues in binding pockets, may play crucial roles in protein functioning and ligands binding. Results A comprehensive protein motion database, namely D3PM, was constructed in this study to facilitate the analysis of protein motions. The protein motions in the D3PM range from overall structural changes of macromolecule to local flip motions of binding pocket residues. Currently, the D3PM has collected 7679 proteins with overall motions and 3513 proteins with pocket residue motions. The motion patterns are classified into 4 types of overall structural changes and 5 types of pocket residue motions. Impressively, we found that less than 15% of protein pairs have obvious overall conformational adaptations induced by ligand binding, while more than 50% of protein pairs have significant structural changes in ligand binding sites, indicating that ligand-induced conformational changes are drastic and mainly confined around ligand binding sites. Based on the residue preference in binding pocket, we classified amino acids into “pocketphilic” and “pocketphobic” residues, which should be helpful for pocket prediction and drug design. Conclusion D3PM is a comprehensive database about protein motions ranging from residue to domain, which should be useful for exploring diverse protein motions and for understanding protein function and drug design. The D3PM is available on www.d3pharma.com/D3PM/index.php. Supplementary Information The online version contains supplementary material available at 10.1186/s12859-022-04595-0.
Collapse
|
4
|
Vetrivel I, de Brevern AG, Cadet F, Srinivasan N, Offmann B. Structural variations within proteins can be as large as variations observed across their homologues. Biochimie 2019; 167:162-170. [PMID: 31560932 DOI: 10.1016/j.biochi.2019.09.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2019] [Accepted: 09/18/2019] [Indexed: 10/26/2022]
Abstract
Understanding the structural plasticity of proteins is key to understanding the intricacies of their functions and mechanistic basis. In the current study, we analyzed the available multiple crystal structures of the same protein for the structural differences. For this purpose we used an abstraction of protein structures referred as Protein Blocks (PBs) that was previously established. We also characterized the nature of the structural variations for a few proteins using molecular dynamics simulations. In both the cases, the structural variations were summarized in the form of substitution matrices of PBs. We show that certain conformational states are preferably replaced by other specific conformational states. Interestingly, these structural variations are highly similar to those previously observed across structures of homologous proteins (r2 = 0.923) or across the ensemble of conformations from NMR data (r2 = 0.919). Thus our study quantitatively shows that overall trends of structural changes in a given protein are nearly identical to the trends of structural differences that occur in the topologically equivalent positions in homologous proteins. Specific case studies are used to illustrate the nature of these structural variations.
Collapse
Affiliation(s)
- Iyanar Vetrivel
- Université de Nantes, UFIP UMR 6286 CNRS, UFR Sciences et Techniques, 2 Chemin de La Houssinière, Nantes, France
| | - Alexandre G de Brevern
- INSERM UMR_S 1134, DSIMB Team, Laboratory of Excellence, GR-Ex, Univ Paris Diderot, Univ Sorbonne Paris Cité, INTS, 6 Rue Alexandre Cabanel, Paris, France
| | - Frédéric Cadet
- University of Paris, UMR_S1134, BIGR, Inserm, F-75015, Paris, France; DSIMB, UMR_S1134, BIGR, Inserm, Laboratory of Excellence GR-Ex, Faculty of Sciences and Technology, University of La Reunion, F-97715, Saint-Denis, France; PEACCEL, Protein Engineering Accelerator, 6 Square Albin Cachot, Box 42, 75013, Paris, France
| | | | - Bernard Offmann
- Université de Nantes, UFIP UMR 6286 CNRS, UFR Sciences et Techniques, 2 Chemin de La Houssinière, Nantes, France.
| |
Collapse
|
5
|
Marks C, Shi J, Deane CM. Predicting loop conformational ensembles. Bioinformatics 2018; 34:949-956. [PMID: 29136084 DOI: 10.1093/bioinformatics/btx718] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2017] [Accepted: 11/09/2017] [Indexed: 12/23/2022] Open
Abstract
Motivation Protein function is often facilitated by the existence of multiple stable conformations. Structure prediction algorithms need to be able to model these different conformations accurately and produce an ensemble of structures that represent a target's conformational diversity rather than just a single state. Here, we investigate whether current loop prediction algorithms are capable of this. We use the algorithms to predict the structures of loops with multiple experimentally determined conformations, and the structures of loops with only one conformation, and assess their ability to generate and select decoys that are close to any, or all, of the observed structures. Results We find that while loops with only one known conformation are predicted well, conformationally diverse loops are modelled poorly, and in most cases the predictions returned by the methods do not resemble any of the known conformers. Our results contradict the often-held assumption that multiple native conformations will be present in the decoy set, making the production of accurate conformational ensembles impossible, and hence indicating that current methodologies are not well suited to prediction of conformationally diverse, often functionally important protein regions. Contact marks@stats.ox.ac.uk. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Claire Marks
- Department of Statistics, University of Oxford, Oxford OX1 3LB, UK
| | - Jiye Shi
- Department of Chemistry, UCB Pharma, Slough SL1 3WE, UK
| | | |
Collapse
|
6
|
Fotoohifiroozabadi S, Mohamad MS, Deris S. NAHAL-Flex: A Numerical and Alphabetical Hinge Detection Algorithm for Flexible Protein Structure Alignment. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2018; 15:934-943. [PMID: 28534783 DOI: 10.1109/tcbb.2017.2705080] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Flexible proteins are proteins that have conformational changes in their structures. Protein flexibility analysis is critical for classifying and understanding protein functionality. For that analysis, the hinge areas where proteins show flexibility must be detected. To detect the location of the hinges, previous methods have utilized the three-dimensional (3D) structure of proteins, which is highly computational. To reduce the computational complexity, this study proposes a novel text-based method using structural alphabets (SAs) for detecting the hinge position, called NAHAL-Flex. Protein structures were encoded to a particular type of SA called the protein folding shape code (PFSC), which remains unaffected by location, scale, and rotation. The flexible regions of the proteins are the only places in which letter sequences can be distorted. With this knowledge, it is possible to find the longest alignment path of two letter sequences using a dynamic programming (DP) algorithm. Then, the proposed method looks for regions where the alphabet sequence is distorted to find the most probable hinge positions. In order to reduce the number of hinge positions, a genetic algorithm (GA) was utilized to find the best candidate hinge points. To evaluate the method's effectiveness, four different flexible and rigid protein databases, including two small datasets and two large datasets, were utilized. For the small dataset, the NAHAL-Flex method was comparable to state-of-the-art structural flexible alignment methods. The result for the large datasets show that NAHAL-Flex outperforms some well-known alignment methods, e.g., DaliLite, Matt, DeepAlign, and TM-align; the speed of NAHAL-Flex was faster and its result was more accurate than the other methods.
Collapse
|
7
|
Halakou F, Kilic ES, Cukuroglu E, Keskin O, Gursoy A. Enriching Traditional Protein-protein Interaction Networks with Alternative Conformations of Proteins. Sci Rep 2017; 7:7180. [PMID: 28775330 PMCID: PMC5543104 DOI: 10.1038/s41598-017-07351-0] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 06/27/2017] [Indexed: 12/19/2022] Open
Abstract
Traditional Protein-Protein Interaction (PPI) networks, which use a node and edge representation, lack some valuable information about the mechanistic details of biological processes. Mapping protein structures to these PPI networks not only provides structural details of each interaction but also helps us to find the mutual exclusive interactions. Yet it is not a comprehensive representation as it neglects the conformational changes of proteins which may lead to different interactions, functions, and downstream signalling. In this study, we proposed a new representation for structural PPI networks inspecting the alternative conformations of proteins. We performed a large-scale study by creating breast cancer metastasis network and equipped it with different conformers of proteins. Our results showed that although 88% of proteins in our network has at least two structures in Protein Data Bank (PDB), only 22% of them have alternative conformations and the remaining proteins have different regions saved in PDB. However, using even this small set of alternative conformations we observed a considerable increase in our protein docking predictions. Our protein-protein interaction predictions increased from 54% to 76% using the alternative conformations. We also showed the benefits of investigating structural data and alternative conformations of proteins through three case studies.
Collapse
Affiliation(s)
- Farideh Halakou
- Department of Computer Engineering, Koc University, Istanbul, 34450, Turkey
| | - Emel Sen Kilic
- Department of Chemical and Biological Engineering, Koc University, Istanbul, 34450, Turkey.,Microbiology, Immunology and Cell Biology Department, West Virginia University, Morgantown, 26505, WV, USA
| | - Engin Cukuroglu
- Computational Sciences and Engineering, Graduate School of Sciences and Engineering, Koc University, Istanbul, 34450, Turkey
| | - Ozlem Keskin
- Department of Chemical and Biological Engineering, Koc University, Istanbul, 34450, Turkey
| | - Attila Gursoy
- Department of Computer Engineering, Koc University, Istanbul, 34450, Turkey.
| |
Collapse
|