1
|
Collins KW, Copeland MM, Kotthoff I, Singh A, Kundrotas PJ, Vakser IA. Dockground resource for protein recognition studies. Protein Sci 2022; 31:e4481. [PMID: 36281025 PMCID: PMC9667896 DOI: 10.1002/pro.4481] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Revised: 10/19/2022] [Accepted: 10/20/2022] [Indexed: 12/13/2022]
Abstract
Structural information of protein-protein interactions is essential for characterization of life processes at the molecular level. While a small fraction of known protein interactions has experimentally determined structures, computational modeling of protein complexes (protein docking) has to fill the gap. The Dockground resource (http://dockground.compbio.ku.edu) provides a collection of datasets for the development and testing of protein docking techniques. Currently, Dockground contains datasets for the bound and the unbound (experimentally determined and simulated) protein structures, model-model complexes, docking decoys of experimentally determined and modeled proteins, and templates for comparative docking. The Dockground bound proteins dataset is a core set, from which other Dockground datasets are generated. It is devised as a relational PostgreSQL database containing information on experimentally determined protein-protein complexes. This report on the Dockground resource describes current status of the datasets, new automated update procedures and further development of the core datasets. We also present a new Dockground interactive web interface, which allows search by various parameters, such as release date, multimeric state, complex type, structure resolution, and so on, visualization of the search results with a number of customizable parameters, as well as downloadable datasets with predefined levels of sequence and structure redundancy.
Collapse
Affiliation(s)
| | | | - Ian Kotthoff
- Computational Biology ProgramThe University of KansasKansasUSA
| | - Amar Singh
- Computational Biology ProgramThe University of KansasKansasUSA
| | | | - Ilya A. Vakser
- Computational Biology ProgramThe University of KansasKansasUSA
- Department of Molecular BiosciencesThe University of KansasKansasUSA
| |
Collapse
|
2
|
ScanNet: an interpretable geometric deep learning model for structure-based protein binding site prediction. Nat Methods 2022; 19:730-739. [DOI: 10.1038/s41592-022-01490-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/05/2021] [Accepted: 04/12/2022] [Indexed: 11/08/2022]
|
3
|
DOCKGROUND membrane protein-protein set. PLoS One 2022; 17:e0267531. [PMID: 35580077 PMCID: PMC9113569 DOI: 10.1371/journal.pone.0267531] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2021] [Accepted: 04/10/2022] [Indexed: 11/19/2022] Open
Abstract
Membrane proteins are significantly underrepresented in Protein Data Bank despite their essential role in cellular mechanisms and the major progress in experimental protein structure determination. Thus, computational approaches are especially valuable in the case of membrane proteins and their assemblies. The main focus in developing structure prediction techniques has been on soluble proteins, in part due to much greater availability of the structural data. Currently, structure prediction of protein complexes (protein docking) is a well-developed field of study. However, the generic protein docking approaches are not optimal for the membrane proteins because of the differences in physicochemical environment and the spatial constraints imposed by the membranes. Thus, docking of the membrane proteins requires specialized computational methods. Development and benchmarking of the membrane protein docking approaches has to be based on high-quality sets of membrane protein complexes. In this study we present a new dataset of 456 non-redundant alpha helical binary interfaces. The set is significantly larger and more representative than the previously developed sets. In the future, it will become the basis for the development of docking and scoring benchmarks, similar to the ones for soluble proteins in the Dockground resource http://dockground.compbio.ku.edu.
Collapse
|
4
|
Mezei M. Tools for Characterizing Proteins: Circular Variance, Mutual Proximity, Chameleon Sequences, and Subsequence Propensities. Methods Mol Biol 2022; 2405:39-61. [PMID: 35298807 DOI: 10.1007/978-1-0716-1855-4_2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
For the characterization of various aspects of protein structures, four useful concepts are discussed: chameleon sequences, circular variance, mutual proximity, and a subsequence-based foldability score. These concepts were used in estimating foldability of globular, intrinsically disordered and fold-switching proteins, properties of protein-protein interfaces, quantifying sphericity, helping to improve protein-protein docking scores, and estimating the effect of mutations on stability. A conjecture about the Achilles' heel of proteins is presented as well.
Collapse
Affiliation(s)
- Mihaly Mezei
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
| |
Collapse
|
5
|
Yasuo N, Ishida T, Sekijima M. Computer aided drug discovery review for infectious diseases with case study of anti-Chagas project. Parasitol Int 2021; 83:102366. [PMID: 33915269 DOI: 10.1016/j.parint.2021.102366] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2020] [Revised: 03/23/2021] [Accepted: 04/07/2021] [Indexed: 01/09/2023]
Abstract
Neglected tropical diseases (NTDs) are parasitic and bacterial infections that are widespread, especially in the tropics, and cause health problems for about one billion people over 149 countries worldwide. However, in terms of therapeutic agents, for example, nifurtimox and benznidazole were developed in the 1960s to treat Chagas disease, but new drugs are desirable because of their side effects. Drug discovery takes 12 to 14 years and costs $2.6 billon dollars, and hence, computer aided drug discovery (CADD) technology is expected to reduce the time and cost. This paper describes our methods and results based on CADD, mainly for NTDs. An overview of databases, molecular simulation and pharmacophore modeling, contest-based drug discovery, and machine learning and their results are presented herein.
Collapse
Affiliation(s)
- Nobuaki Yasuo
- Academy for Convergence of Materials and Informatics (TAC-MI), Tokyo Institute of Technology, S6-23, 2-12-1, Ookayama, Meguro-ku, Tokyo, Japan.
| | - Takashi Ishida
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, W8-85, 2-12-1, Ookayama, Meguro-ku, Tokyo, Japan.
| | - Masakazu Sekijima
- Department of Computer Science, School of Computing, Tokyo Institute of Technology, 4259-J3-23, Nagatsuta-cho, Midori-ku, Yokohama, 226-8501, Japan.
| |
Collapse
|
6
|
Kundrotas PJ, Kotthoff I, Choi SW, Copeland MM, Vakser IA. Dockground Tool for Development and Benchmarking of Protein Docking Procedures. Methods Mol Biol 2020; 2165:289-300. [PMID: 32621232 DOI: 10.1007/978-1-0716-0708-4_17] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/11/2023]
Abstract
Databases of protein-protein complexes are essential for the development of protein modeling/docking techniques. Such databases provide a knowledge base for docking algorithms, intermolecular potentials, search procedures, scoring functions, and refinement protocols. Development of docking techniques requires systematic validation of the modeling protocols on carefully curated benchmark sets of complexes. We present a description and a guide to the DOCKGROUND resource ( http://dockground.compbio.ku.edu ) for structural modeling of protein interactions. The resource integrates various datasets of protein complexes and other data for the development and testing of protein docking techniques. The sets include bound complexes, experimentally determined unbound, simulated unbound, model-model complexes, and docking decoys. The datasets are available to the user community through a Web interface.
Collapse
Affiliation(s)
- Petras J Kundrotas
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA.
| | - Ian Kotthoff
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA
| | - Sherman W Choi
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA
| | - Matthew M Copeland
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA
| | - Ilya A Vakser
- Computational Biology Program and Department of Molecular Biosciences, The University of Kansas, Lawrence, KS, USA.
| |
Collapse
|
7
|
McFarlane JMB, Krause KD, Paci I. Accelerated Structural Prediction of Flexible Protein–Ligand Complexes: The SLICE Method. J Chem Inf Model 2019; 59:5263-5275. [DOI: 10.1021/acs.jcim.9b00688] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/24/2022]
Affiliation(s)
- James M. B. McFarlane
- Department of Chemistry, University of Victoria, Victoria, British Columbia V8W 3V6, Canada
| | - Katherine D. Krause
- Department of Chemistry, University of Victoria, Victoria, British Columbia V8W 3V6, Canada
| | - Irina Paci
- Department of Chemistry, University of Victoria, Victoria, British Columbia V8W 3V6, Canada
| |
Collapse
|
8
|
Aker M, Ohanona S, Fisher S, Katsman E, Dvorkin S, Kopelowitz E, Goldstein M, Barnett-Itzhaki Z, Amitay M. CDB—a database for protein heterodimeric complexes. Protein Eng Des Sel 2018; 31:361-365. [DOI: 10.1093/protein/gzy030] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2018] [Accepted: 10/29/2018] [Indexed: 11/13/2022] Open
Affiliation(s)
- Malka Aker
- Department of Bioinformatics, Jerusalem College of Technology, 21 Havaad Haleumi Street, Jerusalem, Israel
| | - Shirly Ohanona
- Department of Bioinformatics, Jerusalem College of Technology, 21 Havaad Haleumi Street, Jerusalem, Israel
| | - Shira Fisher
- Department of Bioinformatics, Jerusalem College of Technology, 21 Havaad Haleumi Street, Jerusalem, Israel
| | - Efrat Katsman
- Department of Bioinformatics, Jerusalem College of Technology, 21 Havaad Haleumi Street, Jerusalem, Israel
| | - Shirit Dvorkin
- Department of Bioinformatics, Jerusalem College of Technology, 21 Havaad Haleumi Street, Jerusalem, Israel
| | - Efrat Kopelowitz
- Department of Bioinformatics, Jerusalem College of Technology, 21 Havaad Haleumi Street, Jerusalem, Israel
| | - Moshe Goldstein
- Department of Computer Science, Jerusalem College of Technology, 21 Havaad Haleumi St., Jerusalem, Israel
| | - Zohar Barnett-Itzhaki
- Department of Bioinformatics, Jerusalem College of Technology, 21 Havaad Haleumi Street, Jerusalem, Israel
- Public Health Services, Ministry of Health, 39 Yirmiyahu Street, Jerusalem, Israel
| | - Moshe Amitay
- Department of Bioinformatics, Jerusalem College of Technology, 21 Havaad Haleumi Street, Jerusalem, Israel
| |
Collapse
|
9
|
Kundrotas PJ, Anishchenko I, Badal VD, Das M, Dauzhenka T, Vakser IA. Modeling CAPRI targets 110-120 by template-based and free docking using contact potential and combined scoring function. Proteins 2018; 86 Suppl 1:302-310. [PMID: 28905425 PMCID: PMC5820180 DOI: 10.1002/prot.25380] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2017] [Revised: 08/25/2017] [Accepted: 09/10/2017] [Indexed: 01/12/2023]
Abstract
The paper presents analysis of our template-based and free docking predictions in the joint CASP12/CAPRI37 round. A new scoring function for template-based docking was developed, benchmarked on the Dockground resource, and applied to the targets. The results showed that the function successfully discriminates the incorrect docking predictions. In correctly predicted targets, the scoring function was complemented by other considerations, such as consistency of the oligomeric states among templates, similarity of the biological functions, biological interface relevance, etc. The scoring function still does not distinguish well biological from crystal packing interfaces, and needs further development for the docking of bundles of α-helices. In the case of the trimeric targets, sequence-based methods did not find common templates, despite similarity of the structures, suggesting complementary use of structure- and sequence-based alignments in comparative docking. The results showed that if a good docking template is found, an accurate model of the interface can be built even from largely inaccurate models of individual subunits. Free docking however is very sensitive to the quality of the individual models. However, our newly developed contact potential detected approximate locations of the binding sites.
Collapse
Affiliation(s)
- Petras J. Kundrotas
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66045, USA
| | | | - Varsha D. Badal
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66045, USA
| | - Madhurima Das
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66045, USA
| | - Taras Dauzhenka
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66045, USA
| | - Ilya A. Vakser
- Center for Computational Biology and Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas 66045, USA
| |
Collapse
|
10
|
Kundrotas PJ, Anishchenko I, Dauzhenka T, Kotthoff I, Mnevets D, Copeland MM, Vakser IA. Dockground: A comprehensive data resource for modeling of protein complexes. Protein Sci 2017; 27:172-181. [PMID: 28891124 DOI: 10.1002/pro.3295] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Revised: 09/06/2017] [Accepted: 09/07/2017] [Indexed: 12/28/2022]
Abstract
Characterization of life processes at the molecular level requires structural details of protein interactions. The number of experimentally determined structures of protein-protein complexes accounts only for a fraction of known protein interactions. This gap in structural description of the interactome has to be bridged by modeling. An essential part of the development of structural modeling/docking techniques for protein interactions is databases of protein-protein complexes. They are necessary for studying protein interfaces, providing a knowledge base for docking algorithms, and developing intermolecular potentials, search procedures, and scoring functions. Development of protein-protein docking techniques requires thorough benchmarking of different parts of the docking protocols on carefully curated sets of protein-protein complexes. We present a comprehensive description of the Dockground resource (http://dockground.compbio.ku.edu) for structural modeling of protein interactions, including previously unpublished unbound docking benchmark set 4, and the X-ray docking decoy set 2. The resource offers a variety of interconnected datasets of protein-protein complexes and other data for the development and testing of different aspects of protein docking methodologies. Based on protein-protein complexes extracted from the PDB biounit files, Dockground offers sets of X-ray unbound, simulated unbound, model, and docking decoy structures. All datasets are freely available for download, as a whole or selecting specific structures, through a user-friendly interface on one integrated website.
Collapse
Affiliation(s)
- Petras J Kundrotas
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Ivan Anishchenko
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Taras Dauzhenka
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Ian Kotthoff
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Daniil Mnevets
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Matthew M Copeland
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045
| | - Ilya A Vakser
- Center for Computational Biology, The University of Kansas, Lawrence, Kansas, 66045.,Department of Molecular Biosciences, The University of Kansas, Lawrence, Kansas, 66045
| |
Collapse
|
11
|
Mezei M. Rescore protein-protein docked ensembles with an interface contact statistics. Proteins 2016; 85:235-241. [PMID: 27862307 DOI: 10.1002/prot.25209] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2016] [Revised: 10/28/2016] [Accepted: 10/30/2016] [Indexed: 11/09/2022]
Abstract
The recently developed statistical measure for the type of residue-residue contact at protein complex interfaces, based on a parameter-free definition of contact, has been used to define a contact score that is correlated with the likelihood of correctness of a proposed complex structure. Comparing the proposed contact scores on the native structure and on a set of model structures the proposed measure was shown to generally favor the native structure but in itself was not able to reliably score the native structure to be the best. Adjusting the scores of redocking experiments with the contact score showed that the adjusted score was able to move up the ranking of the native-like structure among the proposed complexes when the native-like was not ranked the best by the respective program. Tests on docking of unbound proteins compared the contact scores of the complexes with the contact score of the crystal structure again showing the tendency of the contact score to favor native-like conformations. The possibility of using the contact score to improve the determination of biological dimers in a crystal structure was also explored. Proteins 2017; 85:235-241. © 2016 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Mihaly Mezei
- Department of Pharmacological Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, 10029
| |
Collapse
|