1
|
Akapo OO, Macnar JM, Kryś JD, Syed PR, Syed K, Gront D. In Silico Structural Modeling and Analysis of Interactions of Tremellomycetes Cytochrome P450 Monooxygenases CYP51s with Substrates and Azoles. Int J Mol Sci 2021; 22:7811. [PMID: 34360577 PMCID: PMC8346148 DOI: 10.3390/ijms22157811] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2021] [Revised: 05/21/2021] [Accepted: 05/25/2021] [Indexed: 11/16/2022] Open
Abstract
Cytochrome P450 monooxygenase CYP51 (sterol 14α-demethylase) is a well-known target of the azole drug fluconazole for treating cryptococcosis, a life-threatening fungal infection in immune-compromised patients in poor countries. Studies indicate that mutations in CYP51 confer fluconazole resistance on cryptococcal species. Despite the importance of CYP51 in these species, few studies on the structural analysis of CYP51 and its interactions with different azole drugs have been reported. We therefore performed in silico structural analysis of 11 CYP51s from cryptococcal species and other Tremellomycetes. Interactions of 11 CYP51s with nine ligands (three substrates and six azoles) performed by Rosetta docking using 10,000 combinations for each of the CYP51-ligand complex (11 CYP51s × 9 ligands = 99 complexes) and hierarchical agglomerative clustering were used for selecting the complexes. A web application for visualization of CYP51s' interactions with ligands was developed (http://bioshell.pl/azoledocking/). The study results indicated that Tremellomycetes CYP51s have a high preference for itraconazole, corroborating the in vitro effectiveness of itraconazole compared to fluconazole. Amino acids interacting with different ligands were found to be conserved across CYP51s, indicating that the procedure employed in this study is accurate and can be automated for studying P450-ligand interactions to cater for the growing number of P450s.
Collapse
Affiliation(s)
- Olufunmilayo Olukemi Akapo
- Department of Biochemistry and Microbiology, Faculty of Science and Agriculture, University of Zululand, KwaDlangezwa 3886, South Africa;
| | - Joanna M. Macnar
- College of Inter-Faculty Individual Studies in Mathematics and Natural Sciences, University of Warsaw, Stefana Banacha 2C, 02-097 Warsaw, Poland;
- Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland;
| | - Justyna D. Kryś
- Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland;
| | - Puleng Rosinah Syed
- Department of Pharmaceutical Chemistry, College of Health Sciences, University of KwaZulu-Natal, Durban 4000, South Africa;
| | - Khajamohiddin Syed
- Department of Biochemistry and Microbiology, Faculty of Science and Agriculture, University of Zululand, KwaDlangezwa 3886, South Africa;
| | - Dominik Gront
- Biological and Chemical Research Center, Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland;
| |
Collapse
|
2
|
Macnar JM, Szulc NA, Kryś JD, Badaczewska-Dawid AE, Gront D. BioShell 3.0: Library for Processing Structural Biology Data. Biomolecules 2020; 10:biom10030461. [PMID: 32188163 PMCID: PMC7175226 DOI: 10.3390/biom10030461] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2020] [Revised: 03/05/2020] [Accepted: 03/10/2020] [Indexed: 01/11/2023] Open
Abstract
BioShell is an open-source package for processing biological data, particularly focused on structural applications. The package provides parsers, data structures and algorithms for handling and analyzing macromolecular sequences, structures and sequence profiles. The most frequently used routines are accessible by a set of easy-to-use command line utilities for a Linux environment. The full functionality of the package assumes knowledge of C++ or Python to assemble an application using this software library. Since the last publication that announced the version 2.0, the package has been greatly expanded and rewritten in C++ standard 11 (C++11) to improve its modularity and efficiency. A new testing platform has been implemented to continuously test the correctness and integrity of the package. More than two hundred test programs have been published to provide simple examples that can be used as templates. This makes BioShell an easy to use library that greatly speeds up development of bioinformatics applications and web services without compromising computational efficiency.
Collapse
Affiliation(s)
- Joanna M. Macnar
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland; (J.M.M.); (N.A.S.); (J.D.K.); (A.E.B.-D.)
- College of Inter-Faculty Individual Studies in Mathematics and Natural Sciences, University of Warsaw, Stefana Banacha 2C, 02-097 Warsaw, Poland
| | - Natalia A. Szulc
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland; (J.M.M.); (N.A.S.); (J.D.K.); (A.E.B.-D.)
- Laboratory of Protein Metabolism, International Institute of Molecular and Cell Biology in Warsaw, 4 Ks. Trojdena Street, 02-109 Warsaw, Poland
| | - Justyna D. Kryś
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland; (J.M.M.); (N.A.S.); (J.D.K.); (A.E.B.-D.)
| | - Aleksandra E. Badaczewska-Dawid
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland; (J.M.M.); (N.A.S.); (J.D.K.); (A.E.B.-D.)
| | - Dominik Gront
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland; (J.M.M.); (N.A.S.); (J.D.K.); (A.E.B.-D.)
- Correspondence:
| |
Collapse
|
3
|
Chowdhury S, Sen S, Banerjee A, Uversky VN, Maulik U, Chattopadhyay K. Network mapping of the conformational heterogeneity of SOD1 by deploying statistical cluster analysis of FTIR spectra. Cell Mol Life Sci 2019; 76:4145-4154. [PMID: 31011770 PMCID: PMC11105373 DOI: 10.1007/s00018-019-03108-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/23/2018] [Revised: 04/12/2019] [Accepted: 04/15/2019] [Indexed: 02/02/2023]
Abstract
A crucial contribution to the heterogeneity of the conformational landscape of a protein comes from the way an intermediate relates to another intermediate state in its journey from the unfolded to folded or misfolded form. Unfortunately, it is extremely hard to decode this relatedness in a quantifiable manner. Here, we developed an application of statistical cluster analyses to explore the conformational heterogeneity of a metalloenzyme, human cytosolic copper-zinc superoxide dismutase (SOD1), using the inputs from infrared spectroscopy. This study provides a quantifiable picture of how conformational information at one particular site (for example, the copper-binding pocket) is related to the information at the second site (for example, the zinc-binding pocket), and how this relatedness is transferred to the global conformational information of the protein. The distance outputs were used to quantitatively generate a network capturing the folding sub-stages of SOD1.
Collapse
Affiliation(s)
- Sourav Chowdhury
- Protein Folding and Dynamics Laboratory, Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, Kolkata, 700032, India
- Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA, 02138, USA
| | - Sagnik Sen
- Department of Computer Science, Jadavpur University, Kolkata, 700 032, India
| | - Amrita Banerjee
- Protein Folding and Dynamics Laboratory, Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, Kolkata, 700032, India
- Department of Chemistry, Hiralal Mazumdar Memorial College for Women, Dakshineswar, Kolkata, 700035, India
| | - Vladimir N Uversky
- Department of Molecular Medicine and USF Health Byrd Alzheimer's Research Institute, Morsani College of Medicine, University of South Florida, 12901 Bruce B. Downs Blvd. MDC07, Tampa, FL, USA
- Laboratory of New Methods in Biology, Institute for Biological Instrumentation, Russian Academy of Sciences, Pushchino, 142290, Moscow Region, Russia
| | - Ujjwal Maulik
- Chemistry and Chemical Biology, Harvard University, 12 Oxford Street, Cambridge, MA, 02138, USA
| | - Krishnananda Chattopadhyay
- Protein Folding and Dynamics Laboratory, Structural Biology and Bioinformatics Division, CSIR-Indian Institute of Chemical Biology, Kolkata, 700032, India.
| |
Collapse
|
4
|
Kmiecik S, Kolinski A. One-Dimensional Structural Properties of Proteins in the Coarse-Grained CABS Model. Methods Mol Biol 2017; 1484:83-113. [PMID: 27787822 DOI: 10.1007/978-1-4939-6406-2_8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
Despite the significant increase in computational power, molecular modeling of protein structure using classical all-atom approaches remains inefficient, at least for most of the protein targets in the focus of biomedical research. Perhaps the most successful strategy to overcome the inefficiency problem is multiscale modeling to merge all-atom and coarse-grained models. This chapter describes a well-established CABS coarse-grained protein model. The CABS (C-Alpha, C-Beta, and Side chains) model assumes a 2-4 united-atom representation of amino acids, knowledge-based force field (derived from the statistical regularities seen in known protein sequences and structures) and efficient Monte Carlo sampling schemes (MC dynamics, MC replica-exchange, and combinations). A particular emphasis is given to the unique design of the CABS force-field, which is largely defined using one-dimensional structural properties of proteins, including protein secondary structure. This chapter also presents CABS-based modeling methods, including multiscale tools for de novo structure prediction, modeling of protein dynamics and prediction of protein-peptide complexes. CABS-based tools are freely available at http://biocomp.chem.uw.edu.pl/tools.
Collapse
Affiliation(s)
- Sebastian Kmiecik
- Faculty of Chemistry, University of Warsaw, Pasteura 1, Warszawa, 02-093, Poland
| | - Andrzej Kolinski
- Faculty of Chemistry, University of Warsaw, Pasteura 1, Warszawa, 02-093, Poland.
| |
Collapse
|
5
|
Zhou J, Wishart DS. An improved method to detect correct protein folds using partial clustering. BMC Bioinformatics 2013; 14:11. [PMID: 23323835 PMCID: PMC3626854 DOI: 10.1186/1471-2105-14-11] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2012] [Accepted: 12/13/2012] [Indexed: 11/23/2022] Open
Abstract
Background Structure-based clustering is commonly used to identify correct protein folds among candidate folds (also called decoys) generated by protein structure prediction programs. However, traditional clustering methods exhibit a poor runtime performance on large decoy sets. We hypothesized that a more efficient “partial“ clustering approach in combination with an improved scoring scheme could significantly improve both the speed and performance of existing candidate selection methods. Results We propose a new scheme that performs rapid but incomplete clustering on protein decoys. Our method detects structurally similar decoys (measured using either Cα RMSD or GDT-TS score) and extracts representatives from them without assigning every decoy to a cluster. We integrated our new clustering strategy with several different scoring functions to assess both the performance and speed in identifying correct or near-correct folds. Experimental results on 35 Rosetta decoy sets and 40 I-TASSER decoy sets show that our method can improve the correct fold detection rate as assessed by two different quality criteria. This improvement is significantly better than two recently published clustering methods, Durandal and Calibur-lite. Speed and efficiency testing shows that our method can handle much larger decoy sets and is up to 22 times faster than Durandal and Calibur-lite. Conclusions The new method, named HS-Forest, avoids the computationally expensive task of clustering every decoy, yet still allows superior correct-fold selection. Its improved speed, efficiency and decoy-selection performance should enable structure prediction researchers to work with larger decoy sets and significantly improve their ab initio structure prediction performance.
Collapse
Affiliation(s)
- Jianjun Zhou
- JHK Co., Ltd., 2049 Heping Road, Shenzhen, Guangdong 518010, China
| | | |
Collapse
|
6
|
Gniewek P, Kolinski A, Jernigan RL, Kloczkowski A. Elastic network normal modes provide a basis for protein structure refinement. J Chem Phys 2012; 136:195101. [PMID: 22612113 DOI: 10.1063/1.4710986] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023] Open
Abstract
It is well recognized that thermal motions of atoms in the protein native state, the fluctuations about the minimum of the global free energy, are well reproduced by the simple elastic network models (ENMs) such as the anisotropic network model (ANM). Elastic network models represent protein dynamics as vibrations of a network of nodes (usually represented by positions of the heavy atoms or by the C(α) atoms only for coarse-grained representations) in which the spatially close nodes are connected by harmonic springs. These models provide a reliable representation of the fluctuational dynamics of proteins and RNA, and explain various conformational changes in protein structures including those important for ligand binding. In the present paper, we study the problem of protein structure refinement by analyzing thermal motions of proteins in non-native states. We represent the conformational space close to the native state by a set of decoys generated by the I-TASSER protein structure prediction server utilizing template-free modeling. The protein substates are selected by hierarchical structure clustering. The main finding is that thermal motions for some substates, overlap significantly with the deformations necessary to reach the native state. Additionally, more mobile residues yield higher overlaps with the required deformations than do the less mobile ones. These findings suggest that structural refinement of poorly resolved protein models can be significantly enhanced by reduction of the conformational space to the motions imposed by the dominant normal modes.
Collapse
Affiliation(s)
- Pawel Gniewek
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland
| | | | | | | |
Collapse
|
7
|
Kmiecik S, Gront D, Kouza M, Kolinski A. From coarse-grained to atomic-level characterization of protein dynamics: transition state for the folding of B domain of protein A. J Phys Chem B 2012; 116:7026-32. [PMID: 22486297 DOI: 10.1021/jp301720w] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Atomic-level molecular dynamics simulations are widely used for the characterization of the structural dynamics of proteins; however, they are limited to shorter time scales than the duration of most of the relevant biological processes. Properly designed coarse-grained models that trade atomic resolution for efficient sampling allow access to much longer time-scales. In-depth understanding of the structural dynamics, however, must involve atomic details. In this study, we tested a method for the rapid reconstruction of all-atom models from α carbon atom positions in the application to convert a coarse-grained folding trajectory of a well described model system: the B domain of protein A. The results show that the method and the spatial resolution of the resulting coarse-grained models enable computationally inexpensive reconstruction of realistic all-atom models. Additionally, by means of structural clustering, we determined the most persistent ensembles of the key folding step, the transition state. Importantly, the analysis of the overall structural topologies suggests a dominant folding pathway. This, together with the all-atom characterization of the obtained ensembles, in the form of contact maps, matches the experimental results well.
Collapse
Affiliation(s)
- Sebastian Kmiecik
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| | | | | | | |
Collapse
|
8
|
Gront D, Kmiecik S, Blaszczyk M, Ekonomiuk D, Koliński A. Optimization of protein models. WILEY INTERDISCIPLINARY REVIEWS-COMPUTATIONAL MOLECULAR SCIENCE 2012. [DOI: 10.1002/wcms.1090] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/25/2022]
Affiliation(s)
- Dominik Gront
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Sebastian Kmiecik
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Maciej Blaszczyk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Dariusz Ekonomiuk
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| | - Andrzej Koliński
- Laboratory of Theory of Biopolymers, Faculty of Chemistry, University of Warsaw, Warsaw, Poland
| |
Collapse
|
9
|
Berenger F, Shrestha R, Zhou Y, Simoncini D, Zhang KYJ. Durandal: fast exact clustering of protein decoys. J Comput Chem 2011; 33:471-4. [PMID: 22120171 DOI: 10.1002/jcc.21988] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2011] [Revised: 09/16/2011] [Accepted: 10/11/2011] [Indexed: 11/11/2022]
Abstract
In protein folding, clustering is commonly used as one way to identify the best decoy produced. Initializing the pairwise distance matrix for a large decoy set is computationally expensive. We have proposed a fast method that works even on large decoy sets. This method is implemented in a software called Durandal. Durandal has been shown to be consistently faster than other software performing fast exact clustering. In some cases, Durandal can even outperform the speed of an approximate method. Durandal uses the triangular inequality to accelerate exact clustering, without compromising the distance function. Recently, we have further enhanced the performance of Durandal by incorporating a Quaternion-based characteristic polynomial method that has increased the speed of Durandal between 13% and 27% compared with the previous version. Durandal source code is available under the GNU General Public License at http://www.riken.jp/zhangiru/software/durandal_released_qcp.tgz. Alternatively, a compiled version of Durandal is also distributed with the nightly builds of the Phenix (http://www.phenix-online.org/) crystallographic software suite (Adams et al., Acta Crystallogr Sect D 2010, 66, 213).
Collapse
Affiliation(s)
- Francois Berenger
- Zhang Initiative Research Unit, Advanced Science Institute, RIKEN, 2-1 Hirosawa, Wako, Saitama 351-0198, Japan
| | | | | | | | | |
Collapse
|
10
|
Berenger F, Zhou Y, Shrestha R, Zhang KYJ. Entropy-accelerated exact clustering of protein decoys. Bioinformatics 2011; 27:939-45. [PMID: 21310747 DOI: 10.1093/bioinformatics/btr072] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
MOTIVATION Clustering is commonly used to identify the best decoy among many generated in protein structure prediction when using energy alone is insufficient. Calculation of the pairwise distance matrix for a large decoy set is computationally expensive. Typically, only a reduced set of decoys using energy filtering is subjected to clustering analysis. A fast clustering method for a large decoy set would be beneficial to protein structure prediction and this still poses a challenge. RESULTS We propose a method using propagation of geometric constraints to accelerate exact clustering, without compromising the distance measure. Our method can be used with any metric distance. Metrics that are expensive to compute and have known cheap lower and upper bounds will benefit most from the method. We compared our method's accuracy against published results from the SPICKER clustering software on 40 large decoy sets from the I-TASSER protein folding engine. We also performed some additional speed comparisons on six targets from the 'semfold' decoy set. In our tests, our method chose a better decoy than the energy criterion in 25 out of 40 cases versus 20 for SPICKER. Our method also was shown to be consistently faster than another fast software performing exact clustering named Calibur. In some cases, our approach can even outperform the speed of an approximate method. AVAILABILITY Our C++ software is released under the GNU General Public License. It can be downloaded from http://www.riken.jp/zhangiru/software/durandal_released.tgz.
Collapse
Affiliation(s)
- Francois Berenger
- Zhang Initiative Research Unit, Advanced Science Institute, RIKEN, Wako, Saitama, Japan
| | | | | | | |
Collapse
|
11
|
Kurcinski M, Kolinski A. Theoretical study of molecular mechanism of binding TRAP220 coactivator to Retinoid X Receptor alpha, activated by 9-cis retinoic acid. J Steroid Biochem Mol Biol 2010; 121:124-9. [PMID: 20398753 PMCID: PMC2906686 DOI: 10.1016/j.jsbmb.2010.03.086] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/09/2009] [Accepted: 03/26/2010] [Indexed: 01/22/2023]
Abstract
Study on molecular mechanism of conformational reorientation of RXR-alpha ligand binding domain is presented. We employed CABS--a reduced model of protein dynamics to model folding pathways of binding 9-cis retinoic acid to apo-RXR molecule and TRAP220 peptide fragment to the holo form. Based on obtained results we also propose a sequential model of RXR activation by 9-cis retinoic acid and TRAP220 coactivator. Methodology presented here may be used for investigation of binding pathways of other NR/hormone/cofactor sets.
Collapse
Affiliation(s)
- Mateusz Kurcinski
- Department of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| | | |
Collapse
|
12
|
Latek D, Kolinski A. Contact prediction in protein modeling: scoring, folding and refinement of coarse-grained models. BMC STRUCTURAL BIOLOGY 2008; 8:36. [PMID: 18694501 PMCID: PMC2527566 DOI: 10.1186/1472-6807-8-36] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/16/2008] [Accepted: 08/11/2008] [Indexed: 11/10/2022]
Abstract
BACKGROUND Several different methods for contact prediction succeeded within the Sixth Critical Assessment of Techniques for Protein Structure Prediction (CASP6). The most relevant were non-local contact predictions for targets from the most difficult categories: fold recognition-analogy and new fold. Such contacts could provide valuable structural information in case a template structure cannot be found in the PDB. RESULTS We described comprehensive tests of the effectiveness of contact data in various aspects of de novo modeling with CABS, an algorithm which was used successfully in CASP6 by the Kolinski-Bujnicki group. We used the predicted contacts in a simple scoring function for the post-simulation ranking of protein models and as a soft bias in the folding simulations and in the fold-refinement procedure. The latter approach turned out to be the most successful. The CABS force field used in the Replica Exchange Monte Carlo simulations cooperated with the true contacts and discriminated the false ones, which resulted in an improvement of the majority of Kolinski-Bujnicki's protein models. In the modeling we tested different sets of predicted contact data submitted to the CASP6 server. According to our results, the best performing were the contacts with the accuracy balanced with the coverage, obtained either from the best two predictors only or by a consensus from as many predictors as possible. CONCLUSION Our tests have shown that theoretically predicted contacts can be very beneficial for protein structure prediction. Depending on the protein modeling method, a contact data set applied should be prepared with differently balanced coverage and accuracy of predicted contacts. Namely, high coverage of contact data is important for the model ranking and high accuracy for the folding simulations.
Collapse
Affiliation(s)
- Dorota Latek
- Faculty of Chemistry, University of Warsaw, Pasteura 1, 02-093 Warsaw, Poland.
| | | |
Collapse
|
13
|
|
14
|
Predicting the complex structure and functional motions of the outer membrane transporter and signal transducer FecA. Biophys J 2008; 94:2482-91. [PMID: 18178655 DOI: 10.1529/biophysj.107.116046] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Escherichia coli requires an efficient transport and signaling system to successfully sequester iron from its environment. FecA, a TonB-dependent protein, serves a critical role in this process: first, it binds and transports iron in the form of ferric citrate, and second, it initiates a signaling cascade that results in the transcription of several iron transporter genes in interaction with inner membrane proteins. The structure of the plug and barrel domains and the periplasmic N-terminal domain (NTD) are separately available. However, the linker connecting the plug and barrel and the NTD domains is highly mobile, which may prevent the determination of the FecA structure as a whole assembly. Here, we reduce the conformation space of this linker into most probable structural models using the modeling tool CABS, then apply normal-mode analysis to investigate the motions of the whole structure of FecA by using elastic network models. We relate the FecA domain motions to the outer-inner membrane communication, which initiates transcription. We observe that the global motions of FecA assign flexibility to the TonB box and the NTD, and control the exposure of the TonB box for binding to the TonB inner membrane protein, suggesting how these motions relate to FecA function. Our simulations suggest the presence of a communication between the loops on both ends of the protein, a signaling mechanism by which a signal could be transmitted by conformational transitions in response to the binding of ferric citrate.
Collapse
|
15
|
Kmiecik S, Kolinski A. Folding pathway of the b1 domain of protein G explored by multiscale modeling. Biophys J 2007; 94:726-36. [PMID: 17890394 PMCID: PMC2186257 DOI: 10.1529/biophysj.107.116095] [Citation(s) in RCA: 84] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The understanding of the folding mechanisms of single-domain proteins is an essential step in the understanding of protein folding in general. Recently, we developed a mesoscopic CA-CB side-chain protein model, which was successfully applied in protein structure prediction, studies of protein thermodynamics, and modeling of protein complexes. In this research, this model is employed in a detailed characterization of the folding process of a simple globular protein, the B1 domain of IgG-binding protein G (GB1). There is a vast body of experimental facts and theoretical findings for this protein. Performing unbiased, ab initio simulations, we demonstrated that the GB1 folding proceeds via the formation of an extended folding nucleus, followed by slow structure fine-tuning. Remarkably, a subset of native interactions drives the folding from the very beginning. The emerging comprehensive picture of GB1 folding perfectly matches and extends the previous experimental and theoretical studies.
Collapse
Affiliation(s)
| | - Andrzej Kolinski
- Address reprint requests to Andrzej Kolinski, Faculty of Chemistry, University of Warsaw, L. Pasteura 1, 02-093 Warsaw, Poland. Tel.: 48-022-8220211 ext. 320; Fax: 48-022 820221.
| |
Collapse
|
16
|
Latek D, Ekonomiuk D, Kolinski A. Protein structure prediction: combining de novo modeling with sparse experimental data. J Comput Chem 2007; 28:1668-76. [PMID: 17342709 DOI: 10.1002/jcc.20657] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Routine structure prediction of new folds is still a challenging task for computational biology. The challenge is not only in the proper determination of overall fold but also in building models of acceptable resolution, useful for modeling the drug interactions and protein-protein complexes. In this work we propose and test a comprehensive approach to protein structure modeling supported by sparse, and relatively easy to obtain, experimental data. We focus on chemical shift-based restraints from NMR, although other sparse restraints could be easily included. In particular, we demonstrate that combining the typical NMR software with artificial intelligence-based prediction of secondary structure enhances significantly the accuracy of the restraints for molecular modeling. The computational procedure is based on the reduced representation approach implemented in the CABS modeling software, which proved to be a versatile tool for protein structure prediction during the CASP (CASP stands for critical assessment of techniques for protein structure prediction) experiments (see http://predictioncenter/CASP6/org). The method is successfully tested on a small set of representative globular proteins of different size and topology, including the two CASP6 targets, for which the required NMR data already exist. The method is implemented in a semi-automated pipeline applicable to a large scale structural annotation of genomic data. Here, we limit the computations to relatively small set. This enabled, without a loss of generality, a detailed discussion of various factors determining accuracy of the proposed approach to the protein structure prediction.
Collapse
Affiliation(s)
- Dorota Latek
- Faculty of Chemistry, Warsaw University, Pateura 1, 02-093 Warsaw, Poland.
| | | | | |
Collapse
|
17
|
Abstract
MOTIVATION The number of known protein sequences is about thousand times larger than the number of experimentally solved 3D structures. For more than half of the protein sequences a close or distant structural analog could be identified. The key starting point in a classical comparative modeling is to generate the best possible sequence alignment with a template or templates. With decreasing sequence similarity, the number of errors in the alignments increases and these errors are the main causes of the decreasing accuracy of the molecular models generated. Here we propose a new approach to comparative modeling, which does not require the implicit alignment - the model building phase explores geometric, evolutionary and physical properties of a template (or templates). RESULTS The proposed method requires prior identification of a template, although the initial sequence alignment is ignored. The model is built using a very efficient reduced representation search engine CABS to find the best possible superposition of the query protein onto the template represented as a 3D multi-featured scaffold. The criteria used include: sequence similarity, predicted secondary structure consistency, local geometric features and hydrophobicity profile. For more difficult cases, the new method qualitatively outperforms existing schemes of comparative modeling. The algorithm unifies de novo modeling, 3D threading and sequence-based methods. The main idea is general and could be easily combined with other efficient modeling tools as Rosetta, UNRES and others.
Collapse
Affiliation(s)
- Andrzej Kolinski
- University of Warsaw, Faculty of Chemistry, Pasteura 1 02-093 Warsaw, Poland
| | | |
Collapse
|
18
|
Type II restriction endonuclease R.Eco29kI is a member of the GIY-YIG nuclease superfamily. BMC STRUCTURAL BIOLOGY 2007; 7:48. [PMID: 17626614 PMCID: PMC1952068 DOI: 10.1186/1472-6807-7-48] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/12/2007] [Accepted: 07/12/2007] [Indexed: 01/21/2023]
Abstract
Background The majority of experimentally determined crystal structures of Type II restriction endonucleases (REases) exhibit a common PD-(D/E)XK fold. Crystal structures have been also determined for single representatives of two other folds: PLD (R.BfiI) and half-pipe (R.PabI), and bioinformatics analyses supported by mutagenesis suggested that some REases belong to the HNH fold. Our previous bioinformatic analysis suggested that REase R.Eco29kI shares sequence similarities with one more unrelated nuclease superfamily, GIY-YIG, however so far no experimental data were available to support this prediction. The determination of a crystal structure of the GIY-YIG domain of homing endonuclease I-TevI provided a template for modeling of R.Eco29kI and prompted us to validate the model experimentally. Results Using protein fold-recognition methods we generated a new alignment between R.Eco29kI and I-TevI, which suggested a reassignment of one of the putative catalytic residues. A theoretical model of R.Eco29kI was constructed to illustrate its predicted three-dimensional fold and organization of the active site, comprising amino acid residues Y49, Y76, R104, H108, E142, and N154. A series of mutants was constructed to generate amino acid substitutions of selected residues (Y49A, R104A, H108F, E142A and N154L) and the mutant proteins were examined for their ability to bind the DNA containing the Eco29kI site 5'-CCGCGG-3' and to catalyze the cleavage reaction. Experimental data reveal that residues Y49, R104, E142, H108, and N154 are important for the nuclease activity of R.Eco29kI, while H108 and N154 are also important for specific DNA binding by this enzyme. Conclusion Substitutions of residues Y49, R104, H108, E142 and N154 predicted by the model to be a part of the active site lead to mutant proteins with strong defects in the REase activity. These results are in very good agreement with the structural model presented in this work and with our prediction that R.Eco29kI belongs to the GIY-YIG superfamily of nucleases. Our study provides the first experimental evidence for a Type IIP REase that does not belong to the PD-(D/E)XK or HNH superfamilies of nucleases, and is instead a member of the unrelated GIY-YIG superfamily.
Collapse
|
19
|
Towards the high-resolution protein structure prediction. Fast refinement of reduced models with all-atom force field. BMC STRUCTURAL BIOLOGY 2007; 7:43. [PMID: 17603876 PMCID: PMC1933428 DOI: 10.1186/1472-6807-7-43] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/30/2007] [Accepted: 06/29/2007] [Indexed: 12/03/2022]
Abstract
Background Although experimental methods for determining protein structure are providing high resolution structures, they cannot keep the pace at which amino acid sequences are resolved on the scale of entire genomes. For a considerable fraction of proteins whose structures will not be determined experimentally, computational methods can provide valuable information. The value of structural models in biological research depends critically on their quality. Development of high-accuracy computational methods that reliably generate near-experimental quality structural models is an important, unsolved problem in the protein structure modeling. Results Large sets of structural decoys have been generated using reduced conformational space protein modeling tool CABS. Subsequently, the reduced models were subject to all-atom reconstruction. Then, the resulting detailed models were energy-minimized using state-of-the-art all-atom force field, assuming fixed positions of the alpha carbons. It has been shown that a very short minimization leads to the proper ranking of the quality of the models (distance from the native structure), when the all-atom energy is used as the ranking criterion. Additionally, we performed test on medium and low accuracy decoys built via classical methods of comparative modeling. The test placed our model evaluation procedure among the state-of-the-art protein model assessment methods. Conclusion These test computations show that a large scale high resolution protein structure prediction is possible, not only for small but also for large protein domains, and that it should be based on a hierarchical approach to the modeling protocol. We employed Molecular Mechanics with fixed alpha carbons to rank-order the all-atom models built on the scaffolds of the reduced models. Our tests show that a physic-based approach, usually considered computationally too demanding for large-scale applications, can be effectively used in such studies.
Collapse
|
20
|
Kurcinski M, Kolinski A. Steps towards flexible docking: modeling of three-dimensional structures of the nuclear receptors bound with peptide ligands mimicking co-activators' sequences. J Steroid Biochem Mol Biol 2007; 103:357-60. [PMID: 17241780 DOI: 10.1016/j.jsbmb.2006.12.059] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 11/30/2006] [Indexed: 11/22/2022]
Abstract
We developed a fully flexible docking method that uses a reduced lattice representation of protein molecules, adapted for modeling peptide-protein complexes. The CABS model (Carbon Alpha, Carbon Beta, Side Group) employed here, incorporates three pseudo-atoms per residue-Calpha, Cbeta and the center of the side group instead of full-atomic protein representation. Force field used by CABS was derived from statistical analysis of non-redundant database of protein structures. Application of our method included modeling of the complexes between various nuclear receptors (NRs) and peptide co-activators, for which three-dimensional structures are known. We tried to rebuild the native state of the complexes, starting from separated components. Accuracy of the best obtained models, calculated as coordinate root-mean-square deviation (cRMSD) between the target and the modeled structures, was under 1A, which is competitive with experimental methods, such as crystallography or NMR. Forthcoming modeling study should lead to better understanding of mechanisms of macromolecular assembly and will explain co-activators' effects on receptors activity, especially on vitamin D receptor and other nuclear receptors.
Collapse
Affiliation(s)
- Mateusz Kurcinski
- Faculty of Chemistry, Warsaw University, Pasteura 1, 02-093 Warsaw, Poland
| | | |
Collapse
|
21
|
Kurcinski M, Kolinski A. Hierarchical modeling of protein interactions. J Mol Model 2007; 13:691-8. [PMID: 17297609 DOI: 10.1007/s00894-007-0177-8] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/15/2006] [Accepted: 01/18/2007] [Indexed: 11/27/2022]
Abstract
A novel approach to hierarchical peptide-protein and protein-protein docking is described and evaluated. Modeling procedure starts from a reduced space representation of proteins and peptides. Polypeptide chains are represented by strings of alpha-carbon beads restricted to a fine-mesh cubic lattice. Side chains are represented by up to two centers of interactions, corresponding to beta-carbons and the centers of mass of the remaining portions of the side groups, respectively. Additional pseudoatoms are located in the centers of the virtual bonds connecting consecutive alpha carbons. These pseudoatoms support a model of main-chain hydrogen bonds. Docking starts from a collection of random configurations of modeled molecules. Interacting molecules are flexible; however, higher accuracy models are obtained when the conformational freedom of one (the larger one) of the assembling molecules is limited by a set of weak distance restraints extracted from the experimental (or theoretically predicted) structures. Sampling is done by means of Replica Exchange Monte Carlo method. Afterwards, the set of obtained structures is subject to a hierarchical clustering. Then, the centroids of the resulting clusters are used as scaffolds for the reconstruction of the atomic details. Finally, the all-atom models are energy minimized and scored using classical tools of molecular mechanics. The method is tested on a set of macromolecular assemblies consisting of proteins and peptides. It is demonstrated that the proposed approach to the flexible docking could be successfully applied to prediction of protein-peptide and protein-protein interactions. The obtained models are almost always qualitatively correct, although usually of relatively low (or moderate) resolution. In spite of this limitation, the proposed method opens new possibilities of computational studies of macromolecular recognition and mechanisms of assembly of macromolecular complexes.
Collapse
Affiliation(s)
- Mateusz Kurcinski
- Faculty of Chemistry, Warsaw University, ul. Pasteura 1, 02-093, Warsaw, Poland
| | | |
Collapse
|
22
|
Stumpff-Kane AW, Feig M. A correlation-based method for the enhancement of scoring functions on funnel-shaped energy landscapes. Proteins 2006; 63:155-64. [PMID: 16397892 DOI: 10.1002/prot.20853] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
A correlation-based approach is introduced for enhancing the ability of structure-scoring methods to identify and distinguish native-like conformations. The proposed method relies on a funnel-shaped scoring function that decreases steadily toward the native state. It takes advantage of the idea that the structure from a given ensemble that is closest to the native basin leads to the highest correlation coefficient between a given score and distance to that structure as an approximation of the native state for the entire ensemble. The method is applied successfully to a number of different test cases that demonstrate substantial improvements in the correlation of the score with the distance from the true native state but also result in the selection of more native-like structures compared to the original score.
Collapse
Affiliation(s)
- Andrew W Stumpff-Kane
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824-1319, USA
| | | |
Collapse
|
23
|
Koliński A, Bujnicki JM. Generalized protein structure prediction based on combination of fold-recognition with de novo folding and evaluation of models. Proteins 2006; 61 Suppl 7:84-90. [PMID: 16187348 DOI: 10.1002/prot.20723] [Citation(s) in RCA: 85] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
To predict the tertiary structure of full-length sequences of all targets in CASP6, regardless of their potential category (from easy comparative modeling to fold recognition to apparent new folds) we used a novel combination of two very different approaches developed independently in our laboratories, which ranked quite well in different categories in CASP5. First, the GeneSilico metaserver was used to identify domains, predict secondary structure, and generate fold recognition (FR) alignments, which were converted to full-atom models using the "FRankenstein's Monster" approach for comparative modeling (CM) by recombination of protein fragments. Additional models generated "de novo" by fully automated servers were obtained from the CASP website. All these models were evaluated by VERIFY3D, and residues with scores better than 0.2 were used as a source of spatial restraints. Second, a new implementation of the lattice-based protein modeling tool CABS was used to carry out folding guided by the above-mentioned restraints with the Replica Exchange Monte Carlo sampling technique. Decoys generated in the course of simulation were subject to the average linkage hierarchical clustering. For a representative decoy from each cluster, a full-atom model was rebuilt. Finally, five models were selected for submission based on combination of various criteria, including the size, density, and average energy of the corresponding cluster, and the visual evaluation of the full-atom structures and their relationship to the original templates. The combination of FRankenstein and CABS was one of the best-performing algorithms over all categories in CASP6 (it is important to note that our human intervention was very limited, and all steps in our method can be easily automated). We were able to generate a number of very good models, especially in the Comparative Modeling and New Folds categories. Frequently, the best models were closer to the native structure than any of the templates used. The main problem we encountered was in the ranking of the final models (the only step of significant human intervention), due to the insufficient computational power, which precluded the possibility of full-atom refinement and energy-based evaluation.
Collapse
|
24
|
Abstract
SUMMARY BioShell is a suite of programs performing common tasks accompanying protein structure modeling. BioShell design is based on UNIX shell flexibility and should be used as its extension. Using BioShell various molecular modeling procedures can be integrated in a single pipeline. AVAILABILITY BioShell package can be downloaded from its website http://biocomp.chem.uw.edu.pl/BioShell and these pages provide many examples and a detailed documentation for the newest version.
Collapse
Affiliation(s)
- Dominik Gront
- Faculty of Chemistry, Warsaw University Pasteura 1, 02-093 Warsaw, Poland.
| | | |
Collapse
|
25
|
Plewczynska D, Kolinski A. Protein Folding with a Reduced Model and Inaccurate Short-Range Restraints. MACROMOL THEOR SIMUL 2005. [DOI: 10.1002/mats.200500020] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
|
26
|
Gront D, Hansmann UHE, Kolinski A. Exploring protein energy landscapes with hierarchical clustering. INTERNATIONAL JOURNAL OF QUANTUM CHEMISTRY 2005; 105:826-830. [PMID: 16479277 PMCID: PMC1366497 DOI: 10.1002/qua.20741] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/06/2023]
Abstract
In this work we present a new method for investigating local energy minima on a protein energy landscape. The CABS (CAlpha, CBeta and the center of mass of the Side chain) method was employed for generating protein models, but any other method could be used instead. Cα traces from an ensemble of models are hierarchical clustered with the HCPM (Hierarchical Clustering of Protein Models) method. The efficiency of this method for sampling and analyzing energy landscapes is shown.
Collapse
|