51
|
Dawid AE, Gront D, Kolinski A. SURPASS Low-Resolution Coarse-Grained Protein Modeling. J Chem Theory Comput 2017; 13:5766-5779. [PMID: 28992694 DOI: 10.1021/acs.jctc.7b00642] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Coarse-grained modeling of biomolecules has a very important role in molecular biology. In this work we present a novel SURPASS (Single United Residue per Pre-Averaged Secondary Structure fragment) model of proteins that can be an interesting alternative for existing coarse-grained models. The design of the model is unique and strongly supported by the statistical analysis of structural regularities characteristic for protein systems. Coarse-graining of protein chain structures assumes a single center of interactions per residue and accounts for preaveraged effects of four adjacent residue fragments. Knowledge-based statistical potentials encode complex interaction patterns of these fragments. Using the Replica Exchange Monte Carlo sampling scheme and a generic version of the SURPASS force field we performed test simulations of a representative set of single-domain globular proteins. The method samples a significant part of conformational space and reproduces protein structures, including native-like, with surprisingly good accuracy. Future extension of the SURPASS model on large biomacromolecular systems is briefly discussed.
Collapse
Affiliation(s)
- Aleksandra E Dawid
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw , Pasteura 1, 02-093 Warsaw, Poland
| | - Dominik Gront
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw , Pasteura 1, 02-093 Warsaw, Poland
| | - Andrzej Kolinski
- Faculty of Chemistry, Biological and Chemical Research Center, University of Warsaw , Pasteura 1, 02-093 Warsaw, Poland
| |
Collapse
|
52
|
Hayashi T, Yasuda S, Škrbić T, Giacometti A, Kinoshita M. Unraveling protein folding mechanism by analyzing the hierarchy of models with increasing level of detail. J Chem Phys 2017; 147:125102. [DOI: 10.1063/1.4999376] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Affiliation(s)
- Tomohiko Hayashi
- Institute of Advanced Energy, Kyoto University, Uji, Kyoto 611-0011, Japan
| | - Satoshi Yasuda
- Institute of Advanced Energy, Kyoto University, Uji, Kyoto 611-0011, Japan
- Graduate School of Science, Chiba University, 1-33 Yayoi-cho, Inage, Chiba 263-8522, Japan
- Molecular Chirality Research Center, Chiba University, 1-33 Yayoi-cho, Inage, Chiba 263-8522, Japan
| | - Tatjana Škrbić
- Dipartimento di Scienze Molecolari e Nanosistemi, Università Ca’ Foscari Venezia, Edificio Alfa Campus Scientifico, Via Torino 155, Venezia-Mestre I-3010, Italy
| | - Achille Giacometti
- Dipartimento di Scienze Molecolari e Nanosistemi, Università Ca’ Foscari Venezia, Edificio Alfa Campus Scientifico, Via Torino 155, Venezia-Mestre I-3010, Italy
| | - Masahiro Kinoshita
- Institute of Advanced Energy, Kyoto University, Uji, Kyoto 611-0011, Japan
| |
Collapse
|
53
|
Li Z, Hirst JD. Quantitative first principles calculations of protein circular dichroism in the near-ultraviolet. Chem Sci 2017; 8:4318-4333. [PMID: 29163925 PMCID: PMC5637123 DOI: 10.1039/c7sc00586e] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2017] [Accepted: 03/23/2017] [Indexed: 11/30/2022] Open
Abstract
Vibrational structure in the near-UV circular dichroism (CD) spectra of proteins is an important source of information on protein conformation and can be exploited to study structure and folding. A fully quantitative theory of the relationship between protein conformation and optical spectroscopy would facilitate deeper interpretation of and insight into biophysical and simulation studies of protein dynamics and folding. We have developed new models of the aromatic side chain chromophores toluene, p-cresol and 3-methylindole, which incorporate ab initio calculations of the Franck-Condon effect into first principles calculations of CD using an exciton approach. The near-UV CD spectra of 40 proteins are calculated with the new parameter set and the correlation between the computed and the experimental intensity from 270 to 290 nm is much improved. The contribution of individual chromophores to the CD spectra has been calculated for several mutants and in many cases helps rationalize changes in their experimental spectra. Considering conformational flexibility by using families of NMR structures leads to further improvements for some proteins and illustrates an informative level of sensitivity to side chain conformation. In several cases, the near-UV CD calculations can distinguish the native protein structure from a set of computer-generated misfolded decoy structures.
Collapse
Affiliation(s)
- Zhuo Li
- School of Chemistry , University of Nottingham , University Park , Nottingham NG7 2RD , UK .
| | - Jonathan D Hirst
- School of Chemistry , University of Nottingham , University Park , Nottingham NG7 2RD , UK .
| |
Collapse
|
54
|
Jing X, Dong Q. MQAPRank: improved global protein model quality assessment by learning-to-rank. BMC Bioinformatics 2017; 18:275. [PMID: 28545390 PMCID: PMC5445322 DOI: 10.1186/s12859-017-1691-z] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/12/2017] [Accepted: 05/16/2017] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Protein structure prediction has achieved a lot of progress during the last few decades and a greater number of models for a certain sequence can be predicted. Consequently, assessing the qualities of predicted protein models in perspective is one of the key components of successful protein structure prediction. Over the past years, a number of methods have been developed to address this issue, which could be roughly divided into three categories: single methods, quasi-single methods and clustering (or consensus) methods. Although these methods achieve much success at different levels, accurate protein model quality assessment is still an open problem. RESULTS Here, we present the MQAPRank, a global protein model quality assessment program based on learning-to-rank. The MQAPRank first sorts the decoy models by using single method based on learning-to-rank algorithm to indicate their relative qualities for the target protein. And then it takes the first five models as references to predict the qualities of other models by using average GDT_TS scores between reference models and other models. Benchmarked on CASP11 and 3DRobot datasets, the MQAPRank achieved better performances than other leading protein model quality assessment methods. Recently, the MQAPRank participated in the CASP12 under the group name FDUBio and achieved the state-of-the-art performances. CONCLUSIONS The MQAPRank provides a convenient and powerful tool for protein model quality assessment with the state-of-the-art performances, it is useful for protein structure prediction and model quality assessment usages.
Collapse
Affiliation(s)
- Xiaoyang Jing
- School of Computer Science, Fudan University, Shanghai, 200433 People’s Republic of China
| | - Qiwen Dong
- School of Data Science and Engineering, East China Normal University, Shanghai, 200062 People’s Republic of China
| |
Collapse
|
55
|
Li H, Lyu Q, Cheng J. A Template-Based Protein Structure Reconstruction Method Using Deep Autoencoder Learning. ACTA ACUST UNITED AC 2016; 9:306-313. [PMID: 29081613 PMCID: PMC5658031 DOI: 10.4172/jpb.1000419] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Abstract
Protein structure prediction is an important problem in computational biology, and is widely applied to various biomedical problems such as protein function study, protein design, and drug design. In this work, we developed a novel deep learning approach based on a deeply stacked denoising autoencoder for protein structure reconstruction. We applied our approach to a template-based protein structure prediction using only the 3D structural coordinates of homologous template proteins as input. The templates were identified for a target protein by a PSI-BLAST search. 3DRobot (a program that automatically generates diverse and well-packed protein structure decoys) was used to generate initial decoy models for the target from the templates. A stacked denoising autoencoder was trained on the decoys to obtain a deep learning model for the target protein. The trained deep model was then used to reconstruct the final structural model for the target sequence. With target proteins that have highly similar template proteins as benchmarks, the GDT-TS score of the predicted structures is greater than 0.7, suggesting that the deep autoencoder is a promising method for protein structure reconstruction.
Collapse
Affiliation(s)
- Haiou Li
- Department of Computer Science and Technology, Soochow University, Suzhou, 215006, China.,Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Qiang Lyu
- Department of Computer Science and Technology, Soochow University, Suzhou, 215006, China
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
56
|
Cao R, Bhattacharya D, Hou J, Cheng J. DeepQA: improving the estimation of single protein model quality with deep belief networks. BMC Bioinformatics 2016; 17:495. [PMID: 27919220 PMCID: PMC5139030 DOI: 10.1186/s12859-016-1405-y] [Citation(s) in RCA: 112] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2016] [Accepted: 12/01/2016] [Indexed: 01/02/2023] Open
Abstract
BACKGROUND Protein quality assessment (QA) useful for ranking and selecting protein models has long been viewed as one of the major challenges for protein tertiary structure prediction. Especially, estimating the quality of a single protein model, which is important for selecting a few good models out of a large model pool consisting of mostly low-quality models, is still a largely unsolved problem. RESULTS We introduce a novel single-model quality assessment method DeepQA based on deep belief network that utilizes a number of selected features describing the quality of a model from different perspectives, such as energy, physio-chemical characteristics, and structural information. The deep belief network is trained on several large datasets consisting of models from the Critical Assessment of Protein Structure Prediction (CASP) experiments, several publicly available datasets, and models generated by our in-house ab initio method. Our experiments demonstrate that deep belief network has better performance compared to Support Vector Machines and Neural Networks on the protein model quality assessment problem, and our method DeepQA achieves the state-of-the-art performance on CASP11 dataset. It also outperformed two well-established methods in selecting good outlier models from a large set of models of mostly low quality generated by ab initio modeling methods. CONCLUSION DeepQA is a useful deep learning tool for protein single model quality assessment and protein structure prediction. The source code, executable, document and training/test datasets of DeepQA for Linux is freely available to non-commercial users at http://cactus.rnet.missouri.edu/DeepQA/ .
Collapse
Affiliation(s)
- Renzhi Cao
- Department of Computer Science, Pacific Lutheran University, Tacoma, WA, 98447, USA
| | - Debswapna Bhattacharya
- Department of Electrical Engineering and Computer Science, Wichita State University, Wichita, KS, 67260, USA
| | - Jie Hou
- Department of Computer Science, University of Missouri, Columbia, MO, 65211, USA
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri, Columbia, MO, 65211, USA. .,Informatics Institute, University of Missouri, Columbia, MO, 65211, USA.
| |
Collapse
|
57
|
Jing X, Wang K, Lu R, Dong Q. Sorting protein decoys by machine-learning-to-rank. Sci Rep 2016; 6:31571. [PMID: 27530967 PMCID: PMC4987638 DOI: 10.1038/srep31571] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2016] [Accepted: 07/26/2016] [Indexed: 11/18/2022] Open
Abstract
Much progress has been made in Protein structure prediction during the last few decades. As the predicted models can span a broad range of accuracy spectrum, the accuracy of quality estimation becomes one of the key elements of successful protein structure prediction. Over the past years, a number of methods have been developed to address this issue, and these methods could be roughly divided into three categories: the single-model methods, clustering-based methods and quasi single-model methods. In this study, we develop a single-model method MQAPRank based on the learning-to-rank algorithm firstly, and then implement a quasi single-model method Quasi-MQAPRank. The proposed methods are benchmarked on the 3DRobot and CASP11 dataset. The five-fold cross-validation on the 3DRobot dataset shows the proposed single model method outperforms other methods whose outputs are taken as features of the proposed method, and the quasi single-model method can further enhance the performance. On the CASP11 dataset, the proposed methods also perform well compared with other leading methods in corresponding categories. In particular, the Quasi-MQAPRank method achieves a considerable performance on the CASP11 Best150 dataset.
Collapse
Affiliation(s)
- Xiaoyang Jing
- School of Computer Science, Fudan University, Shanghai 200433, People’s Republic of China
| | - Kai Wang
- College of Animal Science and Technology, Jilin Agricultural University, Changchun 130118, People’s Republic of China
| | - Ruqian Lu
- School of Computer Science, Fudan University, Shanghai 200433, People’s Republic of China
| | - Qiwen Dong
- Institute for Data Science and Engineering, East China Normal University, Shanghai 200062, People’s Republic of China
| |
Collapse
|