Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Jones DT, Singh T, Kosciolek T, Tetchner S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. ACTA ACUST UNITED AC 2014;31:999-1006. [PMID: 25431331 PMCID: PMC4382908 DOI: 10.1093/bioinformatics/btu791] [Citation(s) in RCA: 237] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Accepted: 11/22/2014] [Indexed: 12/13/2022]

For:	Jones DT, Singh T, Kosciolek T, Tetchner S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. ACTA ACUST UNITED AC 2014;31:999-1006. [PMID: 25431331 PMCID: PMC4382908 DOI: 10.1093/bioinformatics/btu791] [Citation(s) in RCA: 237] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Accepted: 11/22/2014] [Indexed: 12/13/2022]

Number

Cited by Other Article(s)

201

Xiong D, Zeng J, Gong H. A deep learning framework for improving long-range residue–residue contact prediction using a hierarchical strategy. Bioinformatics 2017;33:2675-2683. [DOI: 10.1093/bioinformatics/btx296] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2016] [Accepted: 05/02/2017] [Indexed: 12/31/2022] Open

202

Xu Q, Tang Q, Katsonis P, Lichtarge O, Jones D, Bovo S, Babbi G, Martelli PL, Casadio R, Lee GR, Seok C, Fenton AW, Dunbrack RL. Benchmarking predictions of allostery in liver pyruvate kinase in CAGI4. Hum Mutat 2017;38:1123-1131. [PMID: 28370845 DOI: 10.1002/humu.23222] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2017] [Revised: 03/16/2017] [Accepted: 03/24/2017] [Indexed: 12/22/2022]

203

Simkovic F, Ovchinnikov S, Baker D, Rigden DJ. Applications of contact predictions to structural biology. IUCRJ 2017;4:291-300. [PMID: 28512576 PMCID: PMC5414403 DOI: 10.1107/s2052252517005115] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/12/2016] [Accepted: 04/03/2017] [Indexed: 06/07/2023]

Abstract

Evolutionary pressure on residue interactions, intramolecular or intermolecular, that are important for protein structure or function can lead to covariance between the two positions. Recent methodological advances allow much more accurate contact predictions to be derived from this evolutionary covariance signal. The practical application of contact predictions has largely been confined to structural bioinformatics, yet, as this work seeks to demonstrate, the data can be of enormous value to the structural biologist working in X-ray crystallo-graphy, cryo-EM or NMR. Integrative structural bioinformatics packages such as Rosetta can already exploit contact predictions in a variety of ways. The contribution of contact predictions begins at construct design, where structural domains may need to be expressed separately and contact predictions can help to predict domain limits. Structure solution by molecular replacement (MR) benefits from contact predictions in diverse ways: in difficult cases, more accurate search models can be constructed using ab initio modelling when predictions are available, while intermolecular contact predictions can allow the construction of larger, oligomeric search models. Furthermore, MR using supersecondary motifs or large-scale screens against the PDB can exploit information, such as the parallel or antiparallel nature of any β-strand pairing in the target, that can be inferred from contact predictions. Contact information will be particularly valuable in the determination of lower resolution structures by helping to assign sequence register. In large complexes, contact information may allow the identity of a protein responsible for a certain region of density to be determined and then assist in the orientation of an available model within that density. In NMR, predicted contacts can provide long-range information to extend the upper size limit of the technique in a manner analogous but complementary to experimental methods. Finally, predicted contacts can distinguish between biologically relevant interfaces and mere lattice contacts in a final crystal structure, and have potential in the identification of functionally important regions and in foreseeing the consequences of mutations.

Collapse

204

Chapman SD, Adami C, Wilke CO, B Kc D. The evolution of logic circuits for the purpose of protein contact map prediction. PeerJ 2017;5:e3139. [PMID: 28439455 PMCID: PMC5398280 DOI: 10.7717/peerj.3139] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2016] [Accepted: 03/02/2017] [Indexed: 11/20/2022] Open

205

Kell DB. Evolutionary algorithms and synthetic biology for directed evolution: commentary on "on the mapping of genotype to phenotype in evolutionary algorithms" by Peter A. Whigham, Grant Dick, and James Maclaurin. GENETIC PROGRAMMING AND EVOLVABLE MACHINES 2017;18:373-378. [PMID: 29033669 PMCID: PMC5618731 DOI: 10.1007/s10710-017-9292-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]

206

Castelli M, Clementi N, Pfaff J, Sautto GA, Diotti RA, Burioni R, Doranz BJ, Dal Peraro M, Clementi M, Mancini N. A Biologically-validated HCV E1E2 Heterodimer Structural Model. Sci Rep 2017;7:214. [PMID: 28303031 PMCID: PMC5428263 DOI: 10.1038/s41598-017-00320-7] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2016] [Accepted: 02/21/2017] [Indexed: 12/14/2022] Open

207

Skwark MJ, Croucher NJ, Puranen S, Chewapreecha C, Pesonen M, Xu YY, Turner P, Harris SR, Beres SB, Musser JM, Parkhill J, Bentley SD, Aurell E, Corander J. Interacting networks of resistance, virulence and core machinery genes identified by genome-wide epistasis analysis. PLoS Genet 2017;13:e1006508. [PMID: 28207813 PMCID: PMC5312804 DOI: 10.1371/journal.pgen.1006508] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2016] [Accepted: 11/24/2016] [Indexed: 12/05/2022] Open

Abstract

Recent advances in the scale and diversity of population genomic datasets for bacteria now provide the potential for genome-wide patterns of co-evolution to be studied at the resolution of individual bases. Here we describe a new statistical method, genomeDCA, which uses recent advances in computational structural biology to identify the polymorphic loci under the strongest co-evolutionary pressures. We apply genomeDCA to two large population data sets representing the major human pathogens Streptococcus pneumoniae (pneumococcus) and Streptococcus pyogenes (group A Streptococcus). For pneumococcus we identified 5,199 putative epistatic interactions between 1,936 sites. Over three-quarters of the links were between sites within the pbp2x, pbp1a and pbp2b genes, the sequences of which are critical in determining non-susceptibility to beta-lactam antibiotics. A network-based analysis found these genes were also coupled to that encoding dihydrofolate reductase, changes to which underlie trimethoprim resistance. Distinct from these antibiotic resistance genes, a large network component of 384 protein coding sequences encompassed many genes critical in basic cellular functions, while another distinct component included genes associated with virulence. The group A Streptococcus (GAS) data set population represents a clonal population with relatively little genetic variation and a high level of linkage disequilibrium across the genome. Despite this, we were able to pinpoint two RNA pseudouridine synthases, which were each strongly linked to a separate set of loci across the chromosome, representing biologically plausible targets of co-selection. The population genomic analysis method applied here identifies statistically significantly co-evolving locus pairs, potentially arising from fitness selection interdependence reflecting underlying protein-protein interactions, or genes whose product activities contribute to the same phenotype. This discovery approach greatly enhances the future potential of epistasis analysis for systems biology, and can complement genome-wide association studies as a means of formulating hypotheses for targeted experimental work.

Epistatic interactions between polymorphisms in DNA are recognized as important drivers of evolution in numerous organisms. Study of epistasis in bacteria has been hampered by the lack of densely sampled population genomic data, suitable statistical models and inference algorithms sufficiently powered for extremely high-dimensional parameter spaces. We introduce the first model-based method for genome-wide epistasis analysis and use two of the largest available bacterial population genome data sets on Streptococcus pneumoniae (the pneumococcus) and Streptococcus pyogenes (group A Streptococcus) to demonstrate its potential for biological discovery. Our approach reveals interacting networks of resistance, virulence and core machinery genes in the pneumococcus, which highlights putative candidates for novel drug targets. We also discover a number of plausible targets of co-selection in S. pyogenes linked to RNA pseudouridine synthases. Our method significantly enhances the future potential of epistasis analysis for systems biology, and can complement genome-wide association studies as a means of formulating hypotheses for targeted experimental work.

Collapse

Affiliation(s)

Marcin J Skwark Department of Chemistry, Vanderbilt University, Nashville, TN, United States of America
Nicholas J Croucher Department of Infectious Disease Epidemiology, Imperial College London, London, United Kingdom
Santeri Puranen Department of Computer Science, Aalto University, Espoo, Finland
Claire Chewapreecha Department of Medicine, University of Cambridge, Cambridge, United Kingdom
Maiju Pesonen Department of Computer Science, Aalto University, Espoo, Finland
Ying Ying Xu Department of Computer Science, Aalto University, Espoo, Finland
Paul Turner Shoklo Malaria Research Unit, Mahidol-Oxford Tropical Medicine Research Unit, Faculty of Tropical Medicine, Mahidol University, Mae Sot, Thailand.,Centre for Tropical Medicine, Nuffield Department of Medicine, University of Oxford, Oxford, United Kingdom
Simon R Harris Pathogen Genomics, Wellcome Trust Sanger Institute, Cambridge, United Kingdom
Stephen B Beres Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute, and Houston Methodist Hospital, Houston, Texas, United States of America
James M Musser Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute, and Houston Methodist Hospital, Houston, Texas, United States of America.,Departments of Pathology and Laboratory Medicine and Microbiology and Immunology, Weill Cornell Medical College, New York, New York, United States of America
Julian Parkhill Pathogen Genomics, Wellcome Trust Sanger Institute, Cambridge, United Kingdom
Stephen D Bentley Pathogen Genomics, Wellcome Trust Sanger Institute, Cambridge, United Kingdom
Erik Aurell Department of Computational Biology, KTH-Royal Institute of Technology, Stockholm, Sweden.,Departments of Applied Physics and Computer Science, Aalto University, Espoo, Finland.,Institute of Theoretical Physics, Chinese Academy of Sciences, Beijing, China
Jukka Corander Pathogen Genomics, Wellcome Trust Sanger Institute, Cambridge, United Kingdom.,Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland.,Department of Biostatistics, University of Oslo, Oslo, Norway.,Department of Veterinary Medicine, University of Cambridge, Cambridge, United Kingdom

Collapse

208

Wang S, Sun S, Li Z, Zhang R, Xu J. Accurate De Novo Prediction of Protein Contact Map by Ultra-Deep Learning Model. PLoS Comput Biol 2017;13:e1005324. [PMID: 28056090 PMCID: PMC5249242 DOI: 10.1371/journal.pcbi.1005324] [Citation(s) in RCA: 589] [Impact Index Per Article: 73.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2016] [Revised: 01/20/2017] [Accepted: 12/20/2016] [Indexed: 12/02/2022] Open

Abstract

Motivation

Protein contacts contain key information for the understanding of protein structure and function and thus, contact prediction from sequence is an important problem. Recently exciting progress has been made on this problem, but the predicted contacts for proteins without many sequence homologs is still of low quality and not very useful for de novo structure prediction.

Method

This paper presents a new deep learning method that predicts contacts by integrating both evolutionary coupling (EC) and sequence conservation information through an ultra-deep neural network formed by two deep residual neural networks. The first residual network conducts a series of 1-dimensional convolutional transformation of sequential features; the second residual network conducts a series of 2-dimensional convolutional transformation of pairwise information including output of the first residual network, EC information and pairwise potential. By using very deep residual networks, we can accurately model contact occurrence patterns and complex sequence-structure relationship and thus, obtain higher-quality contact prediction regardless of how many sequence homologs are available for proteins in question.

Results

Our method greatly outperforms existing methods and leads to much more accurate contact-assisted folding. Tested on 105 CASP11 targets, 76 past CAMEO hard targets, and 398 membrane proteins, the average top L long-range prediction accuracy obtained by our method, one representative EC method CCMpred and the CASP11 winner MetaPSICOV is 0.47, 0.21 and 0.30, respectively; the average top L/10 long-range accuracy of our method, CCMpred and MetaPSICOV is 0.77, 0.47 and 0.59, respectively. Ab initio folding using our predicted contacts as restraints but without any force fields can yield correct folds (i.e., TMscore>0.6) for 203 of the 579 test proteins, while that using MetaPSICOV- and CCMpred-predicted contacts can do so for only 79 and 62 of them, respectively. Our contact-assisted models also have much better quality than template-based models especially for membrane proteins. The 3D models built from our contact prediction have TMscore>0.5 for 208 of the 398 membrane proteins, while those from homology modeling have TMscore>0.5 for only 10 of them. Further, even if trained mostly by soluble proteins, our deep learning method works very well on membrane proteins. In the recent blind CAMEO benchmark, our fully-automated web server implementing this method successfully folded 6 targets with a new fold and only 0.3L-2.3L effective sequence homologs, including one β protein of 182 residues, one α+β protein of 125 residues, one α protein of 140 residues, one α protein of 217 residues, one α/β of 260 residues and one α protein of 462 residues. Our method also achieved the highest F1 score on free-modeling targets in the latest CASP (Critical Assessment of Structure Prediction), although it was not fully implemented back then.

Availability

http://raptorx.uchicago.edu/ContactMap/

Protein contact prediction and contact-assisted folding has made good progress due to direct evolutionary coupling analysis (DCA). However, DCA is effective on only some proteins with a very large number of sequence homologs. To further improve contact prediction, we borrow ideas from deep learning, which has recently revolutionized object recognition, speech recognition and the GO game. Our deep learning method can model complex sequence-structure relationship and high-order correlation (i.e., contact occurrence patterns) and thus, improve contact prediction accuracy greatly. Our test results show that our method greatly outperforms the state-of-the-art methods regardless how many sequence homologs are available for a protein in question. Ab initio folding guided by our predicted contacts may fold many more test proteins than the other contact predictors. Our contact-assisted 3D models also have much better quality than homology models built from the training proteins, especially for membrane proteins. One interesting finding is that even trained mostly with soluble proteins, our method performs very well on membrane proteins. Recent blind CAMEO test confirms that our method can fold large proteins with a new fold and only a small number of sequence homologs.

Collapse

209

Rawi R, Mall R, Kunji K, El Anbari M, Aupetit M, Ullah E, Bensmail H. COUSCOus: improved protein contact prediction using an empirical Bayes covariance estimator. BMC Bioinformatics 2016;17:533. [PMID: 27978812 PMCID: PMC5159955 DOI: 10.1186/s12859-016-1400-3] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2016] [Accepted: 12/01/2016] [Indexed: 11/13/2022] Open

Abstract

Background

The post-genomic era with its wealth of sequences gave rise to a broad range of protein residue-residue contact detecting methods. Although various coevolution methods such as PSICOV, DCA and plmDCA provide correct contact predictions, they do not completely overlap. Hence, new approaches and improvements of existing methods are needed to motivate further development and progress in the field. We present a new contact detecting method, COUSCOus, by combining the best shrinkage approach, the empirical Bayes covariance estimator and GLasso.

Results

Using the original PSICOV benchmark dataset, COUSCOus achieves mean accuracies of 0.74, 0.62 and 0.55 for the top L/10 predicted long, medium and short range contacts, respectively. In addition, COUSCOus attains mean areas under the precision-recall curves of 0.25, 0.29 and 0.30 for long, medium and short contacts and outperforms PSICOV. We also observed that COUSCOus outperforms PSICOV w.r.t. Matthew’s correlation coefficient criterion on full list of residue contacts. Furthermore, COUSCOus achieves on average 10% more gain in prediction accuracy compared to PSICOV on an independent test set composed of CASP11 protein targets. Finally, we showed that when using a simple random forest meta-classifier, by combining contact detecting techniques and sequence derived features, PSICOV predictions should be replaced by the more accurate COUSCOus predictions.

Conclusion

We conclude that the consideration of superior covariance shrinkage approaches will boost several research fields that apply the GLasso procedure, amongst the presented one of residue-residue contact prediction as well as fields such as gene network reconstruction.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-016-1400-3) contains supplementary material, which is available to authorized users.

Collapse

210

Adhikari B, Nowotny J, Bhattacharya D, Hou J, Cheng J. ConEVA: a toolbox for comprehensive assessment of protein contacts. BMC Bioinformatics 2016;17:517. [PMID: 27923350 PMCID: PMC5142288 DOI: 10.1186/s12859-016-1404-z] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2016] [Accepted: 12/01/2016] [Indexed: 12/31/2022] Open

211

Alcock F, Stansfeld PJ, Basit H, Habersetzer J, Baker MA, Palmer T, Wallace MI, Berks BC. Assembling the Tat protein translocase. eLife 2016;5. [PMID: 27914200 PMCID: PMC5201420 DOI: 10.7554/elife.20718] [Citation(s) in RCA: 49] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2016] [Accepted: 11/29/2016] [Indexed: 12/18/2022] Open

212

Levy RM, Haldane A, Flynn WF. Potts Hamiltonian models of protein co-variation, free energy landscapes, and evolutionary fitness. Curr Opin Struct Biol 2016;43:55-62. [PMID: 27870991 DOI: 10.1016/j.sbi.2016.11.004] [Citation(s) in RCA: 69] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/19/2016] [Accepted: 11/03/2016] [Indexed: 11/17/2022]

213

Assessing Predicted Contacts for Building Protein Three-Dimensional Models. Methods Mol Biol 2016. [PMID: 27787823 DOI: 10.1007/978-1-4939-6406-2_9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]

214

Schueler-Furman O, Wodak SJ. Computational approaches to investigating allostery. Curr Opin Struct Biol 2016;41:159-171. [PMID: 27607077 DOI: 10.1016/j.sbi.2016.06.017] [Citation(s) in RCA: 54] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2016] [Accepted: 06/23/2016] [Indexed: 01/01/2023]

215

Monastyrskyy B, D'Andrea D, Fidelis K, Tramontano A, Kryshtafovych A. New encouraging developments in contact prediction: Assessment of the CASP11 results. Proteins 2016;84 Suppl 1:131-44. [PMID: 26474083 PMCID: PMC4834069 DOI: 10.1002/prot.24943] [Citation(s) in RCA: 69] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2015] [Revised: 09/15/2015] [Accepted: 10/11/2015] [Indexed: 12/27/2022]

216

Li Q, Dahl DB, Vannucci M, Joo H, Tsai JW. KScons: a Bayesian approach for protein residue contact prediction using the knob-socket model of protein tertiary structure. Bioinformatics 2016;32:3774-3781. [PMID: 27559156 DOI: 10.1093/bioinformatics/btw553] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2016] [Revised: 07/15/2016] [Accepted: 08/18/2016] [Indexed: 11/13/2022] Open

Abstract

MOTIVATION

By simplifying the many-bodied complexity of residue packing into patterns of simple pairwise secondary structure interactions between a single knob residue with a three-residue socket, the knob-socket construct allows a more direct incorporation of structural information into the prediction of residue contacts. By modeling the preferences between the amino acid composition of a socket and knob, we undertake an investigation of the knob-socket construct's ability to improve the prediction of residue contacts. The statistical model considers three priors and two posterior estimations to better understand how the input data affects predictions. This produces six implementations of KScons that are tested on three sets: PSICOV, CASP10 and CASP11. We compare against the current leading contact prediction methods.

RESULTS

The results demonstrate the usefulness as well as the limits of knob-socket based structural modeling of protein contacts. The construct is able to extract good predictions from known structural homologs, while its performance degrades when no homologs exist. Among our six implementations, KScons MST-MP (which uses the multiple structure alignment prior and marginal posterior incorporating structural homolog information) performs the best in all three prediction sets. An analysis of recall and precision finds that KScons MST-MP improves accuracy not only by improving identification of true positives, but also by decreasing the number of false positives. Over the CASP10 and CASP11 sets, KScons MST-MP performs better than the leading methods using only evolutionary coupling data, but not quite as well as the supervised learning methods of MetaPSICOV and CoinDCA-NN that incorporate a large set of structural features.

CONTACT

qiwei.li@rice.eduSupplementary information: Supplementary data are available at Bioinformatics online.

Collapse

217

Simkovic F, Thomas JMH, Keegan RM, Winn MD, Mayans O, Rigden DJ. Residue contacts predicted by evolutionary covariance extend the application of ab initio molecular replacement to larger and more challenging protein folds. IUCRJ 2016;3:259-70. [PMID: 27437113 PMCID: PMC4937781 DOI: 10.1107/s2052252516008113] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 02/10/2016] [Accepted: 05/18/2016] [Indexed: 05/05/2023]

218

Pandini A, Morcos F, Khan S. The Gearbox of the Bacterial Flagellar Motor Switch. Structure 2016;24:1209-20. [PMID: 27345932 PMCID: PMC4938800 DOI: 10.1016/j.str.2016.05.012] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2016] [Revised: 04/26/2016] [Accepted: 05/23/2016] [Indexed: 12/11/2022]

219

Bhattacharya D, Cao R, Cheng J. UniCon3D: de novo protein structure prediction using united-residue conformational search via stepwise, probabilistic sampling. Bioinformatics 2016;32:2791-9. [PMID: 27259540 PMCID: PMC5018369 DOI: 10.1093/bioinformatics/btw316] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2016] [Accepted: 05/15/2016] [Indexed: 12/20/2022] Open

Abstract

MOTIVATION

Recent experimental studies have suggested that proteins fold via stepwise assembly of structural units named 'foldons' through the process of sequential stabilization. Alongside, latest developments on computational side based on probabilistic modeling have shown promising direction to perform de novo protein conformational sampling from continuous space. However, existing computational approaches for de novo protein structure prediction often randomly sample protein conformational space as opposed to experimentally suggested stepwise sampling.

RESULTS

Here, we develop a novel generative, probabilistic model that simultaneously captures local structural preferences of backbone and side chain conformational space of polypeptide chains in a united-residue representation and performs experimentally motivated conditional conformational sampling via stepwise synthesis and assembly of foldon units that minimizes a composite physics and knowledge-based energy function for de novo protein structure prediction. The proposed method, UniCon3D, has been found to (i) sample lower energy conformations with higher accuracy than traditional random sampling in a small benchmark of 6 proteins; (ii) perform comparably with the top five automated methods on 30 difficult target domains from the 11th Critical Assessment of Protein Structure Prediction (CASP) experiment and on 15 difficult target domains from the 10th CASP experiment; and (iii) outperform two state-of-the-art approaches and a baseline counterpart of UniCon3D that performs traditional random sampling for protein modeling aided by predicted residue-residue contacts on 45 targets from the 10th edition of CASP.

AVAILABILITY AND IMPLEMENTATION

Source code, executable versions, manuals and example data of UniCon3D for Linux and OSX are freely available to non-commercial users at http://sysbio.rnet.missouri.edu/UniCon3D/ CONTACT: chengji@missouri.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

220

Aurell E. The Maximum Entropy Fallacy Redux? PLoS Comput Biol 2016;12:e1004777. [PMID: 27171259 PMCID: PMC4865147 DOI: 10.1371/journal.pcbi.1004777] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/19/2022] Open

221

Neuwald AF. Gleaning structural and functional information from correlations in protein multiple sequence alignments. Curr Opin Struct Biol 2016;38:1-8. [PMID: 27179293 DOI: 10.1016/j.sbi.2016.04.006] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2015] [Revised: 04/28/2016] [Accepted: 04/29/2016] [Indexed: 10/24/2022]

222

Wang S, Li W, Zhang R, Liu S, Xu J. CoinFold: a web server for protein contact prediction and contact-assisted protein folding. Nucleic Acids Res 2016;44:W361-6. [PMID: 27112569 PMCID: PMC4987891 DOI: 10.1093/nar/gkw307] [Citation(s) in RCA: 43] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2016] [Accepted: 04/12/2016] [Indexed: 12/14/2022] Open

223

Wang S, Li W, Liu S, Xu J. RaptorX-Property: a web server for protein structure property prediction. Nucleic Acids Res 2016;44:W430-5. [PMID: 27112573 PMCID: PMC4987890 DOI: 10.1093/nar/gkw306] [Citation(s) in RCA: 367] [Impact Index Per Article: 40.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2016] [Accepted: 04/12/2016] [Indexed: 11/14/2022] Open

224

Kurczynska M, Kania E, Konopka BM, Kotulska M. Applying PyRosetta molecular energies to separate properly oriented protein models from mirror models, obtained from contact maps. J Mol Model 2016;22:111. [PMID: 27107578 PMCID: PMC4842210 DOI: 10.1007/s00894-016-2975-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/12/2015] [Accepted: 04/05/2016] [Indexed: 11/30/2022]

225

Yang J, Jin QY, Zhang B, Shen HB. R2C: improving ab initio residue contact map prediction using dynamic fusion strategy and Gaussian noise filter. ACTA ACUST UNITED AC 2016;32:2435-43. [PMID: 27153618 DOI: 10.1093/bioinformatics/btw181] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2015] [Accepted: 04/03/2016] [Indexed: 11/12/2022]

Abstract

MOTIVATION

Inter-residue contacts in proteins dictate the topology of protein structures. They are crucial for protein folding and structural stability. Accurate prediction of residue contacts especially for long-range contacts is important to the quality of ab inito structure modeling since they can enforce strong restraints to structure assembly.

RESULTS

In this paper, we present a new Residue-Residue Contact predictor called R2C that combines machine learning-based and correlated mutation analysis-based methods, together with a two-dimensional Gaussian noise filter to enhance the long-range residue contact prediction. Our results show that the outputs from the machine learning-based method are concentrated with better performance on short-range contacts; while for correlated mutation analysis-based approach, the predictions are widespread with higher accuracy on long-range contacts. An effective query-driven dynamic fusion strategy proposed here takes full advantages of the two different methods, resulting in an impressive overall accuracy improvement. We also show that the contact map directly from the prediction model contains the interesting Gaussian noise, which has not been discovered before. Different from recent studies that tried to further enhance the quality of contact map by removing its transitive noise, we designed a new two-dimensional Gaussian noise filter, which was especially helpful for reinforcing the long-range residue contact prediction. Tested on recent CASP10/11 datasets, the overall top L/5 accuracy of our final R2C predictor is 17.6%/15.5% higher than the pure machine learning-based method and 7.8%/8.3% higher than the correlated mutation analysis-based approach for the long-range residue contact prediction.

AVAILABILITY AND IMPLEMENTATION

http://www.csbio.sjtu.edu.cn/bioinf/R2C/Contact:hbshen@sjtu.edu.cn

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

226

Zhang H, Gao Y, Deng M, Wang C, Zhu J, Li SC, Zheng WM, Bu D. Improving residue-residue contact prediction via low-rank and sparse decomposition of residue correlation matrix. Biochem Biophys Res Commun 2016;472:217-22. [PMID: 26920058 DOI: 10.1016/j.bbrc.2016.01.188] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2016] [Accepted: 01/30/2016] [Indexed: 10/22/2022]

Abstract

Strategies for correlation analysis in protein contact prediction often encounter two challenges, namely, the indirect coupling among residues, and the background correlations mainly caused by phylogenetic biases. While various studies have been conducted on how to disentangle indirect coupling, the removal of background correlations still remains unresolved. Here, we present an approach for removing background correlations via low-rank and sparse decomposition (LRS) of a residue correlation matrix. The correlation matrix can be constructed using either local inference strategies (e.g., mutual information, or MI) or global inference strategies (e.g., direct coupling analysis, or DCA). In our approach, a correlation matrix was decomposed into two components, i.e., a low-rank component representing background correlations, and a sparse component representing true correlations. Finally the residue contacts were inferred from the sparse component of correlation matrix. We trained our LRS-based method on the PSICOV dataset, and tested it on both GREMLIN and CASP11 datasets. Our experimental results suggested that LRS significantly improves the contact prediction precision. For example, when equipped with the LRS technique, the prediction precision of MI and mfDCA increased from 0.25 to 0.67 and from 0.58 to 0.70, respectively (Top L/10 predicted contacts, sequence separation: 5 AA, dataset: GREMLIN). In addition, our LRS technique also consistently outperforms the popular denoising technique APC (average product correction), on both local (MI_LRS: 0.67 vs MI_APC: 0.34) and global measures (mfDCA_LRS: 0.70 vs mfDCA_APC: 0.67). Interestingly, we found out that when equipped with our LRS technique, local inference strategies performed in a comparable manner to that of global inference strategies, implying that the application of LRS technique narrowed down the performance gap between local and global inference strategies. Overall, our LRS technique greatly facilitates protein contact prediction by removing background correlations. An implementation of the approach called COLORS (improving COntact prediction using LOw-Rank and Sparse matrix decomposition) is available from http://protein.ict.ac.cn/COLORS/.

Collapse

227

Hönigschmid P, Frishman D. Accurate prediction of helix interactions and residue contacts in membrane proteins. J Struct Biol 2016;194:112-23. [PMID: 26851352 DOI: 10.1016/j.jsb.2016.02.005] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2015] [Revised: 02/01/2016] [Accepted: 02/02/2016] [Indexed: 11/16/2022]

228

Zhang H, Huang Q, Bei Z, Wei Y, Floudas CA. COMSAT: Residue contact prediction of transmembrane proteins based on support vector machines and mixed integer linear programming. Proteins 2016;84:332-48. [PMID: 26756402 DOI: 10.1002/prot.24979] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2015] [Revised: 11/19/2015] [Accepted: 12/10/2015] [Indexed: 12/28/2022]

229

Sfriso P, Duran-Frigola M, Mosca R, Emperador A, Aloy P, Orozco M. Residues Coevolution Guides the Systematic Identification of Alternative Functional Conformations in Proteins. Structure 2016;24:116-126. [DOI: 10.1016/j.str.2015.10.025] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2015] [Revised: 10/13/2015] [Accepted: 10/17/2015] [Indexed: 12/12/2022]

230

Adhikari B, Cheng J. Protein Residue Contacts and Prediction Methods. Methods Mol Biol 2016;1415:463-76. [PMID: 27115648 DOI: 10.1007/978-1-4939-3572-7_24] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]

231

Mabrouk M, Werner T, Schneider M, Putz I, Brock O. Analysis of free modeling predictions by RBO aleph in CASP11. Proteins 2015;84 Suppl 1:87-104. [PMID: 26492194 DOI: 10.1002/prot.24950] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2015] [Revised: 09/28/2015] [Accepted: 10/19/2015] [Indexed: 12/15/2022]

232

Kryshtafovych A, Moult J, Baslé A, Burgin A, Craig TK, Edwards RA, Fass D, Hartmann MD, Korycinski M, Lewis RJ, Lorimer D, Lupas AN, Newman J, Peat TS, Piepenbrink KH, Prahlad J, van Raaij MJ, Rohwer F, Segall AM, Seguritan V, Sundberg EJ, Singh AK, Wilson MA, Schwede T. Some of the most interesting CASP11 targets through the eyes of their authors. Proteins 2015;84 Suppl 1:34-50. [PMID: 26473983 PMCID: PMC4834066 DOI: 10.1002/prot.24942] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2015] [Revised: 09/17/2015] [Accepted: 10/11/2015] [Indexed: 11/17/2022]

Affiliation(s)

Andriy Kryshtafovych Genome Center, University of California, Davis, California, 95616
John Moult Department of Cell Biology and Molecular Genetics, Institute for Bioscience and Biotechnology Research, University of Maryland, Rockville, Maryland, 20850
Arnaud Baslé Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne, NE2 4HH, United Kingdom
Alex Burgin Broad Institute, Cambridge, Massachusetts, 02142
Timothy K Craig TimPharma, Santa Clarita, California, 91350
Robert A Edwards Department of Biology, San Diego State University, San Diego, California, 92182.,Department of Computer Science, San Diego State University, San Diego, California, 92182
Deborah Fass Department of Structural Biology, Weizmann Institute of Science, Rehovot, 76100, Israel
Marcus D Hartmann Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, 72076, Germany
Mateusz Korycinski Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, 72076, Germany
Richard J Lewis Institute for Cell and Molecular Biosciences, University of Newcastle, Newcastle upon Tyne, NE2 4HH, United Kingdom
Donald Lorimer Beryllium, Bainbridge Island, Washington D.C, 98110
Andrei N Lupas Department of Protein Evolution, Max Planck Institute for Developmental Biology, Tübingen, 72076, Germany
Janet Newman Biomedical Manufacturing Program, CSIRO, Parkville, VIC, Australia
Thomas S Peat Biomedical Manufacturing Program, CSIRO, Parkville, VIC, Australia
Kurt H Piepenbrink Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, 21201
Janani Prahlad Department of Biochemistry and Redox Biology Center, University of Nebraska-Lincoln, Lincoln, Nebraska, 68588
Mark J van Raaij Centro Nactional De Biotecnologia (CNB-CSIC), Madrid, E-28049, Spain
Forest Rohwer Department of Biology and Viral Information Institute, San Diego State University, San Diego, California, 92182
Anca M Segall Department of Biology and Viral Information Institute, San Diego State University, San Diego, California, 92182
Victor Seguritan Human Longevity Inc, La Jolla, California, 92121
Eric J Sundberg Institute of Human Virology, University of Maryland School of Medicine, Baltimore, Maryland, 21201.,Department of Medicine, University of Maryland School of Medicine, Baltimore, Maryland, 21201.,Department of Microbiology and Immunology, University of Maryland School of Medicine, Baltimore, Maryland, 21201
Abhimanyu K Singh School of Biosciences, University of Kent, Canterbury, Kent, United Kingdom
Mark A Wilson Department of Biochemistry and Redox Biology Center, University of Nebraska-Lincoln, Lincoln, Nebraska, 68588
Torsten Schwede Biozentrum, University of Basel, Basel, 4056, Switzerland. .,SIB Swiss Institute of Bioinformatics, Basel, 4056, Switzerland.

Collapse

233

De Leonardis E, Lutz B, Ratz S, Cocco S, Monasson R, Schug A, Weigt M. Direct-Coupling Analysis of nucleotide coevolution facilitates RNA secondary and tertiary structure prediction. Nucleic Acids Res 2015;43:10444-55. [PMID: 26420827 PMCID: PMC4666395 DOI: 10.1093/nar/gkv932] [Citation(s) in RCA: 44] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2015] [Accepted: 09/07/2015] [Indexed: 12/16/2022] Open

234

Kosciolek T, Jones DT. Accurate contact predictions using covariation techniques and machine learning. Proteins 2015. [PMID: 26205532 PMCID: PMC5042084 DOI: 10.1002/prot.24863] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]

235

Ma J, Wang S, Wang Z, Xu J. Protein contact prediction by integrating joint evolutionary coupling analysis and supervised learning. Bioinformatics 2015;31:3506-13. [PMID: 26275894 DOI: 10.1093/bioinformatics/btv472] [Citation(s) in RCA: 80] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2014] [Accepted: 08/08/2015] [Indexed: 02/07/2023] Open

236

Haliloglu T, Bahar I. Adaptability of protein structures to enable functional interactions and evolutionary implications. Curr Opin Struct Biol 2015;35:17-23. [PMID: 26254902 DOI: 10.1016/j.sbi.2015.07.007] [Citation(s) in RCA: 75] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2015] [Revised: 07/15/2015] [Accepted: 07/20/2015] [Indexed: 12/21/2022]

237

Adhikari B, Bhattacharya D, Cao R, Cheng J. CONFOLD: Residue-residue contact-guided ab initio protein folding. Proteins 2015;83:1436-49. [PMID: 25974172 PMCID: PMC4509844 DOI: 10.1002/prot.24829] [Citation(s) in RCA: 101] [Impact Index Per Article: 10.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2015] [Revised: 04/11/2015] [Accepted: 05/02/2015] [Indexed: 12/20/2022]

238

Espada R, Parra RG, Mora T, Walczak AM, Ferreiro DU. Capturing coevolutionary signals inrepeat proteins. BMC Bioinformatics 2015;16:207. [PMID: 26134293 PMCID: PMC4489039 DOI: 10.1186/s12859-015-0648-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Accepted: 06/16/2015] [Indexed: 11/10/2022] Open