Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Jones DT, Singh T, Kosciolek T, Tetchner S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. ACTA ACUST UNITED AC 2014;31:999-1006. [PMID: 25431331 PMCID: PMC4382908 DOI: 10.1093/bioinformatics/btu791] [Citation(s) in RCA: 237] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Accepted: 11/22/2014] [Indexed: 12/13/2022]

For:	Jones DT, Singh T, Kosciolek T, Tetchner S. MetaPSICOV: combining coevolution methods for accurate prediction of contacts and long range hydrogen bonding in proteins. ACTA ACUST UNITED AC 2014;31:999-1006. [PMID: 25431331 PMCID: PMC4382908 DOI: 10.1093/bioinformatics/btu791] [Citation(s) in RCA: 237] [Impact Index Per Article: 21.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2014] [Accepted: 11/22/2014] [Indexed: 12/13/2022]

Number

Cited by Other Article(s)

Hou M, Peng C, Zhou X, Zhang B, Zhang G. Multi contact-based folding method for de novo protein structure prediction. Brief Bioinform 2021;23:6445108. [PMID: 34849573 DOI: 10.1093/bib/bbab463] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2021] [Revised: 09/21/2021] [Accepted: 10/10/2021] [Indexed: 11/12/2022] Open

Alshammari M, He J. Combining Cryo-EM Density Map and Residue Contact for Protein Secondary Structure Topologies. Molecules 2021;26:7049. [PMID: 34834140 PMCID: PMC8624718 DOI: 10.3390/molecules26227049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2021] [Revised: 11/01/2021] [Accepted: 11/15/2021] [Indexed: 11/23/2022] Open

Wei H, Zhao Z, Luo R. Machine-Learned Molecular Surface and Its Application to Implicit Solvent Simulations. J Chem Theory Comput 2021;17:6214-6224. [PMID: 34516109 DOI: 10.1021/acs.jctc.1c00492] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

Abstract

Implicit solvent models, such as Poisson-Boltzmann models, play important roles in computational studies of biomolecules. A vital step in almost all implicit solvent models is to determine the solvent-solute interface, and the solvent excluded surface (SES) is the most widely used interface definition in these models. However, classical algorithms used for computing SES are geometry-based, so that they are neither suitable for parallel implementations nor convenient for obtaining surface derivatives. To address the limitations, we explored a machine learning strategy to obtain a level set formulation for the SES. The training process was conducted in three steps, eventually leading to a model with over 95% agreement with the classical SES. Visualization of tested molecular surfaces shows that the machine-learned SES overlaps with the classical SES in almost all situations. Further analyses show that the machine-learned SES is incredibly stable in terms of rotational variation of tested molecules. Our timing analysis shows that the machine-learned SES is roughly 2.5 times as efficient as the classical SES routine implemented in Amber/PBSA on a tested central processing unit (CPU) platform. We expect further performance gain on massively parallel platforms such as graphics processing units (GPUs) given the ease in converting the machine-learned SES to a parallel procedure. We also implemented the machine-learned SES into the Amber/PBSA program to study its performance on reaction field energy calculation. The analysis shows that the two sets of reaction field energies are highly consistent with a 1% deviation on average. Given its level set formulation, we expect the machine-learned SES to be applied in molecular simulations that require either surface derivatives or high efficiency on parallel computing platforms.

Collapse

Basu S, Bahadur RP. Conservation and coevolution determine evolvability of different classes of disordered residues in human intrinsically disordered proteins. Proteins 2021;90:632-644. [PMID: 34626492 DOI: 10.1002/prot.26261] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2021] [Revised: 10/07/2021] [Accepted: 10/07/2021] [Indexed: 12/19/2022]

Laine E, Eismann S, Elofsson A, Grudinin S. Protein sequence-to-structure learning: Is this the end(-to-end revolution)? Proteins 2021;89:1770-1786. [PMID: 34519095 DOI: 10.1002/prot.26235] [Citation(s) in RCA: 23] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/16/2021] [Accepted: 09/03/2021] [Indexed: 01/08/2023]

Geethu S, Vimina ER. Improved 3-D Protein Structure Predictions using Deep ResNet Model. Protein J 2021;40:669-681. [PMID: 34510309 DOI: 10.1007/s10930-021-10016-7] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/09/2021] [Indexed: 10/20/2022]

Adiyaman R, McGuffin LJ. ReFOLD3: refinement of 3D protein models with gradual restraints based on predicted local quality and residue contacts. Nucleic Acids Res 2021;49:W589-W596. [PMID: 34009387 PMCID: PMC8218204 DOI: 10.1093/nar/gkab300] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/03/2021] [Revised: 03/23/2021] [Accepted: 04/16/2021] [Indexed: 12/16/2022] Open

McGuffin LJ, Aldowsari FMF, Alharbi SMA, Adiyaman R. ModFOLD8: accurate global and local quality estimates for 3D protein models. Nucleic Acids Res 2021;49:W425-W430. [PMID: 33963867 PMCID: PMC8218196 DOI: 10.1093/nar/gkab321] [Citation(s) in RCA: 50] [Impact Index Per Article: 12.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2021] [Revised: 04/01/2021] [Accepted: 04/21/2021] [Indexed: 11/26/2022] Open

Mulnaes D, Golchin P, Koenig F, Gohlke H. TopDomain: Exhaustive Protein Domain Boundary Metaprediction Combining Multisource Information and Deep Learning. J Chem Theory Comput 2021;17:4599-4613. [PMID: 34161735 DOI: 10.1021/acs.jctc.1c00129] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]

Pearce R, Zhang Y. Toward the solution of the protein structure prediction problem. J Biol Chem 2021;297:100870. [PMID: 34119522 PMCID: PMC8254035 DOI: 10.1016/j.jbc.2021.100870] [Citation(s) in RCA: 63] [Impact Index Per Article: 15.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 06/07/2021] [Accepted: 06/09/2021] [Indexed: 11/20/2022] Open

Reza MS, Zhang H, Hossain MT, Jin L, Feng S, Wei Y. COMTOP: Protein Residue-Residue Contact Prediction through Mixed Integer Linear Optimization. MEMBRANES 2021;11:membranes11070503. [PMID: 34209399 PMCID: PMC8305966 DOI: 10.3390/membranes11070503] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/25/2021] [Revised: 06/24/2021] [Accepted: 06/25/2021] [Indexed: 11/17/2022]

Abstract

Protein contact prediction helps reconstruct the tertiary structure that greatly determines a protein’s function; therefore, contact prediction from the sequence is an important problem. Recently there has been exciting progress on this problem, but many of the existing methods are still low quality of prediction accuracy. In this paper, we present a new mixed integer linear programming (MILP)-based consensus method: a Consensus scheme based On a Mixed integer linear opTimization method for prOtein contact Prediction (COMTOP). The MILP-based consensus method combines the strengths of seven selected protein contact prediction methods, including CCMpred, EVfold, DeepCov, NNcon, PconsC4, plmDCA, and PSICOV, by optimizing the number of correctly predicted contacts and achieving a better prediction accuracy. The proposed hybrid protein residue–residue contact prediction scheme was tested in four independent test sets. For 239 highly non-redundant proteins, the method showed a prediction accuracy of 59.68%, 70.79%, 78.86%, 89.04%, 94.51%, and 97.35% for top-5L, top-3L, top-2L, top-L, top-L/2, and top-L/5 contacts, respectively. When tested on the CASP13 and CASP14 test sets, the proposed method obtained accuracies of 75.91% and 77.49% for top-L/5 predictions, respectively. COMTOP was further tested on 57 non-redundant α-helical transmembrane proteins and achieved prediction accuracies of 64.34% and 73.91% for top-L/2 and top-L/5 predictions, respectively. For all test datasets, the improvement of COMTOP in accuracy over the seven individual methods increased with the increasing number of predicted contacts. For example, COMTOP performed much better for large number of contact predictions (such as top-5L and top-3L) than for small number of contact predictions such as top-L/2 and top-L/5. The results and analysis demonstrate that COMTOP can significantly improve the performance of the individual methods; therefore, COMTOP is more robust against different types of test sets. COMTOP also showed better/comparable predictions when compared with the state-of-the-art predictors.

Collapse

Affiliation(s)

Md. Selim Reza School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China; (M.S.R.); (H.Z.); (M.T.H.) Centre for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
Huiling Zhang School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China; (M.S.R.); (H.Z.); (M.T.H.) Centre for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
Md. Tofazzal Hossain School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China; (M.S.R.); (H.Z.); (M.T.H.) Centre for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
Langxi Jin Department of Computer Science and Technology, School of Computer Science and Technology, Harbin University of Science and Technology, 52 Xuefu Road, Nangang District, Harbin 150080, China;
Shengzhong Feng Centre for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China;
Yanjie Wei School of Computer Science and Technology, University of Chinese Academy of Sciences, Beijing 100049, China; (M.S.R.); (H.Z.); (M.T.H.) Centre for High Performance Computing, Joint Engineering Research Center for Health Big Data Intelligent Analysis Technology, Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Shenzhen 518055, China; Correspondence:

Collapse

Wu T, Liu J, Guo Z, Hou J, Cheng J. MULTICOM2 open-source protein structure prediction system powered by deep learning and distance prediction. Sci Rep 2021;11:13155. [PMID: 34162922 PMCID: PMC8222248 DOI: 10.1038/s41598-021-92395-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2021] [Accepted: 06/09/2021] [Indexed: 11/09/2022] Open

Abstract

Protein structure prediction is an important problem in bioinformatics and has been studied for decades. However, there are still few open-source comprehensive protein structure prediction packages publicly available in the field. In this paper, we present our latest open-source protein tertiary structure prediction system—MULTICOM2, an integration of template-based modeling (TBM) and template-free modeling (FM) methods. The template-based modeling uses sequence alignment tools with deep multiple sequence alignments to search for structural templates, which are much faster and more accurate than MULTICOM1. The template-free (ab initio or de novo) modeling uses the inter-residue distances predicted by DeepDist to reconstruct tertiary structure models without using any known structure as template. In the blind CASP14 experiment, the average TM-score of the models predicted by our server predictor based on the MULTICOM2 system is 0.720 for 58 TBM (regular) domains and 0.514 for 38 FM and FM/TBM (hard) domains, indicating that MULTICOM2 is capable of predicting good tertiary structures across the board. It can predict the correct fold for 76 CASP14 domains (95% regular domains and 55% hard domains) if only one prediction is made for a domain. The success rate is increased to 3% for both regular and hard domains if five predictions are made per domain. Moreover, the prediction accuracy of the pure template-free structure modeling method on both TBM and FM targets is very close to the combination of template-based and template-free modeling methods. This demonstrates that the distance-based template-free modeling method powered by deep learning can largely replace the traditional template-based modeling method even on TBM targets that TBM methods used to dominate and therefore provides a uniform structure modeling approach to any protein. Finally, on the 38 CASP14 FM and FM/TBM hard domains, MULTICOM2 server predictors (MULTICOM-HYBRID, MULTICOM-DEEP, MULTICOM-DIST) were ranked among the top 20 automated server predictors in the CASP14 experiment. After combining multiple predictors from the same research group as one entry, MULTICOM-HYBRID was ranked no. 5. The source code of MULTICOM2 is freely available at https://github.com/multicom-toolbox/multicom/tree/multicom_v2.0.

Collapse

Di Lena P, Baldi P. Fold recognition by scoring protein maps using the congruence coefficient. Bioinformatics 2021;37:506-513. [PMID: 32976564 DOI: 10.1093/bioinformatics/btaa833] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2020] [Revised: 09/07/2020] [Accepted: 09/10/2020] [Indexed: 11/14/2022] Open

Suh D, Lee JW, Choi S, Lee Y. Recent Applications of Deep Learning Methods on Evolution- and Contact-Based Protein Structure Prediction. Int J Mol Sci 2021;22:6032. [PMID: 34199677 PMCID: PMC8199773 DOI: 10.3390/ijms22116032] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2021] [Revised: 05/29/2021] [Accepted: 05/29/2021] [Indexed: 01/23/2023] Open

Protein Structure Prediction: Conventional and Deep Learning Perspectives. Protein J 2021;40:522-544. [PMID: 34050498 DOI: 10.1007/s10930-021-10003-y] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/21/2021] [Indexed: 10/21/2022]

Pakhrin SC, Shrestha B, Adhikari B, KC DB. Deep Learning-Based Advances in Protein Structure Prediction. Int J Mol Sci 2021;22:5553. [PMID: 34074028 PMCID: PMC8197379 DOI: 10.3390/ijms22115553] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2021] [Revised: 05/12/2021] [Accepted: 05/18/2021] [Indexed: 12/29/2022] Open

Zhang H, Bei Z, Xi W, Hao M, Ju Z, Saravanan KM, Zhang H, Guo N, Wei Y. Evaluation of residue-residue contact prediction methods: From retrospective to prospective. PLoS Comput Biol 2021;17:e1009027. [PMID: 34029314 PMCID: PMC8177648 DOI: 10.1371/journal.pcbi.1009027] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2020] [Revised: 06/04/2021] [Accepted: 04/28/2021] [Indexed: 12/31/2022] Open

Abstract

Sequence-based residue contact prediction plays a crucial role in protein structure reconstruction. In recent years, the combination of evolutionary coupling analysis (ECA) and deep learning (DL) techniques has made tremendous progress for residue contact prediction, thus a comprehensive assessment of current methods based on a large-scale benchmark data set is very needed. In this study, we evaluate 18 contact predictors on 610 non-redundant proteins and 32 CASP13 targets according to a wide range of perspectives. The results show that different methods have different application scenarios: (1) DL methods based on multi-categories of inputs and large training sets are the best choices for low-contact-density proteins such as the intrinsically disordered ones and proteins with shallow multi-sequence alignments (MSAs). (2) With at least 5L (L is sequence length) effective sequences in the MSA, all the methods show the best performance, and methods that rely only on MSA as input can reach comparable achievements as methods that adopt multi-source inputs. (3) For top L/5 and L/2 predictions, DL methods can predict more hydrophobic interactions while ECA methods predict more salt bridges and disulfide bonds. (4) ECA methods can detect more secondary structure interactions, while DL methods can accurately excavate more contact patterns and prune isolated false positives. In general, multi-input DL methods with large training sets dominate current approaches with the best overall performance. Despite the great success of current DL methods must be stated the fact that there is still much room left for further improvement: (1) With shallow MSAs, the performance will be greatly affected. (2) Current methods show lower precisions for inter-domain compared with intra-domain contact predictions, as well as very high imbalances in precisions between intra-domains. (3) Strong prediction similarities between DL methods indicating more feature types and diversified models need to be developed. (4) The runtime of most methods can be further optimized.

The amino acid sequence of a protein ultimately determines its tertiary structure, and the tertiary structure determines its function(s) and plays a key role in understanding biological processes and disease pathogenesis. Protein tertiary structure can be determined using experimental techniques such as cryo-electron microscopy, nuclear magnetic resonance and X-ray crystallography, which are very expensive and time-consuming. As an alternative, researchers are trying to use in silico methods to predict the 3D structures. Residue contact-assisted protein folding paves an avenue for sequence-based protein structure prediction and therefore has become one of the most challenging and promising problems in structural bioinformatics. Over the past years, contact prediction has undergone continuous evolution in techniques. Through a retrospective analysis of traditional machine learning /evolutionary coupling analysis methods/ consensus machine learning methods and a multi-perspective study on recently developed deep learning methods, we explore the most advanced contact predictors, pursue application scenarios for different methods, and seek prospective directions for further improvement. We anticipate that our study will serve as a practical and useful guide for the development of future approaches to contact prediction.

Collapse

Zhang T, Singh J, Litfin T, Zhan J, Paliwal K, Zhou Y. RNAcmap: A Fully Automatic Pipeline for Predicting Contact Maps of RNAs by Evolutionary Coupling Analysis. Bioinformatics 2021;37:3494-3500. [PMID: 34021744 DOI: 10.1093/bioinformatics/btab391] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2020] [Revised: 03/27/2021] [Accepted: 05/18/2021] [Indexed: 11/13/2022] Open

Abstract

MOTIVATION

The accuracy of RNA secondary and tertiary structure prediction can be significantly improved by using structural restraints derived from evolutionary coupling or direct coupling analysis. Currently, these coupling analyses relied on manually curated multiple sequence alignments collected in the Rfam database, which contains 3016 families. By comparison, millions of non-coding RNA sequences are known. Here, we established RNAcmap, a fully automatic pipeline that enables evolutionary coupling analysis for any RNA sequences. The homology search was based on the covariance model built by INFERNAL according to two secondary structure predictors: a folding-based algorithm RNAfold and the latest deep-learning method SPOT-RNA.

RESULTS

We showed that the performance of RNAcmap is less dependent on the specific evolutionary coupling tool but is more dependent on the accuracy of secondary structure predictor with the best performance given by RNAcmap (SPOT-RNA). The performance of RNAcmap (SPOT-RNA) is comparable to that based on Rfam-supplied alignment and consistent for those sequences that are not in Rfam collections. Further improvement can be made with a simple meta predictor RNAcmap (SPOT-RNA/RNAfold) depending on which secondary structure predictor can find more homologous sequences. Reliable base-pairing information generated from RNAcmap, for RNAs with high effective homologous sequences, in particular, will be useful for aiding RNA structure prediction.

AVAILABILITY

RNAcmap is available as a web server at https://sparks-lab.org/server/rnacmap/ and as a standalone application along with the datasets at https://github.com/sparks-lab-org/RNAcmap_standalone. A platform independent and fully configured docker image of RNAcmap is also provided at https://hub.docker.com/r/jaswindersingh2/rnacmap.

Collapse

Xu J, Mcpartlon M, Li J. Improved protein structure prediction by deep learning irrespective of co-evolution information. NAT MACH INTELL 2021;3:601-609. [PMID: 34368623 PMCID: PMC8340610 DOI: 10.1038/s42256-021-00348-5] [Citation(s) in RCA: 124] [Impact Index Per Article: 31.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]

Schmidt M, Hamacher K. Identification of biophysical interaction patterns in direct coupling analysis. Phys Rev E 2021;103:042418. [PMID: 34005861 DOI: 10.1103/physreve.103.042418] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2020] [Accepted: 03/27/2021] [Indexed: 11/07/2022]

Machine learning in protein structure prediction. Curr Opin Chem Biol 2021;65:1-8. [PMID: 34015749 DOI: 10.1016/j.cbpa.2021.04.005] [Citation(s) in RCA: 121] [Impact Index Per Article: 30.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2021] [Accepted: 04/10/2021] [Indexed: 12/31/2022]

Bhattacharya S, Roche R, Shuvo MH, Bhattacharya D. Recent Advances in Protein Homology Detection Propelled by Inter-Residue Interaction Map Threading. Front Mol Biosci 2021;8:643752. [PMID: 34046429 PMCID: PMC8148041 DOI: 10.3389/fmolb.2021.643752] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2020] [Accepted: 04/21/2021] [Indexed: 11/13/2022] Open

Li J, Xu J. Study of Real-Valued Distance Prediction for Protein Structure Prediction with Deep Learning. Bioinformatics 2021;37:3197-3203. [PMID: 33961022 PMCID: PMC8504618 DOI: 10.1093/bioinformatics/btab333] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2021] [Revised: 03/07/2021] [Accepted: 04/28/2021] [Indexed: 11/14/2022] Open

A Peptides Prediction Methodology for Tertiary Structure Based on Simulated Annealing. MATHEMATICAL AND COMPUTATIONAL APPLICATIONS 2021. [DOI: 10.3390/mca26020039] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]

Toth JM, DePietro PJ, Haas J, McLaughlin WA. ResiRole: residue-level functional site predictions to gauge the accuracies of protein structure prediction techniques. Bioinformatics 2021;37:351-359. [PMID: 32780798 PMCID: PMC8058773 DOI: 10.1093/bioinformatics/btaa712] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2019] [Revised: 07/31/2020] [Accepted: 08/05/2020] [Indexed: 11/25/2022] Open

Villegas-Morcillo A, Makrodimitris S, van Ham RCHJ, Gomez AM, Sanchez V, Reinders MJT. Unsupervised protein embeddings outperform hand-crafted sequence and structure features at predicting molecular function. Bioinformatics 2021;37:162-170. [PMID: 32797179 PMCID: PMC8055213 DOI: 10.1093/bioinformatics/btaa701] [Citation(s) in RCA: 57] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2020] [Revised: 07/10/2020] [Accepted: 08/12/2020] [Indexed: 12/19/2022] Open

Flower TG, Hurley JH. Crystallographic molecular replacement using an in silico-generated search model of SARS-CoV-2 ORF8. Protein Sci 2021;30:728-734. [PMID: 33625752 PMCID: PMC7980513 DOI: 10.1002/pro.4050] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2021] [Revised: 02/21/2021] [Accepted: 02/22/2021] [Indexed: 12/01/2022]

Maddhuri Venkata Subramaniya SR, Terashi G, Jain A, Kagaya Y, Kihara D. Protein Contact Map Refinement for Improving Structure Prediction Using Generative Adversarial Networks. Bioinformatics 2021;37:3168-3174. [PMID: 33787852 PMCID: PMC8504630 DOI: 10.1093/bioinformatics/btab220] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/04/2020] [Revised: 02/28/2021] [Accepted: 03/30/2021] [Indexed: 11/13/2022] Open

Deducing high-accuracy protein contact-maps from a triplet of coevolutionary matrices through deep residual convolutional networks. PLoS Comput Biol 2021;17:e1008865. [PMID: 33770072 PMCID: PMC8026059 DOI: 10.1371/journal.pcbi.1008865] [Citation(s) in RCA: 55] [Impact Index Per Article: 13.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Revised: 04/07/2021] [Accepted: 03/10/2021] [Indexed: 12/24/2022] Open

Abstract

The topology of protein folds can be specified by the inter-residue contact-maps and accurate contact-map prediction can help ab initio structure folding. We developed TripletRes to deduce protein contact-maps from discretized distance profiles by end-to-end training of deep residual neural-networks. Compared to previous approaches, the major advantage of TripletRes is in its ability to learn and directly fuse a triplet of coevolutionary matrices extracted from the whole-genome and metagenome databases and therefore minimize the information loss during the course of contact model training. TripletRes was tested on a large set of 245 non-homologous proteins from CASP 11&12 and CAMEO experiments and outperformed other top methods from CASP12 by at least 58.4% for the CASP 11&12 targets and 44.4% for the CAMEO targets in the top-L long-range contact precision. On the 31 FM targets from the latest CASP13 challenge, TripletRes achieved the highest precision (71.6%) for the top-L/5 long-range contact predictions. It was also shown that a simple re-training of the TripletRes model with more proteins can lead to further improvement with precisions comparable to state-of-the-art methods developed after CASP13. These results demonstrate a novel efficient approach to extend the power of deep convolutional networks for high-accuracy medium- and long-range protein contact-map predictions starting from primary sequences, which are critical for constructing 3D structure of proteins that lack homologous templates in the PDB library.

Ab initio protein folding has been a major unsolved problem in computational biology for more than half a century. Recent community-wide Critical Assessment of Structure Prediction (CASP) experiments have witnessed exciting progress on ab initio structure prediction, which was mainly powered by the boosting of contact-map prediction as the latter can be used as constraints to guide ab initio folding simulations. In this work, we proposed a new open-source deep-learning architecture, TripletRes, built on the residual convolutional neural networks for high-accuracy contact prediction. The large-scale benchmark and blind test results demonstrate competitive performance of the proposed methods to other top approaches in predicting medium- and long-range contact-maps that are critical for guiding protein folding simulations. Detailed data analyses showed that the major advantage of TripletRes lies in the unique protocol to fuse multiple evolutionary feature matrices which are directly extracted from whole-genome and metagenome databases and therefore minimize the information loss during the contact model training.

Collapse

Flower TG, Hurley JH. Crystallographic molecular replacement using an in silico-generated search model of SARS-CoV-2 ORF8. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2021:2021.01.05.425441. [PMID: 33442695 PMCID: PMC7805452 DOI: 10.1101/2021.01.05.425441] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Seffernick JT, Lindert S. Hybrid methods for combined experimental and computational determination of protein structure. J Chem Phys 2020;153:240901. [PMID: 33380110 PMCID: PMC7773420 DOI: 10.1063/5.0026025] [Citation(s) in RCA: 33] [Impact Index Per Article: 6.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2020] [Accepted: 11/10/2020] [Indexed: 02/04/2023] Open

Gao W, Mahajan SP, Sulam J, Gray JJ. Deep Learning in Protein Structural Modeling and Design. PATTERNS (NEW YORK, N.Y.) 2020;1:100142. [PMID: 33336200 PMCID: PMC7733882 DOI: 10.1016/j.patter.2020.100142] [Citation(s) in RCA: 100] [Impact Index Per Article: 20.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]

Hameduh T, Haddad Y, Adam V, Heger Z. Homology modeling in the time of collective and artificial intelligence. Comput Struct Biotechnol J 2020;18:3494-3506. [PMID: 33304450 PMCID: PMC7695898 DOI: 10.1016/j.csbj.2020.11.007] [Citation(s) in RCA: 62] [Impact Index Per Article: 12.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2020] [Revised: 11/04/2020] [Accepted: 11/04/2020] [Indexed: 12/12/2022] Open

Chasing coevolutionary signals in intrinsically disordered proteins complexes. Sci Rep 2020;10:17962. [PMID: 33087759 PMCID: PMC7578644 DOI: 10.1038/s41598-020-74791-6] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2020] [Accepted: 08/27/2020] [Indexed: 11/30/2022] Open

Adhikari B. DEEPCON: protein contact prediction using dilated convolutional neural networks with dropout. Bioinformatics 2020;36:470-477. [PMID: 31359036 DOI: 10.1093/bioinformatics/btz593] [Citation(s) in RCA: 20] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2019] [Revised: 07/21/2019] [Accepted: 07/24/2019] [Indexed: 12/24/2022] Open

Du Z, Pan S, Wu Q, Peng Z, Yang J. CATHER: a novel threading algorithm with predicted contacts. Bioinformatics 2020;36:2119-2125. [PMID: 31790141 DOI: 10.1093/bioinformatics/btz876] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2019] [Revised: 10/31/2019] [Accepted: 11/28/2019] [Indexed: 11/14/2022] Open

Alshammari M, He J. Combine Cryo-EM Density Map and Residue Contact for Protein Structure Prediction - A Case Study. ACM-BCB ... ... : THE ... ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE. ACM CONFERENCE ON BIOINFORMATICS, COMPUTATIONAL BIOLOGY AND BIOMEDICINE 2020;2020:110. [PMID: 35838376 PMCID: PMC9279007 DOI: 10.1145/3388440.3414708] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/15/2023]

Adhikari B. A fully open-source framework for deep learning protein real-valued distances. Sci Rep 2020;10:13374. [PMID: 32770096 PMCID: PMC7414848 DOI: 10.1038/s41598-020-70181-0] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2020] [Accepted: 07/23/2020] [Indexed: 11/12/2022] Open

Sun J, Frishman D. DeepHelicon: Accurate prediction of inter-helical residue contacts in transmembrane proteins by residual neural networks. J Struct Biol 2020;212:107574. [PMID: 32663598 DOI: 10.1016/j.jsb.2020.107574] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 07/03/2020] [Accepted: 07/07/2020] [Indexed: 01/16/2023]

Fernández A. Artificial Intelligence Teaches Drugs to Target Proteins by Tackling the Induced Folding Problem. Mol Pharm 2020;17:2761-2767. [PMID: 32551659 DOI: 10.1021/acs.molpharmaceut.0c00470] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Li Y, Hu J, Zhang C, Yu DJ, Zhang Y. ResPRE: high-accuracy protein contact prediction by coupling precision matrix with deep residual neural networks. Bioinformatics 2020;35:4647-4655. [PMID: 31070716 DOI: 10.1093/bioinformatics/btz291] [Citation(s) in RCA: 109] [Impact Index Per Article: 21.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2018] [Revised: 03/18/2019] [Accepted: 04/17/2019] [Indexed: 12/20/2022] Open

Hadarovich AY, Kalinouski AA, Tuzikov AV. Protein homodimers structure prediction based on deep neural network. INFORMATICS 2020. [DOI: 10.37661/1816-0301-2020-17-2-44-53] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022] Open

Hong SH, Joo K, Lee J. ConDo: protein domain boundary prediction using coevolutionary information. Bioinformatics 2020;35:2411-2417. [PMID: 30500873 DOI: 10.1093/bioinformatics/bty973] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2018] [Revised: 11/15/2018] [Accepted: 11/29/2018] [Indexed: 11/13/2022] Open

Gress A, Kalinina OV. SphereCon-a method for precise estimation of residue relative solvent accessible area from limited structural information. Bioinformatics 2020;36:3372-3378. [PMID: 32154837 DOI: 10.1093/bioinformatics/btaa159] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2019] [Revised: 02/28/2020] [Accepted: 03/04/2020] [Indexed: 11/13/2022] Open

Jiang M, Li Z, Zhang S, Wang S, Wang X, Yuan Q, Wei Z. Drug-target affinity prediction using graph neural network and contact maps. RSC Adv 2020;10:20701-20712. [PMID: 35517730 PMCID: PMC9054320 DOI: 10.1039/d0ra02297g] [Citation(s) in RCA: 166] [Impact Index Per Article: 33.2] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2020] [Accepted: 05/07/2020] [Indexed: 02/01/2023] Open

Hopf TA, Green AG, Schubert B, Mersmann S, Schärfe CPI, Ingraham JB, Toth-Petroczy A, Brock K, Riesselman AJ, Palmedo P, Kang C, Sheridan R, Draizen EJ, Dallago C, Sander C, Marks DS. The EVcouplings Python framework for coevolutionary sequence analysis. Bioinformatics 2020;35:1582-1584. [PMID: 30304492 PMCID: PMC6499242 DOI: 10.1093/bioinformatics/bty862] [Citation(s) in RCA: 166] [Impact Index Per Article: 33.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2018] [Revised: 09/06/2018] [Accepted: 10/08/2018] [Indexed: 01/03/2023] Open

Affiliation(s)

Thomas A Hopf Department of Systems Biology, Harvard Medical School, Boston, MA, USA.,Department of Cell Biology, Harvard Medical School, Boston, MA, USA
Anna G Green Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Benjamin Schubert Department of Systems Biology, Harvard Medical School, Boston, MA, USA.,Department of Cell Biology, Harvard Medical School, Boston, MA, USA.,cBio Center, Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA
Sophia Mersmann Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Charlotta P I Schärfe Department of Systems Biology, Harvard Medical School, Boston, MA, USA.,Center for Bioinformatics, University of Tübingen, Tübingen, Germany.,Applied Bioinformatics, Department of Computer Science, Tübingen, Germany
John B Ingraham Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Agnes Toth-Petroczy Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Kelly Brock Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Adam J Riesselman Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Perry Palmedo Department of Systems Biology, Harvard Medical School, Boston, MA, USA.,Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA
Chan Kang Department of Systems Biology, Harvard Medical School, Boston, MA, USA
Robert Sheridan Computational Biology Center, Memorial Sloan Kettering Cancer Center, New York, NY, USA
Eli J Draizen Department of Biomedical Engineering, University of Virginia, Charlottesville, VA, USA
Christian Dallago Department of Systems Biology, Harvard Medical School, Boston, MA, USA.,Department of Cell Biology, Harvard Medical School, Boston, MA, USA.,Department of Informatics, Technische Universität München, Garching, Germany
Chris Sander Department of Cell Biology, Harvard Medical School, Boston, MA, USA.,cBio Center, Department of Biostatistics and Computational Biology, Dana-Farber Cancer Institute, Boston, MA, USA
Debora S Marks Department of Systems Biology, Harvard Medical School, Boston, MA, USA

Collapse

Buchan DWA, Jones DT. The PSIPRED Protein Analysis Workbench: 20 years on. Nucleic Acids Res 2020;47:W402-W407. [PMID: 31251384 PMCID: PMC6602445 DOI: 10.1093/nar/gkz297] [Citation(s) in RCA: 923] [Impact Index Per Article: 184.6] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Revised: 04/02/2019] [Accepted: 04/15/2019] [Indexed: 02/07/2023] Open

Getting to Know Your Neighbor: Protein Structure Prediction Comes of Age with Contextual Machine Learning. J Comput Biol 2020;27:796-814. [DOI: 10.1089/cmb.2019.0193] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open

Feng J, Shukla D. FingerprintContacts: Predicting Alternative Conformations of Proteins from Coevolution. J Phys Chem B 2020;124:3605-3615. [PMID: 32283936 DOI: 10.1021/acs.jpcb.9b11869] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

100

Protein Contact Map Prediction Based on ResNet and DenseNet. BIOMED RESEARCH INTERNATIONAL 2020;2020:7584968. [PMID: 32337273 PMCID: PMC7165324 DOI: 10.1155/2020/7584968] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/04/2020] [Accepted: 03/05/2020] [Indexed: 11/18/2022]