Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Gao X, Bu D, Xu J, Li M. Improving consensus contact prediction via server correlation reduction. BMC Struct Biol 2009;9:28. [PMID: 19419562 PMCID: PMC2689239 DOI: 10.1186/1472-6807-9-28] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2008] [Accepted: 05/06/2009] [Indexed: 11/10/2022]

For:	Gao X, Bu D, Xu J, Li M. Improving consensus contact prediction via server correlation reduction. BMC Struct Biol 2009;9:28. [PMID: 19419562 PMCID: PMC2689239 DOI: 10.1186/1472-6807-9-28] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/25/2008] [Accepted: 05/06/2009] [Indexed: 11/10/2022]

Number

Cited by Other Article(s)

Zohra Smaili F, Tian S, Roy A, Alazmi M, Arold ST, Mukherjee S, Scott Hefty P, Chen W, Gao X. QAUST: Protein Function Prediction Using Structure Similarity, Protein Interaction, and Functional Motifs. GENOMICS PROTEOMICS & BIOINFORMATICS 2021;19:998-1011. [PMID: 33631427 PMCID: PMC9403031 DOI: 10.1016/j.gpb.2021.02.001] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2018] [Revised: 04/03/2019] [Accepted: 05/17/2019] [Indexed: 11/25/2022]

Holland J, Pan Q, Grigoryan G. Contact prediction is hardest for the most informative contacts, but improves with the incorporation of contact potentials. PLoS One 2018;13:e0199585. [PMID: 29953468 PMCID: PMC6023208 DOI: 10.1371/journal.pone.0199585] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2017] [Accepted: 06/11/2018] [Indexed: 11/18/2022] Open

He B, Mortuza SM, Wang Y, Shen HB, Zhang Y. NeBcon: protein contact map prediction using neural network training coupled with naïve Bayes classifiers. Bioinformatics 2018;33:2296-2306. [PMID: 28369334 DOI: 10.1093/bioinformatics/btx164] [Citation(s) in RCA: 53] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/14/2016] [Accepted: 03/21/2017] [Indexed: 12/12/2022] Open

Abstract

Motivation

Recent CASP experiments have witnessed exciting progress on folding large-size non-humongous proteins with the assistance of co-evolution based contact predictions. The success is however anecdotal due to the requirement of the contact prediction methods for the high volume of sequence homologs that are not available to most of the non-humongous protein targets. Development of efficient methods that can generate balanced and reliable contact maps for different type of protein targets is essential to enhance the success rate of the ab initio protein structure prediction.

Results

We developed a new pipeline, NeBcon, which uses the naïve Bayes classifier (NBC) theorem to combine eight state of the art contact methods that are built from co-evolution and machine learning approaches. The posterior probabilities of the NBC model are then trained with intrinsic structural features through neural network learning for the final contact map prediction. NeBcon was tested on 98 non-redundant proteins, which improves the accuracy of the best co-evolution based meta-server predictor by 22%; the magnitude of the improvement increases to 45% for the hard targets that lack sequence and structural homologs in the databases. Detailed data analysis showed that the major contribution to the improvement is due to the optimized NBC combination of the complementary information from both co-evolution and machine learning predictions. The neural network training also helps to improve the coupling of the NBC posterior probability and the intrinsic structural features, which were found particularly important for the proteins that do not have sufficient number of homologous sequences to derive reliable co-evolution profiles.

Availiablity and Implementation

On-line server and standalone package of the program are available at http://zhanglab.ccmb.med.umich.edu/NeBcon/ .

Contact

zhng@umich.edu.

Supplementary information

Supplementary data are available at Bioinformatics online.

Collapse

Mandalaparthy V, Sanaboyana VR, Rafalia H, Gosavi S. Exploring the effects of sparse restraints on protein structure prediction. Proteins 2017;86:248-262. [DOI: 10.1002/prot.25438] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2017] [Revised: 11/20/2017] [Accepted: 11/29/2017] [Indexed: 01/06/2023]

K-nearest uphill clustering in the protein structure space. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.04.065] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]

Fujii C, Kuwahara H, Yu G, Guo L, Gao X. Learning gene regulatory networks from gene expression data using weighted consensus. Neurocomputing 2017. [DOI: 10.1016/j.neucom.2016.02.087] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]

Chen P, Hu S, Zhang J, Gao X, Li J, Xia J, Wang B. A Sequence-Based Dynamic Ensemble Learning System for Protein Ligand-Binding Site Prediction. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2016;13:901-912. [PMID: 26661785 DOI: 10.1109/tcbb.2015.2505286] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Cui X, Lu Z, Wang S, Jing-Yan Wang J, Gao X. CMsearch: simultaneous exploration of protein sequence space and structure space improves not only protein homology detection but also protein structure prediction. Bioinformatics 2016;32:i332-i340. [PMID: 27307635 PMCID: PMC4908355 DOI: 10.1093/bioinformatics/btw271] [Citation(s) in RCA: 36] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open

Abstract

MOTIVATION

Protein homology detection, a fundamental problem in computational biology, is an indispensable step toward predicting protein structures and understanding protein functions. Despite the advances in recent decades on sequence alignment, threading and alignment-free methods, protein homology detection remains a challenging open problem. Recently, network methods that try to find transitive paths in the protein structure space demonstrate the importance of incorporating network information of the structure space. Yet, current methods merge the sequence space and the structure space into a single space, and thus introduce inconsistency in combining different sources of information.

METHOD

We present a novel network-based protein homology detection method, CMsearch, based on cross-modal learning. Instead of exploring a single network built from the mixture of sequence and structure space information, CMsearch builds two separate networks to represent the sequence space and the structure space. It then learns sequence-structure correlation by simultaneously taking sequence information, structure information, sequence space information and structure space information into consideration.

RESULTS

We tested CMsearch on two challenging tasks, protein homology detection and protein structure prediction, by querying all 8332 PDB40 proteins. Our results demonstrate that CMsearch is insensitive to the similarity metrics used to define the sequence and the structure spaces. By using HMM-HMM alignment as the sequence similarity metric, CMsearch clearly outperforms state-of-the-art homology detection methods and the CASP-winning template-based protein structure prediction methods.

AVAILABILITY AND IMPLEMENTATION

Our program is freely available for download from http://sfb.kaust.edu.sa/Pages/Software.aspx

CONTACT

: xin.gao@kaust.edu.sa

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Márquez-Chamorro AE, Asencio-Cortés G, Santiesteban-Toca CE, Aguilar-Ruiz JS. Soft computing methods for the prediction of protein tertiary structures: A survey. Appl Soft Comput 2015. [DOI: 10.1016/j.asoc.2015.06.024] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]

Chen P, Huang JZ, Gao X. LigandRFs: random forest ensemble to identify ligand-binding residues from sequence information alone. BMC Bioinformatics 2014;15 Suppl 15:S4. [PMID: 25474163 PMCID: PMC4271564 DOI: 10.1186/1471-2105-15-s15-s4] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open

He Z, Alazmi M, Zhang J, Xu D. Protein structural model selection by combining consensus and single scoring methods. PLoS One 2013;8:e74006. [PMID: 24023923 PMCID: PMC3759460 DOI: 10.1371/journal.pone.0074006] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2013] [Accepted: 07/26/2013] [Indexed: 01/28/2023] Open

Chen P, Li J, Wong L, Kuwahara H, Huang JZ, Gao X. Accurate prediction of hot spot residues through physicochemical characteristics of amino acid sequences. Proteins 2013;81:1351-62. [DOI: 10.1002/prot.24278] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/07/2012] [Revised: 02/07/2013] [Accepted: 02/23/2013] [Indexed: 11/11/2022]

Abusamra H. A Comparative Study of Feature Selection and Classification Methods for Gene Expression Data of Glioma. ACTA ACUST UNITED AC 2013. [DOI: 10.1016/j.procs.2013.10.003] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]

eThread: a highly optimized machine learning-based approach to meta-threading and the modeling of protein tertiary structures. PLoS One 2012. [PMID: 23185577 PMCID: PMC3503980 DOI: 10.1371/journal.pone.0050200] [Citation(s) in RCA: 30] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open

Evolutionary decision rules for predicting protein contact maps. Pattern Anal Appl 2012. [DOI: 10.1007/s10044-012-0297-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]

Li Y, Rata I, Jakobsson E. Sampling multiple scoring functions can improve protein loop structure prediction accuracy. J Chem Inf Model 2011;51:1656-66. [PMID: 21702492 PMCID: PMC3211142 DOI: 10.1021/ci200143u] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Abstract

Accurately predicting loop structures is important for understanding functions of many proteins. In order to obtain loop models with high accuracy, efficiently sampling the loop conformation space to discover reasonable structures is a critical step. In loop conformation sampling, coarse-grain energy (scoring) functions coupling with reduced protein representations are often used to reduce the number of degrees of freedom as well as sampling computational time. However, due to implicitly considering many factors by reduced representations, the coarse-grain scoring functions may have potential insensitivity and inaccuracy, which can mislead the sampling process and consequently ignore important loop conformations. In this paper, we present a new computational sampling approach to obtain reasonable loop backbone models, so-called the Pareto optimal sampling (POS) method. The rationale of the POS method is to sample the function space of multiple, carefully selected scoring functions to discover an ensemble of diversified structures yielding Pareto optimality to all sampled conformations. The POS method can efficiently tolerate insensitivity and inaccuracy in individual scoring functions and thereby lead to significant accuracy improvement in loop structure prediction. We apply the POS method to a set of 4-12-residue loop targets using a function space composed of backbone-only Rosetta and distance-scale finite ideal-gas reference (DFIRE) and a triplet backbone dihedral potential developed in our lab. Our computational results show that in 501 out of 502 targets, the model sets generated by POS contain structure models are within subangstrom resolution. Moreover, the top-ranked models have a root mean square deviation (rmsd) less than 1 A in 96.8, 84.1, and 72.2% of the short (4-6 residues), medium (7-9 residues), and long (10-12 residues) targets, respectively, when the all-atom models are generated by local optimization from the backbone models and are ranked by our recently developed Pareto optimal consensus (POC) method. Similar sampling effectiveness can also be found in a set of 13-residue loop targets.

Collapse

Ashkenazy H, Unger R, Kliger Y. Hidden conformations in protein structures. Bioinformatics 2011;27:1941-7. [DOI: 10.1093/bioinformatics/btr292] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open

A Consensus Approach to Predicting Protein Contact Map via Logistic Regression. BIOINFORMATICS RESEARCH AND APPLICATIONS 2011. [DOI: 10.1007/978-3-642-21260-4_16] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]

Smolarek D, Bertrand O, Czerwinski M, Colin Y, Etchebest C, de Brevern AG. Multiple interests in structural models of DARC transmembrane protein. Transfus Clin Biol 2010;17:184-96. [PMID: 20655787 DOI: 10.1016/j.tracli.2010.05.003] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/11/2010] [Accepted: 05/21/2010] [Indexed: 12/23/2022]

Li Y, Rata I, Chiu SW, Jakobsson E. Improving predicted protein loop structure ranking using a Pareto-optimality consensus method. BMC STRUCTURAL BIOLOGY 2010;10:22. [PMID: 20642859 PMCID: PMC2914074 DOI: 10.1186/1472-6807-10-22] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2009] [Accepted: 07/20/2010] [Indexed: 11/10/2022]

Abstract

Background

Accurate protein loop structure models are important to understand functions of many proteins. Identifying the native or near-native models by distinguishing them from the misfolded ones is a critical step in protein loop structure prediction.

Results

We have developed a Pareto Optimal Consensus (POC) method, which is a consensus model ranking approach to integrate multiple knowledge- or physics-based scoring functions. The procedure of identifying the models of best quality in a model set includes: 1) identifying the models at the Pareto optimal front with respect to a set of scoring functions, and 2) ranking them based on the fuzzy dominance relationship to the rest of the models. We apply the POC method to a large number of decoy sets for loops of 4- to 12-residue in length using a functional space composed of several carefully-selected scoring functions: Rosetta, DOPE, DDFIRE, OPLS-AA, and a triplet backbone dihedral potential developed in our lab. Our computational results show that the sets of Pareto-optimal decoys, which are typically composed of ~20% or less of the overall decoys in a set, have a good coverage of the best or near-best decoys in more than 99% of the loop targets. Compared to the individual scoring function yielding best selection accuracy in the decoy sets, the POC method yields 23%, 37%, and 64% less false positives in distinguishing the native conformation, indentifying a near-native model (RMSD < 0.5A from the native) as top-ranked, and selecting at least one near-native model in the top-5-ranked models, respectively. Similar effectiveness of the POC method is also found in the decoy sets from membrane protein loops. Furthermore, the POC method outperforms the other popularly-used consensus strategies in model ranking, such as rank-by-number, rank-by-rank, rank-by-vote, and regression-based methods.

Conclusions

By integrating multiple knowledge- and physics-based scoring functions based on Pareto optimality and fuzzy dominance, the POC method is effective in distinguishing the best loop models from the other ones within a loop model set.

Collapse

Kokoszyńska K, Rychlewski L, Wyrwicz LS. Distant homologs of anti-apoptotic factor HAX1 encode parvalbumin-like calcium binding proteins. BMC Res Notes 2010;3:197. [PMID: 20633251 PMCID: PMC2914655 DOI: 10.1186/1756-0500-3-197] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/10/2010] [Accepted: 07/15/2010] [Indexed: 12/02/2022] Open

Abstract

Background

Apoptosis is a highly ordered and orchestrated multiphase process controlled by the numerous cellular and extra-cellular signals, which executes the programmed cell death via release of cytochrome c alterations in calcium signaling, caspase-dependent limited proteolysis and DNA fragmentation. Besides the general modifiers of apoptosis, several tissue-specific regulators of this process were identified including HAX1 (HS-1 associated protein X-1) - an anti-apoptotic factor active in myeloid cells. Although HAX1 was the subject of various experimental studies, the mechanisms of its action and a functional link connected with the regulation of apoptosis still remains highly speculative.

Findings

Here we provide the data which suggests that HAX1 may act as a regulator or as a sensor of calcium. On the basis of iterative similarity searches, we identified a set of distant homologs of HAX1 in insects. The applied fold recognition protocol gives us strong evidence that the distant insects' homologs of HAX1 are novel parvalbumin-like calcium binding proteins. Although the whole three EF-hands fold is not preserved in vertebrate our analysis suggests that there is an existence of a potential single EF-hand calcium binding site in HAX1. The molecular mechanism of its action remains to be identified, but the risen hypothesis easily translates into previously reported lines of various data on the HAX1 biology as well as, provides us a direct link to the regulation of apoptosis. Moreover, we also report that other family of myeloid specific apoptosis regulators - myeloid leukemia factors (MLF1, MLF2) share the homologous C-terminal domain and taxonomic distribution with HAX1.

Conclusions

Performed structural and active sites analyses gave new insights into mechanisms of HAX1 and MLF families in apoptosis process and suggested possible role of HAX1 in calcium-binding, still the analyses require further experimental verification.

Collapse