Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For:	Dubchak I, Holbrook SR, Kim SH. Prediction of protein folding class from amino acid composition. Proteins 1993;16:79-91. [PMID: 8497486 DOI: 10.1002/prot.340160109] [Citation(s) in RCA: 57] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

Number

Cited by Other Article(s)

Zhang Y, Pang D, Wang Z, Ma L, Chen Y, Yang L, Xiao W, Yuan H, Chang F, Ouyang H. An integrative analysis of genotype-phenotype correlation in Charcot Marie Tooth type 2A disease with MFN2 variants: A case and systematic review. Gene 2023;883:147684. [PMID: 37536398 DOI: 10.1016/j.gene.2023.147684] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2023] [Revised: 06/24/2023] [Accepted: 07/31/2023] [Indexed: 08/05/2023]

Computer Simulation and Additive-Based Refolding Process of Cysteine-Rich Proteins: VEGF-A as a Model. Int J Pept Res Ther 2017. [DOI: 10.1007/s10989-017-9644-y] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/13/2023]

Fang Y, Middaugh CR, Fang J. In silico classification of proteins from acidic and neutral cytoplasms. PLoS One 2012;7:e45585. [PMID: 23049817 PMCID: PMC3458925 DOI: 10.1371/journal.pone.0045585] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2012] [Accepted: 08/23/2012] [Indexed: 01/05/2023] Open

Lee TY, Lu CT, Chen SA, Bretaña NA, Cheng TH, Su MG, Huang KY. Investigation and identification of protein γ-glutamyl carboxylation sites. BMC Bioinformatics 2011;12 Suppl 13:S10. [PMID: 22372765 PMCID: PMC3278826 DOI: 10.1186/1471-2105-12-s13-s10] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023] Open

Abstract

BACKGROUND

Carboxylation is a modification of glutamate (Glu) residues which occurs post-translation that is catalyzed by γ-glutamyl carboxylase in the lumen of the endoplasmic reticulum. Vitamin K is a critical co-factor in the post-translational conversion of Glu residues to γ-carboxyglutamate (Gla) residues. It has been shown that the process of carboxylation is involved in the blood clotting cascade, bone growth, and extraosseous calcification. However, studies in this field have been limited by the difficulty of experimentally studying substrate site specificity in γ-glutamyl carboxylation. In silico investigations have the potential for characterizing carboxylated sites before experiments are carried out.

RESULTS

Because of the importance of γ-glutamyl carboxylation in biological mechanisms, this study investigates the substrate site specificity in carboxylation sites. It considers not only the composition of amino acids that surround carboxylation sites, but also the structural characteristics of these sites, including secondary structure and solvent-accessible surface area (ASA). The explored features are used to establish a predictive model for differentiating between carboxylation sites and non-carboxylation sites. A support vector machine (SVM) is employed to establish a predictive model with various features. A five-fold cross-validation evaluation reveals that the SVM model, trained with the combined features of positional weighted matrix (PWM), amino acid composition (AAC), and ASA, yields the highest accuracy (0.892). Furthermore, an independent testing set is constructed to evaluate whether the predictive model is over-fitted to the training set.

CONCLUSIONS

Independent testing data that did not undergo the cross-validation process shows that the proposed model can differentiate between carboxylation sites and non-carboxylation sites. This investigation is the first to study carboxylation sites and to develop a system for identifying them. The proposed method is a practical means of preliminary analysis and greatly diminishes the total number of potential carboxylation sites requiring further experimental confirmation.

Collapse

Li Y, Middaugh CR, Fang J. A novel scoring function for discriminating hyperthermophilic and mesophilic proteins with application to predicting relative thermostability of protein mutants. BMC Bioinformatics 2010;11:62. [PMID: 20109199 PMCID: PMC3098108 DOI: 10.1186/1471-2105-11-62] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2009] [Accepted: 01/28/2010] [Indexed: 11/10/2022] Open

Abstract

Background

The ability to design thermostable proteins is theoretically important and practically useful. Robust and accurate algorithms, however, remain elusive. One critical problem is the lack of reliable methods to estimate the relative thermostability of possible mutants.

Results

We report a novel scoring function for discriminating hyperthermophilic and mesophilic proteins with application to predicting the relative thermostability of protein mutants. The scoring function was developed based on an elaborate analysis of a set of features calculated or predicted from 540 pairs of hyperthermophilic and mesophilic protein ortholog sequences. It was constructed by a linear combination of ten important features identified by a feature ranking procedure based on the random forest classification algorithm. The weights of these features in the scoring function were fitted by a hill-climbing algorithm. This scoring function has shown an excellent ability to discriminate hyperthermophilic from mesophilic sequences. The prediction accuracies reached 98.9% and 97.3% in discriminating orthologous pairs in training and the holdout testing datasets, respectively. Moreover, the scoring function can distinguish non-homologous sequences with an accuracy of 88.4%. Additional blind tests using two datasets of experimentally investigated mutations demonstrated that the scoring function can be used to predict the relative thermostability of proteins and their mutants at very high accuracies (92.9% and 94.4%). We also developed an amino acid substitution preference matrix between mesophilic and hyperthermophilic proteins, which may be useful in designing more thermostable proteins.

Conclusions

We have presented a novel scoring function which can distinguish not only HP/MP ortholog pairs, but also non-homologous pairs at high accuracies. Most importantly, it can be used to accurately predict the relative stability of proteins and their mutants, as demonstrated in two blind tests. In addition, the residue substitution preference matrix assembled in this study may reflect the thermal adaptation induced substitution biases. A web server implementing the scoring function and the dataset used in this study are freely available at http://www.abl.ku.edu/thermorank/.

Collapse

Su Y, Zou Z, Feng S, Zhou P, Cao L. The acidity of protein fusion partners predominantly determines the efficacy to improve the solubility of the target proteins expressed in Escherichia coli. J Biotechnol 2007;129:373-82. [PMID: 17374413 DOI: 10.1016/j.jbiotec.2007.01.015] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/28/2006] [Revised: 01/14/2007] [Accepted: 01/18/2007] [Indexed: 11/17/2022]

Ofran Y, Margalit H. Proteins of the same fold and unrelated sequences have similar amino acid composition. Proteins 2006;64:275-9. [PMID: 16565950 DOI: 10.1002/prot.20964] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]

Sun XD, Huang RB. Prediction of protein structural classes using support vector machines. Amino Acids 2006;30:469-75. [PMID: 16622605 DOI: 10.1007/s00726-005-0239-0] [Citation(s) in RCA: 100] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2005] [Accepted: 07/12/2005] [Indexed: 11/24/2022]

Milac AL, Avram S, Petrescu AJ. Evaluation of a neural networks QSAR method based on ligand representation using substituent descriptors. Application to HIV-1 protease inhibitors. J Mol Graph Model 2005;25:37-45. [PMID: 16325439 DOI: 10.1016/j.jmgm.2005.09.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2005] [Revised: 06/17/2005] [Accepted: 09/29/2005] [Indexed: 11/18/2022]

Huang Y, Cai J, Ji L, Li Y. Classifying G-protein coupled receptors with bagging classification tree. Comput Biol Chem 2004;28:275-80. [PMID: 15548454 DOI: 10.1016/j.compbiolchem.2004.08.001] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2004] [Revised: 08/05/2004] [Accepted: 08/06/2004] [Indexed: 11/17/2022]

Du Q, Wei D, Chou KC. Correlations of amino acids in proteins. Peptides 2003;24:1863-9. [PMID: 15127938 DOI: 10.1016/j.peptides.2003.10.012] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]

Jin L, Fang W, Tang H. Prediction of protein structural classes by a new measure of information discrepancy. Comput Biol Chem 2003;27:373-80. [PMID: 12927111 DOI: 10.1016/s1476-9271(02)00087-7] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]

Røgen P, Bohr H. A new family of global protein shape descriptors. Math Biosci 2003;182:167-81. [PMID: 12591623 DOI: 10.1016/s0025-5564(02)00216-x] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]

Yu CS, Wang JY, Yang JM, Lyu PC, Lin CJ, Hwang JK. Fine-grained protein fold assignment by support vector machines using generalized npeptide coding schemes and jury voting from multiple-parameter sets. Proteins 2003;50:531-6. [PMID: 12577258 DOI: 10.1002/prot.10313] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]

Cai YD, Liu XJ, Xu XB, Chou KC. Prediction of protein structural classes by support vector machines. COMPUTERS & CHEMISTRY 2002;26:293-6. [PMID: 11868916 DOI: 10.1016/s0097-8485(01)00113-9] [Citation(s) in RCA: 195] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]

Cai YD, Liu XJ, Xu XB, Zhou GP. Support vector machines for predicting protein structural class. BMC Bioinformatics 2001;2:3. [PMID: 11483157 PMCID: PMC35360 DOI: 10.1186/1471-2105-2-3] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2001] [Accepted: 06/29/2001] [Indexed: 11/10/2022] Open

Chou KC. Prediction of tight turns and their types in proteins. Anal Biochem 2000;286:1-16. [PMID: 11038267 DOI: 10.1006/abio.2000.4757] [Citation(s) in RCA: 212] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Cai Y, Zhou G. Prediction of protein structural classes by neural network. Biochimie 2000;82:783-5. [PMID: 11018296 DOI: 10.1016/s0300-9084(00)01161-5] [Citation(s) in RCA: 91] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]

Liu W, Chou KC. Prediction of protein secondary structure content. PROTEIN ENGINEERING 1999;12:1041-50. [PMID: 10611397 DOI: 10.1093/protein/12.12.1041] [Citation(s) in RCA: 83] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]

Galat A. Variations of sequences and amino acid compositions of proteins that sustain their biological functions: An analysis of the cyclophilin family of proteins. Arch Biochem Biophys 1999;371:149-62. [PMID: 10545201 DOI: 10.1006/abbi.1999.1434] [Citation(s) in RCA: 87] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]

Chou KC. A key driving force in determination of protein structural classes. Biochem Biophys Res Commun 1999;264:216-24. [PMID: 10527868 DOI: 10.1006/bbrc.1999.1325] [Citation(s) in RCA: 169] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]

Apostol I, Szpankowski W. Indexing and mapping of proteins using a modified nonlinear Sammon projection. J Comput Chem 1999. [DOI: 10.1002/(sici)1096-987x(19990730)20:10<1049::aid-jcc7>3.0.co;2-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022]

Dubchak I, Muchnik I, Mayor C, Dralyuk I, Kim SH. Recognition of a protein fold in the context of the SCOP classification. Proteins 1999. [DOI: 10.1002/(sici)1097-0134(19990601)35:4<401::aid-prot3>3.0.co;2-k] [Citation(s) in RCA: 142] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2022]

Chandonia JM, Karplus M. New methods for accurate prediction of protein secondary structure. Proteins 1999. [DOI: 10.1002/(sici)1097-0134(19990515)35:3<293::aid-prot3>3.0.co;2-l] [Citation(s) in RCA: 73] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]

Chou KC. Using pair-coupled amino acid composition to predict protein secondary structure content. JOURNAL OF PROTEIN CHEMISTRY 1999;18:473-80. [PMID: 10449044 DOI: 10.1023/a:1020696810938] [Citation(s) in RCA: 64] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Gerstein M. How representative are the known structures of the proteins in a complete genome? A comprehensive structural census. FOLDING & DESIGN 1999;3:497-512. [PMID: 9889159 DOI: 10.1016/s1359-0278(98)00066-2] [Citation(s) in RCA: 100] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]

Abstract

BACKGROUND

Determining how representative the known structures are of the proteins encoded by a complete genome is important for assessing to what extent our current picture of protein stability and folding is overly influenced by biases in the structure databank (PDB). It is also important for improving database-based methods of structure prediction and genome annotation.

RESULTS

The known structures are compared to the proteins encoded by eight complete microbial genomes in terms of simple statistics such as sequence length, composition and secondary structure. The known structures are represented by a collection of nonhomologous domains from the PDB and a smaller list of 'biophysical proteins' on which folding experiments have concentrated. The proteins encoded by the genomes are considered as a whole and divided into various regions, such as known-structure homologue, low complexity (nonglobular), transmembrane or linker. Various tests are performed to assess the significance of the reported differences, in both a practical and a statistical sense.

CONCLUSIONS

The proteins encoded by the genomes are significantly different from those in the PDB. Their sequence lengths, which follow an extreme value distribution, are longer than the PDB proteins and much longer than the biophysical proteins. Their composition differs from the PDB proteins in having more Lys, Ile, Asn and Gln and less Cys and Trp. This is true overall and especially for the regions corresponding to soluble proteins of as yet unknown fold. Secondary-structure prediction on these uncharacterized regions indicates that they contain on average more helical structure than the PDB; differences about this mean are small, with yeast having slightly more sheet structure and Haemophilus influenzae and Helicobacter pylori more helical structure. Further information is available through the GeneCensus system at http://bioinfo.mbb.yale.edu/genome.

Collapse

Schneider G, Wrede P. Artificial neural networks for computer-based molecular design. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 1998;70:175-222. [PMID: 9830312 DOI: 10.1016/s0079-6107(98)00026-1] [Citation(s) in RCA: 135] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/01/2022]

Zhou GP. An intriguing controversy over protein structural class prediction. JOURNAL OF PROTEIN CHEMISTRY 1998;17:729-38. [PMID: 9988519 DOI: 10.1023/a:1020713915365] [Citation(s) in RCA: 290] [Impact Index Per Article: 11.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Diederichs K, Freigang J, Umhau S, Zeth K, Breed J. Prediction by a neural network of outer membrane beta-strand protein topology. Protein Sci 1998;7:2413-20. [PMID: 9828008 PMCID: PMC2143870 DOI: 10.1002/pro.5560071119] [Citation(s) in RCA: 73] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Chou KC, Liu WM, Maggiora GM, Zhang CT. Prediction and classification of domain structural classes. Proteins 1998. [DOI: 10.1002/(sici)1097-0134(19980401)31:1<97::aid-prot8>3.0.co;2-e] [Citation(s) in RCA: 81] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]

Wu CH. Artificial neural networks for molecular sequence analysis. COMPUTERS & CHEMISTRY 1998;21:237-56. [PMID: 9415987 DOI: 10.1016/s0097-8485(96)00038-1] [Citation(s) in RCA: 67] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Dandekar T, König R. Computational methods for the prediction of protein folds. BIOCHIMICA ET BIOPHYSICA ACTA 1997;1343:1-15. [PMID: 9428653 DOI: 10.1016/s0167-4838(97)00132-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]

Bahar I, Atilgan AR, Jernigan RL, Erman B. Understanding the recognition of protein structural classes by amino acid composition. Proteins 1997. [DOI: 10.1002/(sici)1097-0134(199710)29:2<172::aid-prot5>3.0.co;2-f] [Citation(s) in RCA: 99] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]

Ojasoo T, Doré JC. Taxonomy of nuclear receptors and SERPINS by multivariate analysis of amino-acid composition. J Steroid Biochem Mol Biol 1996;58:167-81. [PMID: 8809198 DOI: 10.1016/0960-0760(96)00029-5] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]

Zhang CT, Chou KC. An analysis of protein folding type prediction by seed-propagated sampling and jackknife test. JOURNAL OF PROTEIN CHEMISTRY 1995;14:583-93. [PMID: 8561854 DOI: 10.1007/bf01886884] [Citation(s) in RCA: 27] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

Dubchak I, Muchnik I, Holbrook SR, Kim SH. Prediction of protein folding class using global description of amino acid sequence. Proc Natl Acad Sci U S A 1995;92:8700-4. [PMID: 7568000 PMCID: PMC41034 DOI: 10.1073/pnas.92.19.8700] [Citation(s) in RCA: 348] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open

Galat A, Bouet F, Rivière S. Amino acid compositions of proteins and their identities. Electrophoresis 1995;16:1095-103. [PMID: 7498153 DOI: 10.1002/elps.11501601186] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/25/2023]

Zhang CT, Chou KC. An eigenvalue-eigenvector approach to predicting protein folding types. JOURNAL OF PROTEIN CHEMISTRY 1995;14:309-26. [PMID: 8590599 DOI: 10.1007/bf01886788] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/31/2023]

Chou KC. Does the folding type of a protein depend on its amino acid composition? FEBS Lett 1995;363:127-31. [PMID: 7729532 DOI: 10.1016/0014-5793(95)00245-5] [Citation(s) in RCA: 26] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Chou KC. A novel approach to predicting protein structural classes in a (20-1)-D amino acid composition space. Proteins 1995;21:319-44. [PMID: 7567954 DOI: 10.1002/prot.340210406] [Citation(s) in RCA: 350] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Chandonia JM, Karplus M. Neural networks for secondary structure and structural class predictions. Protein Sci 1995;4:275-85. [PMID: 7757016 PMCID: PMC2143056 DOI: 10.1002/pro.5560040214] [Citation(s) in RCA: 82] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/27/2023]

Landale EC, Strong DD, Mohan S, Baylink DJ. Sequence comparison and predicted structure for the four exon-encoded regions of human insulin-like growth factor binding protein 4. Growth Factors 1995;12:245-50. [PMID: 8930016 DOI: 10.3109/08977199509028963] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Indexed: 02/03/2023]

Chou KC, Zhang CT. Prediction of protein structural classes. Crit Rev Biochem Mol Biol 1995;30:275-349. [PMID: 7587280 DOI: 10.3109/10409239509083488] [Citation(s) in RCA: 819] [Impact Index Per Article: 28.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Eisenhaber F, Persson B, Argos P. Protein structure prediction: recognition of primary, secondary, and tertiary structural features from amino acid sequence. Crit Rev Biochem Mol Biol 1995;30:1-94. [PMID: 7587278 DOI: 10.3109/10409239509085139] [Citation(s) in RCA: 97] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]

Chou K, Zhang C. Predicting protein folding types by distance functions that make allowances for amino acid interactions. J Biol Chem 1994. [DOI: 10.1016/s0021-9258(17)31748-9] [Citation(s) in RCA: 92] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/22/2022] Open

Anthonsen HW, Baptista A, Drabløs F, Martel P, Petersen SB. The blind watchmaker and rational protein engineering. J Biotechnol 1994;36:185-220. [PMID: 7765263 PMCID: PMC7173218 DOI: 10.1016/0168-1656(94)90152-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/1994] [Accepted: 04/23/1994] [Indexed: 01/27/2023]

Rost B, Sander C. Combining evolutionary information and neural networks to predict protein secondary structure. Proteins 1994;19:55-72. [PMID: 8066087 DOI: 10.1002/prot.340190108] [Citation(s) in RCA: 1157] [Impact Index Per Article: 38.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]

Abstract

Using evolutionary information contained in multiple sequence alignments as input to neural networks, secondary structure can be predicted at significantly increased accuracy. Here, we extend our previous three-level system of neural networks by using additional input information derived from multiple alignments. Using a position-specific conservation weight as part of the input increases performance. Using the number of insertions and deletions reduces the tendency for overprediction and increases overall accuracy. Addition of the global amino acid content yields a further improvement, mainly in predicting structural class. The final network system has sustained overall accuracy of 71.6% in a multiple cross-validation test on 126 unique protein chains. A test on a new set of 124 recently solved protein structures that have no significant sequence similarity to the learning set confirms the high level of accuracy. The average cross-validated accuracy for all 250 sequence-unique chains is above 72%. Using various data sets, the method is compared to alternative prediction methods, some of which also use multiple alignments: the performance advantage of the network system is at least 6 percentage points in three-state accuracy. In addition, the network estimates secondary structure content from multiple sequence alignments about as well as circular dichroism spectroscopy on a single protein and classifies 75% of the 250 proteins correctly into one of four protein structural classes. Of particular practical importance is the definition of a position-specific reliability index. For 40% of all residues the method has a sustained three-state accuracy of 88%, as high as the overall average for homology modelling. A further strength of the method is greatly increased accuracy in predicting the placement of secondary structure segments.

Collapse

Dombi GW, Lawrence J. Analysis of protein transmembrane helical regions by a neural network. Protein Sci 1994;3:557-66. [PMID: 8003974 PMCID: PMC2142860 DOI: 10.1002/pro.5560030404] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]