1
|
Bhattacharya S, Roche R, Shuvo MH, Moussad B, Bhattacharya D. Contact-Assisted Threading in Low-Homology Protein Modeling. Methods Mol Biol 2023; 2627:41-59. [PMID: 36959441 DOI: 10.1007/978-1-0716-2974-1_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/25/2023]
Abstract
The ability to successfully predict the three-dimensional structure of a protein from its amino acid sequence has made considerable progress in the recent past. The progress is propelled by the improved accuracy of deep learning-based inter-residue contact map predictors coupled with the rising growth of protein sequence databases. Contact map encodes interatomic interaction information that can be exploited for highly accurate prediction of protein structures via contact map threading even for the query proteins that are not amenable to direct homology modeling. As such, contact-assisted threading has garnered considerable research effort. In this chapter, we provide an overview of existing contact-assisted threading methods while highlighting the recent advances and discussing some of the current limitations and future prospects in the application of contact-assisted threading for improving the accuracy of low-homology protein modeling.
Collapse
Affiliation(s)
- Sutanu Bhattacharya
- Department of Computer Science and Software Engineering, Auburn University, Auburn, AL, USA
| | | | - Md Hossain Shuvo
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | - Bernard Moussad
- Department of Computer Science, Virginia Tech, Blacksburg, VA, USA
| | | |
Collapse
|
2
|
Abstract
Summary Motivation. Predicting the native state of a protein has long been considered a gateway problem for understanding protein folding. Recent advances in structural modeling driven by deep learning have achieved unprecedented success at predicting a protein’s crystal structure, but it is not clear if these models are learning the physics of how proteins dynamically fold into their equilibrium structure or are just accurate knowledge-based predictors of the final state. Results. In this work, we compare the pathways generated by state-of-the-art protein structure prediction methods to experimental data about protein folding pathways. The methods considered were AlphaFold 2, RoseTTAFold, trRosetta, RaptorX, DMPfold, EVfold, SAINT2 and Rosetta. We find evidence that their simulated dynamics capture some information about the folding pathway, but their predictive ability is worse than a trivial classifier using sequence-agnostic features like chain length. The folding trajectories produced are also uncorrelated with experimental observables such as intermediate structures and the folding rate constant. These results suggest that recent advances in structure prediction do not yet provide an enhanced understanding of protein folding. Availability. The data underlying this article are available in GitHub at https://github.com/oxpig/structure-vs-folding/ Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Carlos Outeiral
- Department of Statistics, University of Oxford, Oxford OX1 3PB, UK
| | - Daniel A Nissley
- Department of Statistics, University of Oxford, Oxford OX1 3PB, UK
| | | |
Collapse
|
3
|
An RNA-centric historical narrative around the Protein Data Bank. J Biol Chem 2021; 296:100555. [PMID: 33744291 PMCID: PMC8080527 DOI: 10.1016/j.jbc.2021.100555] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 02/17/2021] [Accepted: 03/16/2021] [Indexed: 01/06/2023] Open
Abstract
Some of the amazing contributions brought to the scientific community by the Protein Data Bank (PDB) are described. The focus is on nucleic acid structures with a bias toward RNA. The evolution and key roles in science of the PDB and other structural databases for nucleic acids illustrate how small initial ideas can become huge and indispensable resources with the unflinching willingness of scientists to cooperate globally. The progress in the understanding of the molecular interactions driving RNA architectures followed the rapid increase in RNA structures in the PDB. That increase was consecutive to improvements in chemical synthesis and purification of RNA molecules, as well as in biophysical methods for structure determination and computer technology. The RNA modeling efforts from the early beginnings are also described together with their links to the state of structural knowledge and technological development. Structures of RNA and of its assemblies are physical objects, which, together with genomic data, allow us to integrate present-day biological functions and the historical evolution in all living species on earth.
Collapse
|
4
|
Meyer P, Siwo G, Zeevi D, Sharon E, Norel R, Segal E, Stolovitzky G. Inferring gene expression from ribosomal promoter sequences, a crowdsourcing approach. Genome Res 2013; 23:1928-37. [PMID: 23950146 PMCID: PMC3814892 DOI: 10.1101/gr.157420.113] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
The Gene Promoter Expression Prediction challenge consisted of predicting gene expression from promoter sequences in a previously unknown experimentally generated data set. The challenge was presented to the community in the framework of the sixth Dialogue for Reverse Engineering Assessments and Methods (DREAM6), a community effort to evaluate the status of systems biology modeling methodologies. Nucleotide-specific promoter activity was obtained by measuring fluorescence from promoter sequences fused upstream of a gene for yellow fluorescence protein and inserted in the same genomic site of yeast Saccharomyces cerevisiae. Twenty-one teams submitted results predicting the expression levels of 53 different promoters from yeast ribosomal protein genes. Analysis of participant predictions shows that accurate values for low-expressed and mutated promoters were difficult to obtain, although in the latter case, only when the mutation induced a large change in promoter activity compared to the wild-type sequence. As in previous DREAM challenges, we found that aggregation of participant predictions provided robust results, but did not fare better than the three best algorithms. Finally, this study not only provides a benchmark for the assessment of methods predicting activity of a specific set of promoters from their sequence, but it also shows that the top performing algorithm, which used machine-learning approaches, can be improved by the addition of biological features such as transcription factor binding sites.
Collapse
Affiliation(s)
- Pablo Meyer
- IBM T.J. Watson Research Center, Yorktown Heights, New York 10598, USA
| | | | | | | | | | | | | | | |
Collapse
|
5
|
Bilal E, Dutkowski J, Guinney J, Jang IS, Logsdon BA, Pandey G, Sauerwine BA, Shimoni Y, Moen Vollan HK, Mecham BH, Rueda OM, Tost J, Curtis C, Alvarez MJ, Kristensen VN, Aparicio S, Børresen-Dale AL, Caldas C, Califano A, Friend SH, Ideker T, Schadt EE, Stolovitzky GA, Margolin AA. Improving breast cancer survival analysis through competition-based multidimensional modeling. PLoS Comput Biol 2013; 9:e1003047. [PMID: 23671412 PMCID: PMC3649990 DOI: 10.1371/journal.pcbi.1003047] [Citation(s) in RCA: 66] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2012] [Accepted: 03/18/2013] [Indexed: 01/09/2023] Open
Abstract
Breast cancer is the most common malignancy in women and is responsible for hundreds of thousands of deaths annually. As with most cancers, it is a heterogeneous disease and different breast cancer subtypes are treated differently. Understanding the difference in prognosis for breast cancer based on its molecular and phenotypic features is one avenue for improving treatment by matching the proper treatment with molecular subtypes of the disease. In this work, we employed a competition-based approach to modeling breast cancer prognosis using large datasets containing genomic and clinical information and an online real-time leaderboard program used to speed feedback to the modeling team and to encourage each modeler to work towards achieving a higher ranked submission. We find that machine learning methods combined with molecular features selected based on expert prior knowledge can improve survival predictions compared to current best-in-class methodologies and that ensemble models trained across multiple user submissions systematically outperform individual models within the ensemble. We also find that model scores are highly consistent across multiple independent evaluations. This study serves as the pilot phase of a much larger competition open to the whole research community, with the goal of understanding general strategies for model optimization using clinical and molecular profiling data and providing an objective, transparent system for assessing prognostic models.
Collapse
Affiliation(s)
- Erhan Bilal
- IBM TJ Watson Research, Yorktown Heights, New York, United States of America
| | - Janusz Dutkowski
- Departments of Medicine and Bioengineering, University of California San Diego, La Jolla, California, United States of America
| | - Justin Guinney
- Sage Bionetworks, Seattle, Washington, United States of America
| | - In Sock Jang
- Sage Bionetworks, Seattle, Washington, United States of America
| | - Benjamin A. Logsdon
- Sage Bionetworks, Seattle, Washington, United States of America
- Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Gaurav Pandey
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
- Icahn Institute for Genomics and Multiscale Biology, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
| | | | - Yishai Shimoni
- Columbia Initiative in Systems Biology, Columbia University, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America
| | - Hans Kristian Moen Vollan
- Department of Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- The K. G. Jebsen Center for Breast Cancer Research, Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
- Cambridge Research Institute, Cancer Research UK, Cambridge, United Kingdom
- Department of Oncology, University of Cambridge, Cambridge, United Kingdom
- Department of Oncology, Division of Cancer Medicine, Surgery and Transplantation, Oslo University Hospital, Oslo, Norway
| | | | - Oscar M. Rueda
- Cambridge Research Institute, Cancer Research UK, Cambridge, United Kingdom
- Department of Oncology, University of Cambridge, Cambridge, United Kingdom
| | - Jorg Tost
- Laboratory for Epigenetics and Environment, Centre National de Génotypage, CEA, Institut de Génomique, Evry, France
| | - Christina Curtis
- Department of Preventive Medicine, Keck School of Medicine, University of Southern California, Los Angeles, California, United States of America
| | - Mariano J. Alvarez
- Columbia Initiative in Systems Biology, Columbia University, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America
| | - Vessela N. Kristensen
- Department of Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- The K. G. Jebsen Center for Breast Cancer Research, Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
- Department of Clinical Molecular Biology, Division of Medicine, Akershus University Hospital, Ahus, Norway
| | - Samuel Aparicio
- Department of Pathology and Laboratory Medicine, University of British Colombia, Vancouver, British Colombia, Canada
- Molecular Oncology, British Colombia Cancer Research Center, Vancouver, British Colombia, Canada
| | - Anne-Lise Børresen-Dale
- Department of Genetics, Institute for Cancer Research, Oslo University Hospital, Oslo, Norway
- The K. G. Jebsen Center for Breast Cancer Research, Institute for Clinical Medicine, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Carlos Caldas
- Cambridge Research Institute, Cancer Research UK, Cambridge, United Kingdom
- Department of Oncology, University of Cambridge, Cambridge, United Kingdom
- Cambridge Experimental Cancer Medicine Centre, Cambridge, United Kingdom
- Cambridge Breast Unit, Cambridge University Hospital NHS Foundation Trust and NIHR Cambridge Biomedical Research Centre, Addenbrooke's Hospital, Cambridge, United Kingdom
| | - Andrea Califano
- Columbia Initiative in Systems Biology, Columbia University, New York, New York, United States of America
- Center for Computational Biology and Bioinformatics, Columbia University, New York, New York, United States of America
- Department of Biomedical Informatics, Columbia University, New York, New York, United States of America
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York, United States of America
- Institute for Cancer Genetics, Columbia University, Columbia University, New York, New York, United States of America
- Herbert Irving Comprehensive Cancer Center, Columbia University, New York, New York, United States of America
| | | | - Trey Ideker
- Departments of Medicine and Bioengineering, University of California San Diego, La Jolla, California, United States of America
| | - Eric E. Schadt
- Department of Genetics and Genomic Sciences, Icahn School of Medicine at Mount Sinai, New York, New York, United States of America
| | | | | |
Collapse
|
6
|
Meyer P, Hoeng J, Rice JJ, Norel R, Sprengel J, Stolle K, Bonk T, Corthesy S, Royyuru A, Peitsch MC, Stolovitzky G. Industrial methodology for process verification in research (IMPROVER): toward systems biology verification. Bioinformatics 2012; 28:1193-201. [PMID: 22423044 PMCID: PMC3338013 DOI: 10.1093/bioinformatics/bts116] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023] Open
Abstract
Motivation: Analyses and algorithmic predictions based on high-throughput data are essential for the success of systems biology in academic and industrial settings. Organizations, such as companies and academic consortia, conduct large multi-year scientific studies that entail the collection and analysis of thousands of individual experiments, often over many physical sites and with internal and outsourced components. To extract maximum value, the interested parties need to verify the accuracy and reproducibility of data and methods before the initiation of such large multi-year studies. However, systematic and well-established verification procedures do not exist for automated collection and analysis workflows in systems biology which could lead to inaccurate conclusions. Results: We present here, a review of the current state of systems biology verification and a detailed methodology to address its shortcomings. This methodology named ‘Industrial Methodology for Process Verification in Research’ or IMPROVER, consists on evaluating a research program by dividing a workflow into smaller building blocks that are individually verified. The verification of each building block can be done internally by members of the research program or externally by ‘crowd-sourcing’ to an interested community. www.sbvimprover.com Implementation: This methodology could become the preferred choice to verify systems biology research workflows that are becoming increasingly complex and sophisticated in industrial and academic settings. Contact:gustavo@us.ibm.com
Collapse
Affiliation(s)
- Pablo Meyer
- IBM Computational Biology Center, Yorktown Heights, NY 10598, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
7
|
Chambers SJ, Wyatt GM, Garrett SD, Morgan MRA. Alteration of the Binding Characteristics of a Recombinant scFv Anti-parathion Antibody - 2. Computer Modelling of Hapten Docking and Correlation with ELISA Binding. FOOD AGR IMMUNOL 2010. [DOI: 10.1080/09540109999744] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022] Open
|
8
|
Schön JC, Jansen M. Determination, prediction, and understanding of structures, using the energy landscapes of chemical systems – Part II. ACTA ACUST UNITED AC 2009. [DOI: 10.1524/zkri.216.7.361.20362] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Abstract
In the past decade, new theoretical approaches have been developed to determine, predict and understand the struc-ture of chemical compounds. The central element of these methods has been the investigation of the energy landscape of chemical systems. Applications range from extended crystalline and amorphous compounds over clusters and molecular crystals to proteins. In this review, we are going to give an introduction to energy landscapes and methods for their investigation, together with a number of examples. These include structure prediction of extended and mo-lecular crystals, structure prediction and folding of proteins, structure analysis of zeolites, and structure determination of crystals from powder diffraction data.
Collapse
|
9
|
Takamoto K, Chance MR. RADIOLYTIC PROTEIN FOOTPRINTING WITH MASS SPECTROMETRY TO PROBE THE STRUCTURE OF MACROMOLECULAR COMPLEXES. ACTA ACUST UNITED AC 2006; 35:251-76. [PMID: 16689636 DOI: 10.1146/annurev.biophys.35.040405.102050] [Citation(s) in RCA: 197] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
Structural proteomics approaches using mass spectrometry are increasingly used in biology to examine the composition and structure of macromolecules. Hydroxyl radical-mediated protein footprinting using mass spectrometry has recently been developed to define structure, assembly, and conformational changes of macromolecules in solution based on measurements of reactivity of amino acid side chain groups with covalent modification reagents. Accurate measurements of side chain reactivity are achieved using quantitative liquid-chromatography-coupled mass spectrometry, whereas the side chain modification sites are identified using tandem mass spectrometry. In addition, the use of footprinting data in conjunction with computational modeling approaches is a powerful new method for testing and refining structural models of macromolecules and their complexes. In this review, we discuss the basic chemistry of hydroxyl radical reactions with peptides and proteins, highlight various approaches to map protein structure using radical oxidation methods, and describe state-of-the-art approaches to combine computational and footprinting data.
Collapse
Affiliation(s)
- Keiji Takamoto
- Case Center for Proteomics, Case Western Reserve University, Cleveland, Ohio 44106, USA
| | | |
Collapse
|
10
|
Uchôa HB, Jorge GE, Freitas Da Silveira NJ, Camera JC, Canduri F, De Azevedo WF. Parmodel: a web server for automated comparative modeling of proteins. Biochem Biophys Res Commun 2004; 325:1481-6. [PMID: 15555595 DOI: 10.1016/j.bbrc.2004.10.192] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/29/2004] [Indexed: 11/25/2022]
Abstract
Parmodel is a web server for automated comparative modeling and evaluation of protein structures. The aim of this tool is to help inexperienced users to perform modeling, assessment, visualization, and optimization of protein models as well as crystallographers to evaluate structures solved experimentally. It is subdivided in four modules: Parmodel Modeling, Parmodel Assessment, Parmodel Visualization, and Parmodel Optimization. The main module is the Parmodel Modeling that allows the building of several models for a same protein in a reduced time, through the distribution of modeling processes on a Beowulf cluster. Parmodel automates and integrates the main softwares used in comparative modeling as MODELLER, Whatcheck, Procheck, Raster3D, Molscript, and Gromacs. This web server is freely accessible at .
Collapse
Affiliation(s)
- Hugo Brandão Uchôa
- Departamento de Física UNESP, São José do Rio Preto, SP 15054-000, Brazil
| | | | | | | | | | | |
Collapse
|
11
|
|
12
|
Abstract
Understanding the molecular function of proteins is greatly enhanced by insights gained from their three-dimensional structures. Since experimental structures are only available for a small fraction of proteins, computational methods for protein structure modeling play an increasingly important role. Comparative protein structure modeling is currently the most accurate method, yielding models suitable for a wide spectrum of applications, such as structure-guided drug development or virtual screening. Stable and reliable automated prediction pipelines have been developed to apply large-scale comparative modeling to whole genomes or entire sequence databases. Model repositories give access to these annotated and evaluated models. In this review, we will discuss recent developments in automated comparative modeling and provide selected examples illustrating the use of homology models.
Collapse
Affiliation(s)
- Jurgen Kopp
- Biozentrum der Universitat Basel and Swiss Institute of Bioinformatics, Klingelbergstr. 50-70, CH 4056, Basel, Switzerland
| | | |
Collapse
|
13
|
Comparative Protein Structure Modeling and its Applications to Drug Discovery. ANNUAL REPORTS IN MEDICINAL CHEMISTRY 2004. [DOI: 10.1016/s0065-7743(04)39020-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/16/2023]
|
14
|
John B, Sali A. Comparative protein structure modeling by iterative alignment, model building and model assessment. Nucleic Acids Res 2003; 31:3982-92. [PMID: 12853614 PMCID: PMC165975 DOI: 10.1093/nar/gkg460] [Citation(s) in RCA: 264] [Impact Index Per Article: 12.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Comparative or homology protein structure modeling is severely limited by errors in the alignment of a modeled sequence with related proteins of known three-dimensional structure. To ameliorate this problem, we have developed an automated method that optimizes both the alignment and the model implied by it. This task is achieved by a genetic algorithm protocol that starts with a set of initial alignments and then iterates through re-alignment, model building and model assessment to optimize a model assessment score. During this iterative process: (i) new alignments are constructed by application of a number of operators, such as alignment mutations and cross-overs; (ii) comparative models corresponding to these alignments are built by satisfaction of spatial restraints, as implemented in our program MODELLER; (iii) the models are assessed by a variety of criteria, partly depending on an atomic statistical potential. When testing the procedure on a very difficult set of 19 modeling targets sharing only 4-27% sequence identity with their template structures, the average final alignment accuracy increased from 37 to 45% relative to the initial alignment (the alignment accuracy was measured as the percentage of positions in the tested alignment that were identical to the reference structure-based alignment). Correspondingly, the average model accuracy increased from 43 to 54% (the model accuracy was measured as the percentage of the C(alpha) atoms of the model that were within 5 A of the corresponding C(alpha) atoms in the superposed native structure). The present method also compares favorably with two of the most successful previously described methods, PSI-BLAST and SAM. The accuracy of the final models would be increased further if a better method for ranking of the models were available.
Collapse
Affiliation(s)
- Bino John
- Laboratory of Molecular Biophysics, Pels Family Center for Biochemistry and Structural Biology, The Rockefeller University, New York, NY 10021, USA
| | | |
Collapse
|
15
|
Schwede T, Kopp J, Guex N, Peitsch MC. SWISS-MODEL: An automated protein homology-modeling server. Nucleic Acids Res 2003; 31:3381-5. [PMID: 12824332 PMCID: PMC168927 DOI: 10.1093/nar/gkg520] [Citation(s) in RCA: 4062] [Impact Index Per Article: 193.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
SWISS-MODEL (http://swissmodel.expasy.org) is a server for automated comparative modeling of three-dimensional (3D) protein structures. It pioneered the field of automated modeling starting in 1993 and is the most widely-used free web-based automated modeling facility today. In 2002 the server computed 120 000 user requests for 3D protein models. SWISS-MODEL provides several levels of user interaction through its World Wide Web interface: in the 'first approach mode' only an amino acid sequence of a protein is submitted to build a 3D model. Template selection, alignment and model building are done completely automated by the server. In the 'alignment mode', the modeling process is based on a user-defined target-template alignment. Complex modeling tasks can be handled with the 'project mode' using DeepView (Swiss-PdbViewer), an integrated sequence-to-structure workbench. All models are sent back via email with a detailed modeling report. WhatCheck analyses and ANOLEA evaluations are provided optionally. The reliability of SWISS-MODEL is continuously evaluated in the EVA-CM project. The SWISS-MODEL server is under constant development to improve the successful implementation of expert knowledge into an easy-to-use server.
Collapse
Affiliation(s)
- Torsten Schwede
- Biozentrum der Universität Basel, Klingelbergstr. 50-70, CH 4056 Basel, Switzerland.
| | | | | | | |
Collapse
|
16
|
Hari Krishna S, Karanth NG. LIPASES AND LIPASE-CATALYZED ESTERIFICATION REACTIONS IN NONAQUEOUS MEDIA. CATALYSIS REVIEWS-SCIENCE AND ENGINEERING 2002. [DOI: 10.1081/cr-120015481] [Citation(s) in RCA: 178] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
17
|
Abstract
The conventional notion that enzymes are only active in aqueous media has long been discarded, thanks to the numerous studies documenting enzyme activities in nonaqueous media, including pure organic solvents and supercritical fluids. Enzymatic reactions in nonaqueous solvents offer new possibilities for producing useful chemicals (emulsifiers, surfactants, wax esters, chiral drug molecules, biopolymers, peptides and proteins, modified fats and oils, structured lipids and flavor esters). The use of enzymes in both macro- and microaqueous systems has been investigated especially intensively in the last two decades. Although enzymes exhibit considerable activity in nonaqueous media, the activity is low compared to that in water. This observation has led to numerous studies to modify enzymes for specific purposes by various means including protein engineering. This review covers the historical developments, major technological advances and recent trends of enzyme catalysis in nonconventional media. A brief description of different classes of enzymes and their use in industry is provided with representative examples. Recent trends including use of novel solvent systems, role of water activity, stability issues, medium and biocatalyst engineering aspects have been discussed with examples. Special attention is given to protein engineering and directed evolution.
Collapse
Affiliation(s)
- Sajja Hari Krishna
- AK-Technische Chemie und Biotechnologie, Institut für Chemie und Biochemie, Universität Greifswald, Soldmannstrasse 16, D-17487 Greifswald, Germany.
| |
Collapse
|
18
|
Flohil JA, Vriend G, Berendsen HJC. Completion and refinement of 3-D homology models with restricted molecular dynamics: application to targets 47, 58, and 111 in the CASP modeling competition and posterior analysis. Proteins 2002; 48:593-604. [PMID: 12211026 DOI: 10.1002/prot.10105] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
A method is presented to refine models built by homology by the use of restricted molecular dynamics (MD) techniques. The basic idea behind this method is the use of structure validation software to determine for each residue the likelihood that it is modeled correctly. This information is used to determine constraints and restraints in an MD simulation including explicit solvent molecules, which is used for model refinement. The procedure is based on the idea that residues that the validation software identifies as correctly positioned should be strongly constrained or restrained in the MD simulations, whereas residues that are likely to be positioned wrongly should move freely. Two different protocols are compared: one (applied to CASP3 target T58) using full structural constraints with separate optimization of each short fragment and the other (applied to T47) allowing some freedom using harmonic restraining potentials, with automatic optimization of the whole molecule. Structures along the MD trajectory that scored best in structural checks were selected for the construction of models that appeared to be successful in the CASP3 competition. Model refinement with MD in general leads to a model that is less like the experimental structure (Levitt et al. Nature Struct Biol 1999;6:108-111). Actually, refined T47 was slightly improved compared to the starting model; changes in model T58 led not to further enhancement. After the X-ray structure of the modeled proteins became known, the procedure was evaluated for two targets (T47 and the CASP4 target T111) by comparing a long simulation in water with the experimental target structures. It was found that structural improvements could be obtained on a nanosecond time scale by allowing appropriate freedom in the simulation. Structural checks applied to fast fluctuations do not appear to be informative for the correctness of the structure. However, both a simple hydrogen bond count and a simple compactness measure, if averaged over times of typically 300 ps, correlate well with structural correctness and we suggest that criteria based on these properties may be used in computational folding strategies.
Collapse
Affiliation(s)
- J A Flohil
- Groningen Biomolecular Sciences and Biotechnology Institute (GBB), Department of Biophysical Chemistry, University of Groningen, Groningen, The Netherlands
| | | | | |
Collapse
|
19
|
Abstract
The prediction of the three-dimensional structures of the native states of proteins from the sequences of their amino acids is one of the most important challenges in molecular biology. An essential task for solving this problem within coarse-grained models is the deduction of effective interaction potentials between the amino acids. Over the years, several techniques have been developed to extract potentials that are able to discriminate satisfactorily between the native and nonnative folds of a preassigned protein sequence. In general, when these potentials are used in actual dynamical folding simulations, they lead to a drift of the native structure outside the quasinative basin. In this article, we present and validate an approach to overcome this difficulty. By exploiting several numerical and analytical tools, we set up a rigorous iterative scheme to extract potentials satisfying a prerequisite of any viable potential: the stabilization of proteins within their native basin (less than 3-4 A RMSD). The scheme is flexible and is demonstrated to be applicable to a variety of parameterizations of the energy function, and it provides in each case the optimal potentials.
Collapse
Affiliation(s)
- C Micheletti
- International School for Advanced Studies and INFM, Trieste, Italy.
| | | | | | | |
Collapse
|
20
|
Koonin EV, Wolf YI, Aravind L. Protein fold recognition using sequence profiles and its application in structural genomics. ADVANCES IN PROTEIN CHEMISTRY 2000; 54:245-75. [PMID: 10829230 DOI: 10.1016/s0065-3233(00)54008-x] [Citation(s) in RCA: 67] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/04/2022]
Affiliation(s)
- E V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20894, USA
| | | | | |
Collapse
|
21
|
Abstract
The current state of the art in modeling protein structure has been assessed, based on the results of the CASP (Critical Assessment of protein Structure Prediction) experiments. In comparative modeling, improvements have been made in sequence alignment, sidechain orientation and loop building. Refinement of the models remains a serious challenge. Improved sequence profile methods have had a large impact in fold recognition. Although there has been some progress in alignment quality, this factor still limits model usefulness. In ab initio structure prediction, there has been notable progress in building approximately correct structures of 40-60 residue-long protein fragments. There is still a long way to go before the general ab initio prediction problem is solved. Overall, the field is maturing into a practical technology, able to deliver useful models for a large number of sequences.
Collapse
Affiliation(s)
- J Moult
- Center for Advanced Research in Biotechnology, University of Maryland Biotechnology Institute, Rockville, MD 20850, USA.
| |
Collapse
|
22
|
|
23
|
Morea V, Leplae R, Tramontano A. Protein structure prediction and design. BIOTECHNOLOGY ANNUAL REVIEW 1999; 4:177-214. [PMID: 9890141 DOI: 10.1016/s1387-2656(08)70070-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Proteins have a unique native conformation, which can be proven in many instances to be determined by the amino acid sequence alone. The folding problem, that is the understanding of how the amino acid sequence directs folding, is still unsolved, despite more than 30 years of effort. However, many new methods have appeared in the past few years. This chapter describes the different principles underlying them and tries to give an overview of their successes and pitfalls.
Collapse
Affiliation(s)
- V Morea
- IRBM P. Angeletti, Pomezia, Rome, Italy
| | | | | |
Collapse
|
24
|
Vetriani C, Maeder DL, Tolliday N, Yip KS, Stillman TJ, Britton KL, Rice DW, Klump HH, Robb FT. Protein thermostability above 100 degreesC: a key role for ionic interactions. Proc Natl Acad Sci U S A 1998; 95:12300-5. [PMID: 9770481 PMCID: PMC22826 DOI: 10.1073/pnas.95.21.12300] [Citation(s) in RCA: 198] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The discovery of hyperthermophilic microorganisms and the analysis of hyperthermostable enzymes has established the fact that multisubunit enzymes can survive for prolonged periods at temperatures above 100 degreesC. We have carried out homology-based modeling and direct structure comparison on the hexameric glutamate dehydrogenases from the hyperthermophiles Pyrococcus furiosus and Thermococcus litoralis whose optimal growth temperatures are 100 degreesC and 88 degreesC, respectively, to determine key stabilizing features. These enzymes, which are 87% homologous, differ 16-fold in thermal stability at 104 degreesC. We observed that an intersubunit ion-pair network was substantially reduced in the less stable enzyme from T. litoralis, and two residues were then altered to restore these interactions. The single mutations both had adverse effects on the thermostability of the protein. However, with both mutations in place, we observed a fourfold improvement of stability at 104 degreesC over the wild-type enzyme. The catalytic properties of the enzymes were unaffected by the mutations. These results suggest that extensive ion-pair networks may provide a general strategy for manipulating enzyme thermostability of multisubunit enzymes. However, this study emphasizes the importance of the exact local environment of a residue in determining its effects on stability.
Collapse
Affiliation(s)
- C Vetriani
- Center of Marine Biotechnology, University of Maryland Biotechnology Institute, 701 E. Pratt Street, Baltimore, MD 21202, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
25
|
Benner SA, Trabesinger N, Schreiber D. Post-genomic science: converting primary structure into physiological function. ADVANCES IN ENZYME REGULATION 1998; 38:155-80. [PMID: 9762352 DOI: 10.1016/s0065-2571(97)00019-8] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Affiliation(s)
- S A Benner
- Department of Chemistry, University of Florida, Gainesville 32611, USA
| | | | | |
Collapse
|
26
|
Rychlewski L, Zhang B, Godzik A. Fold and function predictions for Mycoplasma genitalium proteins. FOLDING & DESIGN 1998; 3:229-38. [PMID: 9710568 DOI: 10.1016/s1359-0278(98)00034-0] [Citation(s) in RCA: 79] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
BACKGROUND Uncharacterized proteins from newly sequenced genomes provide perfect targets for fold and function prediction. RESULTS For 38% of the entire genome of Mycoplasma genitalium, sequence similarity to a protein with a known structure can be recognized using a new sequence alignment algorithm. When comparing genomes of M. genitalium and Escherichia coli, > 80% of M. genitalium proteins have a significant sequence similarity to a protein in E. coli and there are > 40 examples that have not been recognized before. For all cases of proteins with significant profile similarities, there are strong analogies in their functions, if the functions of both proteins are known. The results presented here and other recent results strongly support the argument that such proteins are actually homologous. Assuming this homology allows one to make tentative functional assignments for > 50 previously uncharacterized proteins, including such intriguing cases as the putative beta-lactam antibiotic resistance protein in M. gentalium. CONCLUSIONS Using a new profile-to-profile alignment algorithm, the three-dimensional fold can be predicted for almost 40% of proteins from a genome of the small bacterium M. genitalium, and tentative function can be assigned to almost 80% of the entire genome. Some predictions lead to new insights about known functions or point to hitherto unexpected features of M. genitalium.
Collapse
Affiliation(s)
- L Rychlewski
- Department of Molecular Biology, Scripps Research Institute, La Jolla, CA 92037, USA
| | | | | |
Collapse
|
27
|
Grigoriev IV, Rakhmaninova AB, Mironov AA. Simulated annealing for alpha-helical protein folding: searches in vicinity of the "molten globule" state. J Biomol Struct Dyn 1998; 16:115-22. [PMID: 9745900 DOI: 10.1080/07391102.1998.10508232] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/08/2023]
Abstract
A new model for simulation of protein folding of alpha-helical proteins with known secondary structure is proposed. We are dealing here with the analysis of alpha-helix packings rather than with a detailed atom structure of a whole protein. Starting from a random compact packing of the helices the search is focused on a vicinity of "molten globule" states of a protein. In contrast to the majority of the known approaches for estimation of a protein free energy we introduce a simplified potential of interactions with solvent and consider conformational energy of the loops in addition to mean-force potential. The model was applied to several globular alpha-helical proteins and demonstrated high prediction accuracy in comparison with other known models.
Collapse
Affiliation(s)
- I V Grigoriev
- Research Institute for Genetics of Industrial Microorganisms, Moscow, Russia.
| | | | | |
Collapse
|
28
|
Abstract
Genome sequencing projects continue to provide a flood of new protein sequences, and prediction methods remain an important means of adding structural information. Recently, there have been advances in secondary structure prediction, which feed, in turn, into improved fold recognition algorithms. Finally, there have been technical improvements in comparative modelling, and studies of the expected accuracy of three-dimensional structural models built by this method.
Collapse
Affiliation(s)
- D R Westhead
- The European Bioinformatics Institute EMBL Outstation Wellcome Trust Genome Campus Hinxton, Cambridge, CB10 1SD, UK.
| | | |
Collapse
|
29
|
Abstract
We outline a general strategy for determining the effective coarse-grained interactions between the amino acids of a protein from the experimentally derived native-state structures. The method is, in principle, free from any adjustable or empirically determined parameters, and it is tested on simple models and compared with other existing approaches.
Collapse
Affiliation(s)
- F Seno
- Istituto Nazionale per la Fisica della Materia, Dipartimento di Fisica G. Galilei, Università di Padova, Italy.
| | | | | |
Collapse
|
30
|
Sudarsanam S. Structural diversity of sequentially identical subsequences of proteins: identical octapeptides can have different conformations. Proteins 1998; 30:228-31. [PMID: 9517538 DOI: 10.1002/(sici)1097-0134(19980215)30:3<228::aid-prot2>3.0.co;2-g] [Citation(s) in RCA: 44] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
One of the most important questions in the protein folding problem is whether secondary structures are formed entirely by local interactions. One way to answer this question is to compare identical subsequences of proteins to see if they have identical structures. Such an exercise would also reveal a lower limit on the number of amino acids needed to form unique secondary structures. In this context, we have searched the April 1996 release of the Protein Data Bank for sequentially identical subsequences of proteins and compared their structures. We find that identical octamers can have different conformations. In addition, there are several examples of identical heptamers with different conformations, and the number of identical hexamers with different conformations has increased since the previous PDB releases. These observations imply that secondary structure can be formed entirely by non-local interactions and that an identical match of up to eight amino acids may not imply structural similarity. In addition to the larger context of the protein folding problem, these observations have implications for protein structure prediction methods.
Collapse
Affiliation(s)
- S Sudarsanam
- Department of Protein Chemistry, Immunex Corporation, Seattle, Washington, USA
| |
Collapse
|
31
|
Searls DB. Grand challenges in computational biology. COMPUTATIONAL METHODS IN MOLECULAR BIOLOGY 1998. [DOI: 10.1016/s0167-7306(08)60458-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/10/2023]
|
32
|
Benner SA, Cannarozzi G, Gerloff D, Turcotte M, Chelvanayagam G. Bona Fide Predictions of Protein Secondary Structure Using Transparent Analyses of Multiple Sequence Alignments. Chem Rev 1997; 97:2725-2844. [PMID: 11851479 DOI: 10.1021/cr940469a] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Affiliation(s)
- Steven A. Benner
- Department of Chemistry, University of Florida, Gainesville, Florida 32611-7200
| | | | | | | | | |
Collapse
|
33
|
Lichtarge O, Yamamoto KR, Cohen FE. Identification of functional surfaces of the zinc binding domains of intracellular receptors. J Mol Biol 1997; 274:325-37. [PMID: 9405143 DOI: 10.1006/jmbi.1997.1395] [Citation(s) in RCA: 86] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
Transcriptional regulatory factor complexes assemble on genomic response elements to control gene expression. To gain insights on the surfaces that determine this assembly in the zinc binding domains from intracellular receptors, we systematically analyzed the variations in sequence and function of those domains in the context of their invariant fold. Taking the intracellular receptor superfamily as a whole revealed a hierarchy of amino acid residues along the DNA interface that correlated with response element binding specificity. When only steroid receptors were considered, two additional sites appeared: the known dimer interface, and a novel putative interface suitably located to contact regulatory factors bound to the free face of palindromic response elements commonly used by steroid receptors. Surprisingly, retinoic acid receptors, not known to bind palindromic response elements, contain both of these surfaces, implying that they may dimerize at palindromic elements under some circumstances. This work extends Evolutionary Trace analysis of functional surfaces to protein-DNA interactions, suggests how coordinated exchange of trace residues may predictably switch binding specificity, and demonstrates how to detect functional surfaces that are not apparent from sequence comparison alone.
Collapse
Affiliation(s)
- O Lichtarge
- Department of Cellular and Molecular Pharmacology, University of California, San Francisco, CA 94143-0450, USA
| | | | | |
Collapse
|
34
|
Abstract
The Caenorhabditis elegans genome sequencing project has completed over half of this nematode's 100-Mb genome. Proteins predicted in the finished sequence have been compiled and released in the data-base Wormpep. Presented here is a comprehensive analysis of protein domain families in Wormpep 11, which comprises 7299 proteins. The relative abundance of common protein domain families was counted by comparing all Wormpep proteins to the Pfam collection of protein families, which is based on recognition by hidden Markov models. This analysis also identified a number of previously unannotated domains. To investigate new apparently nematode-specific protein families, Wormpep was clustered into domain families on the basis of sequence similarity using the Domainer program. The largest clusters that lacked clear homology to proteins outside Nematoda were analyzed in further detail, after which some could be assigned a putative function. We compared all proteins in Wormpep 11 to proteins in the human, Saccharomyces cerevisiae, and Haemophilus influenzae genomes. Among the results are the estimation that over two-thirds of the currently known human proteins are likely to have a homologue in the whole C. elegans genome and that a significant number of proteins are well conserved between C. elegans and H. influenzae, that are not found in S. cerevisiae.
Collapse
|
35
|
Penkett CJ, Redfield C, Dodd I, Hubbard J, McBay DL, Mossakowska DE, Smith RA, Dobson CM, Smith LJ. NMR analysis of main-chain conformational preferences in an unfolded fibronectin-binding protein. J Mol Biol 1997; 274:152-9. [PMID: 9398523 DOI: 10.1006/jmbi.1997.1369] [Citation(s) in RCA: 112] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
A 130-residue fragment of the Staphylococcus aureus fibronectin-binding protein has been found to exist in a highly unfolded conformation at neutral pH. Measurement of experimental NMR 3JHNalpha coupling constants provides evidence for individual residues having distinct main-chain conformational preferences that are dependent both on the amino acid concerned and on neighbouring residues in the sequence. Analysis shows that these variations in the populations of individual residues can be explained in detail in terms of statistical distributions of conformational states derived from the protein data base. In particular, when the preceding residue has a beta-branched or aromatic side-chain, a significant increase occurs in the population of the less sterically restricted b region of phi,psi space. The results indicate that the local structure of the fibronectin binding protein in solution, under conditions where it displays full activity, approximates very closely to a statistical random coil structure. This may be an important feature in the biological role of this and other polypeptides involved in protein-protein interactions.
Collapse
Affiliation(s)
- C J Penkett
- Oxford Centre for Molecular Sciences and New Chemistry Laboratory, University of Oxford, South Parks Road, Oxford, OX1 3QT, UK
| | | | | | | | | | | | | | | | | |
Collapse
|
36
|
Dandekar T, König R. Computational methods for the prediction of protein folds. BIOCHIMICA ET BIOPHYSICA ACTA 1997; 1343:1-15. [PMID: 9428653 DOI: 10.1016/s0167-4838(97)00132-5] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
|
37
|
Fischer D, Eisenberg D. Assigning folds to the proteins encoded by the genome of Mycoplasma genitalium. Proc Natl Acad Sci U S A 1997; 94:11929-34. [PMID: 9342339 PMCID: PMC23659 DOI: 10.1073/pnas.94.22.11929] [Citation(s) in RCA: 93] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
A crucial step in exploiting the information inherent in genome sequences is to assign to each protein sequence its three-dimensional fold and biological function. Here we describe fold assignment for the proteins encoded by the small genome of Mycoplasma genitalium. The assignment was carried out by our computer server (http://www.doe-mbi.ucla.edu/people/frsvr/ frsvr. html), which assigns folds to amino acid sequences by comparing sequence-derived predictions with known structures. Of the total of 468 protein ORFs, 103 (22%) can be assigned a known protein fold with high confidence, as cross-validated with tests on known structures. Of these sequences, 75 (16%) show enough sequence similarity to proteins of known structure that they can also be detected by traditional sequence-sequence comparison methods. That is, the difference of 28 sequences (6%) are assignable by the sequence-structure method of the server but not by current sequence-sequence methods. Of the remaining 78% of sequences in the genome, 18% belong to membrane proteins and the remaining 60% cannot be assigned either because these sequences correspond to no presently known fold or because of insensitivity of the method. At the current rate of determination of new folds by x-ray and NMR methods, extrapolation suggests that folds will be assigned to most soluble proteins in the next decade.
Collapse
Affiliation(s)
- D Fischer
- University of California, Los Angeles-Department of Energy Laboratory of Structural Biology and Molecular Medicine, Molecular Biology Institute, University of California, Los Angeles, Box 951570, Los Angeles, CA 90095-1570, USA
| | | |
Collapse
|
38
|
Anderson LE, Li D, Muslin EH, Stevens FJ, Schiffer M. Predicting redox-sensitive cysteines in plant enzymes by homology modeling. ACTA ACUST UNITED AC 1997. [DOI: 10.1016/s0764-4469(97)85012-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
|
39
|
Abstract
Work with proteins, particularly enzymes, is a rapidly growing segment of the biotechnology industry. Directed evolution promises to become an increasingly important strategy in their development as it allows one to sidestep some of the difficult questions relating the structural and functional properties of such proteins to their industrial utility. It is also clear, however, that greater understanding of how to engineer certain basic enzyme properties, such as stability, activity, and surface properties, is beginning to emerge, and this understanding will make rational design more efficient. To engineer a commercially useful protein many properties need to be changed, and frequently these changes are interdependent. Recent protein engineering studies on protease, amylase, lipase and cellulase illustrate some of the progress in this area.
Collapse
|
40
|
Abstract
Prediction of protein structure by fold recognition, or threading, was recently put to the test in a 'blind' structure prediction experiment, CASP2. Thirty-two teams from around the world participated, preparing predictions for 22 different 'target' proteins whose structures were soon to be determined. As experimental structures became available, we, as organizers of the threading competition, computed objective measures of fold-recognition specificity and model accuracy, to identify and characterize successful predictions. Here, we present a brief summary of these prediction evaluations, a tally of 'correct' predictions and a discussion of factors associated with correct predictions. We find that threading produced specific recognition and accurate models whenever the structural database contained a template spanning a large fraction of target sequence. Presence of conserved sequence motifs was helpful, but not required, and it would appear that threading can succeed whenever similarity to a known structure is sufficiently extensive.
Collapse
Affiliation(s)
- A Marchler-Bauer
- Computational Biology Branch, National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894, USA
| | | |
Collapse
|
41
|
|
42
|
|
43
|
|