51
|
DePristo MA, De Bakker PIW, Shetty RP, Blundell TL. Discrete restraint-based protein modeling and the Calpha-trace problem. Protein Sci 2003; 12:2032-46. [PMID: 12931001 PMCID: PMC2323999 DOI: 10.1110/ps.0386903] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
Abstract
We present a novel de novo method to generate protein models from sparse, discretized restraints on the conformation of the main chain and side chain atoms. We focus on Calpha-trace generation, the problem of constructing an accurate and complete model from approximate knowledge of the positions of the Calpha atoms and, in some cases, the side chain centroids. Spatial restraints on the Calpha atoms and side chain centroids are supplemented by constraints on main chain geometry, phi/xi angles, rotameric side chain conformations, and inter-atomic separations derived from analyses of known protein structures. A novel conformational search algorithm, combining features of tree-search and genetic algorithms, generates models consistent with these restraints by propensity-weighted dihedral angle sampling. Models with ideal geometry, good phi/xi angles, and no inter-atomic overlaps are produced with 0.8 A main chain and, with side chain centroid restraints, 1.0 A all-atom root-mean-square deviation (RMSD) from the crystal structure over a diverse set of target proteins. The mean model derived from 50 independently generated models is closer to the crystal structure than any individual model, with 0.5 A main chain RMSD under only Calpha restraints and 0.7 A all-atom RMSD under both Calpha and centroid restraints. The method is insensitive to randomly distributed errors of up to 4 A in the Calpha restraints. The conformational search algorithm is efficient, with computational cost increasing linearly with protein size. Issues relating to decoy set generation, experimental structure determination, efficiency of conformational sampling, and homology modeling are discussed.
Collapse
Affiliation(s)
- Mark A DePristo
- Department of Biochemistry, University of Cambridge, Cambridge CB2 1GA, England.
| | | | | | | |
Collapse
|
52
|
Back JW, de Jong L, Muijsers AO, de Koster CG. Chemical cross-linking and mass spectrometry for protein structural modeling. J Mol Biol 2003; 331:303-13. [PMID: 12888339 DOI: 10.1016/s0022-2836(03)00721-6] [Citation(s) in RCA: 175] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
The growth of gene and protein sequence information is currently so rapid that three-dimensional structural information is lacking for the overwhelming majority of known proteins. In this review, efforts towards rapid and sensitive methods for protein structural characterization are described, complementing existing technologies. Based on chemical cross-linking and offering the analytical speed and sensitivity of mass spectrometry these methodologies are thought to contribute valuable tools towards future high throughput protein structure elucidation.
Collapse
Affiliation(s)
- Jaap Willem Back
- Swammerdam Institute for Life Sciences (SILS), Mass Spectrometry group, University of Amsterdam, Nieuwe Achtergracht 166, 1018 WV, Amsterdam, The Netherlands.
| | | | | | | |
Collapse
|
53
|
Kumaran D, Eswaramoorthy S, Gerchman SE, Kycia H, Studier FW, Swaminathan S. Crystal structure of a putative CN hydrolase from yeast. Proteins 2003; 52:283-91. [PMID: 12833551 DOI: 10.1002/prot.10417] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
The crystal structure of a yeast hypothetical protein with sequence similarity to CN hydrolases has been determined to 2.4 A resolution by the multiwavelength anomalous dispersion (MAD) method. The protein folds as a four-layer alphabetabetaalpha sandwich and exists as a dimer in the crystal and in solution. It was selected in a structural genomics project as representative of CN hydrolases at a time when no structures had been determined for members of this family. Structures for two other members of the family have since been reported and the three proteins have similar topology and dimerization modes, which are distinct from those of other alphabetabetaalpha proteins whose structures are known. The dimers form an unusual eight-layer alphabetabetaalpha:alphabetabetaalpha structure. Although the precise enzymatic reactions catalyzed by the yeast protein are not known, considerable information about the active site may be deduced from conserved sequence motifs, comparative biochemical information, and comparison with known structures of hydrolase active sites. As with serine hydrolases, the active-site nucleophile (cysteine in this case) is positioned on a nucleophile elbow.
Collapse
Affiliation(s)
- Desigan Kumaran
- Biology Department, Brookhaven National Laboratory, Upton, New York 11973, USA
| | | | | | | | | | | |
Collapse
|
54
|
Ivanciuc O, Mathura V, Midoro-Horiuti T, Braun W, Goldblum RM, Schein CH. Detecting potential IgE-reactive sites on food proteins using a sequence and structure database, SDAP-food. JOURNAL OF AGRICULTURAL AND FOOD CHEMISTRY 2003; 51:4830-4837. [PMID: 14705920 DOI: 10.1021/jf034218r] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/24/2023]
Abstract
The high incidence of food allergies, including oral allergy syndrome, represent major considerations when introducing new crops and foods. A new structural database of allergenic proteins, SDAP-Food, http://fermi.utmb.edu/SDAP/, has been developed to aid in predicting the IgE-binding potential of novel food proteins and cross-reactivities among known allergens. The site is designed to facilitate the first steps of a decision tree approach to determine the allergenicity of a given protein, based on the sequence and structural similarity to known allergens and their IgE binding sites. Immunological tests can then be used to confirm the predictions. A hierarchical procedure for identifying potential allergens, using a physical property-based sequence similarity index, has been designed to identify regions that resemble known IgE binding sites. As an example, SDAP tools were used to find food allergen sequences similar to an IgE binding site of the Jun a 3 allergen from mountain cedar pollen. The SDAP sequence similarity search matched the Jun a 3 epitope to regions in several food allergens, including cherry (Pru av 2), apple (Mal d 2) and pepper (Cap a 1), which are, like Jun a 3, members of the plant pathogenesis-related (PR-5) protein family. Homology modeling, using our EXDIS/DIAMOD/FANTOM program suite, indicated a similar surface location and structure for the potential epitope region on all of these allergens. The quantitative approach presented here can be used as part of a screening process for potential allergenicity of recombinant food products.
Collapse
Affiliation(s)
- Ovidiu Ivanciuc
- Sealy Center for Structural Biology, Department of Human Biological Chemistry and Genetics, University of Texas Medical Branch, 301 University Blvd., Galveston, TX 77555-1157, USA
| | | | | | | | | | | |
Collapse
|
55
|
John B, Sali A. Comparative protein structure modeling by iterative alignment, model building and model assessment. Nucleic Acids Res 2003; 31:3982-92. [PMID: 12853614 PMCID: PMC165975 DOI: 10.1093/nar/gkg460] [Citation(s) in RCA: 242] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Comparative or homology protein structure modeling is severely limited by errors in the alignment of a modeled sequence with related proteins of known three-dimensional structure. To ameliorate this problem, we have developed an automated method that optimizes both the alignment and the model implied by it. This task is achieved by a genetic algorithm protocol that starts with a set of initial alignments and then iterates through re-alignment, model building and model assessment to optimize a model assessment score. During this iterative process: (i) new alignments are constructed by application of a number of operators, such as alignment mutations and cross-overs; (ii) comparative models corresponding to these alignments are built by satisfaction of spatial restraints, as implemented in our program MODELLER; (iii) the models are assessed by a variety of criteria, partly depending on an atomic statistical potential. When testing the procedure on a very difficult set of 19 modeling targets sharing only 4-27% sequence identity with their template structures, the average final alignment accuracy increased from 37 to 45% relative to the initial alignment (the alignment accuracy was measured as the percentage of positions in the tested alignment that were identical to the reference structure-based alignment). Correspondingly, the average model accuracy increased from 43 to 54% (the model accuracy was measured as the percentage of the C(alpha) atoms of the model that were within 5 A of the corresponding C(alpha) atoms in the superposed native structure). The present method also compares favorably with two of the most successful previously described methods, PSI-BLAST and SAM. The accuracy of the final models would be increased further if a better method for ranking of the models were available.
Collapse
Affiliation(s)
- Bino John
- Laboratory of Molecular Biophysics, Pels Family Center for Biochemistry and Structural Biology, The Rockefeller University, New York, NY 10021, USA
| | | |
Collapse
|
56
|
McGovern SL, Shoichet BK. Information decay in molecular docking screens against holo, apo, and modeled conformations of enzymes. J Med Chem 2003; 46:2895-907. [PMID: 12825931 DOI: 10.1021/jm0300330] [Citation(s) in RCA: 201] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Molecular docking uses the three-dimensional structure of a receptor to screen a small molecule database for potential ligands. The dependence of docking screens on the conformation of the binding site remains an open question. To evaluate the information loss that occurs as the active site conformation becomes less defined, a small molecule database was docked against the holo (ligand bound), apo, and homology modeled structures of 10 different enzyme binding sites. The holo and apo representations were crystallographic structures taken from the Protein Data Bank (PDB), and the homology-modeled structures were taken from the publicly available resource ModBase. The database docked was the MDL Drug Data Report (MDDR), a functionally annotated database of 95000 small molecules that contained at least 35 ligands for each of the 10 systems. In all sites, at least 99% of the molecules in the MDDR were treated as nonbinding decoys. For each system, the holo, apo, and modeled structures were used to screen the MDDR, and the ability of each structure to enrich the known ligands for that system over random selection was evaluated. The best overall enrichment was produced by the holo structure in seven systems, the apo structure in two systems, and the modeled structure in one system. These results suggest that the performance of the docking calculation is affected by the particular representation of the receptor used in the screen, and that the holo structure is the one most likely to yield the best discrimination between known ligands and decoy molecules, but important exceptions to this rule also emerge from this study. Although each of the holo, apo, and modeled conformations led to enrichment of known ligands in all systems, the enrichment did not always rise to a level judged to be sufficient to justify the effort of a docking screen. Using a 20-fold enrichment of known ligands over random selection as a rough guideline for what might be enough to justify a docking screen, the holo conformation of the enzyme met this criterion in eight of 10 sites, whereas the apo conformation met this criterion in only two sites and the modeled conformation in three.
Collapse
Affiliation(s)
- Susan L McGovern
- Department of Molecular Pharmacology and Biological Chemistry, Northwestern University, 303 East Chicago Avenue, Chicago, Illinois 60611, USA
| | | |
Collapse
|
57
|
Eswar N, John B, Mirkovic N, Fiser A, Ilyin VA, Pieper U, Stuart AC, Marti-Renom MA, Madhusudhan MS, Yerkovich B, Sali A. Tools for comparative protein structure modeling and analysis. Nucleic Acids Res 2003; 31:3375-80. [PMID: 12824331 PMCID: PMC168950 DOI: 10.1093/nar/gkg543] [Citation(s) in RCA: 364] [Impact Index Per Article: 16.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The following resources for comparative protein structure modeling and analysis are described (http://salilab.org): MODELLER, a program for comparative modeling by satisfaction of spatial restraints; MODWEB, a web server for automated comparative modeling that relies on PSI-BLAST, IMPALA and MODELLER; MODLOOP, a web server for automated loop modeling that relies on MODELLER; MOULDER, a CPU intensive protocol of MODWEB for building comparative models based on distant known structures; MODBASE, a comprehensive database of annotated comparative models for all sequences detectably related to a known structure; MODVIEW, a Netscape plugin for Linux that integrates viewing of multiple sequences and structures; and SNPWEB, a web server for structure-based prediction of the functional impact of a single amino acid substitution.
Collapse
Affiliation(s)
- Narayanan Eswar
- Department of Biopharmaceutical Sciences and California Institute for Quantitative Biomedical Research, University of California, San Francisco, CA 94143-2240, USA
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
58
|
Bader GD, Heilbut A, Andrews B, Tyers M, Hughes T, Boone C. Functional genomics and proteomics: charting a multidimensional map of the yeast cell. Trends Cell Biol 2003; 13:344-56. [PMID: 12837605 DOI: 10.1016/s0962-8924(03)00127-2] [Citation(s) in RCA: 67] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The challenge of large-scale functional genomics projects is to build a comprehensive map of the cell including genome sequence and gene expression data, information on protein localization, structure, function and expression, post-translational modifications, molecular and genetic interactions and phenotypic descriptions. Some of this broad set of functional genomics data has been already assembled for the budding yeast. Even though molecular cartography of the yeast cell is still far from comprehensive, functional genomics has begun to forge connections between disparate cellular events and to foster numerous hypotheses. Here we review several different genomics and proteomics technologies and describe bioinformatics methods for exploring these data to make new discoveries.
Collapse
Affiliation(s)
- Gary D Bader
- Computational Biology Center, Memorial Sloan-Kettering Cancer Center, 1275 York Avenue, Box 460, 10021, New York, NY, USA
| | | | | | | | | | | |
Collapse
|
59
|
Gao H, Sengupta J, Valle M, Korostelev A, Eswar N, Stagg SM, Van Roey P, Agrawal RK, Harvey SC, Sali A, Chapman MS, Frank J. Study of the structural dynamics of the E coli 70S ribosome using real-space refinement. Cell 2003; 113:789-801. [PMID: 12809609 DOI: 10.1016/s0092-8674(03)00427-6] [Citation(s) in RCA: 225] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/17/2022]
Abstract
Cryo-EM density maps showing the 70S ribosome of E. coli in two different functional states related by a ratchet-like motion were analyzed using real-space refinement. Comparison of the two resulting atomic models shows that the ribosome changes from a compact structure to a looser one, coupled with the rearrangement of many of the proteins. Furthermore, in contrast to the unchanged inter-subunit bridges formed wholly by RNA, the bridges involving proteins undergo large conformational changes following the ratchet-like motion, suggesting an important role of ribosomal proteins in facilitating the dynamics of translation.
Collapse
Affiliation(s)
- Haixiao Gao
- Howard Hughes Medical Institute, Health Research, Inc, Empire State Plaza, Albany, NY 12201, USA
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
60
|
Abstract
The study of structural genomics and structural proteomics has determined the tertiary structures of many hypothetical proteins, whose molecular functions could not be understood using conventional methods. In order to infer the geometrical location of the functional site, the biochemical function and the biological function of the hypothetical protein, much effort has been made in protein informatics. The importance of heterogeneous databases and various descriptors of amino acid sequences, tertiary structures and pathways on the proteome scale has been emphasised.
Collapse
Affiliation(s)
- Kengo Kinoshita
- Graduate School of Integrated Science, Yokohama City University, 1-7-29 Suehiro-cho, Turumi-ku, 230-0045, Yokohama, Japan.
| | | |
Collapse
|
61
|
Abstract
Technical advances on several frontiers have expanded the applicability of existing methods in structural biology and helped close the resolution gaps between them. As a result, we are now poised to integrate structural information gathered at multiple levels of the biological hierarchy - from atoms to cells - into a common framework. The goal is a comprehensive description of the multitude of interactions between molecular entities, which in turn is a prerequisite for the discovery of general structural principles that underlie all cellular processes.
Collapse
Affiliation(s)
- Andrej Sali
- Department of Biopharmaceutical Sciences, and California Institute for Quantitative Biomedical Research, University of California, San Francisco, California 94143, USA
| | | | | | | |
Collapse
|
62
|
Zaim J, Kierzek AM. The structure of full-length LysR-type transcriptional regulators. Modeling of the full-length OxyR transcription factor dimer. Nucleic Acids Res 2003; 31:1444-54. [PMID: 12595552 PMCID: PMC149827 DOI: 10.1093/nar/gkg234] [Citation(s) in RCA: 51] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The LysR-type transcriptional regulators (LTTRs) comprise the largest family of prokaryotic transcription factors. These proteins are composed of an N-terminal DNA binding domain (DBD) and a C-terminal cofactor binding domain. To date, no structure of the DBD has been solved. According to the SUPERFAMILY and MODBASE databases, a reliable homology model of LTTR DBDs may be built using the structure of the Escherichia coli ModE transcription factor, containing a winged helix- turn-helix (HTH) motif, as a template. The remote, but statistically significant, sequence similarity between ModE and LTTR DBDs and an alignment generated using SUPERFAMILY and MODBASE methods was independently confirmed by alignment of sequence profiles representing ModE and LTTR family DBDs. Using the crystal structure of the E.coli OxyR C-terminal domain and the DBD alignments we constructed a structural model of the full-length dimer of this LTTR family member and used it to investigate the mode of protein-DNA interaction. We also applied the model to interpret, in a structural context, the results of numerous biochemical studies of mutated LTTRs. A comparison of the LTTR DBD model with the structures of other HTH proteins also provides insights into the interaction of LTTRs with the C-terminal domain of the RNA polymerase alpha subunit.
Collapse
Affiliation(s)
- Jolanta Zaim
- Institute of Biochemistry and Biophysics, Polish Academy of Sciences, Pawinskiego 5a, 02-106 Warsaw, Poland
| | | |
Collapse
|
63
|
Schafferhans A, Meyer JEW, O'Donoghue SI. The PSSH database of alignments between protein sequences and tertiary structures. Nucleic Acids Res 2003; 31:494-8. [PMID: 12520061 PMCID: PMC165557 DOI: 10.1093/nar/gkg110] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/13/2002] [Revised: 10/30/2002] [Accepted: 10/30/2002] [Indexed: 11/14/2022] Open
Abstract
We introduce the PSSH ('Protein Sequence-to-Structure Homologies') database derived from HSSP2, an improved version of the HSSP ('Homology-derived Secondary Structure of Proteins') database [Dodge et al. (1998) Nucleic Acids Res., 26, 313-315]. Whereas each HSSP entry lists all protein sequences related to a given 3D structure, PSSH is the 'inverse', with each entry listing all structures related to a given sequence. In addition, we introduce two other derived databases: HSSPchain, in which each entry lists all sequences related to a given PDB chain, and HSSPalign, in which each entry gives details of one sequence aligned onto one PDB chain. This re-organization makes it easier to navigate from sequence to structure, and to map sequence features onto 3D structures. Currently (September 2002), PSSH provides structural information for over 400 000 protein sequences, covering 48% of SWALL and 61% of SWISS-PROT sequences; HSSPchain provides sequence information for over 25 000 PDB chains, and HSSPalign gives over 14 million sequence-to-structure alignments. The databases can be accessed via SRS 3D, an extension to the SRS system, at http://srs3d.ebi.ac.uk/.
Collapse
|
64
|
Bonneau R, Strauss CEM, Rohl CA, Chivian D, Bradley P, Malmström L, Robertson T, Baker D. De novo prediction of three-dimensional structures for major protein families. J Mol Biol 2002; 322:65-78. [PMID: 12215415 DOI: 10.1016/s0022-2836(02)00698-8] [Citation(s) in RCA: 180] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
We use the Rosetta de novo structure prediction method to produce three-dimensional structure models for all Pfam-A sequence families with average length under 150 residues and no link to any protein of known structure. To estimate the reliability of the predictions, the method was calibrated on 131 proteins of known structure. For approximately 60% of the proteins one of the top five models was correctly predicted for 50 or more residues, and for approximately 35%, the correct SCOP superfamily was identified in a structure-based search of the Protein Data Bank using one of the models. This performance is consistent with results from the fourth critical assessment of structure prediction (CASP4). Correct and incorrect predictions could be partially distinguished using a confidence function based on a combination of simulation convergence, protein length and the similarity of a given structure prediction to known protein structures. While the limited accuracy and reliability of the method precludes definitive conclusions, the Pfam models provide the only tertiary structure information available for the 12% of publicly available sequences represented by these large protein families.
Collapse
Affiliation(s)
- Richard Bonneau
- Department of Biochemistry, University of Washington, Seattle, WA 98195-7350, USA
| | | | | | | | | | | | | | | |
Collapse
|
65
|
Abstract
Chemical genomics represents a convergence of biology and chemistry in the era of global approaches to target identification and intervention. The success of genomics has led to a bottleneck in target validation that could be overcome by using small diverse organic compounds to interfere with biological processes. Because of the limitations of existing compound collections, this diversity can only fully be exploited using in silico design techniques to guide the selection of molecules with optimal binding properties. Structure-based design is used to create structures de novo that can be synthesized for use as chemical probes and drug leads.
Collapse
Affiliation(s)
- Edward D Zanders
- De Nove Pharmaceuticals, Compass House, Vision Park, Histon, Cambridge, UK CB4 9ZR.
| | | | | |
Collapse
|
66
|
Fiser A, Feig M, Brooks CL, Sali A. Evolution and physics in comparative protein structure modeling. Acc Chem Res 2002; 35:413-21. [PMID: 12069626 DOI: 10.1021/ar010061h] [Citation(s) in RCA: 71] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
From a physical perspective, the native structure of a protein is a consequence of physical forces acting on the protein and solvent atoms during the folding process. From a biological perspective, the native structure of proteins is a result of evolution over millions of years. Correspondingly, there are two types of protein structure prediction methods, de novo prediction and comparative modeling. We review comparative protein structure modeling and discuss the incorporation of physical considerations into the modeling process. A good starting point for achieving this aim is provided by comparative modeling by satisfaction of spatial restraints. Incorporation of physical considerations is illustrated by an inclusion of solvation effects into the modeling of loops.
Collapse
Affiliation(s)
- András Fiser
- Laboratory of Molecular Biophysics, Pels Family Center for Biochemistry and Structural Biology, The Rockefeller University, 1230 York Avenue, New York, New York 10021, USA
| | | | | | | |
Collapse
|
67
|
Romanowski MJ, Soccio RE, Breslow JL, Burley SK. Crystal structure of the Mus musculus cholesterol-regulated START protein 4 (StarD4) containing a StAR-related lipid transfer domain. Proc Natl Acad Sci U S A 2002; 99:6949-54. [PMID: 12011453 PMCID: PMC124509 DOI: 10.1073/pnas.052140699] [Citation(s) in RCA: 125] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The x-ray structure of the mouse cholesterol-regulated START protein 4 (StarD4) has been determined at 2.2-A resolution, revealing a compact alpha/beta structure related to the START domain present in the cytoplasmic C-terminal portion of human MLN64. The volume of the putative lipid-binding tunnel was estimated at 847 A(3), which is consistent with the binding of one cholesterol-size lipid molecule. Comparison of the tunnel-lining residues in StarD4 and MLN64-START permitted identification of possible lipid specificity determinants in both molecular tunnels. Homology modeling of related proteins, and comparison of the StarD4 and MLN64-START structures, showed that StarD4 is a member of a large START domain superfamily characterized by the helix-grip fold. Additional mechanistic and evolutionary studies should be facilitated by the availability of a second START domain structure from a distant relative of MLN64.
Collapse
Affiliation(s)
- Michael J Romanowski
- Laboratories of Molecular Biophysics, The Rockefeller University, 1230 York Avenue, New York, NY 10021, USA.
| | | | | | | |
Collapse
|
68
|
Chance MR, Bresnick AR, Burley SK, Jiang JS, Lima CD, Sali A, Almo SC, Bonanno JB, Buglino JA, Boulton S, Chen H, Eswar N, He G, Huang R, Ilyin V, McMahan L, Pieper U, Ray S, Vidal M, Wang LK. Structural genomics: a pipeline for providing structures for the biologist. Protein Sci 2002; 11:723-38. [PMID: 11910018 PMCID: PMC2373525 DOI: 10.1110/ps.4570102] [Citation(s) in RCA: 119] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/16/2022]
Affiliation(s)
- Mark R Chance
- Center for Synchrotron Biosciences, Albert Einstein College of Medicine, Bronx, New York 10461, USA.
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
69
|
Abstract
A protein structure model generally needs to be evaluated to assess whether or not it has the correct fold. To improve fold assessment, four types of a residue-level statistical potential were optimized, including distance-dependent, contact, Phi/Psi dihedral angle, and accessible surface statistical potentials. Approximately 10,000 test models with the correct and incorrect folds were built by automated comparative modeling of protein sequences of known structure. The criterion used to discriminate between the correct and incorrect models was the Z-score of the model energy. The performance of a Z-score was determined as a function of many variables in the derivation and use of the corresponding statistical potential. The performance was measured by the fractions of the correctly and incorrectly assessed test models. The most discriminating combination of any one of the four tested potentials is the sum of the normalized distance-dependent and accessible surface potentials. The distance-dependent potential that is optimal for assessing models of all sizes uses both C(alpha) and C(beta) atoms as interaction centers, distinguishes between all 20 standard residue types, has the distance range of 30 A, and is derived and used by taking into account the sequence separation of the interacting atom pairs. The terms for the sequentially local interactions are significantly less informative than those for the sequentially nonlocal interactions. The accessible surface potential that is optimal for assessing models of all sizes uses C(beta) atoms as interaction centers and distinguishes between all 20 standard residue types. The performance of the tested statistical potentials is not likely to improve significantly with an increase in the number of known protein structures used in their derivation. The parameters of fold assessment whose optimal values vary significantly with model size include the size of the known protein structures used to derive the potential and the distance range of the accessible surface potential. Fold assessment by statistical potentials is most difficult for the very small models. This difficulty presents a challenge to fold assessment in large-scale comparative modeling, which produces many small and incomplete models. The results described in this study provide a basis for an optimal use of statistical potentials in fold assessment.
Collapse
Affiliation(s)
- Francisco Melo
- Laboratories of Molecular Biophysics, Pels Family Center for Biochemistry and Structural Biology, The Rockefeller University, New York, New York 10021, USA
| | | | | |
Collapse
|