101
|
Ashby M, Houmard J. Cyanobacterial two-component proteins: structure, diversity, distribution, and evolution. Microbiol Mol Biol Rev 2006; 70:472-509. [PMID: 16760311 PMCID: PMC1489541 DOI: 10.1128/mmbr.00046-05] [Citation(s) in RCA: 104] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/22/2022] Open
Abstract
A survey of the already characterized and potential two-component protein sequences that exist in the nine complete and seven partially annotated cyanobacterial genome sequences available (as of May 2005) showed that the cyanobacteria possess a much larger repertoire of such proteins than most other bacteria. By analysis of the domain structure of the 1,171 potential histidine kinases, response regulators, and hybrid kinases, many various arrangements of about thirty different modules could be distinguished. The number of two-component proteins is related in part to genome size but also to the variety of physiological properties and ecophysiologies of the different strains. Groups of orthologues were defined, only a few of which have representatives with known physiological functions. Based on comparisons with the proposed phylogenetic relationships between the strains, the orthology groups show that (i) a few genes, some of them clustered on the genome, have been conserved by all species, suggesting their very ancient origin and an essential role for the corresponding proteins, and (ii) duplications, fusions, gene losses, insertions, and deletions, as well as domain shuffling, occurred during evolution, leading to the extant repertoire. These mechanisms are put in perspective with the different genetic properties that cyanobacteria have to achieve genome plasticity. This review is designed to serve as a basis for orienting further research aimed at defining the most ancient regulatory mechanisms and understanding how evolution worked to select and keep the most appropriate systems for cyanobacteria to develop in the quite different environments that they have successfully colonized.
Collapse
Affiliation(s)
- Mark
K. Ashby
- Department
of Basic Medical Sciences, Biochemistry Section, University of the West
Indies, Mona Campus, Kingston 7,
Jamaica, Ecole Normale
Supérieure, CNRS UMR 8541, Génétique
Moléculaire, 46 rue d'Ulm, 75230 Paris Cedex 05,
France
| | - Jean Houmard
- Department
of Basic Medical Sciences, Biochemistry Section, University of the West
Indies, Mona Campus, Kingston 7,
Jamaica, Ecole Normale
Supérieure, CNRS UMR 8541, Génétique
Moléculaire, 46 rue d'Ulm, 75230 Paris Cedex 05,
France
- Corresponding
author. Mailing address: Ecole Normale Supérieure, CNRS UMR 8541,
Génétique Moléculaire, 46 rue d'Ulm, 75230 Paris
Cedex 05, France. Phone: 33 1 44 32 35 19. Fax: 33 1 44 96 53 60.
E-mail:
| |
Collapse
|
102
|
Chi PH, Shyu CR, Xu D. A fast SCOP fold classification system using content-based E-Predict algorithm. BMC Bioinformatics 2006; 7:362. [PMID: 16872501 PMCID: PMC1579235 DOI: 10.1186/1471-2105-7-362] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2005] [Accepted: 07/26/2006] [Indexed: 11/10/2022] Open
Abstract
Background Domain experts manually construct the Structural Classification of Protein (SCOP) database to categorize and compare protein structures. Even though using the SCOP database is believed to be more reliable than classification results from other methods, it is labor intensive. To mimic human classification processes, we develop an automatic SCOP fold classification system to assign possible known SCOP folds and recognize novel folds for newly-discovered proteins. Results With a sufficient amount of ground truth data, our system is able to assign the known folds for newly-discovered proteins in the latest SCOP v1.69 release with 92.17% accuracy. Our system also recognizes the novel folds with 89.27% accuracy using 10 fold cross validation. The average response time for proteins with 500 and 1409 amino acids to complete the classification process is 4.1 and 17.4 seconds, respectively. By comparison with several structural alignment algorithms, our approach outperforms previous methods on both the classification accuracy and efficiency. Conclusion In this paper, we build an advanced, non-parametric classifier to accelerate the manual classification processes of SCOP. With satisfactory ground truth data from the SCOP database, our approach identifies relevant domain knowledge and yields reasonably accurate classifications. Our system is publicly accessible at .
Collapse
Affiliation(s)
- Pin-Hao Chi
- Medical and Biological Digital Library Research Lab, Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Chi-Ren Shyu
- Medical and Biological Digital Library Research Lab, Department of Computer Science, University of Missouri, Columbia, MO 65211, USA
| | - Dong Xu
- Digital Biology Laboratory, Department of Computer Science and Life Sciences Center, University of Missouri, Columbia, MO 65211, USA
| |
Collapse
|
103
|
Benita Y, Wise MJ, Lok MC, Humphery-Smith I, Oosting RS. Analysis of high throughput protein expression in Escherichia coli. Mol Cell Proteomics 2006; 5:1567-80. [PMID: 16822774 DOI: 10.1074/mcp.m600140-mcp200] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
The ability to efficiently produce hundreds of proteins in parallel is the most basic requirement of many aspects of proteomics. Overcoming the technical and financial barriers associated with high throughput protein production is essential for the development of an experimental platform to query and browse the protein content of a cell (e.g. protein and antibody arrays). Proteins are inherently different one from another in their physicochemical properties; therefore, no single protocol can be expected to successfully express most of the proteins. Instead of optimizing a protocol to express a specific protein, we used sequence analysis tools to estimate the probability of a specific protein to be expressed successfully using a given protocol, thereby avoiding a priori proteins with a low success probability. A set of 547 proteins, to be used for antibody production and selection, was expressed in Escherichia coli using a high throughput protein production pipeline. Protein properties derived from sequence alone were correlated to successful expression, and general guidelines are given to increase the efficiency of similar pipelines. A second set of 68 proteins was expressed to investigate the link between successful protein expression and inclusion body formation. More proteins were expressed in inclusion bodies; however, the formation of inclusion bodies was not a requirement for successful expression.
Collapse
Affiliation(s)
- Yair Benita
- Department of Psychopharmacology, Utrecht Institute of Pharmaceutical Sciences (UIPS), Utrecht University, Sorbonnelaan 16, 3584 CA Utrecht, The Netherlands
| | | | | | | | | |
Collapse
|
104
|
Shvartsburg AA, Li F, Tang K, Smith RD. Characterizing the Structures and Folding of Free Proteins Using 2-D Gas-Phase Separations: Observation of Multiple Unfolded Conformers. Anal Chem 2006; 78:3304-15. [PMID: 16689531 DOI: 10.1021/ac060283z] [Citation(s) in RCA: 79] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Understanding the 3-D structure and dynamics of proteins and other biological macromolecules in various environments is among the central challenges of chemistry. Electrospray ionization can often transfer ions from solution to gas phase with only limited structural distortion, allowing their profiling using mass spectrometry and other gas-phase approaches. Ion mobility spectrometry (IMS) can separate and characterize macroion conformations with high sensitivity and speed. However, IMS separation power is generally insufficient for full resolution of major structural variants of protein ions and elucidation of their interconversion dynamics. Here we report characterization of macromolecular conformations using field asymmetric waveform IMS (FAIMS) coupled to conventional IMS in conjunction with mass spectrometry. The collisional heating of ions in the electrodynamic funnel trap between FAIMS and IMS stages enables investigating the structural evolution of particular isomeric precursors as a function of the intensity and duration of activation that can be varied over large ranges. These new capabilities are demonstrated for ubiquitin and cytochrome c, two common model proteins for structure and folding studies. For nearly all charge states, two-dimensional FAIMS/IMS separations distinguish many more conformations than either FAIMS or IMS alone, including some with very low abundance. For cytochrome c in high charge states, we find several abundant "unfolded" isomer series not distinguishable by IMS, possibly corresponding to different "string of beads" geometries. The unfolding of specific ubiquitin conformers selected by FAIMS has been studied by employing their heating in the FAIMS/IMS interface.
Collapse
Affiliation(s)
- Alexandre A Shvartsburg
- Biological Sciences Division, Pacific Northwest National Laboratory, P.O. Box 999, Richland, WA 99352, USA
| | | | | | | |
Collapse
|
105
|
Gao W, Liu Y, Giometti CS, Tollaksen SL, Khare T, Wu L, Klingeman DM, Fields MW, Zhou J. Knock-out of SO1377 gene, which encodes the member of a conserved hypothetical bacterial protein family COG2268, results in alteration of iron metabolism, increased spontaneous mutation and hydrogen peroxide sensitivity in Shewanella oneidensis MR-1. BMC Genomics 2006; 7:76. [PMID: 16600046 PMCID: PMC1468410 DOI: 10.1186/1471-2164-7-76] [Citation(s) in RCA: 44] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2006] [Accepted: 04/06/2006] [Indexed: 01/28/2023] Open
Abstract
Background Shewanella oneidensis MR-1 is a facultative, gram-negative bacterium capable of coupling the oxidation of organic carbon to a wide range of electron acceptors such as oxygen, nitrate and metals, and has potential for bioremediation of heavy metal contaminated sites. The complete 5-Mb genome of S. oneidensis MR-1 was sequenced and standard sequence-comparison methods revealed approximately 42% of the MR-1 genome encodes proteins of unknown function. Defining the functions of hypothetical proteins is a great challenge and may need a systems approach. In this study, by using integrated approaches including whole genomic microarray and proteomics, we examined knockout effects of the gene encoding SO1377 (gi24372955), a member of the conserved, hypothetical, bacterial protein family COG2268 (Clusters of Orthologous Group) in bacterium Shewanella oneidensis MR-1, under various physiological conditions. Results Compared with the wild-type strain, growth assays showed that the deletion mutant had a decreased growth rate when cultured aerobically, but not affected under anaerobic conditions. Whole-genome expression (RNA and protein) profiles revealed numerous gene and protein expression changes relative to the wild-type control, including some involved in iron metabolism, oxidative damage protection and respiratory electron transfer, e. g. complex IV of the respiration chain. Although total intracellular iron levels remained unchanged, whole-cell electron paramagnetic resonance (EPR) demonstrated that the level of free iron in mutant cells was 3 times less than that of the wild-type strain. Siderophore excretion in the mutant also decreased in iron-depleted medium. The mutant was more sensitive to hydrogen peroxide and gave rise to 100 times more colonies resistant to gentamicin or kanamycin. Conclusion Our results showed that the knock-out of SO1377 gene had pleiotropic effects and suggested that SO1377 may play a role in iron homeostasis and oxidative damage protection in S. oneidensis MR-1.
Collapse
Affiliation(s)
- Weimin Gao
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Yongqing Liu
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Carol S Giometti
- Biosciences Division, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Sandra L Tollaksen
- Biosciences Division, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Tripti Khare
- Biosciences Division, Argonne National Laboratory, Argonne, IL 60439, USA
| | - Liyou Wu
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Dawn M Klingeman
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
| | - Matthew W Fields
- Department of Microbiology, Miami University, Oxford, OH 45056, USA
| | - Jizhong Zhou
- Environmental Sciences Division, Oak Ridge National Laboratory, Oak Ridge, TN 37831, USA
- Institute for Environmental Genomics and Department of Botany and Microbiology, University of Oklahoma, Norman, OK 73019, USA
| |
Collapse
|
106
|
Qiu Y, Tereshko V, Kim Y, Zhang R, Collart F, Yousef M, Kossiakoff A, Joachimiak A. The crystal structure of Aq_328 from the hyperthermophilic bacteria Aquifex aeolicus shows an ancestral histone fold. Proteins 2006; 62:8-16. [PMID: 16287087 PMCID: PMC2792020 DOI: 10.1002/prot.20590] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Abstract
The structure of Aq_328, an uncharacterized protein from hyperthermophilic bacteria Aquifex aeolicus, has been determined to 1.9 A by using multi-wavelength anomalous diffraction (MAD) phasing. Although the amino acid sequence analysis shows that Aq_328 has no significant similarity to proteins with a known structure and function, the structure comparison by using the Dali server reveals that it: (1) assumes a histone-like fold, and (2) is similar to an ancestral nuclear histone protein (PDB code 1F1E) with z-score 8.1 and RMSD 3.6 A over 124 residues. A sedimentation equilibrium experiment indicates that Aq_328 is a monomer in solution, with an average sedimentation coefficient of 2.4 and an apparent molecular weight of about 20 kDa. The overall architecture of Aq_328 consists of two noncanonical histone domains in tandem repeat within a single chain, and is similar to eukaryotic heterodimer (H2A/H2B and H3/H4) and an archaeal histone heterodimer (HMfA/HMfB). The sequence comparisons between the two histone domains of Aq_328 and six eukaryotic/archaeal histones demonstrate that most of the conserved residues that underlie the Aq_328 architecture are used to build and stabilize the two cross-shaped antiparallel histone domains. The high percentage of salt bridges in the structure could be a factor in the protein's thermostability. The structural similarities to other histone-like proteins, molecular properties, and potential function of Aq_328 are discussed in this paper.
Collapse
Affiliation(s)
- Yang Qiu
- The University of Chicago, Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois
| | - Valentina Tereshko
- The University of Chicago, Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois
| | - Youngchang Kim
- Structural Biology Center and Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, Argonne, Illinois
| | - Rongguang Zhang
- Structural Biology Center and Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, Argonne, Illinois
| | - Frank Collart
- Structural Biology Center and Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, Argonne, Illinois
| | - Mohammed Yousef
- The University of Chicago, Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois
| | - Anthony Kossiakoff
- The University of Chicago, Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois
| | - Andrzej Joachimiak
- The University of Chicago, Department of Biochemistry and Molecular Biology, University of Chicago, Chicago, Illinois
- Structural Biology Center and Midwest Center for Structural Genomics, Biosciences Division, Argonne National Laboratory, Argonne, Illinois
| |
Collapse
|
107
|
Arcus VL, Lott JS, Johnston JM, Baker EN. The potential impact of structural genomics on tuberculosis drug discovery. Drug Discov Today 2006; 11:28-34. [PMID: 16478688 DOI: 10.1016/s1359-6446(05)03667-6] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
Mycobacterium tuberculosis, the causative agent of tuberculosis (TB) in humans, is a devastating infectious organism that kills approximately two million people annually. The current suite of antibiotics used to treat TB faces two main difficulties: (i) the emergence of multidrug-resistant (MDR) strains of M. tuberculosis, and (ii) the persistent state of the bacterium, which is less susceptible to antibiotics and causes very long antibiotic treatment regimes. The complete genome sequences of a laboratory strain (H37Rv) and a clinical strain (CDC1551) of M. tuberculosis and the concurrent identification of all the open reading frames that encode proteins within this organism, present structural biologists with a wide array of protein targets for structure determination. Comparative genomics of the species that make up the M. tuberculosis complex has also added an array of genomic information to our understanding of these organisms. In response to this, structural genomics consortia have been established for targeting proteins from M. tuberculosis. This review looks at the progress of these major initiatives and the potential impact of large scale structure determination efforts on the development of inhibitors to many proteins. Increasing sophistication in structure-based drug design approaches, in combination with increasing numbers of protein structures and inhibitors for TB proteins, will have a significant impact on the downstream development of TB antibiotics.
Collapse
Affiliation(s)
- Vickery L Arcus
- AgResearch Structural Biology Laboratory, School of Biological Sciences, University of Auckland, Private Bag 92-019, Auckland, New Zealand.
| | | | | | | |
Collapse
|
108
|
|
109
|
Riboldi-Tunnicliffe A, Isaacs NW, Mitchell TJ. 1.2 Angstroms crystal structure of the S. pneumoniae PhtA histidine triad domain a novel zinc binding fold. FEBS Lett 2005; 579:5353-60. [PMID: 16194532 DOI: 10.1016/j.febslet.2005.08.066] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/17/2005] [Revised: 08/30/2005] [Accepted: 08/31/2005] [Indexed: 11/29/2022]
Abstract
The recently described pneumococcal histidine triad protein family has been shown to be highly conserved within the pneumococcus. As part of our structural genomics effort on proteins from Streptococcus pneumoniae, we have expressed, crystallised and solved the structure of PhtA-166-220 at 1.2 Angstroms using remote SAD with zinc. The structure of PhtA-166-220 shows no similarity to any protein structure. The overall fold contains 3beta-strands and a single short alpha-helix. The structure appears to contain a novel zinc binding motif. The remaining 4 histidine triad repeats from PhtA have been modelled based on the crystal structure of the PhtA histidine triad repeat 2. From this modelling work, we speculate that only three of the five histidine triad repeats contain the residues in the correct geometry to allow the binding of a zinc ion.
Collapse
Affiliation(s)
- A Riboldi-Tunnicliffe
- University of Glasgow, Division of Infection and Immunity, IBLS Joseph Black Building, UK
| | | | | |
Collapse
|
110
|
Nachin L, Nannmark U, Nyström T. Differential roles of the universal stress proteins of Escherichia coli in oxidative stress resistance, adhesion, and motility. J Bacteriol 2005; 187:6265-72. [PMID: 16159758 PMCID: PMC1236625 DOI: 10.1128/jb.187.18.6265-6272.2005] [Citation(s) in RCA: 258] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The universal stress protein (UspA) superfamily encompasses a conserved group of proteins that are found in bacteria, archaea, and eukaryotes. Escherichia coli harbors six usp genes--uspA, -C, -D, -E, -F, and -G--the expression of which is triggered by a large variety of environmental insults. The uspA gene is important for survival during cellular growth arrest, but the exact physiological role of the Usp proteins is not known. In this work we have performed phenotypic characterization of mutants with deletions of the six different usp genes. We report on hitherto unknown functions of these genes linked to motility, adhesion, and oxidative stress resistance, and we show that usp functions are both overlapping and distinct. Both UspA and UspD are required in the defense against superoxide-generating agents, and UspD appears also important in controlling intracellular levels of iron. In contrast, UspC is not involved in stress resistance or iron metabolism but is essential, like UspE, for cellular motility. Electron microscopy demonstrates that uspC and uspE mutants are devoid of flagella. In addition, the function of the uspC and uspE genes is linked to cell adhesion, measured as FimH-mediated agglutination of yeast cells. While the UspC and UspE proteins promote motility at the expense of adhesion, the UspF and UspG proteins exhibit the exact opposite effects. We suggest that the Usp proteins have evolved different physiological functions that reprogram the cell towards defense and escape during cellular stress.
Collapse
Affiliation(s)
- Laurence Nachin
- Department of Cell and Molecular Biology, Göteborg University, Medicinaregatan 9C, 413 90 Göteborg, Sweden
| | | | | |
Collapse
|
111
|
Affiliation(s)
- Deborah A Siegele
- Department of Biology, Texas A&M University, 3258 TAMU, College Station, TX 77843-3258, USA.
| |
Collapse
|
112
|
Chandonia JM, Kim SH, Brenner SE. Target selection and deselection at the Berkeley Structural Genomics Center. Proteins 2005; 62:356-70. [PMID: 16276528 DOI: 10.1002/prot.20674] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
At the Berkeley Structural Genomics Center (BSGC), our goal is to obtain a near-complete structural complement of proteins in the minimal organisms Mycoplasma genitalium and M. pneumoniae, two closely related pathogens. Current targets for structure determination have been selected in six major stages, starting with those predicted to be most tractable to high throughput study and likely to yield new structural information. We report on the process used to select these proteins, as well as our target deselection procedure. Target deselection reduces experimental effort by eliminating targets similar to those recently solved by the structural biology community or other centers. We measure the impact of the 69 structures solved at the BSGC as of July 2004 on structure prediction coverage of the M. pneumoniae and M. genitalium proteomes. The number of Mycoplasma proteins for which the fold could first be reliably assigned based on structures solved at the BSGC (24 M. pneumoniae and 21 M. genitalium) is approximately 25% of the total resulting from work at all structural genomics centers and the worldwide structural biology community (94 M. pneumoniae and 86 M. genitalium) during the same period. As the number of structures contributed by the BSGC during that period is less than 1% of the total worldwide output, the benefits of a focused target selection strategy are apparent. If the structures of all current targets were solved, the percentage of M. pneumoniae proteins for which folds could be reliably assigned would increase from approximately 57% (391 of 687) at present to around 80% (550 of 687), and the percentage of the proteome that could be accurately modeled would increase from around 37% (254 of 687) to about 64% (438 of 687). In M. genitalium, the percentage of the proteome that could be structurally annotated based on structures of our remaining targets would rise from 72% (348 of 486) to around 76% (371 of 486), with the percentage of accurately modeled proteins would rise from 50% (243 of 486) to 58% (283 of 486). Sequences and data on experimental progress on our targets are available in the public databases TargetDB and PEPCdb.
Collapse
Affiliation(s)
- John-Marc Chandonia
- Berkeley Structural Genomics Center, Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | | | | |
Collapse
|
113
|
Bayley MJ, Gardiner EJ, Willett P, Artymiuk PJ. A fourier fingerprint-based method for protein surface representation. J Chem Inf Model 2005; 45:696-707. [PMID: 15921459 DOI: 10.1021/ci049647j] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
A crucial enabling technology for structural genomics is the development of algorithms that can predict the putative function of novel protein structures: the proposed functions can subsequently be experimentally tested by functional studies. Testable assignments of function can be made if it is possible to attribute a putative, or indeed probable, function on the basis of the shapes of the binding sites on the surface of a protein structure. However the comparison of the surfaces of 3D protein structures is a computationally demanding task. Here we present four surface representations that can be used locally to describe the global shape of specifically bounded local region models. The most successful of these representations is obtained by a Fourier analysis of the distribution of surface curvature on concentric spheres around a surface point and summarizes a 24 A diameter spherically clipped region of protein surface by a fingerprint of 18 Fourier amplitude values. Searching experiments using these fingerprints on a set of 366 proteins demonstrate that this provides an effective and an efficient technique for the matching of protein surfaces.
Collapse
Affiliation(s)
- Martin J Bayley
- Krebs Institute for Biomolecular Research, Department of Information Studies, University of Sheffield, Western Bank, Sheffield S10 2TN, United Kingdom
| | | | | | | |
Collapse
|
114
|
Rajasekaran S, Thapar V, Dave H, Huang CH. Randomized and parallel algorithms for distance matrix calculations in multiple sequence alignment. J Clin Monit Comput 2005; 19:351-9. [PMID: 16328949 DOI: 10.1007/s10877-005-0680-3] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2005] [Accepted: 06/30/2005] [Indexed: 10/25/2022]
Abstract
Multiple sequence alignment (MSA) is a vital problem in biology. Optimal alignment of multiple sequences becomes impractical even for a modest number of sequences since the general version of the problem is NP-hard. Because of the high time complexity of traditional MSA algorithms, even today's fast computers are not able to solve the problem for large number of sequences. In this paper we present a randomized algorithm to calculate distance matrices, which is a major step in many multiple sequence alignment algorithms. The basic idea employed is sampling (along the lines of). We also illustrate how to parallelize this algorithm. In Section we introduce the problem of multiple sequence alignments. In Section we provide a discussion on various methods that have been employed in the literature for Multiple Sequence Alignment. In this section we also introduce our new sampling approach. We extend our randomized algorithm to the case of non-uniform length sequences as well. We show that our algorithms are amenable to parallelism in Section. In Section we back up our claim of speedup and accuracy with empirical data and examples. In Section we provide some concluding remarks.
Collapse
|
115
|
Li Q, Li L, Rejtar T, Karger BL, Ferry JG. Proteome of Methanosarcina acetivorans Part II: comparison of protein levels in acetate- and methanol-grown cells. J Proteome Res 2005; 4:129-35. [PMID: 15707367 DOI: 10.1021/pr049831k] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Methanosarcina acetivorans is an archaeon isolated from marine sediments which utilizes a diversity of substrates for growth and methanogenesis. Part I of a two-part investigation has profiled proteins of this microorganism cultured with both methanol and acetate as growth substrates, utilizing two-dimensional gel electrophoresis and MALDI-TOF-TOF mass spectrometry. In this report, Part II, the analyses were extended to identify 34 proteins found to be present in different amounts between methanol- and acetate-grown M. acetivorans. Among these proteins are enzymes which function in pathways for methanogenesis from either acetate or methanol. Several of the 34 proteins were determined to have redundant functions based on annotations of the genomic sequence. Enzymes which function in ATP synthesis and steps common to both methanogenic pathways were elevated in acetate- versus methanol-grown cells, whereas enzymes that have a more general function in protein synthesis were in greater amounts in methanol- compared to acetate-grown cells. Several group I chaperonins were present in greater amounts in methanol- versus acetate-grown cells, whereas lower amounts of several stress related proteins were found in methanol- versus acetate-grown cells. The potential physiological basis for these novel patterns of protein synthesis are discussed.
Collapse
Affiliation(s)
- Qingbo Li
- Center for Microbial Structural Biology, Department of Biochemistry and Molecular Biology, 205 South Frear Laboratory, Penn State University, University Park, Pennsylvania 16802, USA
| | | | | | | | | |
Collapse
|
116
|
Hamady M, Cheung THT, Resing K, Cios KJ, Knight R. Key challenges in proteomics and proteoinformatics. Progress in proteins. ACTA ACUST UNITED AC 2005; 24:34-40. [PMID: 15971839 DOI: 10.1109/memb.2005.1436456] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Affiliation(s)
- Micah Hamady
- Department of Computer Science, University of Colorado at Boulder, USA
| | | | | | | | | |
Collapse
|
117
|
Siew N, Saini HK, Fischer D. A putative novel alpha/beta hydrolase ORFan family in Bacillus. FEBS Lett 2005; 579:3175-82. [PMID: 15922334 DOI: 10.1016/j.febslet.2005.04.030] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2004] [Revised: 03/25/2005] [Accepted: 04/11/2005] [Indexed: 10/25/2022]
Abstract
A large number of sequences in each newly sequenced genome correspond to lineage and species-specific proteins, also known as ORFans. Amongst these ORFans, a large number are sequences with unknown structures and functions. We have identified a family of sequences, annotated as hypothetical proteins, which are specific to Bacillus and have carried out a computational study aimed at characterizing this family. Fold-recognition methods predict that these sequences belong to the alpha/beta hydrolase fold. We suggest possible catalytic triads for the ORFans and propose a hypothesis regarding the possible families within the alpha/beta hydrolase superfamily to which they may belong.
Collapse
Affiliation(s)
- Naomi Siew
- Department of Chemistry, Ben Gurion University, Beer-Sheva 84105, Israel
| | | | | |
Collapse
|
118
|
Bennetts B, Rychkov GY, Ng HL, Morton CJ, Stapleton D, Parker MW, Cromer BA. Cytoplasmic ATP-sensing domains regulate gating of skeletal muscle ClC-1 chloride channels. J Biol Chem 2005; 280:32452-8. [PMID: 16027167 DOI: 10.1074/jbc.m502890200] [Citation(s) in RCA: 100] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
ClC proteins are a family of chloride channels and transporters that are found in a wide variety of prokaryotic and eukaryotic cell types. The mammalian voltage-gated chloride channel ClC-1 is important for controlling the electrical excitability of skeletal muscle. Reduced excitability of muscle cells during metabolic stress can protect cells from metabolic exhaustion and is thought to be a major factor in fatigue. Here we identify a novel mechanism linking excitability to metabolic state by showing that ClC-1 channels are modulated by ATP. The high concentration of ATP in resting muscle effectively inhibits ClC-1 activity by shifting the voltage gating to more positive potentials. ADP and AMP had similar effects to ATP, but IMP had no effect, indicating that the inhibition of ClC-1 would only be relieved under anaerobic conditions such as intense muscle activity or ischemia, when depleted ATP accumulates as IMP. The resulting increase in ClC-1 activity under these conditions would reduce muscle excitability, thus contributing to fatigue. We show further that the modulation by ATP is mediated by cystathionine beta-synthase-related domains in the cytoplasmic C terminus of ClC-1. This defines a function for these domains as gating-modulatory domains sensitive to intracellular ligands, such as nucleotides, a function that is likely to be conserved in other ClC proteins.
Collapse
|
119
|
Chin KH, Huang ZW, Wei KC, Chou CC, Lee CC, Shr HL, Gao FP, Lyu PC, Wang AHJ, Chou SH. Preparation, crystallization and preliminary X-ray characterization of a conserved hypothetical protein XC1692 from Xanthomonas campestris. Acta Crystallogr Sect F Struct Biol Cryst Commun 2005; 61:691-3. [PMID: 16511130 PMCID: PMC1952469 DOI: 10.1107/s1744309105018798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2005] [Accepted: 06/13/2005] [Indexed: 11/10/2022]
Abstract
Xanthomonas campestris pv. campestris strain 17 is a Gram-negative yellow-pigmented pathogenic bacterium that causes black rot, one of the major worldwide diseases of cruciferous crops. Its genome contains approximately 4500 genes, one third of which have no known structure and/or function yet are highly conserved among several different bacterial genuses. One of these gene products is XC1692 protein, containing 141 amino acids. It was overexpressed in Escherichia coli, purified and crystallized in a variety of forms using the hanging-drop vapour-diffusion method. The crystals diffract to at least 1.45 A resolution. They are hexagonal and belong to space group P6(3), with unit-cell parameters a = b = 56.9, c = 71.0 A. They contain one molecule per asymmetric unit.
Collapse
Affiliation(s)
- Ko-Hsin Chin
- Institute of Biochemistry, National Chung-Hsing University, Taichung 40227, Taiwan
| | - Zhao-Wei Huang
- Institute of Biochemistry, National Chung-Hsing University, Taichung 40227, Taiwan
| | - Kun-Chou Wei
- Institute of Biochemistry, National Chung-Hsing University, Taichung 40227, Taiwan
| | - Chia-Cheng Chou
- Institute of Biological Chemistry, Academia Sinica, Nankang, Taipei, Taiwan
- Core Facility for Protein Crystallography, Academia Sinica, Nankang, Taipei, Taiwan
| | - Cheng-Chung Lee
- Institute of Biological Chemistry, Academia Sinica, Nankang, Taipei, Taiwan
- Core Facility for Protein Crystallography, Academia Sinica, Nankang, Taipei, Taiwan
| | - Hui-Lin Shr
- Institute of Biological Chemistry, Academia Sinica, Nankang, Taipei, Taiwan
- Core Facility for Protein Crystallography, Academia Sinica, Nankang, Taipei, Taiwan
| | - Fei Philip Gao
- National High Magnetic Field Laboratory, Florida State University, Tallahassee, FL 32310, USA
| | - Ping-Chiang Lyu
- Department of Life Science, National Tsing Hua University, Hsin-Chu, Taiwan
| | - Andrew H.-J. Wang
- Institute of Biological Chemistry, Academia Sinica, Nankang, Taipei, Taiwan
- Core Facility for Protein Crystallography, Academia Sinica, Nankang, Taipei, Taiwan
| | - Shan-Ho Chou
- Institute of Biochemistry, National Chung-Hsing University, Taichung 40227, Taiwan
| |
Collapse
|
120
|
Chin KH, Kuo WT, Chou CC, Shr HL, Lyu PC, Wang AHJ, Chou SH. Cloning, purification, crystallization and preliminary X-ray analysis of XC229, a conserved hypothetical protein from Xanthomonas campestris. Acta Crystallogr Sect F Struct Biol Cryst Commun 2005; 61:694-6. [PMID: 16511131 PMCID: PMC1952452 DOI: 10.1107/s1744309105018944] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2005] [Accepted: 06/14/2005] [Indexed: 11/10/2022]
Abstract
Xanthomonas campestris pv. campestris is a Gram-negative yellow-pigmented pathogenic bacterium that causes black rot, one of the major worldwide diseases of cruciferous crops. Its genome contains approximately 4500 genes, roughly one third of which have no known structure and/or function. However, some of these unknown genes are highly conserved among several different bacterial genuses. XC229 is one such protein containing 134 amino acids. It was overexpressed in Escherichia coli, purified and crystallized using the hanging-drop vapour-diffusion method. The crystal diffracted to a resolution of at least 1.80 A. It is cubic and belongs to space group I2(x)3, with unit-cell parameters a = b = c = 106.8 A. It contains one or two molecules per asymmetric unit.
Collapse
Affiliation(s)
- Ko-Hsin Chin
- Institute of Biochemistry, National Chung-Hsing University, Taichung 40227, Taiwan
| | - Wei-Tien Kuo
- Institute of Biochemistry, National Chung-Hsing University, Taichung 40227, Taiwan
| | - Chia-Cheng Chou
- Institute of Biological Chemistry, Academia Sinica, Nankang, Taipei, Taiwan
- Core Facility for Protein Crystallography, Academia Sinica, Nankang, Taipei, Taiwan
| | - Hui-Lin Shr
- Institute of Biological Chemistry, Academia Sinica, Nankang, Taipei, Taiwan
- Core Facility for Protein Crystallography, Academia Sinica, Nankang, Taipei, Taiwan
| | - Ping-Chiang Lyu
- Department of Life Science, National Tsing Hua University, Hsin-Chu, Taiwan
| | - Andrew H.-J. Wang
- Institute of Biological Chemistry, Academia Sinica, Nankang, Taipei, Taiwan
- Core Facility for Protein Crystallography, Academia Sinica, Nankang, Taipei, Taiwan
| | - Shan-Ho Chou
- Institute of Biochemistry, National Chung-Hsing University, Taichung 40227, Taiwan
| |
Collapse
|
121
|
Zhou CZ, Meyer P, Quevillon-Cheruel S, Li De La Sierra-Gallay I, Collinet B, Graille M, Blondeau K, François JM, Leulliot N, Sorel I, Poupon A, Janin J, Van Tilbeurgh H. Crystal structure of the YML079w protein from Saccharomyces cerevisiae reveals a new sequence family of the jelly-roll fold. Protein Sci 2005; 14:209-15. [PMID: 15608122 PMCID: PMC2253319 DOI: 10.1110/ps.041121305] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
We determined the three-dimensional crystal structure of the protein YML079wp, encoded by a hypothetical open reading frame from Saccharomyces cerevisiae to a resolution of 1.75 A. The protein has no close homologs and its molecular and cellular functions are unknown. The structure of the protein is a jelly-roll fold consisting of ten beta-strands organized in two parallel packed beta-sheets. The protein has strong structural resemblance to the plant storage and ligand binding proteins (canavalin, glycinin, auxin binding protein) but also to some plant and bacterial enzymes (epimerase, germin). The protein forms homodimers in the crystal, confirming measurements of its molecular mass in solution. Two monomers have their beta-sheet packed together to form the dimer. The presence of a hydrophobic ligand in a well conserved pocket inside the barrel and local sequence similarity with bacterial epimerases may suggest a biochemical function for this protein.
Collapse
Affiliation(s)
- Cong-Zhao Zhou
- Institut de Biochimie et de Biophysique Moléculaire et Cellulaire, Centre National de la Recherche Scientifique-Unité Mixte de Recherche 8619, Université Paris-Sud, 91405 Orsay, France
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
122
|
Guzzo CR, Nagem RAP, Galvão-Botton LMP, Guimarães BG, Medrano FJ, Barbosa JARG, Farah CS. Expression, purification, crystallization and preliminary X-ray analysis of YaeQ (XAC2396) from Xanthomonas axonopodis pv. citri. Acta Crystallogr Sect F Struct Biol Cryst Commun 2005; 61:493-5. [PMID: 16511077 PMCID: PMC1952311 DOI: 10.1107/s1744309105010985] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2005] [Accepted: 04/07/2005] [Indexed: 11/10/2022]
Abstract
Xanthomonas axonopodis pv. citri YaeQ (XAC2396) is a member of a family of bacterial proteins conserved in several Gram-negative pathogens. Here, the cloning, expression, purification and crystallization of the 182-residue (20.6 kDa) YaeQ protein are described. Recombinant YaeQ containing selenomethionine was crystallized in space group P2(1) and crystals diffracted to 1.9 A resolution at a synchrotron source. The unit-cell parameters are a = 39.75, b = 91.88, c = 48.03 A, beta = 108.37 degrees. The calculated Matthews coefficient suggests the presence of two YaeQ molecules in the asymmetric unit. Initial experimental phases were calculated by the multiple-wavelength anomalous dispersion technique and an interpretable electron-density map was obtained.
Collapse
Affiliation(s)
- Cristiane R. Guzzo
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, CEP 05508-000, São Paulo, SP, Brazil
| | - Ronaldo A. P. Nagem
- Instituto de Física de São Carlos, Universidade de São Paulo, CEP 13560-970, São Carlos, SP, Brazil
| | - Leonor M. P. Galvão-Botton
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, CEP 05508-000, São Paulo, SP, Brazil
| | - Beatriz G. Guimarães
- Centro de Biologia Molecular Estrutural, Laboratório Nacional de Luz Síncrotron, CEP 13084-971, Campinas, SP, Brazil
| | - Francisco J. Medrano
- Centro de Biologia Molecular Estrutural, Laboratório Nacional de Luz Síncrotron, CEP 13084-971, Campinas, SP, Brazil
| | - João A. R. G. Barbosa
- Centro de Biologia Molecular Estrutural, Laboratório Nacional de Luz Síncrotron, CEP 13084-971, Campinas, SP, Brazil
| | - Chuck S. Farah
- Departamento de Bioquímica, Instituto de Química, Universidade de São Paulo, CEP 05508-000, São Paulo, SP, Brazil
| |
Collapse
|
123
|
Todd AE, Marsden RL, Thornton JM, Orengo CA. Progress of Structural Genomics Initiatives: An Analysis of Solved Target Structures. J Mol Biol 2005; 348:1235-60. [PMID: 15854658 DOI: 10.1016/j.jmb.2005.03.037] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/23/2004] [Revised: 02/28/2005] [Accepted: 03/15/2005] [Indexed: 11/27/2022]
Abstract
The explosion in gene sequence data and technological breakthroughs in protein structure determination inspired the launch of structural genomics (SG) initiatives. An often stated goal of structural genomics is the high-throughput structural characterisation of all protein sequence families, with the long-term hope of significantly impacting on the life sciences, biotechnology and drug discovery. Here, we present a comprehensive analysis of solved SG targets to assess progress of these initiatives. Eleven consortia have contributed 316 non-redundant entries and 323 protein chains to the Protein Data Bank (PDB), and 459 and 393 domains to the CATH and SCOP structure classifications, respectively. The quality and size of these proteins are comparable to those solved in traditional structural biology and, despite huge scope for duplicated efforts, only 14% of targets have a close homologue (>/=30% sequence identity) solved by another consortium. Analysis of CATH and SCOP revealed the significant contribution that structural genomics is making to the coverage of superfamilies and folds. A total of 67% of SG domains in CATH are unique, lacking an already characterised close homologue in the PDB, whereas only 21% of non-SG domains are unique. For 29% of domains, structure determination revealed a remote evolutionary relationship not apparent from sequence, and 19% and 11% contributed new superfamilies and folds. The secondary structure class, fold and superfamily distributions of this dataset reflect those of the genomes. The domains fall into 172 different folds and 259 superfamilies in CATH but the distribution is highly skewed. The most populous of these are those that recur most frequently in the genomes. Whilst 11% of superfamilies are bacteria-specific, most are common to all three superkingdoms of life and together the 316 PDB entries have provided new and reliable homology models for 9287 non-redundant gene sequences in 206 completely sequenced genomes. From the perspective of this analysis, it appears that structural genomics is on track to be a success, and it is hoped that this work will inform future directions of the field.
Collapse
Affiliation(s)
- Annabel E Todd
- Department of Biochemistry and Molecular Biology, University College London, Gower Street, London, WC1E 6BT, UK.
| | | | | | | |
Collapse
|
124
|
Quevillon-Cheruel S, Leulliot N, Graille M, Hervouet N, Coste F, Bénédetti H, Zelwer C, Janin J, Van Tilbeurgh H. Crystal structure of yeast YHR049W/FSH1, a member of the serine hydrolase family. Protein Sci 2005; 14:1350-6. [PMID: 15802654 PMCID: PMC2253265 DOI: 10.1110/ps.051415905] [Citation(s) in RCA: 18] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
Yhr049w/FSH1 was recently identified in a combined computational and experimental proteomics analysis for the detection of active serine hydrolases in yeast. This analysis suggested that FSH1 might be a serine-type hydrolase belonging to the broad functional alphabeta-hydrolase superfamily. In order to get insight into the molecular function of this gene, it was targeted in our yeast structural genomics project. The crystal structure of the protein confirms that it contains a Ser/His/Asp catalytic triad that is part of a minimal alpha/beta-hydrolase fold. The architecture of the putative active site and analogies with other protein structures suggest that FSH1 may be an esterase. This finding was further strengthened by the unexpected presence of a compound covalently bound to the catalytic serine in the active site. Apparently, the enzyme was trapped with a reactive compound during the purification process.
Collapse
Affiliation(s)
- Sophie Quevillon-Cheruel
- Institut de Biochimie et de Biophysique Moléculaire et Cellulaire (CNRS-UMR 8619), Université Paris-Sud, Bâtiment 430, 91405 Orsay, France
| | | | | | | | | | | | | | | | | |
Collapse
|
125
|
Abstract
The success of genomic sequencing projects in recent years has presented protein scientists with a formidable challenge in characterizing the vast number of gene products that have subsequently been identified. NMR has proven to be a valuable tool in the elucidation of various properties for many of these proteins, allowing versatile studies of structure, dynamics, and interactions in the solution state. But the characteristics needed for proteins amenable to this kind of study, such as folding capability, long-term stability, and high solubility, require robust and expeditious methods for the identification and optimization of target protein domains. Here we present a variety of computational and experimental methods developed for these purposes and show that great care must often be taken in the design of constructs intended for NMR-based investigations.
Collapse
Affiliation(s)
- Paul B Card
- Department of Biochemistry, University of Texas Southwestern Medical Center, Dallas, Texas 75390, USA
| | | |
Collapse
|
126
|
Kikugawa S, Takehara H, Kuhara S, Kimura M. A Novel Model for Prediction of RNA binding Proteins. CHEM-BIO INFORMATICS JOURNAL 2005. [DOI: 10.1273/cbij.5.1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Affiliation(s)
- Shingo Kikugawa
- Laboratory of Biochemistry, Department of Bioscience and Biotechnology, Faculty of Agriculture, Graduate School, Kyushu University
| | - Hideki Takehara
- Laboratory of Molecular Gene Technics, Faculty of Agriculture, Graduate School, Kyushu University
| | - Satoru Kuhara
- Laboratory of Molecular Gene Technics, Faculty of Agriculture, Graduate School, Kyushu University
| | - Makoto Kimura
- Laboratory of Biochemistry, Department of Bioscience and Biotechnology, Faculty of Agriculture, Graduate School, Kyushu University
| |
Collapse
|
127
|
Robertson MP, Igel H, Baertsch R, Haussler D, Ares M, Scott WG. The structure of a rigorously conserved RNA element within the SARS virus genome. PLoS Biol 2004; 3:e5. [PMID: 15630477 PMCID: PMC539059 DOI: 10.1371/journal.pbio.0030005] [Citation(s) in RCA: 106] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2004] [Accepted: 10/13/2004] [Indexed: 11/19/2022] Open
Abstract
We have solved the three-dimensional crystal structure of the stem-loop II motif (s2m) RNA element of the SARS virus genome to 2.7-Å resolution. SARS and related coronaviruses and astroviruses all possess a motif at the 3′ end of their RNA genomes, called the s2m, whose pathogenic importance is inferred from its rigorous sequence conservation in an otherwise rapidly mutable RNA genome. We find that this extreme conservation is clearly explained by the requirement to form a highly structured RNA whose unique tertiary structure includes a sharp 90° kink of the helix axis and several novel longer-range tertiary interactions. The tertiary base interactions create a tunnel that runs perpendicular to the main helical axis whose interior is negatively charged and binds two magnesium ions. These unusual features likely form interaction surfaces with conserved host cell components or other reactive sites required for virus function. Based on its conservation in viral pathogen genomes and its absence in the human genome, we suggest that these unusual structural features in the s2m RNA element are attractive targets for the design of anti-viral therapeutic agents. Structural genomics has sought to deduce protein function based on three-dimensional homology. Here we have extended this approach to RNA by proposing potential functions for a rigorously conserved set of RNA tertiary structural interactions that occur within the SARS RNA genome itself. Based on tertiary structural comparisons, we propose the s2m RNA binds one or more proteins possessing an oligomer-binding-like fold, and we suggest a possible mechanism for SARS viral RNA hijacking of host protein synthesis, both based upon observed s2m RNA macromolecular mimicry of a relevant ribosomal RNA fold. The SARS RNA genome contains a unique structure that resembles a portion of ribosomal RNA; this may allow the virus to hijack its hosts protein synthesis machinery
Collapse
Affiliation(s)
- Michael P Robertson
- 1The Center for the Molecular Biology of RNA, University of CaliforniaSanta Cruz, CaliforniaUnited States of America
- 2Department of Chemistry and Biochemistry, University of CaliforniaSanta Cruz, CaliforniaUnited States of America
| | - Haller Igel
- 1The Center for the Molecular Biology of RNA, University of CaliforniaSanta Cruz, CaliforniaUnited States of America
- 3Department of Molecular, Celland Developmental Biology, University of California, Santa Cruz, CaliforniaUnited States of America
| | - Robert Baertsch
- 1The Center for the Molecular Biology of RNA, University of CaliforniaSanta Cruz, CaliforniaUnited States of America
- 4Howard Hughes Medical Institute and Department of Biomolecular Engineering, University of CaliforniaSanta Cruz, CaliforniaUnited States of America
| | - David Haussler
- 1The Center for the Molecular Biology of RNA, University of CaliforniaSanta Cruz, CaliforniaUnited States of America
- 4Howard Hughes Medical Institute and Department of Biomolecular Engineering, University of CaliforniaSanta Cruz, CaliforniaUnited States of America
| | - Manuel Ares
- 1The Center for the Molecular Biology of RNA, University of CaliforniaSanta Cruz, CaliforniaUnited States of America
- 3Department of Molecular, Celland Developmental Biology, University of California, Santa Cruz, CaliforniaUnited States of America
| | - William G Scott
- 1The Center for the Molecular Biology of RNA, University of CaliforniaSanta Cruz, CaliforniaUnited States of America
- 2Department of Chemistry and Biochemistry, University of CaliforniaSanta Cruz, CaliforniaUnited States of America
| |
Collapse
|
128
|
Skolnick J, Kihara D, Zhang Y. Development and large scale benchmark testing of the PROSPECTOR_3 threading algorithm. Proteins 2004; 56:502-18. [PMID: 15229883 DOI: 10.1002/prot.20106] [Citation(s) in RCA: 119] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Abstract
This article describes the PROSPECTOR_3 threading algorithm, which combines various scoring functions designed to match structurally related target/template pairs. Each variant described was found to have a Z-score above which most identified templates have good structural (threading) alignments, Z(struct) (Z(good)). 'Easy' targets with accurate threading alignments are identified as single templates with Z > Z(good) or two templates, each with Z > Z(struct), having a good consensus structure in mutually aligned regions. 'Medium' targets have a pair of templates lacking a consensus structure, or a single template for which Z(struct) < Z < Z(good). PROSPECTOR_3 was applied to a comprehensive Protein Data Bank (PDB) benchmark composed of 1491 single domain proteins, 41-200 residues long and no more than 30% identical to any threading template. Of the proteins, 878 were found to be easy targets, with 761 having a root mean square deviation (RMSD) from native of less than 6.5 A. The average contact prediction accuracy was 46%, and on average 17.6 residue continuous fragments were predicted with RMSD values of 2.0 A. There were 606 medium targets identified, 87% (31%) of which had good structural (threading) alignments. On average, 9.1 residue, continuous fragments with RMSD of 2.5 A were predicted. Combining easy and medium sets, 63% (91%) of the targets had good threading (structural) alignments compared to native; the average target/template sequence identity was 22%. Only nine targets lacked matched templates. Moreover, PROSPECTOR_3 consistently outperforms PSIBLAST. Similar results were predicted for open reading frames (ORFS) < or =200 residues in the M. genitalium, E. coli and S. cerevisiae genomes. Thus, progress has been made in identification of weakly homologous/analogous proteins, with very high alignment coverage, both in a comprehensive PDB benchmark as well as in genomes.
Collapse
Affiliation(s)
- Jeffrey Skolnick
- Center of Excellence in Bioinformatics, University at Buffalo, 901 Washington St., Suite 300, Buffalo, NY 14203, USA.
| | | | | |
Collapse
|
129
|
Kifer I, Sasson O, Linial M. Predicting fold novelty based on ProtoNet hierarchical classification. Bioinformatics 2004; 21:1020-7. [PMID: 15539447 DOI: 10.1093/bioinformatics/bti135] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
MOTIVATION Structural genomics projects aim to solve a large number of protein structures with the ultimate objective of representing the entire protein space. The computational challenge is to identify and prioritize a small set of proteins with new, currently unknown, superfamilies or folds. RESULTS We develop a method that assigns each protein a likelihood of it belonging to a new, yet undetermined, structural superfamily. The method relies on a variant of ProtoNet, an automatic hierarchical classification scheme of all protein sequences from SwissProt. Our results show that proteins that are remote from solved structures in the ProtoNet hierarchy are more likely to belong to new superfamilies. The results are validated against SCOP releases from recent years that account for about half of the solved structures known to date. We show that our new method and the representation of ProtoNet are superior in detecting new targets, compared to our previous method using ProtoMap classification. Furthermore, our method outperforms PSI-BLAST search in detecting potential new superfamilies.
Collapse
Affiliation(s)
- Ilona Kifer
- Department of Biological Chemistry, Institute of Life Sciences Jerusalem 91904, Israel
| | | | | |
Collapse
|
130
|
Galperin MY, Koonin EV. 'Conserved hypothetical' proteins: prioritization of targets for experimental study. Nucleic Acids Res 2004; 32:5452-63. [PMID: 15479782 PMCID: PMC524295 DOI: 10.1093/nar/gkh885] [Citation(s) in RCA: 303] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Comparative genomics shows that a substantial fraction of the genes in sequenced genomes encodes 'conserved hypothetical' proteins, i.e. those that are found in organisms from several phylogenetic lineages but have not been functionally characterized. Here, we briefly discuss recent progress in functional characterization of prokaryotic 'conserved hypothetical' proteins and the possible criteria for prioritizing targets for experimental study. Based on these criteria, the chief one being wide phyletic spread, we offer two 'top 10' lists of highly attractive targets. The first list consists of proteins for which biochemical activity could be predicted with reasonable confidence but the biological function was predicted only in general terms, if at all ('known unknowns'). The second list includes proteins for which there is no prediction of biochemical activity, even if, for some, general biological clues exist ('unknown unknowns'). The experimental characterization of these and other 'conserved hypothetical' proteins is expected to reveal new, crucial aspects of microbial biology and could also lead to better functional prediction for medically relevant human homologs.
Collapse
Affiliation(s)
- Michael Y Galperin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
| | | |
Collapse
|
131
|
Yakunin AF, Yee AA, Savchenko A, Edwards AM, Arrowsmith CH. Structural proteomics: a tool for genome annotation. Curr Opin Chem Biol 2004; 8:42-8. [PMID: 15036155 DOI: 10.1016/j.cbpa.2003.12.003] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
In any newly sequenced genome, 30% to 50% of genes encode proteins with unknown molecular or cellular function. Fortunately, structural genomics is emerging as a powerful approach of functional annotation. Because of recent developments in high-throughput technologies, ongoing structural genomics projects are generating new structures at an unprecedented rate. In the past year, structural studies have identified many new structural motifs involved in enzymatic catalysis or in binding ligands or other macromolecules (DNA, RNA, protein). The efficiency by which function is deduced from structure can be further improved by the integration of structure with bioinformatics and other experimental approaches, such as screening for enzymatic activity or ligand binding.
Collapse
Affiliation(s)
- Alexander F Yakunin
- Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario M5G 1L6, Canada
| | | | | | | | | |
Collapse
|
132
|
Quevillon-Cheruel S, Liger D, Leulliot N, Graille M, Poupon A, Li de La Sierra-Gallay I, Zhou CZ, Collinet B, Janin J, Van Tilbeurgh H. The Paris-Sud yeast structural genomics pilot-project: from structure to function. Biochimie 2004; 86:617-23. [PMID: 15556271 DOI: 10.1016/j.biochi.2004.09.013] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2004] [Accepted: 09/24/2004] [Indexed: 10/26/2022]
Abstract
We present here the outlines and results from our yeast structural genomics (YSG) pilot-project. A lab-scale platform for the systematic production and structure determination is presented. In order to validate this approach, 250 non-membrane proteins of unknown structure were targeted. Strategies and final statistics are evaluated. We finally discuss the opportunity of structural genomics programs to contribute to functional biochemical annotation.
Collapse
Affiliation(s)
- Sophie Quevillon-Cheruel
- Institut de Biochimie et de Biophysique Moléculaire et Cellulaire (CNRS-UMR 8619), Université Paris-Sud, Bâtiment 430, 91405 Orsay, France
| | | | | | | | | | | | | | | | | | | |
Collapse
|
133
|
Kim SH, Shin DH, Choi IG, Schulze-Gahmen U, Chen S, Kim R. Structure-based functional inference in structural genomics. ACTA ACUST UNITED AC 2004; 4:129-35. [PMID: 14649297 DOI: 10.1023/a:1026200610644] [Citation(s) in RCA: 53] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The dramatically increasing number of new protein sequences arising from genomics and proteomics requires the need for methods to rapidly and reliably infer the molecular and cellular functions of these proteins. One such approach, structural genomics, aims to delineate the total repertoire of protein folds in nature, thereby providing three-dimensional folding patterns for all proteins and to infer molecular functions of the proteins based on the combined information of structures and sequences. The goal of obtaining protein structures on a genomic scale has motivated the development of high throughput technologies and protocols for macromolecular structure determination that have begun to produce structures at a greater rate than previously possible. These new structures have revealed many unexpected functional inferences and evolutionary relationships that were hidden at the sequence level. Here, we present samples of structures determined at Berkeley Structural Genomics Center and collaborators' laboratories to illustrate how structural information provides and complements sequence information to deduce the functional inferences of proteins with unknown molecular functions. Two of the major premises of structural genomics are to discover a complete repertoire of protein folds in nature and to find molecular functions of the proteins whose functions are not predicted from sequence comparison alone. To achieve these objectives on a genomic scale, new methods, protocols, and technologies need to be developed by multi-institutional collaborations worldwide. As part of this effort, the Protein Structure Initiative has been launched in the United States (PSI; www.nigms.nih.gov/funding/psi.html). Although infrastructure building and technology development are still the main focus of structural genomics programs, a considerable number of protein structures have already been produced, some of them coming directly out of semiautomated structure determination pipelines. The Berkeley Structural Genomics Center (BSGC) has focused on the proteins of Mycoplasma or their homologues from other organisms as its structural genomics targets because of the minimal genome size of the Mycoplasmas as well as their relevance to human and animal pathogenicity (http://www.strgen.org). Here we present several protein examples encompassing a spectrum of functional inferences obtainable from their three-dimensional structures in five situations, where the inferences are new and testable, and are not predictable from protein sequence information alone.
Collapse
Affiliation(s)
- Sung-Hou Kim
- Department of Chemistry, University of California, Berkeley, California 94720-5230, USA.
| | | | | | | | | | | |
Collapse
|
134
|
Shulman-Peleg A, Nussinov R, Wolfson HJ. Recognition of functional sites in protein structures. J Mol Biol 2004; 339:607-33. [PMID: 15147845 PMCID: PMC7126412 DOI: 10.1016/j.jmb.2004.04.012] [Citation(s) in RCA: 196] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2003] [Revised: 04/02/2004] [Accepted: 04/02/2004] [Indexed: 11/29/2022]
Abstract
Recognition of regions on the surface of one protein, that are similar to a binding site of another is crucial for the prediction of molecular interactions and for functional classifications. We first describe a novel method, SiteEngine, that assumes no sequence or fold similarities and is able to recognize proteins that have similar binding sites and may perform similar functions. We achieve high efficiency and speed by introducing a low-resolution surface representation via chemically important surface points, by hashing triangles of physico-chemical properties and by application of hierarchical scoring schemes for a thorough exploration of global and local similarities. We proceed to rigorously apply this method to functional site recognition in three possible ways: first, we search a given functional site on a large set of complete protein structures. Second, a potential functional site on a protein of interest is compared with known binding sites, to recognize similar features. Third, a complete protein structure is searched for the presence of an a priori unknown functional site, similar to known sites. Our method is robust and efficient enough to allow computationally demanding applications such as the first and the third. From the biological standpoint, the first application may identify secondary binding sites of drugs that may lead to side-effects. The third application finds new potential sites on the protein that may provide targets for drug design. Each of the three applications may aid in assigning a function and in classification of binding patterns. We highlight the advantages and disadvantages of each type of search, provide examples of large-scale searches of the entire Protein Data Base and make functional predictions.
Collapse
Affiliation(s)
| | - Ruth Nussinov
- Sackler Institute of Molecular Medicine, Sackler Faculty of Medicine, Tel Aviv University, Tel Aviv 69978, Israel
- Basic Research Program, SAIC, NCI-Frederick, Inc. Laboratory of Experimental and Computational Biology, Bldg 469, Rm 151, Frederick, MD 21702, USA
- Corresponding authors
| | - Haim J. Wolfson
- School of Computer Science, Tel Aviv University, Tel Aviv 69978, Israel
| |
Collapse
|
135
|
Chalk AJ, Worth CL, Overington JP, Chan AWE. PDBLIG: Classification of Small Molecular Protein Binding in the Protein Data Bank. J Med Chem 2004; 47:3807-16. [PMID: 15239659 DOI: 10.1021/jm040804f] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
Abstract
It is known that proteins can adopt different folds while sharing similar features for recognition of similar substrates or ligands, for example, in the binding sites of enzyme cofactors such as ATP. On the other hand, proteins that have highly flexible binding sites or belong to large and diverse protein families can bind structurally dissimilar ligands, as, for example, in the case of the matrix metalloprotease family. We have developed a database, PDBLIG, that classifies protein domains and ligands. The information stored includes each protein's function, domain class(es), which ligand(s) it binds, and so on. The database can provide valuable knowledge for drug discovery, supporting the answering of questions such as whether the same drug molecule can bind different target protein families and whether these families are related functionally or structurally, which ligand classes (such as metabolites or organic molecules) bind to a particular protein family and whether the ligands are druglike, and which target families bind a wide variety of ligands and whether different ligands are associated with different subfamilies.
Collapse
Affiliation(s)
- Andrew J Chalk
- Department of Molecular Design, Inpharmatica, 60 Charlotte Street, London W1T 2NU, U.K
| | | | | | | |
Collapse
|
136
|
Mao L, Wang Y, Liu Y, Hu X. Molecular determinants for ATP-binding in proteins: a data mining and quantum chemical analysis. J Mol Biol 2004; 336:787-807. [PMID: 15095988 DOI: 10.1016/j.jmb.2003.12.056] [Citation(s) in RCA: 97] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2003] [Revised: 11/29/2003] [Accepted: 12/11/2003] [Indexed: 11/26/2022]
Abstract
Adenosine 5'-triphosphate (ATP) plays an essential role in all forms of life. Molecular recognition of ATP in proteins is a subject of great importance for understanding enzymatic mechanism and for drug design. We have carried out a large-scale data mining of the Protein Data Bank (PDB) to analyze molecular determinants for recognition of the adenine moiety of ATP by proteins. Non-bonded intermolecular interactions (hydrogen bonding, pi-pi stacking interactions, and cation-pi interactions) between adenine base and surrounding residues in its binding pockets are systematically analyzed for 68 non-redundant, high-resolution crystal structures of adenylate-binding proteins. In addition to confirming the importance of the widely known hydrogen bonding, we found out that cation-pi interactions between adenine base and positively charged residues (Lys and Arg) and pi-pi stacking interactions between adenine base and surrounding aromatic residues (Phe, Tyr, Trp) are also crucial for adenine binding in proteins. On average, there exist 2.7 hydrogen bonding interactions, 1.0 pi-pi stacking interactions, and 0.8 cation-pi interactions in each adenylate-binding protein complex. Furthermore, a high-level quantum chemical analysis was performed to analyze contributions of each of the three forms of intermolecular interactions (i.e. hydrogen bonding, pi-pi stacking interactions, and cation-pi interactions) to the overall binding force of the adenine moiety of ATP in proteins. Intermolecular interaction energies for representative configurations of intermolecular complexes were analyzed using the supermolecular approach at the MP2/6-311 + G* level, which resulted in substantial interaction strengths for all the three forms of intermolecular interactions. This work represents a timely undertaking at a historical moment when a large number of X-ray crystallographic structures of proteins with bound ATP ligands have become available, and when high-level quantum chemical analysis of intermolecular interactions of large biomolecular systems becomes computationally feasible. The establishment of the molecular basis for recognition of the adenine moiety of ATP in proteins will directly impact molecular design of ATP-binding site targeted enzyme inhibitors such as kinase inhibitors.
Collapse
Affiliation(s)
- Lisong Mao
- Department of Chemistry, University of Toledo, Toledo, OH 43606-3390, USA
| | | | | | | |
Collapse
|
137
|
Li de La Sierra-Gallay I, Collinet B, Graille M, Quevillon-Cheruel S, Liger D, Minard P, Blondeau K, Henckes G, Aufrère R, Leulliot N, Zhou CZ, Sorel I, Ferrer JL, Poupon A, Janin J, van Tilbeurgh H. Crystal structure of the YGR205w protein from Saccharomyces cerevisiae: close structural resemblance to E. coli pantothenate kinase. Proteins 2004; 54:776-83. [PMID: 14997573 DOI: 10.1002/prot.10596] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The protein product of the YGR205w gene of Saccharomyces cerevisiae was targeted as part of our yeast structural genomics project. YGR205w codes for a small (290 amino acids) protein with unknown structure and function. The only recognizable sequence feature is the presence of a Walker A motif (P loop) indicating a possible nucleotide binding/converting function. We determined the three-dimensional crystal structure of Se-methionine substituted protein using multiple anomalous diffraction. The structure revealed a well known mononucleotide fold and strong resemblance to the structure of small metabolite phosphorylating enzymes such as pantothenate and phosphoribulo kinase. Biochemical experiments show that YGR205w binds specifically ATP and, less tightly, ADP. The structure also revealed the presence of two bound sulphate ions, occupying opposite niches in a canyon that corresponds to the active site of the protein. One sulphate is bound to the P-loop in a position that corresponds to the position of beta-phosphate in mononucleotide protein ATP complex, suggesting the protein is indeed a kinase. The nature of the phosphate accepting substrate remains to be determined.
Collapse
Affiliation(s)
- Ines Li de La Sierra-Gallay
- Laboratoire d'Enzymologie et Biochimie Structurales (CNRS-UPR 9063), Bât. 34, 1 Av. de la Terrasse, 91198 Gif sur Yvette, France
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
138
|
Frishman D. What we have learned about prokaryotes from structural genomics. OMICS-A JOURNAL OF INTEGRATIVE BIOLOGY 2004; 7:211-24. [PMID: 14506850 DOI: 10.1089/153623103322246601] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
Five years ago systematic determination and theoretical analysis of all protein structures encoded in model prokaryotic organisms was proposed as a powerful way to obtain new insights into protein function and the variety of protein folds. What has been the pay-off from studying structures in genomic context? Have we learned anything new about protein structure? Can we now predict protein function better? In this contribution, I summarize the status of large-scale structure determination projects on prokaryotes and provide an overview of the main results obtained from experimental and theoretical studies in this dynamic research field.
Collapse
Affiliation(s)
- Dmitrij Frishman
- Department of Genome Oriented Bioinformatics, Technical University of Munich, Freising-Weihenstephan, Germany.
| |
Collapse
|
139
|
Koth CM, Edwards AM. From clone to crystal: maximizing the amount of protein samples for structure determination. ADVANCES IN PROTEIN CHEMISTRY 2004; 65:343-52. [PMID: 12964375 DOI: 10.1016/s0065-3233(03)01025-8] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/04/2023]
Affiliation(s)
- Chris M Koth
- Banting and Best Department of Medical Research, Department of Medical Genetics and Microbiology, C. H. Best Institute, University of Toronto, Toronto, Ontario, Canada, M5G 1L6
| | | |
Collapse
|
140
|
Abstract
The discovery of biochemical and cellular functions of unannotated gene products begins with a database search of proteins with structure/sequence homologues based on known genes. Very recently, a number of frontier groups in structural biology proposed a new paradigm to predict biological functions of an unknown protein on the basis of its three-dimensional structure on a genomic scale. Structural proteomics (genomics), a research area for structure-based functional discovery, aims to complete the protein-folding universe of all gene products in a cell. It would lead us to a complete understanding of a living organism from protein structure. Two major complementary experimental techniques, X-ray crystallography and NMR spectroscopy, combined with recently developed high throughput methods have played a central role in structural proteomics research; however, an integration of these methodologies together with comparative modeling and electron microscopy would speed up the goal for completing a full dictionary of protein folding space in the near future.
Collapse
Affiliation(s)
- Jin-Won Jung
- Department of Biochemistry and Protein Network Research Center, College of Science, Yonsei University, Seoul 120-749, Korea
| | | |
Collapse
|
141
|
Feng J, Yuan F, Gao Y, Liang C, Xu J, Zhang C, He L. A novel antimicrobial protein isolated from potato (Solanum tuberosum) shares homology with an acid phosphatase. Biochem J 2003; 376:481-7. [PMID: 12927022 PMCID: PMC1223772 DOI: 10.1042/bj20030806] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2003] [Revised: 08/15/2003] [Accepted: 08/20/2003] [Indexed: 11/17/2022]
Abstract
The nucleotide and amino acids sequences for AP(1) will appear in the GenBank(R) and NCBI databases under accession number AY297449. A novel antimicrobial protein (AP(1)) was purified from leaves of the potato ( Solanum tuberosum, variety MS-42.3) with a procedure involving ammonium sulphate fractionation, molecular sieve chromatography with Sephacryl S-200 and hydrophobic chromatography with Butyl-Sepharose using a FPLC system. The inhibition spectrum investigation showed that AP(1) had good inhibition activity against five different strains of Ralstonia solanacearum from potato or other crops, and two fungal pathogens, Rhizoctonia solani and Alternaria solani from potato. The full-length cDNA encoding AP(1) has been successfully cloned by screening a cDNA expression library of potato with an anti-AP(1) antibody and RACE (rapid amplification of cDNA ends) PCR. Determination of the nucleotide sequences revealed the presence of an open reading frame encoding 343 amino acids. At the C-terminus of AP(1) there is an ATP-binding domain, and the N-terminus exhibits 58% identity with an/the acid phosphatase from Mesorhizobium loti. SDS/PAGE and Western blotting analysis suggested that the AP(1) gene can be successfully expressed in Escherichia coli and recognized by an antibody against AP(1). Also the expressed protein showed an inhibition activity the same as original AP(1) protein isolated from potato. We suggest that AP(1) most likely belongs to a new group of proteins with antimicrobial characteristics in vitro and functions in relation to phosphorylation and energy metabolism of plants.
Collapse
Affiliation(s)
- Jie Feng
- State Key Laboratory for Biology of Plant Diseases and Insect Pests, Institute of Plant Protection, Chinese Academy of Agricultural Sciences, No. 2 West Yuanmingyuan Road, Beijing, 100094, People's Republic of China
| | | | | | | | | | | | | |
Collapse
|
142
|
Marti‐Renom MA, Madhusudhan M, Eswar N, Pieper U, Shen M, Sali A, Fiser A, Mirkovic N, John B, Stuart A. Modeling Protein Structure from its Sequence. ACTA ACUST UNITED AC 2003. [DOI: 10.1002/0471250953.bi0501s03] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022]
Affiliation(s)
- Marc A. Marti‐Renom
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and The California Institute for Quantitative Biomedical Research University of California at San Francisco San Francisco California
| | - M.S. Madhusudhan
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and The California Institute for Quantitative Biomedical Research University of California at San Francisco San Francisco California
| | - Narayanan Eswar
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and The California Institute for Quantitative Biomedical Research University of California at San Francisco San Francisco California
| | - Ursula Pieper
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and The California Institute for Quantitative Biomedical Research University of California at San Francisco San Francisco California
| | - Min‐yi Shen
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and The California Institute for Quantitative Biomedical Research University of California at San Francisco San Francisco California
| | - Andrej Sali
- Departments of Biopharmaceutical Sciences and Pharmaceutical Chemistry and The California Institute for Quantitative Biomedical Research University of California at San Francisco San Francisco California
| | - Andras Fiser
- Department of Biochemistry and Seaver Foundation Center for Bioinformatics Albert Einstein College of Medicine Bronx New York
| | - Nebojsa Mirkovic
- Laboratory of Molecular Biophysics The Rockefeller University New York New York
| | - Bino John
- Laboratory of Molecular Biophysics The Rockefeller University New York New York
| | - Ashley Stuart
- Laboratory of Molecular Biophysics The Rockefeller University New York New York
| |
Collapse
|
143
|
Steiner-Lange S, Fischer A, Boettcher A, Rouhara I, Liedgens H, Schmelzer E, Knogge W. Differential defense reactions in leaf tissues of barley in response to infection by Rhynchosporium secalis and to treatment with a fungal avirulence gene product. MOLECULAR PLANT-MICROBE INTERACTIONS : MPMI 2003; 16:893-902. [PMID: 14558691 DOI: 10.1094/mpmi.2003.16.10.893] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/21/2023]
Abstract
Expression of defense-associated genes was analyzed in leaf tissues of near-isogenic resistant and susceptible barley cultivars upon infection by Rhynchosporium secalis. The genes encoding pathogenesis-related (PR) proteins PR-1, PR-5, and PR-9 are specifically expressed in the mesophyll of resistant plants, whereas a germin-like protein (OxOLP) is synthesized in the epidermis irrespective of the resistance genotype. Restriction-mediated differential display was employed to identify additional epidermis-specific genes. This resulted in the detection of another PR gene, PR-10, along with a lipoxygenase gene, LoxA, and a gene of unknown function, pI2-4, which are specifically induced in the epidermis of resistant plants. The gene encoding a putative protease inhibitor, SD10, is preferentially but not exclusively expressed in the epidermis. The fungal avirulence gene product NIP1 triggers the induction of the four PR genes only. At least two additional elicitors, therefore, must be postulated, one for the unspecific induction of OxOLP and one for the resistance-specific induction of LoxA, pI2-4, and SD10. PR-10 expression can be assumed to be the consequence of NIP1 perception by epidermis cells. In contrast, gene expression in the mesophyll is likely to be triggered by an as yet unknown signal that appears to originate in the epidermis and that is strongly amplified in the mesophyll.
Collapse
Affiliation(s)
- Sabine Steiner-Lange
- Department of Biochemistry, Max-Planck Institute for Plant Breeding Research, Carl-von-Linné-Weg 10, D-50829 Köln, Germany
| | | | | | | | | | | | | |
Collapse
|
144
|
Kawamura Y, Uemura M. Mass spectrometric approach for identifying putative plasma membrane proteins of Arabidopsis leaves associated with cold acclimation. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2003; 36:141-54. [PMID: 14535880 DOI: 10.1046/j.1365-313x.2003.01864.x] [Citation(s) in RCA: 158] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/18/2023]
Abstract
Although enhancement of freezing tolerance in plants during cold acclimation is closely associated with an increase in the cryostability of plasma membrane, the molecular mechanism for the increased cryostability of plasma membrane is still to be elucidated. In Arabidopsis, enhanced freezing tolerance was detectable after cold acclimation at 2 degrees C for as short as 1 day, and maximum freezing tolerance was attained after 1 week. To identify the plasma membrane proteins that change in quantity in response to cold acclimation, a highly purified plasma membrane fraction was isolated from leaves before and during cold acclimation, and the proteins in the fraction were separated with gel electrophoresis. We found that there were substantial changes in the protein profiles after as short as 1 day of cold acclimation. Subsequently, using matrix-assisted laser desorption-ionization time-of-flight mass spectrometry (MALDI-TOF MS), we identified 38 proteins that changed in quantity during cold acclimation. The proteins that changed in quantity during the first day of cold acclimation include those that are associated with membrane repair by membrane fusion, protection of the membrane against osmotic stress, enhancement of CO2 fixation, and proteolysis.
Collapse
Affiliation(s)
- Yukio Kawamura
- Cryobiosystem Research Center, Faculty of Agriculture, Iwate University, Morioka 020-8550, Japan
| | | |
Collapse
|
145
|
Abstract
The success of structural genomics initiatives requires the development and application of tools for structure analysis, prediction, and annotation. In this paper we review recent developments in these areas; specifically structure alignment, the detection of remote homologs and analogs, homology modeling and the use of structures to predict function. We also discuss various rationales for structural genomics initiatives. These include the structure-based clustering of sequence space and genome-wide function assignment. It is also argued that structural genomics can be integrated into more traditional biological research if specific biological questions are included in target selection strategies.
Collapse
Affiliation(s)
- Sharon Goldsmith-Fischman
- Department of Biochemistry and Molecular Biophysics, Columbia University, New York, New York 10032, USA
| | | |
Collapse
|
146
|
Shin DH, Yokota H, Kim R, Kim SH. Crystal structure of a conserved hypothetical protein from Escherichia coli. JOURNAL OF STRUCTURAL AND FUNCTIONAL GENOMICS 2003; 2:53-66. [PMID: 12836674 DOI: 10.1023/a:1014450817696] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
The crystal structure of a conserved hypothetical protein from Escherichia coli has been determined using X-ray crystallography. The protein belongs to the Cluster of Orthologous Group COG1553 (National Center for Biotechnology Information database, NLM, NIH), for which there was no structural information available until now. Structural homology search with DALI algorism indicated that this protein has a new fold with no obvious similarity to those of other proteins with known three-dimensional structures. The protein quaternary structure consists of a dimer of trimers, which makes a characteristic cylinder shape. There is a large closed cavity with approximate dimensions of 16 A x 16 A x 20 A in the center of the hexameric structure. Six putative active sites are positioned along the equatorial surface of the hexamer. There are several highly conserved residues including two possible functional cysteines in the putative active site. The possible molecular function of the protein is discussed.
Collapse
Affiliation(s)
- Dong Hae Shin
- Department of Chemistry, University of California, Berkeley, California 94720-5230, USA
| | | | | | | |
Collapse
|
147
|
Kinoshita K, Furui J, Nakamura H. Identification of protein functions from a molecular surface database, eF-site. JOURNAL OF STRUCTURAL AND FUNCTIONAL GENOMICS 2003; 2:9-22. [PMID: 12836670 DOI: 10.1023/a:1011318527094] [Citation(s) in RCA: 85] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
A bioinformatics method was developed to identify the protein surface around the functional site and to estimate the biochemical function, using a newly constructed molecular surface database named the eF-site (electrostatic surface of Functional site. Molecular surfaces of protein molecules were computed based on the atom coordinates, and the eF-site database was prepared by adding the physical properties on the constructed molecular surfaces. The electrostatic potential on each molecular surface was individually calculated solving the Poisson-Boltzmann equation numerically for the precise continuum model, and the hydrophobicity information of each residue was also included. The eF-site database is accessed by the internet (http://pi.protein.osaka-u.ac.jp/eF-site/). We have prepared four different databases, eF-site/antibody, eF-site/prosite, eF-site/P-site, and eF-site/ActiveSite, corresponding to the antigen binding sites of antibodies with the same orientations, the molecular surfaces for the individual motifs in PROSITE database, the phosphate binding sites, and the active site surfaces for the representatives of the individual protein family, respectively. An algorithm using the clique detection method as an applied graph theory was developed to search of the eF-site database, so as to recognize and discriminate the characteristic molecular surfaces of the proteins. The method identifies the active site having the similar function to those of the known proteins.
Collapse
Affiliation(s)
- Kengo Kinoshita
- Institute for Protein Research, Osaka University, 3-2 Yamadaoka, Suita, Osaka 565-0871, Japan
| | | | | |
Collapse
|
148
|
Grigoriev IV, Choi IG. Target selection for structural genomics: a single genome approach. OMICS : A JOURNAL OF INTEGRATIVE BIOLOGY 2003; 6:349-62. [PMID: 12626094 DOI: 10.1089/153623102321112773] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/13/2022]
Abstract
We describe our strategy for selecting targets for protein structure determination in context of structural genomics of a single genome. In the course of target selection, we have studied two of the smallest microbial genomes, Mycoplasma genitalium and Mycoplasma pneumoniae. To our surprise, we found that only 71 Mycoplasma genes or their orthologues can be considered as easy targets for high-throughput structural studies--far fewer than expected. We discuss the methods and criteria used for target selection and the reasons explaining rarity of easy targets. First, despite the common opinion that protein folds can be predicted for only 30-50% of genes, the number of "truly unknown" structures is less than one-third. Second, due to the different codon usage, two thirds of Mycoplasma proteins cannot be directly expressed in E. coli in high-throughput manner and require substitution by their homologues from other organisms. Third, membrane or large multi-domain proteins are difficult targets because of solubility and size issues and often require identification and structure determination of protein domains. Finally, we propose different approaches to address the difficult targets.
Collapse
Affiliation(s)
- Igor V Grigoriev
- Department of Chemistry and E.O. Lawrence Berkeley National Laboratory, University of California, Berkeley, CA, USA.
| | | |
Collapse
|
149
|
Makarova KS, Koonin EV. Comparative genomics of Archaea: how much have we learned in six years, and what's next? Genome Biol 2003; 4:115. [PMID: 12914651 PMCID: PMC193635 DOI: 10.1186/gb-2003-4-8-115] [Citation(s) in RCA: 74] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Archaea comprise one of the three distinct domains of life (with bacteria and eukaryotes). With 16 complete archaeal genomes sequenced to date, comparative genomics has revealed a conserved core of 313 genes that are represented in all sequenced archaeal genomes, plus a variable 'shell' that is prone to lineage-specific gene loss and horizontal gene exchange. The majority of archaeal genes have not been experimentally characterized, but novel functional pathways have been predicted.
Collapse
Affiliation(s)
- Kira S Makarova
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA.
| | | |
Collapse
|
150
|
Sanishvili R, Yakunin AF, Laskowski RA, Skarina T, Evdokimova E, Doherty-Kirby A, Lajoie GA, Thornton JM, Arrowsmith CH, Savchenko A, Joachimiak A, Edwards AM. Integrating structure, bioinformatics, and enzymology to discover function: BioH, a new carboxylesterase from Escherichia coli. J Biol Chem 2003; 278:26039-45. [PMID: 12732651 PMCID: PMC2792009 DOI: 10.1074/jbc.m303867200] [Citation(s) in RCA: 96] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
Structural proteomics projects are generating three-dimensional structures of novel, uncharacterized proteins at an increasing rate. However, structure alone is often insufficient to deduce the specific biochemical function of a protein. Here we determined the function for a protein using a strategy that integrates structural and bioinformatics data with parallel experimental screening for enzymatic activity. BioH is involved in biotin biosynthesis in Escherichia coli and had no previously known biochemical function. The crystal structure of BioH was determined at 1.7 A resolution. An automated procedure was used to compare the structure of BioH with structural templates from a variety of different enzyme active sites. This screen identified a catalytic triad (Ser82, His235, and Asp207) with a configuration similar to that of the catalytic triad of hydrolases. Analysis of BioH with a panel of hydrolase assays revealed a carboxylesterase activity with a preference for short acyl chain substrates. The combined use of structural bioinformatics with experimental screens for detecting enzyme activity could greatly enhance the rate at which function is determined from structure.
Collapse
Affiliation(s)
- Ruslan Sanishvili
- Biosciences Division, Argonne National Laboratory, Argonne, Illinois, 60439
| | - Alexander F. Yakunin
- Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario M5G 1L6, Canada
| | - Roman A. Laskowski
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom, London, Ontario N6A 5C1, Canada
| | - Tatiana Skarina
- Clinical Genomics Centre/Proteomics, University Health Network, Toronto, Ontario M5G 1L7, Canada
| | - Elena Evdokimova
- Clinical Genomics Centre/Proteomics, University Health Network, Toronto, Ontario M5G 1L7, Canada
| | - Amanda Doherty-Kirby
- Department of Biochemistry, University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Gilles A. Lajoie
- Department of Biochemistry, University of Western Ontario, London, Ontario N6A 5C1, Canada
| | - Janet M. Thornton
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, United Kingdom, London, Ontario N6A 5C1, Canada
| | - Cheryl H. Arrowsmith
- Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario M5G 1L6, Canada
- Clinical Genomics Centre/Proteomics, University Health Network, Toronto, Ontario M5G 1L7, Canada
| | - Alexei Savchenko
- Clinical Genomics Centre/Proteomics, University Health Network, Toronto, Ontario M5G 1L7, Canada
| | - Andrzej Joachimiak
- Biosciences Division, Argonne National Laboratory, Argonne, Illinois, 60439
| | - Aled M. Edwards
- Banting and Best Department of Medical Research, University of Toronto, Toronto, Ontario M5G 1L6, Canada
- Clinical Genomics Centre/Proteomics, University Health Network, Toronto, Ontario M5G 1L7, Canada
| |
Collapse
|