1
|
Kuang D, Issakova D, Kim J. Learning Proteome Domain Folding Using LSTMs in an Empirical Kernel Space. J Mol Biol 2022; 434:167686. [PMID: 35716781 DOI: 10.1016/j.jmb.2022.167686] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Revised: 06/08/2022] [Accepted: 06/10/2022] [Indexed: 11/30/2022]
Abstract
The recognition of protein structural folds is the starting point for protein function inference and for many structural prediction tools. We previously introduced the idea of using empirical comparisons to create a data-augmented feature space called PESS (Protein Empirical Structure Space)1 as a novel approach for protein structure prediction. Here, we extend the previous approach by generating the PESS feature space over fixed-length subsequences of query peptides, and applying a sequential neural network model, with one long short-term memory cell layer followed by a fully connected layer. Using this approach, we show that only a small group of domains as a training set is needed to achieve near state-of-the-art accuracy on fold recognition. Our method improves on the previous approach by reducing the training set required and improving the model's ability to generalize across species, which will help fold prediction for newly discovered proteins.
Collapse
Affiliation(s)
- Da Kuang
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA.
| | - Dina Issakova
- Department of Biology, University of Pennsylvania, Philadelphia, USA.
| | - Junhyong Kim
- Department of Computer and Information Science, University of Pennsylvania, Philadelphia, USA; Department of Biology, University of Pennsylvania, Philadelphia, USA.
| |
Collapse
|
2
|
Middleton SA, Illuminati J, Kim J. Complete fold annotation of the human proteome using a novel structural feature space. Sci Rep 2017; 7:46321. [PMID: 28406174 PMCID: PMC5390313 DOI: 10.1038/srep46321] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2017] [Accepted: 03/14/2017] [Indexed: 11/11/2022] Open
Abstract
Recognition of protein structural fold is the starting point for many structure prediction tools and protein function inference. Fold prediction is computationally demanding and recognizing novel folds is difficult such that the majority of proteins have not been annotated for fold classification. Here we describe a new machine learning approach using a novel feature space that can be used for accurate recognition of all 1,221 currently known folds and inference of unknown novel folds. We show that our method achieves better than 94% accuracy even when many folds have only one training example. We demonstrate the utility of this method by predicting the folds of 34,330 human protein domains and showing that these predictions can yield useful insights into potential biological function, such as prediction of RNA-binding ability. Our method can be applied to de novo fold prediction of entire proteomes and identify candidate novel fold families.
Collapse
Affiliation(s)
- Sarah A Middleton
- Genomics and Computational Biology Program, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Joseph Illuminati
- Department of Computer Science, University of Pennsylvania, Philadelphia, PA 19104, USA
| | - Junhyong Kim
- Genomics and Computational Biology Program, University of Pennsylvania, Philadelphia, PA 19104, USA.,Department of Biology, University of Pennsylvania, Philadelphia, PA 19104, USA
| |
Collapse
|
3
|
General assay for enzymes in the heptose biosynthesis pathways using electrospray ionization mass spectrometry. Appl Microbiol Biotechnol 2017; 101:4521-4532. [DOI: 10.1007/s00253-017-8148-1] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2016] [Revised: 01/12/2017] [Accepted: 01/25/2017] [Indexed: 10/20/2022]
|
4
|
Park J, Kim H, Kim S, Lee D, Shin DH. Expression and crystallographic studies of D-glycero-β-D-manno-heptose-1-phosphate adenylyltransferase from Burkholderia pseudomallei. Acta Crystallogr F Struct Biol Commun 2017; 73:90-94. [PMID: 28177319 PMCID: PMC5297929 DOI: 10.1107/s2053230x16020537] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2016] [Accepted: 12/27/2016] [Indexed: 11/10/2022] Open
Abstract
The Gram-negative bacterium Burkholderia pseudomallei is the causative agent of melioidosis. D-glycero-β-D-manno-Heptose-1-phosphate adenylyltransferase (HldC) is the fourth enzyme of the ADP-L-glycero-β-D-manno-heptose biosynthesis pathway, which produces an essential carbohydrate comprising the inner core of lipopolysaccharide. Therefore, HldC is a potential target of antibiotics against melioidosis. In this study, HldC from B. pseudomallei has been cloned, expressed, purified and crystallized. Synchrotron X-ray data from a selenomethionine-substituted HldC crystal were also collected to 2.8 Å resolution. The crystal belonged to the primitive triclinic space group P1, with unit-cell parameters a = 74.0, b = 74.0, c = 74.9 Å, α = 108.4, β = 108.4, γ = 108.0°. Eight protomers are present in the unit cell and three out of five selenomethionines were found in each protomer using the PHENIX software suite. A full structural determination is in progress to elucidate the structure-function relationship of the protein.
Collapse
Affiliation(s)
- Jimin Park
- College of Pharmacy, Ewha W. University, 52 Ewhayeodae-gil, Seoul 03760, Republic of Korea
| | - Hyojin Kim
- College of Pharmacy, Ewha W. University, 52 Ewhayeodae-gil, Seoul 03760, Republic of Korea
| | - Suwon Kim
- College of Pharmacy, Ewha W. University, 52 Ewhayeodae-gil, Seoul 03760, Republic of Korea
| | - Daeun Lee
- College of Pharmacy, Ewha W. University, 52 Ewhayeodae-gil, Seoul 03760, Republic of Korea
| | - Dong Hae Shin
- College of Pharmacy, Ewha W. University, 52 Ewhayeodae-gil, Seoul 03760, Republic of Korea
| |
Collapse
|
5
|
Park J, Lee D, Kim MS, Kim DY, Shin DH. A preliminary X-ray study of 3-deoxy-D-manno-oct-2-ulosonic acid 8-phosphate phosphatase (YrbI) from Burkholderia pseudomallei. Acta Crystallogr F Struct Biol Commun 2015; 71:790-3. [PMID: 26057814 PMCID: PMC4461349 DOI: 10.1107/s2053230x15006135] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2015] [Accepted: 03/25/2015] [Indexed: 11/10/2022] Open
Abstract
3-Deoxy-D-manno-oct-2-ulosonic acid 8-phosphate phosphatase (YrbI), the third enzyme in the pathway for the biosynthesis of 3-deoxy-D-manno-oct-2-ulosonic acid (KDO), hydrolyzes KDO 8-phosphate to KDO and inorganic phosphate. YrbI belongs to the haloacid dehalogenase (HAD) superfamily, which is a large family of magnesium-dependent phosphatase/phosphotransferase enzymes. In this study, YrbI from Burkholderia pseudomallei, the causative agent of melioidosis, has been cloned, expressed, purified and crystallized. Synchrotron X-ray data were also collected to 2.25 Å resolution. The crystal belonged to the primitive orthorhombic space group P2(1)2(1)2(1), with unit-cell parameters a = 63.7, b = 97.5, c = 98.0 Å. A full structural determination is in progress to elucidate the structure-function relationship of this protein.
Collapse
Affiliation(s)
- Jimin Park
- Global Top 5 Research Program, College of Pharmacy, Ewha Womans University, Seoul 120-750, Republic of Korea
| | - Daeun Lee
- Global Top 5 Research Program, College of Pharmacy, Ewha Womans University, Seoul 120-750, Republic of Korea
| | - Mi-Sun Kim
- Global Top 5 Research Program, College of Pharmacy, Ewha Womans University, Seoul 120-750, Republic of Korea
| | - Dae Yong Kim
- 402-803 Technopark, 65 Pyeongchon-ro, Wonmi-gu, Bucheon-si, Gyeonggi-do 420-734, Republic of Korea
| | - Dong Hae Shin
- Global Top 5 Research Program, College of Pharmacy, Ewha Womans University, Seoul 120-750, Republic of Korea
| |
Collapse
|
6
|
Vorontsova MA, Maes D, Vekilov PG. Recent advances in the understanding of two-step nucleation of protein crystals. Faraday Discuss 2015; 179:27-40. [DOI: 10.1039/c4fd00217b] [Citation(s) in RCA: 65] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The two-step mechanism of nucleation of crystals in solutions posits that the formation of crystal nuclei occurs within structures of extended lifetimes, in which the nucleating solute is at high concentration. The validity of this mechanism has been demonstrated for proteins, small-molecule organic and inorganic materials, colloids, and polymers. Due to large molecule sizes, proteins are an ideal system to study the details of this nucleation pathway, in particular the formation mechanisms of the nucleation precursors and the associated physico-chemical rules. The precursors of protein crystal nuclei are protein-rich clusters of sizes ∼100 nm that contain 10 000–100 000 molecules and occupy less than 10−3of the total solution volume. Here we demonstrate, using oblique illumination microscopy, the liquid nature of the clusters of the protein lysozyme and reveal their inhomogeneous structure. We test a hypothesis put forth by theory that clusters primarily consist of transient protein oligomers. For this, we explore how varying the strength of the Coulomb interaction affects the cluster characteristics. We find that the cluster’s size is insensitive to variations of pH and ionic strength. In contrast, the addition of urea, a chaotropic agent that leads to protein unfolding, strongly decreases the cluster size. Shear stress, a known protein denaturant, induced by bubbling of the solutions with an inert gas, elicits a similar response. These observations support partial protein unfolding, followed by dimerization, as the mechanism of cluster formation. The amide hydrogen–deuterium exchange, monitored by nuclear magnetic resonance, highlights that lysozyme conformational flexibility is a condition for the formation of the protein-rich clusters and facilitates the nucleation of protein crystals.
Collapse
Affiliation(s)
- Maria A. Vorontsova
- Department of Chemical and Biomolecular Engineering
- University of Houston
- Houston
- USA
| | - Dominique Maes
- Structural Biology Brussels
- Vrije Universiteit Brussel
- B-1050 Brussel
- Belgium
| | - Peter G. Vekilov
- Department of Chemical and Biomolecular Engineering
- University of Houston
- Houston
- USA
- Department of Chemistry
| |
Collapse
|
7
|
A preliminary X-ray study of murine Tnfaip8/Oxi-α. Int J Mol Sci 2014; 15:4523-30. [PMID: 24637935 PMCID: PMC3975411 DOI: 10.3390/ijms15034523] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2014] [Revised: 03/06/2014] [Accepted: 03/07/2014] [Indexed: 11/17/2022] Open
Abstract
Tnfaip8/oxidative stress regulated gene-α (Oxi-α) is a novel protein expressed specifically in brain dopaminergic neurons and its over-expression has been reported to protect dopaminergic cells against OS-induced cell death. In this study, murine C165S mutant Tnfaip8/Oxi-α has been crystallized and X-ray data have been collected to 1.8 Å using synchrotron radiation. The crystal belonged to the primitive orthorhombic space group P21212, with unit-cell parameters a = 66.9, b = 72.3, c = 93.5 Å. A full structural determination is under way in order to provide insights into the structure-function relationships of this protein.
Collapse
|
8
|
Kim MS, Lee H, Heo L, Lim A, Seok C, Shin DH. New molecular interaction of IIANtr
and HPr from Burkholderia pseudomallei
identified by X-ray crystallography and docking studies. Proteins 2013; 81:1499-508. [DOI: 10.1002/prot.24275] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2012] [Revised: 01/24/2013] [Accepted: 02/18/2013] [Indexed: 11/11/2022]
Affiliation(s)
- Mi-Sun Kim
- Division of Life & Pharmaceutical Sciences; The Center for Cell Signaling & Drug Discovery Research; College of Pharmacy, Ewha Womans University; Seoul 120-750 Republic of Korea
| | - Hasup Lee
- Department of Chemistry; College of Natural Sciences; Seoul National University; Seoul 151-747 Republic of Korea
| | - Lim Heo
- Department of Chemistry; College of Natural Sciences; Seoul National University; Seoul 151-747 Republic of Korea
| | - Areum Lim
- Division of Life & Pharmaceutical Sciences; The Center for Cell Signaling & Drug Discovery Research; College of Pharmacy, Ewha Womans University; Seoul 120-750 Republic of Korea
| | - Chaok Seok
- Department of Chemistry; College of Natural Sciences; Seoul National University; Seoul 151-747 Republic of Korea
| | - Dong Hae Shin
- Division of Life & Pharmaceutical Sciences; The Center for Cell Signaling & Drug Discovery Research; College of Pharmacy, Ewha Womans University; Seoul 120-750 Republic of Korea
| |
Collapse
|
9
|
Kim MS, Lim A, Yang SW, Park J, Lee D, Shin DH. Structure andin silicosubstrate-binding mode of ADP-L-glycero-D-manno-heptose 6-epimerase fromBurkholderia thailandensis. ACTA CRYSTALLOGRAPHICA SECTION D: BIOLOGICAL CRYSTALLOGRAPHY 2013; 69:658-68. [DOI: 10.1107/s0907444913001030] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2012] [Accepted: 01/11/2013] [Indexed: 11/11/2022]
|
10
|
Kim MS, Lim A, Yang SW, Lee D, Park J, Shin DH. A preliminary X-ray study of transketolase from Burkholderia pseudomallei. Acta Crystallogr Sect F Struct Biol Cryst Commun 2012; 68:1554-6. [PMID: 23192046 PMCID: PMC3509987 DOI: 10.1107/s1744309112044375] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2012] [Accepted: 10/25/2012] [Indexed: 11/10/2022]
Abstract
TktA is the most critical enzyme in the nonoxidative pentose phosphate pathway. It catalyzes the conversion of xylulose 5-phosphate and ribose 5-phosphate into sedoheptulose 7-phosphate and glyceraldehyde 3-phosphate, and its products are used in the biosynthesis of acetyl-CoA, aromatic amino acids, nucleic acids and ADP-L-glycero-β-D-manno-heptose. TktA also has an unexpected role in chromosome structure that is independent of its metabolic responsibilities. Therefore, it is a new potent antibiotic target. In this study, TktA from Burkholderia pseudomallei has been cloned, expressed, purified and crystallized. Synchrotron X-ray data were also collected to 2.0 Å resolution. The crystal belonged to the monoclinic space group C2, with unit-cell parameters a=146.2, b=74.6, c=61.6 Å, β=113.0°. A full structural determination is under way in order to provide insight into the structure-function relationship of this protein.
Collapse
Affiliation(s)
- Mi-Sun Kim
- The Center for Cell Signaling and Drug Discovery Research, College of Pharmacy, Division of Life and Pharmaceutical Sciences, Ewha Womans University, Seoul 120-750, Republic of Korea
| | - Areum Lim
- The Center for Cell Signaling and Drug Discovery Research, College of Pharmacy, Division of Life and Pharmaceutical Sciences, Ewha Womans University, Seoul 120-750, Republic of Korea
| | - Seung Won Yang
- The Center for Cell Signaling and Drug Discovery Research, College of Pharmacy, Division of Life and Pharmaceutical Sciences, Ewha Womans University, Seoul 120-750, Republic of Korea
| | - Daeun Lee
- The Center for Cell Signaling and Drug Discovery Research, College of Pharmacy, Division of Life and Pharmaceutical Sciences, Ewha Womans University, Seoul 120-750, Republic of Korea
| | - Jimin Park
- The Center for Cell Signaling and Drug Discovery Research, College of Pharmacy, Division of Life and Pharmaceutical Sciences, Ewha Womans University, Seoul 120-750, Republic of Korea
| | - Dong Hae Shin
- The Center for Cell Signaling and Drug Discovery Research, College of Pharmacy, Division of Life and Pharmaceutical Sciences, Ewha Womans University, Seoul 120-750, Republic of Korea
| |
Collapse
|
11
|
Kim MS, Shin DH. A preliminary X-ray study of D,D-heptose-1,7-bisphosphate phosphatase from Burkholderia thailandensis E264. Acta Crystallogr Sect F Struct Biol Cryst Commun 2010; 66:160-2. [PMID: 20124712 DOI: 10.1107/s1744309109042614] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2009] [Accepted: 10/16/2009] [Indexed: 11/10/2022]
Abstract
D,D-Heptose-1,7-bisphosphate phosphatase (GmhB), which is involved in the third step of the NDP-heptose biosynthesis pathway, converts D,D-heptose-1,7-bisphosphate to D,D-heptose-1-phosphate. This biosynthesis pathway is a target for new antibiotics or antibiotic adjuvants for Gram-negative pathogens. Burkholderia thailandensis is a useful surrogate organism for studying the pathogenicity of melioidosis owing to its extensive genomic similarity to B. pseudomallei. Melioidosis caused by B. pseudomallei is a serious invasive disease of animals and humans in tropical and subtropical areas. In this study, GmhB has been cloned, expressed, purified and crystallized. X-ray data have also been collected to 2.50 A resolution using synchrotron radiation. The crystal belonged to space group P6, with unit-cell parameters a = 243.2, b = 243.2, c = 41.1 A.
Collapse
Affiliation(s)
- Mi Sun Kim
- College of Pharmacy, Division of Life and Pharmaceutical Sciences and Center for Cell Signaling and Drug Discovery Research, Ewha Womans University, Seoul 120-750, Republic of Korea
| | | |
Collapse
|
12
|
Kühner S, van Noort V, Betts MJ, Leo-Macias A, Batisse C, Rode M, Yamada T, Maier T, Bader S, Beltran-Alvarez P, Castaño-Diez D, Chen WH, Devos D, Güell M, Norambuena T, Racke I, Rybin V, Schmidt A, Yus E, Aebersold R, Herrmann R, Böttcher B, Frangakis AS, Russell RB, Serrano L, Bork P, Gavin AC. Proteome organization in a genome-reduced bacterium. Science 2009; 326:1235-40. [PMID: 19965468 DOI: 10.1126/science.1176343] [Citation(s) in RCA: 361] [Impact Index Per Article: 22.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
The genome of Mycoplasma pneumoniae is among the smallest found in self-replicating organisms. To study the basic principles of bacterial proteome organization, we used tandem affinity purification-mass spectrometry (TAP-MS) in a proteome-wide screen. The analysis revealed 62 homomultimeric and 116 heteromultimeric soluble protein complexes, of which the majority are novel. About a third of the heteromultimeric complexes show higher levels of proteome organization, including assembly into larger, multiprotein complex entities, suggesting sequential steps in biological processes, and extensive sharing of components, implying protein multifunctionality. Incorporation of structural models for 484 proteins, single-particle electron microscopy, and cellular electron tomograms provided supporting structural details for this proteome organization. The data set provides a blueprint of the minimal cellular machinery required for life.
Collapse
Affiliation(s)
- Sebastian Kühner
- European Molecular Biology Laboratory, Meyerhofstrasse 1, D-69117 Heidelberg, Germany
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
13
|
Kim MS, Shin DH. A preliminary X-ray study of sedoheptulose-7-phosphate isomerase from Burkholderia pseudomallei. Acta Crystallogr Sect F Struct Biol Cryst Commun 2009; 65:1110-2. [PMID: 19923728 DOI: 10.1107/s174430910903259x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2009] [Accepted: 08/17/2009] [Indexed: 11/11/2022]
Abstract
Sedoheptulose-7-phosphate isomerase (GmhA) converts d-sedoheptulose 7-phosphate to d,d-heptose 7-phosphate. This is the first step in the biosynthesis pathway of NDP-heptose, which is responsible for the pleiotropic phenotype. This biosynthesis pathway is the target of inhibitors to increase the membrane permeability of Gram-negative pathogens or of adjuvants working synergistically with known antibiotics. Burkholderia pseudomallei is the causative agent of melioidosis, a seriously invasive disease in animals and humans in tropical and subtropical areas. GmhA from B. pseudomallei is one of the targets of antibiotic adjuvants for melioidosis. In this study, GmhA has been cloned, expressed, purified and crystallized. Synchrotron X-ray data were also collected to 1.9 angstrom resolution. The crystal belonged to the primitive orthorhombic space group P2(1)2(1)2(1), with unit-cell parameters a = 61.3, b = 84.2, c = 142.3 angstrom. A full structural determination is under way in order to provide insights into the structure- function relationships of this protein.
Collapse
Affiliation(s)
- Mi Sun Kim
- Ewha Womans University, Seoul, Republic of Korea
| | | |
Collapse
|
14
|
Abstract
ORFan genes can constitute a large fraction of a bacterial genome, but due to their lack of homologs, their functions have remained largely unexplored. To determine if particular features of ORFan-encoded proteins promote their presence in a genome, we analyzed properties of ORFans that originated over a broad evolutionary timescale. We also compared ORFan genes to another class of acquired genes, heterogeneous occurrence in prokaryotes (HOPs), which have homologs in other bacteria. A total of 54 ORFan and HOP genes selected from different phylogenetic depths in the Escherichia coli lineage were cloned, expressed, purified, and subjected to circular dichroism (CD) spectroscopy. A majority of genes could be expressed, but only 18 yielded sufficient soluble protein for spectral analysis. Of these, half were significantly alpha-helical, three were predominantly beta-sheet, and six were of intermediate/indeterminate structure. Although a higher proportion of HOPs yielded soluble proteins with resolvable secondary structures, ORFans resembled HOPs with regard to most of the other features tested. Overall, we found that those ORFan and HOP genes that have persisted in the E. coli lineage were more likely to encode soluble and folded proteins, more likely to display environmental modulation of their gene expression, and by extrapolation, are more likely to be functional.
Collapse
Affiliation(s)
- Hema Prasad Narra
- Department of Biochemistry & Molecular Biophysics, University of Arizona, Tucson, AZ, USA
| | - Matthew H. J. Cordes
- Department of Biochemistry & Molecular Biophysics, University of Arizona, Tucson, AZ, USA
| | - Howard Ochman
- Department of Biochemistry & Molecular Biophysics, University of Arizona, Tucson, AZ, USA
| |
Collapse
|
15
|
Abstract
The initial objective of the Berkeley Structural Genomics Center was to obtain a near complete three-dimensional (3D) structural information of all soluble proteins of two minimal organisms, closely related pathogens Mycoplasma genitalium and M. pneumoniae. The former has fewer than 500 genes and the latter has fewer than 700 genes. A semiautomated structural genomics pipeline was set up from target selection, cloning, expression, purification, and ultimately structural determination. At the time of this writing, structural information of more than 93% of all soluble proteins of M. genitalium is avail able. This chapter summarizes the approaches taken by the authors' center.
Collapse
|
16
|
Busso D, Thierry JC, Moras D. The structural biology and genomics platform in strasbourg: an overview. METHODS IN MOLECULAR BIOLOGY (CLIFTON, N.J.) 2008; 426:523-36. [PMID: 18542888 DOI: 10.1007/978-1-60327-058-8_35] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
This chapter describes the modules and facilities of the Structural Biology and Genomics Platform (SBGP), Strasbourg, France. The platform consists of three modules (cloning, mini-expression screening; optimization-large scale protein production; characterization, crystallization) with dedicated scientists, and other facilities for purifying recombinant proteins and solving three-dimensional (3D) structures. Strong collaborations have been established with the Integrative Bioinformatics and Genomics group, located in the same institition, for target selection and domains definition. To handle large numbers of samples, classical and new protocols were adapted to automation, increasing reproducibility and reducing error risks as well. Using the platform and its facilities, over 2,000 expression vectors have been constructed and more than 40 novel structures, of mostly human proteins, have been solved.
Collapse
Affiliation(s)
- Didier Busso
- Structural Biology and Genomics Platform, IGBMC, CNRS/INSERM/Université Louis Pasteur, Illkirch, France
| | | | | |
Collapse
|
17
|
Shin DH, Proudfoot M, Lim HJ, Choi IK, Yokota H, Yakunin AF, Kim R, Kim SH. Structural and enzymatic characterization of DR1281: A calcineurin-like phosphoesterase from Deinococcus radiodurans. Proteins 2008; 70:1000-9. [PMID: 17847097 DOI: 10.1002/prot.21584] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022]
Abstract
We have determined the crystal structure of DR1281 from Deinococcus radiodurans. DR1281 is a protein of unknown function with over 170 homologs found in prokaryotes and eukaryotes. To elucidate the molecular function of DR1281, its crystal structure at 2.3 A resolution was determined and a series of biochemical screens for catalytic activity was performed. The crystal structure shows that DR1281 has two domains, a small alpha domain and a putative catalytic domain formed by a four-layered structure of two beta-sheets flanked by five alpha-helices on both sides. The small alpha domain interacts with other molecules in the asymmetric unit and contributes to the formation of oligomers. The structural comparison of the putative catalytic domain with known structures suggested its biochemical function to be a phosphatase, phosphodiesterase, nuclease, or nucleotidase. Structural analyses with its homologues also indicated that there is a dinuclear center at the interface of two domains formed by Asp8, Glu37, Asn38, Asn65, His148, His173, and His175. An absolute requirement of metal ions for activity has been proved by enzymatic assay with various divalent metal ions. A panel of general enzymatic assays of DR1281 revealed metal-dependent catalytic activity toward model substrates for phosphatases (p-nitrophenyl phosphate) and phosphodiesterases (bis-p-nitrophenyl phosphate). Subsequent secondary enzymatic screens with natural substrates demonstrated significant phosphatase activity toward phosphoenolpyruvate and phosphodiesterase activity toward 2',3'-cAMP. Thus, our structural and enzymatic studies have identified the biochemical function of DR1281 as a novel phosphatase/phosphodiesterase and disclosed key conserved residues involved in metal binding and catalytic activity.
Collapse
Affiliation(s)
- Dong Hae Shin
- College of Pharmacy, Ewha Womans University, Seoul 120-750, Republic of Korea.
| | | | | | | | | | | | | | | |
Collapse
|
18
|
Structural Genomics Consortium, China Structural Genomics Consortium, Northeast Structural Genomics Consortium, Gräslund S, Nordlund P, Weigelt J, Hallberg BM, Bray J, Gileadi O, Knapp S, Oppermann U, Arrowsmith C, Hui R, Ming J, dhe-Paganon S, Park HW, Savchenko A, Yee A, Edwards A, Vincentelli R, Cambillau C, Kim R, Kim SH, Rao Z, Shi Y, Terwilliger TC, Kim CY, Hung LW, Waldo GS, Peleg Y, Albeck S, Unger T, Dym O, Prilusky J, Sussman JL, Stevens RC, Lesley SA, Wilson IA, Joachimiak A, Collart F, Dementieva I, Donnelly MI, Eschenfeldt WH, Kim Y, Stols L, Wu R, Zhou M, Burley SK, Emtage JS, Sauder JM, Thompson D, Bain K, Luz J, Gheyi T, Zhang F, Atwell S, Almo SC, Bonanno JB, Fiser A, Swaminathan S, Studier FW, Chance MR, Sali A, Acton TB, Xiao R, Zhao L, Ma LC, Hunt JF, Tong L, Cunningham K, Inouye M, Anderson S, Janjua H, Shastry R, Ho CK, Wang D, Wang H, Jiang M, Montelione GT, Stuart DI, Owens RJ, Daenke S, Schütz A, Heinemann U, Yokoyama S, Büssow K, Gunsalus KC. Protein production and purification. Nat Methods 2008; 5:135-46. [PMID: 18235434 PMCID: PMC3178102 DOI: 10.1038/nmeth.f.202] [Citation(s) in RCA: 631] [Impact Index Per Article: 37.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
In selecting a method to produce a recombinant protein, a researcher is faced with a bewildering array of choices as to where to start. To facilitate decision-making, we describe a consensus 'what to try first' strategy based on our collective analysis of the expression and purification of over 10,000 different proteins. This review presents methods that could be applied at the outset of any project, a prioritized list of alternate strategies and a list of pitfalls that trip many new investigators.
Collapse
|
19
|
Taylor WR. Evolutionary transitions in protein fold space. Curr Opin Struct Biol 2007; 17:354-61. [PMID: 17580115 DOI: 10.1016/j.sbi.2007.06.002] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2007] [Revised: 04/11/2007] [Accepted: 06/06/2007] [Indexed: 10/23/2022]
Abstract
With the number of known protein folds potentially approaching completion, the problems associated with their systematic classification are evaluated. It is argued that it will be difficult, if not impossible, to find a general metric based on pairwise comparison that will provide a satisfactory classification. It is suggested that some progress may be made through comparison against a library of idealised 'template' folds, but a proper solution can only be attained if this includes a model of the underlying evolutionary processes. These processes are considered with examples of some unexpected relationships among folds, including circular permutations. The problem is finally set in the wider context of the genetic environment, introducing complications relating to introns, gene fixation and population size.
Collapse
Affiliation(s)
- William R Taylor
- Division of Mathematical Biology, National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK.
| |
Collapse
|
20
|
Puri M, Robin G, Cowieson N, Forwood JK, Listwan P, Hu SH, Guncar G, Huber T, Kellie S, Hume DA, Kobe B, Martin JL. Focusing in on structural genomics: The University of Queensland structural biology pipeline. ACTA ACUST UNITED AC 2006; 23:281-9. [PMID: 17097918 DOI: 10.1016/j.bioeng.2006.09.002] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/14/2006] [Revised: 09/22/2006] [Accepted: 09/25/2006] [Indexed: 10/24/2022]
Abstract
The flood of new genomic sequence information together with technological innovations in protein structure determination have led to worldwide structural genomics (SG) initiatives. The goals of SG initiatives are to accelerate the process of protein structure determination, to fill in protein fold space and to provide information about the function of uncharacterized proteins. In the long-term, these outcomes are likely to impact on medical biotechnology and drug discovery, leading to a better understanding of disease as well as the development of new therapeutics. Here we describe the high throughput pipeline established at the University of Queensland in Australia. In this focused pipeline, the targets for structure determination are proteins that are expressed in mouse macrophage cells and that are inferred to have a role in innate immunity. The aim is to characterize the molecular structure and the biochemical and cellular function of these targets by using a parallel processing pipeline. The pipeline is designed to work with tens to hundreds of target gene products and comprises target selection, cloning, expression, purification, crystallization and structure determination. The structures from this pipeline will provide insights into the function of previously uncharacterized macrophage proteins and could lead to the validation of new drug targets for chronic obstructive pulmonary disease and arthritis.
Collapse
Affiliation(s)
- Munish Puri
- Institute for Molecular Bioscience, University of Queensland, Brisbane, Queensland, Australia.
| | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
21
|
Chandonia JM, Kim SH. Structural proteomics of minimal organisms: conservation of protein fold usage and evolutionary implications. BMC STRUCTURAL BIOLOGY 2006; 6:7. [PMID: 16566839 PMCID: PMC1488858 DOI: 10.1186/1472-6807-6-7] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/20/2005] [Accepted: 03/28/2006] [Indexed: 11/10/2022]
Abstract
BACKGROUND Determining the complete repertoire of protein structures for all soluble, globular proteins in a single organism has been one of the major goals of several structural genomics projects in recent years. RESULTS We report that this goal has nearly been reached for several "minimal organisms"--parasites or symbionts with reduced genomes--for which over 95% of the soluble, globular proteins may now be assigned folds, overall 3-D backbone structures. We analyze the structures of these proteins as they relate to cellular functions, and compare conservation of fold usage between functional categories. We also compare patterns in the conservation of folds among minimal organisms and those observed between minimal organisms and other bacteria. CONCLUSION We find that proteins performing essential cellular functions closely related to transcription and translation exhibit a higher degree of conservation in fold usage than proteins in other functional categories. Folds related to transcription and translation functional categories were also overrepresented in minimal organisms compared to other bacteria.
Collapse
Affiliation(s)
- John-Marc Chandonia
- Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
| | - Sung-Hou Kim
- Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720, USA
- Department of Chemistry, University of California, Berkeley, CA 94720, USA
| |
Collapse
|
22
|
Chandonia JM, Kim SH, Brenner SE. Target selection and deselection at the Berkeley Structural Genomics Center. Proteins 2005; 62:356-70. [PMID: 16276528 DOI: 10.1002/prot.20674] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
At the Berkeley Structural Genomics Center (BSGC), our goal is to obtain a near-complete structural complement of proteins in the minimal organisms Mycoplasma genitalium and M. pneumoniae, two closely related pathogens. Current targets for structure determination have been selected in six major stages, starting with those predicted to be most tractable to high throughput study and likely to yield new structural information. We report on the process used to select these proteins, as well as our target deselection procedure. Target deselection reduces experimental effort by eliminating targets similar to those recently solved by the structural biology community or other centers. We measure the impact of the 69 structures solved at the BSGC as of July 2004 on structure prediction coverage of the M. pneumoniae and M. genitalium proteomes. The number of Mycoplasma proteins for which the fold could first be reliably assigned based on structures solved at the BSGC (24 M. pneumoniae and 21 M. genitalium) is approximately 25% of the total resulting from work at all structural genomics centers and the worldwide structural biology community (94 M. pneumoniae and 86 M. genitalium) during the same period. As the number of structures contributed by the BSGC during that period is less than 1% of the total worldwide output, the benefits of a focused target selection strategy are apparent. If the structures of all current targets were solved, the percentage of M. pneumoniae proteins for which folds could be reliably assigned would increase from approximately 57% (391 of 687) at present to around 80% (550 of 687), and the percentage of the proteome that could be accurately modeled would increase from around 37% (254 of 687) to about 64% (438 of 687). In M. genitalium, the percentage of the proteome that could be structurally annotated based on structures of our remaining targets would rise from 72% (348 of 486) to around 76% (371 of 486), with the percentage of accurately modeled proteins would rise from 50% (243 of 486) to 58% (283 of 486). Sequences and data on experimental progress on our targets are available in the public databases TargetDB and PEPCdb.
Collapse
Affiliation(s)
- John-Marc Chandonia
- Berkeley Structural Genomics Center, Physical Biosciences Division, Lawrence Berkeley National Laboratory, Berkeley, California, USA
| | | | | |
Collapse
|