51
|
|
52
|
Kinch LN, Cheek S, Grishin NV. EDD, a novel phosphotransferase domain common to mannose transporter EIIA, dihydroxyacetone kinase, and DegV. Protein Sci 2005; 14:360-7. [PMID: 15632288 PMCID: PMC2253402 DOI: 10.1110/ps.041114805] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Using a recently developed program (SCOPmap) designed to automatically assign new protein structures to existing evolutionary-based classification schemes, we identify a evolutionarily conserved domain (EDD) common to three different folds: mannose transporter EIIA domain (EIIA-man), dihydroxyacetone kinase (Dak), and DegV. Several lines of evidence support unification of these three folds into a single superfamily: statistically significant sequence similarity detected by PSI-BLAST; "closed structural grouping" using DALI Z-scores (each protein inside a group finds all other group members with scores higher than those to proteins outside the group) that includes only these proteins sharing a unique alpha-helical hairpin at the C-terminus and excludes all other proteins with similar topology; similar domain fusions connect Dak and DegV, and genomic neighborhood organizations connect Dak and EIIA-man. Finally, both Dak and EIIA-man perform similar phosphotransfer reactions, suggesting a phosphotransferase activity for the DegV-like family of proteins, whose function other than lipid binding revealed in the crystal structure remains unknown.
Collapse
Affiliation(s)
- Lisa N Kinch
- Howard Hughes Medical Institute, University of Texas Southwestern Medical Center, 5323 Harry Hines Blvd., Dallas, TX 75390-9050, USA
| | | | | |
Collapse
|
53
|
Stroh JG, Loulakis P, Lanzetti AJ, Xie J. LC-mass spectrometry analysis of N- and C-terminal boundary sequences of polypeptide fragments by limited proteolysis. JOURNAL OF THE AMERICAN SOCIETY FOR MASS SPECTROMETRY 2005; 16:38-45. [PMID: 15653362 DOI: 10.1016/j.jasms.2004.08.018] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/18/2004] [Revised: 08/31/2004] [Accepted: 08/31/2004] [Indexed: 05/24/2023]
Abstract
Limited proteolysis is an important and widely used method for analyzing the tertiary structure and determining the domain boundaries of proteins. Here we describe a novel method for determining the N- and C-terminal boundary amino acid sequences of products derived from limited proteolysis using semi-specific and/or non-specific enzymes, with mass spectrometry as the only analytical tool. The core of this method is founded on the recognition that cleavage of proteins with non-specific proteases is not random, but patterned. Based on this recognition, we have the ability to determine the sequence of each proteolytic fragment by extracting a common association between data sets containing multiple potential sequences derived from two or more different mass spectral molecular weight measurements. Proteolytic product sequences derived from specific and non-specific enzymes can be accurately determined without resorting to the conventional time-consuming and laborious methods of SDS-PAGE and N-terminal sequencing analysis. Because of the sensitivity of mass spectrometry, multiple transient proteolysis intermediates can also be identified and analyzed by this method, which allows the ability to monitor the progression of proteolysis and thereby gain insight into protein structures.
Collapse
Affiliation(s)
- Justin G Stroh
- PGRD-Groton Laboratories, Pfizer Inc., Groton, Connecticut 06340, USA.
| | | | | | | |
Collapse
|
54
|
Koch MA, Waldmann H. Protein domain fold similarity and natural product structure as guiding principles for compound library design. ERNST SCHERING RESEARCH FOUNDATION WORKSHOP 2005:1-18. [PMID: 15645714 DOI: 10.1007/3-540-27055-8_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/01/2023]
Affiliation(s)
- M A Koch
- Max Planck Institute of Molecular Physiology, Department of Chemical Biology and Fachbereich Organische Chemie, University of Dortmund, Germany
| | | |
Collapse
|
55
|
Abstract
An ability to assign protein function from protein structure is important for structural genomics consortia. The complex relationship between protein fold and function highlights the necessity of looking beyond the global fold of a protein to specific functional sites. Many computational methods have been developed that address this issue. These include evolutionary trace methods, methods that involve the calculation and assessment of maximal superpositions, methods based on graph theory, and methods that apply machine learning techniques. Such function prediction techniques have been applied to the identification of enzyme catalytic triads and DNA-binding motifs.
Collapse
Affiliation(s)
- Susan Jones
- European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge, CB10 1SD, UK.
| | | |
Collapse
|
56
|
Mans BJ, Neitz AWH. Exon-intron structure of outlier tick lipocalins indicate a monophyletic origin within the larger lipocalin family. INSECT BIOCHEMISTRY AND MOLECULAR BIOLOGY 2004; 34:585-594. [PMID: 15147759 DOI: 10.1016/j.ibmb.2004.03.006] [Citation(s) in RCA: 17] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/20/2004] [Revised: 03/23/2004] [Accepted: 03/26/2004] [Indexed: 05/24/2023]
Abstract
All tick proteins assigned to the lipocalin family lack the structural conserved regions (SCRs) that are characteristic of the kernel lipocalins and can thus be classified as outliers. These tick proteins have been assigned to the tick lipocalin family based on database searches that indicated homology between tick sequences and the fact that the histamine binding protein (HBP2) from the hard tick Rhipicephalus appendiculatus (Ixodidae) shows structural similarity to the lipocalin fold. Sequence identity between kernel and outlier lipocalins falls below 20% and the question raised is whether the outlier and kernel lipocalins are truly homologous. More specifically in the case of the tick lipocalins, whether their structural fold is derived from the lipocalin fold or whether convergent evolution resulted in the generation of the basic lipocalin-like fold which consists of an eight stranded continuous anti-parallel beta-barrel terminated by a C-terminal alpha-helix that lies parallel to the barrel. The current study determined the gene structure for HBP2 and TSGP1, TSGP2 and TSGP4, lipocalins identified from the soft tick Ornithodoros savignyi (Argasidae). All tick lipocalins have four introns (A-D) with conserved positions and phases within the tick lipocalin sequence alignment. The positions and phase information are also conserved with regard to the rest of the lipocalin family. Phylogenetic analysis using this information shows conclusively that tick lipocalins are evolutionary related to the rest of the lipocalin family. Tick lipocalins are grouped within a monophyletic clade that indicates a monophyletic origin within the tick lineage and also group with the other arthropod lipocalins in a larger clade. Phylogenetic analysis of sequence alignments based on conserved secondary structure of the lipocalin fold support the conclusions from the gene structure trees. These results indicate that exon-intron arrangement can be useful for the inclusion of outlier lipocalins within the larger lipocalin family.
Collapse
Affiliation(s)
- Ben J Mans
- Department of Biochemistry, University of Pretoria, Pretoria 0002, South Africa.
| | | |
Collapse
|
57
|
Vogel C, Bashton M, Kerrison ND, Chothia C, Teichmann SA. Structure, function and evolution of multidomain proteins. Curr Opin Struct Biol 2004; 14:208-16. [PMID: 15093836 DOI: 10.1016/j.sbi.2004.03.011] [Citation(s) in RCA: 292] [Impact Index Per Article: 13.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
Proteins are composed of evolutionary units called domains; the majority of proteins consist of at least two domains. These domains and nature of their interactions determine the function of the protein. The roles that combinations of domains play in the formation of the protein repertoire have been found by analysis of domain assignments to genome sequences. Additional findings on the geometry of domains have been gained from examination of three-dimensional protein structures. Future work will require a domain-centric functional classification scheme and efforts to determine structures of domain combinations.
Collapse
Affiliation(s)
- Christine Vogel
- MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK.
| | | | | | | | | |
Collapse
|
58
|
Gunasekaran K, Ma B, Nussinov R. Triggering loops and enzyme function: identification of loops that trigger and modulate movements. J Mol Biol 2003; 332:143-59. [PMID: 12946353 DOI: 10.1016/s0022-2836(03)00893-3] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Enzyme function often involves a conformational change. There is a general agreement that loops play a vital role in correctly positioning the catalytically important residues. Nevertheless, predicting the functional loops and most importantly their role in enzyme function remains a difficult task. A major reason for this difficulty is that loops that undergo conformational change are frequently not well conserved in their primary sequence. beta1,4-Galactosyltransferase is one such enzyme. There, the amino acid sequence of a long loop that undergoes a large conformational change upon substrate binding is not well conserved. Our molecular dynamics simulations show that the large conformational change in the long loop is brought about by a second, interacting loop. Interestingly, while the structural change of the second loop is much smaller than that of the long loop, its sequence (particularly glycine residues) is highly conserved. We further examine the generality of the proposition that there are loops that trigger movements but nevertheless show little or no structural changes in crystals. We focus on two other enzymes, enolase and lipase. We chose these enzymes, since they too undergo conformational change upon ligand binding, however, they have different folds and different functions. Through multiple sets of simulations we show that the conformational change of the functional loop(s) is brought about through communication of flexibility by triggering loops that have several glycine residues. We further propose that similar to the conservation of common favorable fold types and structural motifs, evolution has also conserved common "skillful" mechanisms. Mechanisms may be conserved across different folds, sequences and functions, with adaptation to specific enzymatic roles.
Collapse
Affiliation(s)
- K Gunasekaran
- Basic Research Program, SAIC-Frederick Inc., Laboratory of Experimental and Computational Biology, NCI-Frederick, Bldg. 469 Rm. 151, Frederick, MD 21702, USA
| | | | | |
Collapse
|
59
|
Riveros-Rosas H, Julián-Sánchez A, Villalobos-Molina R, Pardo JP, Piña E. Diversity, taxonomy and evolution of medium-chain dehydrogenase/reductase superfamily. EUROPEAN JOURNAL OF BIOCHEMISTRY 2003; 270:3309-34. [PMID: 12899689 DOI: 10.1046/j.1432-1033.2003.03704.x] [Citation(s) in RCA: 79] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
Abstract
A comprehensive, structural and functional, in silico analysis of the medium-chain dehydrogenase/reductase (MDR) superfamily, including 583 proteins, was carried out by use of extensive database mining and the blastp program in an iterative manner to identify all known members of the superfamily. Based on phylogenetic, sequence, and functional similarities, the protein members of the MDR superfamily were classified into three different taxonomic categories: (a) subfamilies, consisting of a closed group containing a set of ideally orthologous proteins that perform the same function; (b) families, each comprising a cluster of monophyletic subfamilies that possess significant sequence identity among them and might share or not common substrates or mechanisms of reaction; and (c) macrofamilies, each comprising a cluster of monophyletic protein families with protein members from the three domains of life, which includes at least one subfamily member that displays activity related to a very ancient metabolic pathway. In this context, a superfamily is a group of homologous protein families (and/or macrofamilies) with monophyletic origin that shares at least a barely detectable sequence similarity, but showing the same 3D fold. The MDR superfamily encloses three macrofamilies, with eight families and 49 subfamilies. These subfamilies exhibit great functional diversity including noncatalytic members with different subcellular, phylogenetic, and species distributions. This results from constant enzymogenesis and proteinogenesis within each kingdom, and highlights the huge plasticity that MDR superfamily members possess. Thus, through evolution a great number of taxa-specific new functions were acquired by MDRs. The generation of new functions fulfilled by proteins, can be considered as the essence of protein evolution. The mechanisms of protein evolution inside MDR are not constrained to conserve substrate specificity and/or chemistry of catalysis. In consequence, MDR functional diversity is more complex than sequence diversity. MDR is a very ancient protein superfamily that existed in the last universal common ancestor. It had at least two (and probably three) different ancestral activities related to formaldehyde metabolism and alcoholic fermentation. Eukaryotic members of this superfamily are more related to bacterial than to archaeal members; horizontal gene transfer among the domains of life appears to be a rare event in modern organisms.
Collapse
Affiliation(s)
- Héctor Riveros-Rosas
- Depto. Bioquímica, Fac. Medicina, UNAM, Cd. Universitaria, México D.F., México; Depto. Farmacobiología, CINVESTAV-Sede Sur, México D.F., México
| | | | | | | | | |
Collapse
|
60
|
Breinbauer R, Vetter IR, Waldmann H. From protein domains to drug candidates--natural products as guiding principles in compound library design and synthesis. ERNST SCHERING RESEARCH FOUNDATION WORKSHOP 2003:167-88. [PMID: 12664541 DOI: 10.1007/978-3-662-05314-0_11] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/01/2023]
Affiliation(s)
- R Breinbauer
- Max-Planck-Institut für Molekulare Physiologie, Department of Chemical Biology, Otto-Hahn-Str. 11, 44227 Dortmund, Germany.
| | | | | |
Collapse
|
61
|
Koch MA, Breinbauer R, Waldmann H. Protein Structure Similarity as Guiding Principle for Combinatorial Library Design. Biol Chem 2003; 384:1265-72. [PMID: 14515987 DOI: 10.1515/bc.2003.140] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022]
Abstract
Proteins are modularly built from a limited set of approximately 1000 structural domains. The evolutionary relationship within a domain family suggests that the knowledge about a common fold structure can be exploited for the design of small molecule libraries in the development of inhibitors and ligands. This principle has been used for the synthesis of inhibitors for kinases sharing the same fold. It can also be applied for proteins which share the same fold architecture yet belong to different functional classes. Bestatin--originally known as an aminopeptidase inhibitor--was employed as guiding structure for the development of leukotriene A4 hydrolase inhibitors. A combinatorial approach helped to identify inhibitors for sulfotransferases which share structural similarity with nucleotide kinases using a kinase inhibitor core structure as guiding principle.
Collapse
Affiliation(s)
- Marcus A Koch
- Max-Planck-Institut für molekulare Physiologie, Abteilung Chemische Biologie, and Fachbereich III, Organische Chemie, Universität Dortmund, D-44227 Dortmund, Germany
| | | | | |
Collapse
|
62
|
Watson JD, Todd AE, Bray J, Laskowski RA, Edwards A, Joachimiak A, Orengo CA, Thornton JM. Target selection and determination of function in structural genomics. IUBMB Life 2003; 55:249-55. [PMID: 12880206 PMCID: PMC3366504 DOI: 10.1080/1521654031000123385] [Citation(s) in RCA: 20] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
Abstract
The first crucial step in any structural genomics project is the selection and prioritization of target proteins for structure determination. There may be a number of selection criteria to be satisfied, including that the proteins have novel folds, that they be representatives of large families for which no structure is known, and so on. The better the selection at this stage, the greater is the value of the structures obtained at the end of the experimental process. This value can be further enhanced once the protein structures have been solved if the functions of the given proteins can also be determined. Here we describe the methods used at either end of the experimental process: firstly, sensitive sequence comparison techniques for selecting a high-quality list of target proteins, and secondly the various computational methods that can be applied to the eventual 3D structures to determine the most likely biochemical function of the proteins in question.
Collapse
Affiliation(s)
- James D Watson
- EMBL-European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
| | | | | | | | | | | | | | | |
Collapse
|
63
|
Kim GJ, Lee DE, Kim HS. Characterization and evaluation of a distinct fusion ability in the functionally related cyclic amidohydrolase family enzymes. BIOTECHNOL BIOPROC E 2002. [DOI: 10.1007/bf02932913] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
64
|
Pearl FMG, Lee D, Bray JE, Buchan DWA, Shepherd AJ, Orengo CA. The CATH extended protein-family database: providing structural annotations for genome sequences. Protein Sci 2002; 11:233-44. [PMID: 11790833 PMCID: PMC2373435 DOI: 10.1110/ps.16802] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/17/2022]
Abstract
An automatic sequence search and analysis protocol (DomainFinder) based on PSI-BLAST and IMPALA, and using conservative thresholds, has been developed for reliably integrating gene sequences from GenBank into their respective structural families within the CATH domain database (http://www.biochem.ucl.ac.uk/bsm/cath_new). DomainFinder assigns a new gene sequence to a CATH homologous superfamily provided that PSI-BLAST identifies a clear relationship to at least one other Protein Data Bank sequence within that superfamily. This has resulted in an expansion of the CATH protein family database (CATH-PFDB v1.6) from 19,563 domain structures to 176,597 domain sequences. A further 50,000 putative homologous relationships can be identified using less stringent cut-offs and these relationships are maintained within neighbour tables in the CATH Oracle database, pending further evidence of their suggested evolutionary relationship. Analysis of the CATH-PFDB has shown that only 15% of the sequence families are close enough to a known structure for reliable homology modeling. IMPALA/PSI-BLAST profiles have been generated for each of the sequence families in the expanded CATH-PFDB and a web server has been provided so that new sequences may be scanned against the profile library and be assigned to a structure and homologous superfamily.
Collapse
Affiliation(s)
- Frances M G Pearl
- Department of Biochemistry and Molecular Biology, University College London, University of London, London WC1E 6BT, UK.
| | | | | | | | | | | |
Collapse
|
65
|
Almeida MS, Cabral KMS, Kurtenbach E, Almeida FCL, Valente AP. Solution structure of Pisum sativum defensin 1 by high resolution NMR: plant defensins, identical backbone with different mechanisms of action. J Mol Biol 2002; 315:749-57. [PMID: 11812144 DOI: 10.1006/jmbi.2001.5252] [Citation(s) in RCA: 90] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Pisum sativum defensin 1 (Psd1) is a 46 amino acid residue plant defensin isolated from seeds of pea. The three-dimensional structure in solution of Psd1 was determined by two-dimensional NMR data recorded at 600 MHz. Experimental restraints were used for structure calculation using CNS and torsion-angle molecular dynamics. The 20 lowest energy structures were selected and further subjected to minimization, giving a root-mean-square deviation of 0.78(+/- 0.22) A in the backbone and 1.91(+/-0.60) A for over all atoms of the molecule. The protein has a globular fold with a triple-stranded antiparalell beta-sheet and an alpha-helix (from residue Asn17 to Leu27). Psd1 presents the so called "cysteine stabilized alpha/beta motif" and presents identical three-dimensional topology in the backbone with other defensins and neurotoxins. Comparison of the electrostatic surface potential among proteins with high three-dimensional (selected using the softwares TOP and DALI) topology gave insights into the mode of action of Psd1. The surface topologies between proteins that present antifungal activity or sodium channel inhibiting activity are different. On the other hand the surface topology presents several common features with potassium channel inhibitors, suggesting that Psd1 presents this activity. Other common features with potassium channel inhibitors were found including the presence of a lysine residue essential for inhibitory activity. The identity of Psd1 in primary sequence is not enough to infer a mechanism of action, in contrast with the strategy proposed here.
Collapse
Affiliation(s)
- Marcius S Almeida
- Departamento de Bioquímica Médica, ICB/CCS/UFRJ. CEP., Rio de Janeiro, 21941-590, Brazil
| | | | | | | | | |
Collapse
|
66
|
Weir M, Swindells M, Overington J. Insights into protein function through large-scale computational analysis of sequence and structure. Trends Biotechnol 2001; 19:S61-6. [PMID: 11780973 DOI: 10.1016/s0167-7799(01)01794-2] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Functional genomic and proteomic technologies are producing biological data relating to hundreds, or even thousands of proteins per experiment. Rapid and accurate computational analysis of the molecular function of these proteins is therefore crucial in order to interpret these data and prioritize further experiments.
Collapse
Affiliation(s)
- M Weir
- Inpharmatica, London, UK.
| | | | | |
Collapse
|
67
|
Weir M, Swindells M, Overington J. Insights into protein function through large-scale computational analysis of sequence and structure. Trends Biotechnol 2001. [DOI: 10.1016/s0167-7799(01)00011-7] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022]
|
68
|
Park HS, Kim HS. Genetic and structural organization of the aminophenol catabolic operon and its implication for evolutionary process. J Bacteriol 2001; 183:5074-81. [PMID: 11489860 PMCID: PMC95383 DOI: 10.1128/jb.183.17.5074-5081.2001] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
The aminophenol (AP) catabolic operon in Pseudomonas putida HS12 mineralizing nitrobenzene was found to contain all the enzymes responsible for the conversion of AP to pyruvate and acetyl coenzyme A via extradiol meta cleavage of 2-aminophenol. The sequence and functional analyses of the corresponding genes of the operon revealed that the AP catabolic operon consists of one regulatory gene, nbzR, and the following nine structural genes, nbzJCaCbDGFEIH, which encode catabolic enzymes. The NbzR protein, which is divergently transcribed with respect to the structural genes, possesses a leucine zipper motif and a MarR homologous domain. It was also found that NbzR functions as a repressor for the AP catabolic operon through binding to the promoter region of the gene cluster in its dimeric form. A comparative study of the AP catabolic operon with other meta cleavage operons led us to suggest that the regulatory unit (nbzR) was derived from the MarR family and that the structural unit (nbzJCaCbDGFEIH) has evolved from the ancestral meta cleavage gene cluster. It is also proposed that these two functional units assembled through a modular type gene transfer and then have evolved divergently to acquire specialized substrate specificities (NbzCaCb and NbzD) and catalytic function (NbzE), resulting in the creation of the AP catabolic operon. The evolutionary process of the AP operon suggests how bacteria have efficiently acquired genetic diversity and expanded their metabolic capabilities by modular type gene transfer.
Collapse
Affiliation(s)
- H S Park
- Department of Biological Sciences, Korea Advanced Institute of Science and Technology, 373-1, Kusong-dong, Yusong-gu, Taejon, 305-701, Korea
| | | |
Collapse
|
69
|
Kim GJ, Cheon YH, Park MS, Park HS, Kim HS. Generation of protein lineages with new sequence spaces by functional salvage screen. PROTEIN ENGINEERING 2001; 14:647-54. [PMID: 11707610 DOI: 10.1093/protein/14.9.647] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
A variety of different methods to generate diverse proteins, including random mutagenesis and recombination, are currently available and most of them accumulate the mutations on the target gene of a protein, whose sequence space remains unchanged. On the other hand, a pool of diverse genes, which is generated by random insertions, deletions and exchange of the homologous domains with different lengths in the target gene, would present the protein lineages resulting in new fitness landscapes. Here we report a method to generate a pool of protein variants with different sequence spaces by employing green fluorescent protein (GFP) as a model protein. This process, designated functional salvage screen (FSS), comprises the following procedures: a defective GFP template expressing no fluorescence is first constructed by genetically disrupting a predetermined region(s) of the protein and a library of GFP variants is generated from the defective template by incorporating the randomly fragmented genomic DNA from Escherichia coli into the defined region(s) of the target gene, followed by screening of the functionally salvaged, fluorescence-emitting GFPs. Two approaches, sequence-directed and PCR-coupled methods, were attempted to generate the library of GFP variants with new sequences derived from the genomic segments of E.coli. The functionally salvaged GFPs were selected and analyzed in terms of the sequence space and functional properties. The results demonstrate that the functional salvage process not only can be a simple and effective method to create protein lineages with new sequence spaces, but also can be useful in elucidating the involvement of a specific region(s) or domain(s) in the structure and function of protein.
Collapse
Affiliation(s)
- G J Kim
- Department of Molecular Science and Technology, Ajou University, San5, Woncheon-dong, Paldal-gu, Suwon, 442-749, Korea
| | | | | | | | | |
Collapse
|
70
|
Teichmann SA, Rison SC, Thornton JM, Riley M, Gough J, Chothia C. The evolution and structural anatomy of the small molecule metabolic pathways in Escherichia coli. J Mol Biol 2001; 311:693-708. [PMID: 11518524 DOI: 10.1006/jmbi.2001.4912] [Citation(s) in RCA: 69] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
The 106 small molecule metabolic (SMM) pathways in Escherichia coli are formed by the protein products of 581 genes. We can define 722 domains, nearly all of which are homologous to proteins of known structure, that form all or part of 510 of these proteins. This information allows us to answer general questions on the structural anatomy of the SMM pathway proteins and to trace family relationships and recruitment events within and across pathways. Half the gene products contain a single domain and half are formed by combinations of between two and six domains. The 722 domains belong to one of 213 families that have between one and 51 members. Family members usually conserve their catalytic or cofactor binding properties; substrate recognition is rarely conserved. Of the 213 families, members of only a quarter occur in isolation, i.e. they form single-domain proteins. Most members of the other families combine with domains from just one or two other families and a few more versatile families can combine with several different partners. Excluding isoenzymes, more than twice as many homologues are distributed across pathways as within pathways. However, serial recruitment, with two consecutive enzymes both being recruited to another pathway, is rare and recruitment of three consecutive enzymes is not observed. Only eight of the 106 pathways have a high number of homologues. Homology between consecutive pairs of enzymes with conservation of the main substrate-binding site but change in catalytic mechanism (which would support a simple model of retrograde pathway evolution) occurs only six times in the whole set of enzymes. Most of the domains that form SMM pathways have homologues in non-SMM pathways. Taken together, these results imply a pervasive "mosaic" model for the formation of protein repertoires and pathways.
Collapse
Affiliation(s)
- S A Teichmann
- Department of Biochemistry and Molecular Biology, University College London, Darwin Building, Gower Street, London, WC1E 6BT, UK.
| | | | | | | | | | | |
Collapse
|
71
|
Orengo CA, Sillitoe I, Reeves G, Pearl FM. Review: what can structural classifications reveal about protein evolution? J Struct Biol 2001; 134:145-65. [PMID: 11551176 DOI: 10.1006/jsbi.2001.4398] [Citation(s) in RCA: 42] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
In this article we present a review of the methods used for comparing and classifying protein structures. We discuss the hierarchies and populations of fold groups and evolutionary families in some of the major classifications and we consider some of the problems confronting any general analyses of structural evolution in protein families. We also review some more recent analyses that have expanded these classifications by identifying sequence relatives in the genomes and thereby reveal interesting trends in fold usage and recurrence.
Collapse
Affiliation(s)
- C A Orengo
- Department of Biochemistry and Molecular Biology, University College, Gower Street, London, WC1E 6BT, United Kingdom
| | | | | | | |
Collapse
|
72
|
Pariza MW, Johnson EA. Evaluating the Safety of Microbial Enzyme Preparations Used in Food Processing: Update for a New Century. Regul Toxicol Pharmacol 2001; 33:173-86. [PMID: 11350200 DOI: 10.1006/rtph.2001.1466] [Citation(s) in RCA: 134] [Impact Index Per Article: 5.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Microbial enzymes used in food processing are typically sold as enzyme preparations that contain not only a desired enzyme activity but also other metabolites of the production strain, as well as added materials such as preservatives and stabilizers. The added materials must be food grade and meet applicable regulatory standards. The purpose of this report is to present guidelines that can be used to evaluate the safety of the metabolites of the production strain that are also present in the enzyme preparation, including of course, but not limited to, the desired enzyme activity itself. This discussion builds on previously published decision tree mechanisms and includes consideration of new genetic modification technologies, for example, modifying the primary structure of enzymes to enhance specific properties that are commercially useful. The safety of the production strain remains the primary consideration in evaluating enzyme safety, in particular, the toxigenic potential of the production strain. Thoroughly characterized nonpathogenic, nontoxigenic microbial strains, particularly those with a history of safe use in food enzyme manufacture, are logical candidates for generating a safe strain lineage, through which improved strains may be derived via genetic modification by using either traditional/classical or rDNA strain improvement strategies. The elements needed to establish a safe strain lineage include thoroughly characterizing the host organism, determining the safety of all new DNA that has been introduced into the host organism, and ensuring that the procedure(s) that have been used to modify the host organism are appropriate for food use. Enzyme function may be changed by intentionally altering the amino acid sequence (e.g., protein engineering). It may be asked if such modifications might also affect the safety of an otherwise safe enzyme. We consider this question in light of what is known about the natural variation in enzyme structure and function and conclude that it is unlikely that changes which improve upon desired enzyme function will result in the creation of a toxic protein. It is prudent to assess such very small theoretical risks by conducting limited toxicological tests on engineered enzymes. The centerpiece of this report is a decision tree mechanism that updates previous enzyme safety evaluation mechanisms to accommodate advances in enzymology. We have concluded that separate mutagenicity testing is not needed if this decision tree is used to evaluate enzyme safety. Under the criteria of the decision tree, no new food enzyme can enter the market without critical evaluation of its safety.
Collapse
Affiliation(s)
- M W Pariza
- Food Research Institute, Department of Food Microbiology and Toxicology, University of Wisconsin-Madison, Madison, WI 53706, USA.
| | | |
Collapse
|
73
|
Abstract
Establishing the linkage between an individual biochemical activity and the gene(s) specifying that activity has been facilitated by advances in mass spectrometry and affinity purification methods. In addition, a genomic protein array has been produced in yeast by fusing each yeast open reading frame to glutathione-S-transferase, thus linking each protein with its cognate gene. Purification and biochemical assay of pools of glutathione-S-transferase-open-reading-frame proteins allows analysis of the entire proteome for biochemical activities, followed by simple deconvolution to identify the responsible open reading frame. An alternative method to analyze large sets of proteins is the use of protein microarrays in which over 10,000 individual proteins can be immobilized and assayed on a single slide.
Collapse
Affiliation(s)
- E J Grayhack
- Department of Biochemistry and Biophysics, University of Rochester, School of Medicine and Dentistry, 601 Elmwood Avenue, Box 712, Rochester, NY 14642, USA.
| | | |
Collapse
|
74
|
Thornton JM. The Hans Neurath Award lecture of The Protein Society: proteins-- a testament to physics, chemistry, and evolution. Protein Sci 2001; 10:3-11. [PMID: 11266588 PMCID: PMC2249842 DOI: 10.1110/ps.90001] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2000] [Accepted: 11/08/2000] [Indexed: 10/19/2022]
Affiliation(s)
- J M Thornton
- Biochemistry and Molecular Biology Department, University College, WC1E 6BT London, UK.
| |
Collapse
|
75
|
Abstract
Site-directed mutagenesis is still a very efficient strategy to elaborate improved enzymes. Recently, advances have been made in developing rational strategies aimed at reshaping enzyme specificities and mechanisms, and at engineering biocatalysts through molecular assembling. These knowledge-based studies greatly benefit from the most recent computational analyses of enzyme structures and functions. The combination of rational and combinatorial methods opens up new vistas in the design of stable and efficient enzymes.
Collapse
Affiliation(s)
- F Cedrone
- CEA, Département d'Ingénierie et d'Etudes des Protéines, Gif-sur-Yvette, France
| | | | | |
Collapse
|
76
|
Abstract
In the past few years, a variety of methods have been developed to allow the in vitro evolution of a range of biomolecules including novel and improved biocatalysts (enzymes). These methods for directed evolution differ in the size and characteristics of the gene repertoire, in the way of linking genotype and phenotype, and in the selection approach. Selections for enzymes can be performed indirectly (for binding of a transition-state analogue or mechanism-based inhibitor), and directly using either intramolecular single-turnover selections (e.g. with SELEX) or the normal (intermolecular, multiple turnover) mode of enzymatic reactions. Each of these methods has distinct strengths and weaknesses. The best system (or combinations of systems) to use depends on the specific target for evolution and the evolutionary distance that needs to be crossed.
Collapse
Affiliation(s)
- A D Griffiths
- The MRC Laboratory of Molecular Biology, Cambridge, UK.
| | | |
Collapse
|
77
|
Abstract
Is structure, rather than sequence, the key to the successful generation of truly novel proteins? While protein evolution by homologous recombination has become an established tool to explore confined regions in sequence space, the generation of functional hybrid proteins by homology-independent methods further expands the scope of protein engineering.
Collapse
Affiliation(s)
- S Lutz
- Department of Chemistry, The Pennsylvania State University, University Park 16802-6300, USA
| | | |
Collapse
|
78
|
Bray JE, Todd AE, Pearl FM, Thornton JM, Orengo CA. The CATH Dictionary of Homologous Superfamilies (DHS): a consensus approach for identifying distant structural homologues. PROTEIN ENGINEERING 2000; 13:153-65. [PMID: 10775657 DOI: 10.1093/protein/13.3.153] [Citation(s) in RCA: 41] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
A consensus approach has been developed for identifying distant structural homologues. This is based on the CATH Dictionary of Homologous Superfamilies (DHS), a database of validated multiple structural alignments annotated with consensus functional information for evolutionary protein superfamilies (URL: http://www. biochem.ucl.ac.uk/bsm/dhs). Multiple structural alignments have been generated for 362 well-populated superfamilies in the CATH structural domain database and annotated with secondary structure, physicochemical properties, functional sequence patterns and protein-ligand interaction data. Consensus functional information for each superfamily includes descriptions and keywords extracted from SWISS-PROT and the ENZYME database. The Dictionary provides a powerful resource to validate, examine and visualize key structural and functional features of each homologous superfamily. The value of the DHS, for assessing functional variability and identifying distant evolutionary relationships, is illustrated using the pyridoxal-5'-phosphate (PLP) binding aspartate aminotransferase superfamily. The DHS also provides a tool for examining sequence-structure relationships for proteins within each fold group.
Collapse
Affiliation(s)
- J E Bray
- Biomolecular Structure and Modelling Unit, Department of Biochemistry and Molecular Biology, University College London, Gower Street,London WC1E 6BT, UK.
| | | | | | | | | |
Collapse
|
79
|
Pearl FM, Lee D, Bray JE, Sillitoe I, Todd AE, Harrison AP, Thornton JM, Orengo CA. Assigning genomic sequences to CATH. Nucleic Acids Res 2000; 28:277-82. [PMID: 10592246 PMCID: PMC102424 DOI: 10.1093/nar/28.1.277] [Citation(s) in RCA: 125] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/1999] [Accepted: 10/06/1999] [Indexed: 11/12/2022] Open
Abstract
We report the latest release (version 1.6) of the CATH protein domains database (http://www.biochem.ucl. ac.uk/bsm/cath ). This is a hierarchical classification of 18 577 domains into evolutionary families and structural groupings. We have identified 1028 homo-logous superfamilies in which the proteins have both structural, and sequence or functional similarity. These can be further clustered into 672 fold groups and 35 distinct architectures. Recent developments of the database include the generation of 3D templates for recognising structural relatives in each fold group, which has led to significant improvements in the speed and accuracy of updating the database and also means that less manual validation is required. We also report the establishment of the CATH-PFDB (Protein Family Database), which associates 1D sequences with the 3D homologous superfamilies. Sequences showing identifiable homology to entries in CATH have been extracted from GenBank using PSI-BLAST. A CATH-PSIBLAST server has been established, which allows you to scan a new sequence against the database. The CATH Dictionary of Homologous Superfamilies (DHS), which contains validated multiple structural alignments annotated with consensus functional information for evolutionary protein superfamilies, has been updated to include annotations associated with sequence relatives identified in GenBank. The DHS is a powerful tool for considering the variation of functional properties within a given CATH superfamily and in deciding what functional properties may be reliably inherited by a newly identified relative.
Collapse
Affiliation(s)
- F M Pearl
- Department of Biochemistry, University College London, University of London, Gower Street, London WC1E 6BT, UK.
| | | | | | | | | | | | | | | |
Collapse
|