1
|
Properties affecting transfer and expression of degradative plasmids for the purpose of bioremediation. Biodegradation 2021; 32:361-375. [PMID: 34046775 DOI: 10.1007/s10532-021-09950-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/16/2021] [Accepted: 05/15/2021] [Indexed: 10/21/2022]
Abstract
Plasmids, circular DNA that exist and replicate outside of the host chromosome, have been important in the spread of non-essential genes as well as the rapid evolution of prokaryotes. Recent advances in environmental engineering have aimed to utilize the mobility of plasmids carrying degradative genes to disseminate them into the environment for cost-effective and environmentally friendly remediation of harmful contaminants. Here, we review the knowledge surrounding plasmid transfer and the conditions needed for successful transfer and expression of degradative plasmids. Both abiotic and biotic factors have a great impact on the success of degradative plasmid transfer and expression of the degradative genes of interest. Properties such as ecological growth strategies of bacteria may also contribute to plasmid transfer and may be an important consideration for bioremediation applications. Finally, the methods for detection of conjugation events have greatly improved and the application of these tools can help improve our understanding of conjugation in complex communities. However, it remains clear that more methods for in situ detection of plasmid transfer are needed to help detangle the complexities of conjugation in natural environments to better promote a framework for precision bioremediation.
Collapse
|
2
|
Liu Q, Liu M, Wu W. Strong/Weak Feature Recognition of Promoters Based on Position Weight Matrix and Ensemble Set-Valued Models. J Comput Biol 2018; 25:1152-1160. [PMID: 29993261 DOI: 10.1089/cmb.2018.0067] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
In this article, we propose a method to recognize the strong/weak property of the promoters based on the nucleotide sequence. To the best of our knowledge, it is the first time to predict the strong/weak property of the promoters. First, position weight matrix (PWM) is used to evaluate the contributions of the nucleotides to the promoter strength. Then, the set-valued model is used to describe the relation between the nucleotide sequence and the strength. Considering the small-sample and imbalance features of the promoter data, we propose an ensemble approach to predict the strong/weak property of the promoters. The proposed method is used to recognize 60 [Formula: see text] promoters of Escherichia coli. The results show the effectiveness of the proposed method. This article provides a simple way for a biologist to evaluate the strong/weak feature of promoters from the nucleotide sequence.
Collapse
Affiliation(s)
- Qie Liu
- Department of Automation, Tsinghua University , Beijing, China
| | - Min Liu
- Department of Automation, Tsinghua University , Beijing, China
| | - Wenfa Wu
- Department of Automation, Tsinghua University , Beijing, China
| |
Collapse
|
3
|
Lam KN, Charles TC. Strong spurious transcription likely contributes to DNA insert bias in typical metagenomic clone libraries. MICROBIOME 2015; 3:22. [PMID: 26056565 PMCID: PMC4459075 DOI: 10.1186/s40168-015-0086-5] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 12/09/2014] [Accepted: 05/01/2015] [Indexed: 05/24/2023]
Abstract
BACKGROUND Clone libraries provide researchers with a powerful resource to study nucleic acid from diverse sources. Metagenomic clone libraries in particular have aided in studies of microbial biodiversity and function, and allowed the mining of novel enzymes. Libraries are often constructed by cloning large inserts into cosmid or fosmid vectors. Recently, there have been reports of GC bias in fosmid metagenomic libraries, and it was speculated to be a result of fragmentation and loss of AT-rich sequences during cloning. However, evidence in the literature suggests that transcriptional activity or gene product toxicity may play a role. RESULTS To explore possible mechanisms responsible for sequence bias in clone libraries, we constructed a cosmid library from a human microbiome sample and sequenced DNA from different steps during library construction: crude extract DNA, size-selected DNA, and cosmid library DNA. We confirmed a GC bias in the final cosmid library, and we provide evidence that the bias is not due to fragmentation and loss of AT-rich sequences but is likely occurring after DNA is introduced into Escherichia coli. To investigate the influence of strong constitutive transcription, we searched the sequence data for promoters and found that rpoD/σ(70) promoter sequences were underrepresented in the cosmid library. Furthermore, when we examined the genomes of taxa that were differentially abundant in the cosmid library relative to the original sample, we found the bias to be more correlated with the number of rpoD/σ(70) consensus sequences in the genome than with simple GC content. CONCLUSIONS The GC bias of metagenomic libraries does not appear to be due to DNA fragmentation. Rather, analysis of promoter sequences provides support for the hypothesis that strong constitutive transcription from sequences recognized as rpoD/σ(70) consensus-like in E. coli may lead to instability, causing loss of the plasmid or loss of the insert DNA that gives rise to the transcription. Despite widespread use of E. coli to propagate foreign DNA in metagenomic libraries, the effects of in vivo transcriptional activity on clone stability are not well understood. Further work is required to tease apart the effects of transcription from those of gene product toxicity.
Collapse
Affiliation(s)
- Kathy N. Lam
- Department of Biology, University of Waterloo, Waterloo, ON Canada
| | | |
Collapse
|
4
|
Stevens DC, Conway KR, Pearce N, Villegas-Peñaranda LR, Garza AG, Boddy CN. Alternative sigma factor over-expression enables heterologous expression of a type II polyketide biosynthetic pathway in Escherichia coli. PLoS One 2013; 8:e64858. [PMID: 23724102 PMCID: PMC3665592 DOI: 10.1371/journal.pone.0064858] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2011] [Accepted: 04/22/2013] [Indexed: 02/03/2023] Open
Abstract
Background Heterologous expression of bacterial biosynthetic gene clusters is currently an indispensable tool for characterizing biosynthetic pathways. Development of an effective, general heterologous expression system that can be applied to bioprospecting from metagenomic DNA will enable the discovery of a wealth of new natural products. Methodology We have developed a new Escherichia coli-based heterologous expression system for polyketide biosynthetic gene clusters. We have demonstrated the over-expression of the alternative sigma factor σ54 directly and positively regulates heterologous expression of the oxytetracycline biosynthetic gene cluster in E. coli. Bioinformatics analysis indicates that σ54 promoters are present in nearly 70% of polyketide and non-ribosomal peptide biosynthetic pathways. Conclusions We have demonstrated a new mechanism for heterologous expression of the oxytetracycline polyketide biosynthetic pathway, where high-level pleiotropic sigma factors from the heterologous host directly and positively regulate transcription of the non-native biosynthetic gene cluster. Our bioinformatics analysis is consistent with the hypothesis that heterologous expression mediated by the alternative sigma factor σ54 may be a viable method for the production of additional polyketide products.
Collapse
Affiliation(s)
| | - Kyle R. Conway
- Department of Chemistry, University of Ottawa, Ottawa, Ontario, Canada
| | - Nelson Pearce
- Department of Chemistry, University of Ottawa, Ottawa, Ontario, Canada
| | | | - Anthony G. Garza
- Department of Biology, Syracuse University, Syracuse, New York, United States of America
| | - Christopher N. Boddy
- Department of Chemistry, University of Ottawa, Ottawa, Ontario, Canada
- Department of Biology, University of Ottawa, Ottawa, Ontario, Canada
- * E-mail:
| |
Collapse
|
5
|
Ikuma K, Gunsch CK. Functionality of the TOL plasmid under varying environmental conditions following conjugal transfer. Appl Microbiol Biotechnol 2012; 97:395-408. [PMID: 22367613 DOI: 10.1007/s00253-012-3949-8] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2011] [Revised: 01/19/2012] [Accepted: 02/06/2012] [Indexed: 10/28/2022]
Abstract
Conjugation of catabolic plasmids in contaminated environments is a naturally occurring horizontal gene transfer phenomenon, which could be utilized in genetic bioaugmentation. The potentially important parameters for genetic bioaugmentation include gene regulation of transferred catabolic plasmids that may be controlled by the genetic characteristics of transconjugants as well as environmental conditions that may alter the expression of the contaminant-degrading phenotype. This study showed that both genomic guanine-cytosine contents and phylogenetic characteristics of transconjugants were important in controlling the phenotype functionality of the TOL plasmid. These genetic characteristics had no apparent impact on the stability of the TOL plasmid, which was observed to be highly variable among strains. Within the environmental conditions tested, the addition of glucose resulted in the largest enhancement of the activities of enzymes encoded by the TOL plasmid in all transconjugant strains. Glucose (1 g/L) enhanced the phenotype functionality by up to 16.4 (±2.22), 30.8 (±7.03), and 90.8 (±4.56)-fold in toluene degradation rates, catechol 2,3-dioxygenase enzymatic activities, and xylE gene expression, respectively. These results suggest that genetic limitations of the expression of horizontally acquired genes may be overcome by the presence of alternate carbon substrates. Such observations may be utilized in improving the effectiveness of genetic bioaugmentation.
Collapse
Affiliation(s)
- Kaoru Ikuma
- Department of Civil and Environmental Engineering, Duke University, 121 Hudson Hall, Box 90287, Durham, NC 27708-0287, USA
| | | |
Collapse
|
6
|
Vedel V, Scotti I. Promoting the promoter. PLANT SCIENCE : AN INTERNATIONAL JOURNAL OF EXPERIMENTAL PLANT BIOLOGY 2011; 180:182-189. [PMID: 21421359 DOI: 10.1016/j.plantsci.2010.09.009] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/01/2010] [Revised: 09/23/2010] [Accepted: 09/27/2010] [Indexed: 05/28/2023]
Abstract
Recent evolutionary studies clearly indicate that evolution is mainly driven by changes in the complex mechanisms of gene regulation and not solely by polymorphism in protein-encoding genes themselves. After a short description of the cis-regulatory mechanism, we intend in this review to argue that by applying newly available technologies and by merging research areas such as evolutionary and developmental biology, population genetics, ecology and molecular cell biology it is now possible to study evolution in an integrative way. We contend that, by analysing the effects of promoter sequence variation on phenotypic diversity in natural populations, we will soon be able to break the barrier between the study of extant genetic variability and the study of major developmental changes. This will lead to an integrative view of evolution at different scales. Because of their sessile nature and their continuous development, plants must permanently regulate their gene expression to react to their environment, and can, therefore, be considered as a remarkable model for these types of studies.
Collapse
Affiliation(s)
- Vincent Vedel
- UMR ECOFOG, INRA, Ecological genetic, Campus Agronomique de Kourou, BP 709, 97387 Kourou, French Guiana.
| | | |
Collapse
|
7
|
Ikuma K, Gunsch C. Effect of carbon source addition on toluene biodegradation by an Escherichia coli DH5alpha transconjugant harboring the TOL plasmid. Biotechnol Bioeng 2010; 107:269-77. [PMID: 20506384 DOI: 10.1002/bit.22808] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/07/2022]
Abstract
Horizontal gene transfer (HGT) of plasmids is a naturally occurring phenomenon which could be manipulated for bioremediation applications. Specifically, HGT may prove useful to enhance bioremediation through genetic bioaugmentation. However, because the transfer of a plasmid between donor and recipient cells does not always result in useful functional phenotypes, the conditions under which HGT events result in enhanced degradative capabilities must first be elucidated. The objective of this study was to determine if the addition of alternate carbon substrates could improve toluene degradation in Escherichia coli DH5alpha transconjugants. The addition of glucose (0.5-5 g/L) and Luria-Bertani (LB) broth (10-100%) resulted in enhanced toluene degradation. On average, the toluene degradation rate increased 14.1 (+/-2.1)-fold in the presence of glucose while the maximum increase was 18.4 (+/-1.7)-fold in the presence of 25% LB broth. Gene expression of xyl genes was upregulated in the presence of glucose but not LB broth, which implies different inducing mechanisms by the two types of alternate carbon source. The increased toluene degradation by the addition of glucose or LB broth was persistent over the short-term, suggesting the pulse amendment of an alternative carbon source may be helpful in bioremediation. While the effects of recipient genome GC content and other conditions must still be examined, our results suggest that changes in environmental conditions such as alternate substrate availability may significantly improve the functionality of the transferred phenotypes in HGT and therefore may be an important parameter for genetic bioaugmentation optimization.
Collapse
Affiliation(s)
- Kaoru Ikuma
- Department of Civil and Environmental Engineering, Duke University, Durham, NC, USA
| | | |
Collapse
|
8
|
Perez-Bello A, Munteanu CR, Ubeira FM, De Magalhães AL, Uriarte E, González-Díaz H. Alignment-free prediction of mycobacterial DNA promoters based on pseudo-folding lattice network or star-graph topological indices. J Theor Biol 2008; 256:458-66. [PMID: 18992259 PMCID: PMC7126577 DOI: 10.1016/j.jtbi.2008.09.035] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/06/2008] [Revised: 09/23/2008] [Accepted: 09/25/2008] [Indexed: 12/01/2022]
Abstract
The importance of the promoter sequences in the function regulation of several important mycobacterial pathogens creates the necessity to design simple and fast theoretical models that can predict them. This work proposes two DNA promoter QSAR models based on pseudo-folding lattice network (LN) and star-graphs (SG) topological indices. In addition, a comparative study with the previous RNA electrostatic parameters of thermodynamically-driven secondary structure folding representations has been carried out. The best model of this work was obtained with only two LN stochastic electrostatic potentials and it is characterized by accuracy, selectivity and specificity of 90.87%, 82.96% and 92.95%, respectively. In addition, we pointed out the SG result dependence on the DNA sequence codification and we proposed a QSAR model based on codons and only three SG spectral moments.
Collapse
Affiliation(s)
- Alcides Perez-Bello
- Department of Microbiology and Parasitology, University of Santiago de Compostela, Santiago de Compostela 15782, Spain.
| | | | | | | | | | | |
Collapse
|
9
|
Sorek R, Zhu Y, Creevey CJ, Francino MP, Bork P, Rubin EM. Genome-Wide Experimental Determination of Barriers to Horizontal Gene Transfer. Science 2007; 318:1449-52. [DOI: 10.1126/science.1147112] [Citation(s) in RCA: 321] [Impact Index Per Article: 17.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/03/2022]
|
10
|
González-Díaz H, Pérez-Bello A, Uriarte E, González-Díaz Y. QSAR study for mycobacterial promoters with low sequence homology. Bioorg Med Chem Lett 2006; 16:547-53. [PMID: 16275068 DOI: 10.1016/j.bmcl.2005.10.057] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2005] [Revised: 10/13/2005] [Accepted: 10/18/2005] [Indexed: 11/27/2022]
Abstract
The general belief is that quantitative structure-activity relationship (QSAR) techniques work only for small molecules and, protein sequences or, more recently, DNA sequences. However, with non-branched graph for proteins and DNA sequences the QSAR often have to be based on powerful non-linear techniques such as support vector machines. In our opinion, linear QSAR models based on RNA could be useful to assign biological activity when alignment techniques fail due to low sequence homology. The idea bases the high level of branching for the RNA graph. This work introduces the so-called Markov electrostatic potentials (k)xi(M) as a new class of RNA 2D-structure descriptors. Subsequently, we validate these molecular descriptors solving a QSAR classification problem for mycobacterial promoter sequences (mps), which constitute a very low sequence homology problem. The model developed (mps=-4.664.(0)xi(M)+0. 991.(1)xi(M)-2.432) was intended to predict whether a naturally occurring sequence is an mps or not on the basis of the calculated (k)xi(M) value for the corresponding RNA secondary structure. The RNA-QSAR approach recognises 115/135mps (85.2%) and 100% of control sequences. Average predictability and robustness were greater than 95%. A previous non-linear model predicts mps with a slightly higher accuracy (97%) but uses a very large parameter space for DNA sequences. Conversely, the (k)xi(M)-based RNA-QSAR encodes more structural information and needs only two variables.
Collapse
|
11
|
Kawano M, Storz G, Rao BS, Rosner JL, Martin RG. Detection of low-level promoter activity within open reading frame sequences of Escherichia coli. Nucleic Acids Res 2005; 33:6268-76. [PMID: 16260475 PMCID: PMC1275588 DOI: 10.1093/nar/gki928] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The search for promoters has largely been confined to sequences upstream of open reading frames (ORFs) or stable RNA genes. Here we used a cloning approach to discover other potential promoters in Escherichia coli. Chromosomal fragments of approximately 160 bp were fused to a promoterless lacZ reporter gene on a multi-copy plasmid. Eight clones were deliberately selected for high activity and 105 clones were selected at random. All eight of the high-activity clones carried promoters that were located upstream of an ORF. Among the randomly-selected clones, 56 had significantly elevated activity. Of these, 7 had inserts which also mapped upstream of an ORF, while 49 mapped within or downstream of ORFs. Surprisingly, the eight promoters selected for high activity matched the canonical sigma70 -35 and -10 sequences no better than sequences from the randomly-selected clones. For six of the nine most active sequences with orientations opposite to that of the ORF, chromosomal expression was detected by RT-PCR, but defined transcripts were not detected by northern analysis. Our results indicate that the E.coli chromosome carries numerous -35 and -10 sequences with weak promoter activity but that most are not productively expressed because other features needed to enhance promoter activity and transcript stability are absent.
Collapse
Affiliation(s)
| | | | - B. Sridhar Rao
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of HealthBethesda, MD 20892-0560, USA
| | - Judah L. Rosner
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney DiseasesBuilding 5, Room 333, Bethesda, MD 20892-0560, USA
| | - Robert G. Martin
- Laboratory of Molecular Biology, National Institute of Diabetes and Digestive and Kidney DiseasesBuilding 5, Room 333, Bethesda, MD 20892-0560, USA
- To whom correspondence should be addressed. Tel: +1 301 496 5466; Fax: +1 301 496 0201;
| |
Collapse
|
12
|
Michalowski CB, Short MD, Little JW. Sequence tolerance of the phage lambda PRM promoter: implications for evolution of gene regulatory circuitry. J Bacteriol 2004; 186:7988-99. [PMID: 15547271 PMCID: PMC529058 DOI: 10.1128/jb.186.23.7988-7999.2004] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2004] [Accepted: 09/01/2004] [Indexed: 11/20/2022] Open
Abstract
Much of the gene regulatory circuitry of phage lambda centers on a complex region called the O(R) region. This approximately 100-bp region is densely packed with regulatory sites, including two promoters and three repressor-binding sites. The dense packing of this region is likely to impose severe constraints on its ability to change during evolution, raising the question of how the specific arrangement of sites and their exact sequences could evolve to their present form. Here we ask whether the sequence of a cis-acting site can be widely varied while retaining its function; if it can, evolution could proceed by a larger number of paths. To help address this question, we developed a lambda cloning vector that allowed us to clone fragments spanning the O(R) region. By using this vector, we carried out intensive mutagenesis of the P(RM) promoter, which drives expression of CI repressor and is activated by CI itself. We made a pool of fragments in which 8 of the 12 positions in the -35 and -10 regions were randomized and cloned this pool into the vector, making a pool of P(RM) variant phage. About 10% of the P(RM) variants were able to lysogenize, suggesting that the lambda regulatory circuitry is compatible with a wide range of P(RM) sequences. Analysis of several of these phages indicated a range of behaviors in prophage induction. Several isolates had induction properties similar to those of the wild type, and their promoters resembled the wild type in their responses to CI. We term this property of different sequences allowing roughly equivalent function "sequence tolerance " and discuss its role in the evolution of gene regulatory circuitry.
Collapse
Affiliation(s)
- Christine B Michalowski
- Department of Biochemistry and Molecular Biophysics, University of Arizona, Tucson, AZ 85721, USA
| | | | | |
Collapse
|
13
|
Kiryu H, Oshima T, Asai K. Extracting relations between promoter sequences and their strengths from microarray data. Bioinformatics 2004; 21:1062-8. [PMID: 15513998 DOI: 10.1093/bioinformatics/bti094] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION The relations between the promoter sequences and their strengths were extensively studied in the 1980s. Although these studies uncovered strong sequence-strength correlations, the cost of their elaborate experimental methods have been too high to be applied to a large number of promoters. On the contrary, a recent increase in the microarray data allows us to compare thousands of gene expressions with their DNA sequences. RESULTS We studied the relations between the promoter sequences and their strengths using the Escherichia coli microarray data. We modeled those relations using a simple weight matrix, which was optimized with a novel support vector regression method. It was observed that several non-consensus bases in the '-35' and '-10' regions of promoter sequences act positively on the promoter strength and that certain consensus bases have a minor effect on the strength. We analyzed outliers for which the observed gene expressions deviate from the promoter strength predictions, and identified several genes with enhanced expressions due to multiple promoters and genes under strong regulation by transcription factors. Our method is applicable to other procaryotes for which both the promoter sequences and the microarray data are available.
Collapse
Affiliation(s)
- Hisanori Kiryu
- Graduate School of Information Sciences, Nara Institute of Science and Technology 8916-5 Takayama-cho, Ikoma, Nara 630-0192, Japan.
| | | | | |
Collapse
|
14
|
Kalate RN, Tambe SS, Kulkarni BD. Artificial neural networks for prediction of mycobacterial promoter sequences. Comput Biol Chem 2004; 27:555-64. [PMID: 14667783 DOI: 10.1016/j.compbiolchem.2003.09.004] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022]
Abstract
A multilayered feed-forward ANN architecture trained using the error-back-propagation (EBP) algorithm has been developed for predicting whether a given nucleotide sequence is a mycobacterial promoter sequence. Owing to the high prediction capability ( congruent with 97%) of the developed network model, it has been further used in conjunction with the caliper randomization (CR) approach for determining the structurally/functionally important regions in the promoter sequences. The results obtained thereby indicate that: (i) upstream region of -35 box, (ii) -35 region, (iii) spacer region and, (iv) -10 box, are important for mycobacterial promoters. The CR approach also suggests that the -38 to -29 region plays a significant role in determining whether a given sequence is a mycobacterial promoter. In essence, the present study establishes ANNs as a tool for predicting mycobacterial promoter sequences and determining structurally/functionally important sub-regions therein.
Collapse
Affiliation(s)
- Rupali N Kalate
- Department of Biosciences and Informatics, Keio University, 3-14-1 Hiyoshi, Kohoku-ku, Yokohama 223-8522, Japan.
| | | | | |
Collapse
|
15
|
Majdalani N, Chen S, Murrow J, St John K, Gottesman S. Regulation of RpoS by a novel small RNA: the characterization of RprA. Mol Microbiol 2001; 39:1382-94. [PMID: 11251852 DOI: 10.1111/j.1365-2958.2001.02329.x] [Citation(s) in RCA: 226] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Translational regulation of the stationary phase sigma factor RpoS is mediated by the formation of a double-stranded RNA stem-loop structure in the upstream region of the rpoS messenger RNA, occluding the translation initiation site. The interaction of the rpoS mRNA with a small RNA, DsrA, disrupts the double-strand pairing and allows high levels of translation initiation. We screened a multicopy library of Escherichia coli DNA fragments for novel activators of RpoS translation when DsrA is absent. Clones carrying rprA (RpoS regulator RNA) increased the translation of RpoS. The rprA gene encodes a 106 nucleotide regulatory RNA. As with DsrA, RprA is predicted to form three stem-loops and is highly conserved in Salmonella and Klebsiella species. Thus, at least two small RNAs, DsrA and RprA, participate in the positive regulation of RpoS translation. Unlike DsrA, RprA does not have an extensive region of complementarity to the RpoS leader, leaving its mechanism of action unclear. RprA is non-essential. Mutations in the gene interfere with the induction of RpoS after osmotic shock when DsrA is absent, demonstrating a physiological role for RprA. The existence of two very different small RNA regulators of RpoS translation suggests that such additional regulatory RNAs are likely to exist, both for regulation of RpoS and for regulation of other important cellular components.
Collapse
Affiliation(s)
- N Majdalani
- Laboratory of Molecular Biology, National Cancer Institute, National Institutes of Health, Bldg. 37 Room 2E 18, Bethesda, MD 20892-4255, USA
| | | | | | | | | |
Collapse
|
16
|
King RA, Madsen PL, Weisberg RA. Constitutive expression of a transcription termination factor by a repressed prophage: promoters for transcribing the phage HK022 nun gene. J Bacteriol 2000; 182:456-62. [PMID: 10629193 PMCID: PMC94296 DOI: 10.1128/jb.182.2.456-462.2000] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Lysogens of phage HK022 are resistant to infection by phage lambda. Lambda resistance is caused by the action of the HK022 Nun protein, which prematurely terminates early lambda transcripts. We report here that transcription of the nun gene initiates at a constitutive prophage promoter, P(Nun), located just upstream of the protein coding sequence. The 5' end of the transcript was determined by primer extension analysis of RNA isolated from HK022 lysogens or RNA made in vitro by transcribing a template containing the promoter with purified Escherichia coli RNA polymerase. Inactivation of P(Nun) by mutation greatly reduced Nun activity and Nun antigen in an HK022 lysogen. However, a low level of residual activity was detected, suggesting that a secondary promoter also contributes to nun expression. We found one possible secondary promoter, P(Nun)', just upstream of P(Nun). Neither promoter is likely to increase the expression of other phage genes in a lysogen because their transcripts should be terminated downstream of nun. We estimate that HK022 lysogens in stationary phase contain several hundred molecules of Nun per cell and that cells in exponential phase probably contain fewer.
Collapse
Affiliation(s)
- R A King
- Laboratory of Molecular Genetics, National Institute of Child Health and Human Development, National Institutes of Health, Bethesda, Maryland, USA.
| | | | | |
Collapse
|
17
|
Wang JT, Rozen S, Shapiro BA, Shasha D, Wang Z, Yin M. New techniques for DNA sequence classification. J Comput Biol 1999; 6:209-18. [PMID: 10421523 DOI: 10.1089/cmb.1999.6.209] [Citation(s) in RCA: 22] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
DNA sequence classification is the activity of determining whether or not an unlabeled sequence S belongs to an existing class C. This paper proposes two new techniques for DNA sequence classification. The first technique works by comparing the unlabeled sequence S with a group of active motifs discovered from the elements of C and by distinction with elements outside of C. The second technique generates and matches gapped fingerprints of S with elements of C. Experimental results obtained by running these algorithms on long and well conserved Alu sequences demonstrate the good performance of the presented methods compared with FASTA. When applied to less conserved and relatively short functional sites such as splice-junctions, a variation of the second technique combining fingerprinting with consensus sequence analysis gives better results than the current classifiers employing text compression and machine learning algorithms.
Collapse
Affiliation(s)
- J T Wang
- Department of Computer and Information Science, New Jersey Institute of Technology, University Heights, Newark 07102, USA.
| | | | | | | | | | | |
Collapse
|
18
|
Ozoline ON, Deev AA, Arkhipova MV. Non-canonical sequence elements in the promoter structure. Cluster analysis of promoters recognized by Escherichia coli RNA polymerase. Nucleic Acids Res 1997; 25:4703-9. [PMID: 9365247 PMCID: PMC147123 DOI: 10.1093/nar/25.23.4703] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/05/2023] Open
Abstract
Nucleotide sequences of 441 promoters recognized by Escherichia coli RNA polymerase were subjected to a site-specific cluster analysis based on the hierarchical method of classification. Five regions permitting promoter subgrouping were identified. They are located at -54 +/- 4, -44 +/- 3, -35 +/- 3 (-35 element), -29 +/- 2 and -11 +/-4 (-10 element). Promoters were independently subgrouped on the basis of their sequence homology in each of these regions and typical sequence elements were determined. The putative functional significance of the revealed elements is discussed on the basis of available biochemical data. Those promoters that have a high degree of homology with the revealed sequence elements were selected as representatives of corresponding promoter groups and the presence of other sequence motifs in their structure was examined. Both positive and negative correlations in the presence of particular sequence motifs were observed; however, the degree of these interdependencies was not high in all cases, probably indicating that different combinations of the signal elements may create a promoter. The list of promoter sequences with the presence of different sequence elements is available on request by Email: ozoline@venus.iteb. serpukhov.su.
Collapse
Affiliation(s)
- O N Ozoline
- Institute of Cell Biophysics, Russian Academy of Sciences (RAS), Pushchino, 142292 Moscow region, Russia.
| | | | | |
Collapse
|
19
|
Frank DE, Saecker RM, Bond JP, Capp MW, Tsodikov OV, Melcher SE, Levandoski MM, Record MT. Thermodynamics of the interactions of lac repressor with variants of the symmetric lac operator: effects of converting a consensus site to a non-specific site. J Mol Biol 1997; 267:1186-206. [PMID: 9150406 DOI: 10.1006/jmbi.1997.0920] [Citation(s) in RCA: 111] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/04/2023]
Abstract
What are the thermodynamic consequences of the stepwise conversion of a highly specific (consensus) protein-DNA interface to one that is nonspecific? How do the magnitudes of key favorable contributions to complex stability (burial of hydrophobic surfaces and reduction of DNA phosphate charge density) change as the DNA sequence of the specific site is detuned? To address these questions we investigated the binding of lac repressor (LacI) to a series of 40 bp fragments carrying symmetric (consensus) and variant operator sequences over a range of temperatures and salt concentrations. Variant DNA sites contained symmetrical single and double base-pair substitutions at positions 4 and/or 5 [sequence: see text] in each 10 bp half site of the symmetric lac operator (Osym). Non-specific interactions were examined using a 40 bp non-operator DNA fragment. Disruption of the consensus interface by a single symmetrical substitution reduces the observed equilibrium association constant (K(obs)) for Osym by three to four orders of magnitude; double symmetrical substitutions approach the six orders in magnitude difference between specific and non-specific binding to a 40 bp fragment. At these adjacent positions in the consensus site, the free energy effects of multiple substitutions are non-additive: the first reduces /deltaG(obs)o/ by 3 to 5 kcal mol(-1), approximately halfway to the non-specific level, whereas the second is less deleterious, reducing /deltaG(obs)o/ by less than 3 kcal mol(-1). Variant-specific dependences of K(obs) on temperature and salt concentration characterize these LacI-operator interactions. In general, binding constants and standard free energies of binding both exhibit characteristic extrema near 290 K. As a consequence, both the enthalpic and entropic contributions to stability of Osym and variant complexes change from positive (i.e. entropy driven) at lower temperatures to negative (i.e. enthalpy driven) at higher temperatures, indicating that the heat capacity change upon binding, deltaC(obs)o, is large and negative. In general, /deltaC(obs)o/ decreases as the specificity and stability of the variant complex decreases. Stabilities of complexes of LacI with Osym and all variant operators are strongly [salt]-dependent. Binding constants for the variant complexes exhibit a power-dependence on [salt] that is larger in magnitude (i.e. more negative) than for Osym, but no obvious trend relates changes in contributions from the polyelectrolyte effect and the observed reductions in stability (delta deltaG(obs)o). These variant-specific thermodynamic signatures provide novel insights into the consequences of converting a consensus interface to a less specific one; such insights are not obtained from comparisons at the level of delta deltaG(obs)o. We propose that this variant-specific behavior arises from a strong effect of operator sequence on the extent of induced conformational changes in the protein (and possibly also in the DNA site) which accompany binding.
Collapse
Affiliation(s)
- D E Frank
- Department of Biochemistry, University of Wisconsin-Madison, 53706, USA
| | | | | | | | | | | | | | | |
Collapse
|
20
|
Brikun I, Suziedelis K, Stemmann O, Zhong R, Alikhanian L, Linkova E, Mironov A, Berg DE. Analysis of CRP-CytR interactions at the Escherichia coli udp promoter. J Bacteriol 1996; 178:1614-22. [PMID: 8626289 PMCID: PMC177846 DOI: 10.1128/jb.178.6.1614-1622.1996] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/31/2023] Open
Abstract
Multiprotein complexes regulate the transcription of certain bacterial genes in a sensitive, physiologically responsive manner. In particular, the transcription of genes needed for utilization of nucleosides in Escherichia coli is regulated by a repressor protein, CytR, in concert with the cyclic AMP (cAMP) activated form of cAMP receptor protein (CRP). We studied this regulation by selecting and characterizing spontaneous constitutive mutations in the promoter of the udp (uridine phosphorylase) gene, one of the genes most strongly regulated by CytR. We found deletions, duplications, and point mutations that affect key regulatory sites in the udp promoter, insertion sequence element insertions that activated cryptic internal promoters or provided new promoters, and large duplications that may have increased expression by udp gene amplification. Unusual duplications and deletions that resulted in constitutive udp expression that depended on the presence of CytR were also found. Our results support the model in which repression normally involves the binding of CytR to cAMP-CRP to form a complex which binds to specific sites in the udp promoter, without direct interaction between CytR protein and a specific operator DNA sequence, and in which induction by specific inducer cytidine involves dissociation of CytR from cAMP-CRP and the RNA polymerase interaction with cAMP-CRP bound to a site upstream of then transcription start point. The stimulation of udp expression by CytR in certain mutants may reflect its stabilization of cAMP-CRP binding to target DNA and illustrates that only modest evolutionary changes could allow particular multiprotein complexes to serve as either repressors or transcriptional activators.
Collapse
Affiliation(s)
- I Brikun
- Department of Molecular Microbiology, Washington University Medical School, St. Louis, Missouri 63110, USA
| | | | | | | | | | | | | | | |
Collapse
|
21
|
Abstract
Recognition of function of newly sequenced DNA fragments is an important area of computational molecular biology. Here we present an extensive review of methods for prediction of functional sites, tRNA, and protein-coding genes and discuss possible further directions of research in this area.
Collapse
Affiliation(s)
- M S Gelfand
- Institute of Protein Research, Russian Academy of Sciences, Pushchino, Moscow region, Russia
| |
Collapse
|
22
|
Boyd DA, Cvitkovitch DG, Hamilton IR. Sequence and expression of the genes for HPr (ptsH) and enzyme I (ptsI) of the phosphoenolpyruvate-dependent phosphotransferase transport system from Streptococcus mutans. Infect Immun 1994; 62:1156-65. [PMID: 8132321 PMCID: PMC186246 DOI: 10.1128/iai.62.4.1156-1165.1994] [Citation(s) in RCA: 19] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023] Open
Abstract
We report the sequencing of a 2,242-bp region of the Streptococcus mutants NG5 genome containing the genes for ptsH and ptsI, which encode HPr and enzyme I (EI), respectively, of the phosphoenolpyruvate-dependent phosphotransferase transport system. The sequence was obtained from two cloned overlapping genomic fragments; one expresses HPr and a truncated EI, while the other expresses a full-length EI in Escherichia coli, as determined by Western immunoblotting. The ptsI gene appeared to be expressed from a region located in the ptsH gene. The S. mutans NG5 pts operon does not appear to be linked to other phosphotransferase transport system proteins as has been found in other bacteria. A positive fermentation pattern on MacConkey-glucose plates by an E. coli ptsI mutant harboring the S. mutans NG5 ptsI gene on a plasmid indicated that the S. mutans NG5 EI can complement a defect in the E. coli gene. This was confirmed by protein phosphorylation experiments with 32P-labeled phosphoenolpyruvate indicating phosphotransfer from the S. mutans NG5 EI to the E. coli HPr. Two forms of the cloned EI, both truncated to varying degrees in the C-terminal region, were inefficiently phosphorylated and unable to complement fully the ptsI defect in the E. coli mutant. The deduced amino acid sequence of HPr shows a high degree of homology, particularly around the active site, to the same protein from other gram-positive bacteria, notably, S. salivarius, and to a lesser extent with those of gram-negative bacteria. The deduced amino acid sequence of S. mutans NG5 EI also shares several regions of homology with other sequenced EIs, notably, with the region around the active site, a region that contains the only conserved cystidyl residue among the various proteins and which may be involved in substrate binding.
Collapse
Affiliation(s)
- D A Boyd
- Department of Oral Biology, University of Manitoba, Winnipeg, Canada
| | | | | |
Collapse
|
23
|
Jain C, Kleckner N. IS10 mRNA stability and steady state levels in Escherichia coli: indirect effects of translation and role of rne function. Mol Microbiol 1993; 9:233-47. [PMID: 7692216 DOI: 10.1111/j.1365-2958.1993.tb01686.x] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
Translation of the IS10 transposase gene is known to be very infrequent. We have identified mutations whose genetic properties suggest that they act directly to increase or decrease the intrinsic level of translation initiation. Also, we have analysed in detail the effects of these mutations on IS10 mRNA using one particular IS10 derivative. In this case, increases or decreases in translation are accompanied by increases or decreases in both the steady state level and the half-life of transposase mRNA; effects on steady state levels are much more dramatic than effects on message half-life. At wild-type levels of translation initiation, the rate-limiting step in physical decay of full length IS10 message for a particular IS10 derivative is shown to be rne-dependent endonucleolytic cleavage; 3' exonucleases appear to play a secondary role, degrading primary cleavage products. Analysis of interplay between translation mutations and rne function, together with the above observations, suggests that translation stabilizes messages in a general way against rne-dependent endonucleolytic cleavage, and that significant protection may be conferred by one or a few ribosomes. However, dramatic effects of translation on steady state message levels are still observed in an rne mutant and involve the 3' end of the transcript; we propose that these additional effects reflect translation-mediated stimulation of transcript release.
Collapse
Affiliation(s)
- C Jain
- Department of Biochemistry and Molecular Biology, Harvard University, Cambridge, Massachusetts 02138
| | | |
Collapse
|
24
|
Gaudin HM, Silverman PM. Contributions of promoter context and structure to regulated expression of the F plasmid traY promoter in Escherichia coli K-12. Mol Microbiol 1993; 8:335-42. [PMID: 8316084 DOI: 10.1111/j.1365-2958.1993.tb01577.x] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/29/2023]
Abstract
Expression of the F plasmid traY promoter in vivo requires both host (E. coli) and plasmid encoded proteins. As judged by transcript size and primer extension analyses, the F plasmid traY promoter was utilized in vitro by purified E. coli sigma 70 RNA polymerase in the absence of other proteins. However, in vitro transcription required supercoiled templates. Endonuclease protection experiments showed that RNA polymerase is unable to form a stable complex at the traY promoter in linear or relaxed circular templates. In vitro transcription with linear templates could be elicited by altering the traY -10 and -35 hexamers to the consensus sequences. Alterations that reduced the effect of template supercoiling on apparent promoter strength in vitro also reduced the effect of the F plasmid TraJ protein on traY expression in vivo. Apparent traY promoter strength in vitro, estimated in template competition experiments, was unaltered by deletion of tra DNA normally upstream of the promoter, a change in promoter context that elicited high levels of promoter activity in TraJ- cells. These data suggest a model for regulated traY promoter activity in which a nucleoprotein complex involving tra DNA immediately upstream locally relaxes traY promoter DNA. TraJ and perhaps other activators could disrupt the complex, allowing promoter DNA to equilibrate at the prevailing negative superhelical density and thereby eliciting transcription initiation.
Collapse
Affiliation(s)
- H M Gaudin
- Program in Molecular and Cell Biology, Oklahoma Medical Research Foundation, Oklahoma 73104
| | | |
Collapse
|
25
|
Abstract
The Escherichia coli araFGH operon codes for proteins involved in the L-arabinose high-affinity transport system. Transcriptional regulation of the operon was studied by creating point mutations and deletions in the control region cloned into a GalK expression vector. The transcription start site was confirmed by RNA sequencing of transcripts. The sequences essential for polymerase function were localized by deletions and point mutations. Surprisingly, only a weak -10 consensus sequence, and no -35 sequence is required. Mutation of a guanosine at position -12 greatly reduced promoter activity, which suggests important polymerase interactions with DNA between the usual -10 and -35 positions. A double mutation toward the consensus in the -10 region was required to create a promoter capable of significant AraC-independent transcription. These results show that the araFGH promoter structure is similar to that of the galP1 promoter and is substantially different from that of the araBAD promoter. The effects of 11 mutations within the DNA region thought to bind the cyclic AMP receptor protein correlate well with the CRP consensus binding sequence and confirm that this region is responsible for cyclic AMP regulation. Deletion of the AraC binding site nearest the promoter, araFG1, eliminates arabinose regulation, whereas deletion of the upstream AraC binding site, araFG2, has only a slight effect on promoter activity.
Collapse
Affiliation(s)
- W Hendrickson
- Department of Microbiology and Immunology, College of Medicine, University of Illinois, Chicago 60680
| | | | | |
Collapse
|
26
|
Abstract
Long-range two-body correlations in a DNA sequence should in theory approach a constant value very rapidly with increasing value of the correlation length. It is shown that for most DNA sequences, the long-range correlations exhibit oscillations superimposed on the constant background. These oscillations persist for very large correlation lengths. The oscillations are shown to be three-point cycles and are related to the coding regions in the DNA. A method for discovering the coding regions in DNA sequences is presented. The limitations of the method are discussed.
Collapse
Affiliation(s)
- G S Mani
- Department of Theoretical Physics, Schuster Laboratory, University of Manchester, U.K
| |
Collapse
|
27
|
Wang L, Weiss B. dcd (dCTP deaminase) gene of Escherichia coli: mapping, cloning, sequencing, and identification as a locus of suppressors of lethal dut (dUTPase) mutations. J Bacteriol 1992; 174:5647-53. [PMID: 1324907 PMCID: PMC206511 DOI: 10.1128/jb.174.17.5647-5653.1992] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/26/2022] Open
Abstract
In Escherichia coli, most of the dUMP that is used as a substrate for thymidylate synthetase is generated from dCTP through the sequential action of dCTP deaminase and dUTPase. Some mutations of the dut (dUTPase) gene are lethal even when the cells are grown in the presence of thymidine, but their lethality can be suppressed by extragenic mutations that can be produced by transposon insertion. Six suppressor mutations were tested, and all were found to belong to the same complementation group. The affected gene was cloned, it was mapped by hybridization with a library of recombinant DNA, and its nucleotide sequence was determined. The gene is at 2,149 kb on the physical map. Its product, a 21.2-kDa polypeptide, was overproduced 1,000-fold via an expression vector and identified as dCTP deaminase, the enzyme affected in previously described dcd mutants. Null mutations in dcd probably suppress the lethality of dut mutations by reducing the accumulation of dUTP, which would otherwise lead to the excessive incorporation of uracil into DNA.
Collapse
Affiliation(s)
- L Wang
- Department of Pathology, University of Michigan Medical School, Ann Arbor 48109-0602
| | | |
Collapse
|
28
|
Cardon LR, Stormo GD. Expectation maximization algorithm for identifying protein-binding sites with variable lengths from unaligned DNA fragments. J Mol Biol 1992; 223:159-70. [PMID: 1731067 DOI: 10.1016/0022-2836(92)90723-w] [Citation(s) in RCA: 97] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022]
Abstract
An Expectation Maximization algorithm for identification of DNA binding sites is presented. The approach predicts the location of binding regions while allowing variable length spacers within the sites. In addition to predicting the most likely spacer length for a set of DNA fragments, the method identifies individual sites that differ in spacer size. No alignment of DNA sequences is necessary. The method is illustrated by application to 231 Escherichia coli DNA fragments known to contain promoters with variable spacings between their consensus regions. Maximum-likelihood tests of the differences between the spacing classes indicate that the consensus regions of the spacing classes are not distinct. Further tests suggest that several positions within the spacing region may contribute to promoter specificity.
Collapse
Affiliation(s)
- L R Cardon
- Institute for Behavior Genetics, University of Colorado Boulder 80309-0447
| | | |
Collapse
|
29
|
Gartmann CJ, Grob U. SQUIRREL: Sequence QUery, Information Retrieval and REporting Library. A program package for analyzing signals in nucleic acid sequences for the VAX. Nucleic Acids Res 1991; 19:6033-40. [PMID: 1945887 PMCID: PMC329063 DOI: 10.1093/nar/19.21.6033] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
A computer tool is described for comparison, analysis and search of genetic signals. The method is based on sequence consensus matrices. It assumes that a genetic signal (such as a promoter, enhancer or whatever) is composed of several signal blocks separated from each other by variable distances. A set of programs is presented to perform the analysis. The result of such an analysis is a description of the investigated signal including matrices for each signal block, distances between each block and distribution of the values. Programs are provided to search for a signal using results from previous analysis. The method is able to align large sets of sequences within a few minutes and to check the quality of the alignment. An analysis of E.coli promoters is provided as an example.
Collapse
Affiliation(s)
- C J Gartmann
- Max-Planck-Institut für Immunbiologie, Freiburg, FRG
| | | |
Collapse
|
30
|
Dillard JP, Yother J. Analysis of Streptococcus pneumoniae sequences cloned into Escherichia coli: effect of promoter strength and transcription terminators. J Bacteriol 1991; 173:5105-9. [PMID: 1860821 PMCID: PMC208201 DOI: 10.1128/jb.173.16.5105-5109.1991] [Citation(s) in RCA: 29] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
Difficulties encountered in the cloning of DNA from Streptococcus pneumoniae and other AT-rich organisms into ColE1-type Escherichia coli vectors have been proposed to be due to the presence of a large number of strong promoter-acting sequences in the donor DNA. The use of transcription terminators has been advocated as a means of reducing instability resulting from disruption of plasmid replication caused by strong promoters. However, neither the existence of promoter-acting sequences of sufficient strength and number to explain the reported cloning difficulties nor their role as a source of instability has been proven. As a direct test of the "strong promoter" hypothesis, we cloned random fragments from S. pneumoniae into an E. coli vector containing transcription terminators, identified strong promoter-acting sequences, and subsequently removed the transcription terminators. We observed that terminator removal resulted in reduced copy numbers for the strongest promoter-acting sequences but not in reduced promoter strengths or altered plasmid stabilities. Our results indicate that promoters strong enough to require transcription terminators for plasmid stability are probably rare in S. pneumoniae DNA.
Collapse
Affiliation(s)
- J P Dillard
- Department of Microbiology, University of Alabama, Birmingham 35294
| | | |
Collapse
|
31
|
Wu J, Weiss B. Two divergently transcribed genes, soxR and soxS, control a superoxide response regulon of Escherichia coli. J Bacteriol 1991; 173:2864-71. [PMID: 1708380 PMCID: PMC207867 DOI: 10.1128/jb.173.9.2864-2871.1991] [Citation(s) in RCA: 203] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
soxR governs a superoxide response regulon that contains the genes for endonuclease IV, Mn2(+)-superoxide dismutase, and glucose 6-phosphate dehydrogenase. The soxR gene encodes a 17-kDa protein; some mutations of this gene cause constitutive overexpression of the regulon. Induction by paraquat (methyl viologen) requires both soxR and a new gene, soxS. soxS is adjacent to soxR, it encodes a 13-kDa protein, and it is required for paraquat resistance. These functions were revealed by studies in which the sequence of the 1.1-kb soxR-soxS region was determined, the 5' ends of the mRNAs were mapped, and complementation tests were performed with soxRS plasmids containing deletions of known sequence. The two genes are divergently transcribed, and the transcripts overlap. The soxS promoter is within the 85-nucleotide intergenic region, whereas the soxR promoter is within soxS. soxS mRNA increases after induction. Both protein products have possible DNA-binding (helix-turn-helix) domains. SoxR contains four cysteines (CX2CXCX5C) that might be part of a sensor region. SoxS shows 17 to 31% homology to the C-terminal portions of members of the AraC family of positive regulators.
Collapse
Affiliation(s)
- J Wu
- Department of Pathology, University of Michigan Medical School, Ann Arbor 48109-0602
| | | |
Collapse
|
32
|
Abstract
Methods for optimizing the prediction of Escherichia coli RNA polymerase promoter sequences by neural networks are presented. A neural network was trained on a set of 80 known promoter sequences combined with different numbers of random sequences. The conserved -10 region and -35 region of the promoter sequences and a combination of these regions were used in three independent training sets. The prediction accuracy of the resulting weight matrix was tested against a separate set of 30 known promoter sequences and 1500 random sequences. The effects of the network's topology, the extent of training, the number of random sequences in the training set and the effects of different data representations were examined and optimized. Accuracies of 100% on the promoter test set and 98.4% on the random test set were achieved with the optimal parameters.
Collapse
Affiliation(s)
- B Demeler
- Department of Biochemistry and Biophysics, Oregon State University, Corvallis 97331-6503
| | | |
Collapse
|
33
|
Silverman PM, Wickersham E, Harris R. Regulation of the F plasmid traY promoter in Escherichia coli by host and plasmid factors. J Mol Biol 1991; 218:119-28. [PMID: 2002497 DOI: 10.1016/0022-2836(91)90878-a] [Citation(s) in RCA: 46] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
F plasmid DNA transfer (tra) gene expression in Escherichia coli is regulated by chromosome- and F-encoded gene products. To study the relationship among these regulatory factors, we constructed low-copy plasmids containing a phi(traY'-'lacZ)hyb gene that couples beta-galactosidase and Lac permease synthesis to the F plasmid traY promoter. Wild-type transformants maintained high levels of beta-galactosidase over a broad range of culture densities. Primer extension analysis of tra mRNA from F'lac and phi(traY'-'lacZ)hyb strains indicated very similar, though not identical, transcription initiation sites. Moreover, phi(traY'-'lacZ)hyb gene expression required both TraJ and SfrA, as does tra gene expression in F+ strains. beta-Galactosidase activity was reduced approximately 30-fold in the absence of TraJ, which could be supplied in cis or in trans. In a two-plasmid system in which TraJ was supplied in trans by a lac-traJ operon fusion, phi(traY'-'lacZ)hyb expression was a linear, saturable function of traJ expression. Enzyme activity was reduced approximately tenfold in sfrA mutants. That reduction could not be attributed to an effect on the TraJ level. Several other cellular or environmental variables had only a modest effect on phi(traY'-'lacZ)hyb expression. Hyperexpression was observed at high cell density (twofold) and in anaerobic cultures (1.2- to 1.5-fold). In contrast, expression was reduced twofold in integration host factor mutants.
Collapse
Affiliation(s)
- P M Silverman
- Program in Molecular and Cell Biology, Oklahoma Medical Research Foundation, Oklahoma City 73104
| | | | | |
Collapse
|
34
|
Lupski JR, Zhang YH, Rieger M, Minter M, Hsu B, Ooi BG, Koeuth T, McCabe ER. Mutational analysis of the Escherichia coli glpFK region with Tn5 mutagenesis and the polymerase chain reaction. J Bacteriol 1990; 172:6129-34. [PMID: 2170343 PMCID: PMC526940 DOI: 10.1128/jb.172.10.6129-6134.1990] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Transposon Tn5 mutagenesis of the Escherichia coli chromosome was used to isolate 21 independent insertion mutations conferring an altered colony color phenotype on MacConkey-glycerol plates. The polymerase chain reaction was used to map 16 of these Tn5 insertions within the glpFK region at 88 min. The most polar Tn5 insertion was shown by nucleotide sequencing to be in the proposed glpF open reading frame. The data suggest that the glpF and glpK genes are in an operon with a bent DNA segment (BENT-6) involved in transcriptional regulation of this operon.
Collapse
Affiliation(s)
- J R Lupski
- Institute for Molecular Genetics, Baylor College of Medicine, Houston, Texas 77030
| | | | | | | | | | | | | | | |
Collapse
|
35
|
Goodrich JA, Schwartz ML, McClure WR. Searching for and predicting the activity of sites for DNA binding proteins: compilation and analysis of the binding sites for Escherichia coli integration host factor (IHF). Nucleic Acids Res 1990; 18:4993-5000. [PMID: 2205834 PMCID: PMC332103 DOI: 10.1093/nar/18.17.4993] [Citation(s) in RCA: 239] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
An analysis of the sequence information contained in a compilation of published binding sites for E. coli integration host factor (IHF) was performed. The sequences of twenty-seven IHF sites were aligned; the base occurrences at each position, the information content, and an extended consensus sequence were obtained for the IHF site. The base occurrences at each position of the IHF site were used with a program written for the Apple Macintosh computers in order to determine the similarity scores for published IHF sites. A linear correlation was found to exist between the logarithm of IHF binding and functional data (relative free energies) and similarity scores for two groups of IHF sites. The MacTargsearch program and its potential usefulness in searching for other sites and predicting their relative activities is discussed.
Collapse
Affiliation(s)
- J A Goodrich
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213
| | | | | |
Collapse
|
36
|
Bradshaw HD, Traxler BA, Minkley EG, Nester EW, Gordon MP. Nucleotide sequence of the traI (helicase I) gene from the sex factor F. J Bacteriol 1990; 172:4127-31. [PMID: 2163400 PMCID: PMC213404 DOI: 10.1128/jb.172.7.4127-4131.1990] [Citation(s) in RCA: 24] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
A 6.9-kilobase region of the Escherichia coli F plasmid containing the 3' half of the traD gene and the entire traI gene (encodes the TraI protein, DNA helicase I and TraI, a polypeptide arising from an internal in-frame translational start in traI) has been sequenced. A previously unidentified open reading frame (tentatively trbH) lies between traD and traI.
Collapse
Affiliation(s)
- H D Bradshaw
- Department of Biochemistry, University of Washington, Seattle 98195
| | | | | | | | | |
Collapse
|
37
|
Morrison DA, Jaurin B. Streptococcus pneumoniae possesses canonical Escherichia coli (sigma 70) promoters. Mol Microbiol 1990; 4:1143-52. [PMID: 2233251 DOI: 10.1111/j.1365-2958.1990.tb00689.x] [Citation(s) in RCA: 38] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
Seventeen DNA fragments from Streptococcus pneumoniae were randomly cloned in Escherichia coli with selection for promoter activity. The fragments were sequenced and the promoter locations were determined by primer extension analysis. Examination for sites similar to the E. coli major consensus promoter sequence revealed such a site in each of the seventeen fragments, located five to eight base pairs upstream of the point at which transcription was initiated in the E. coli host. Thus, the abundance of promoter activity found in pneumococcal DNA cloned in E. coli hosts arises primarily from sigma-70-type promoter structures. Combined with the observation that such sequences are usually found just upstream of, but not within, pneumococcal genes, this implies that one class (perhaps the major class) of pneumococcal promoters closely resembles the canonical E. coli promoter consensus.
Collapse
Affiliation(s)
- D A Morrison
- Department of Biological Sciences, University of Illinois, Chicago 60680
| | | |
Collapse
|
38
|
Bucher P. Weight matrix descriptions of four eukaryotic RNA polymerase II promoter elements derived from 502 unrelated promoter sequences. J Mol Biol 1990; 212:563-78. [PMID: 2329577 DOI: 10.1016/0022-2836(90)90223-9] [Citation(s) in RCA: 818] [Impact Index Per Article: 23.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Optimized weight matrices defining four major eukaryotic promoter elements, the TATA-box, cap signal, CCAAT-, and GC-box, are presented; they were derived by comparative sequence analysis of 502 unrelated RNA polymerase II promoter regions. The new TATA-box and cap signal descriptions differ in several respects from the only hitherto available base frequency Tables. The CCAAT-box matrix, obtained with no prior assumption but CCAAT being the core of the motif, reflects precisely the sequence specificity of the recently discovered nuclear factor NY-I/CP1 but does not include typical recognition sequences of two other purported CCAAT-binding proteins, CTF and CBP. The GC-box description is longer than the previously proposed consensus sequences but is consistent with Sp1 protein-DNA binding data. The notion of a CACCC element distinct from the GC-box seems not to be justified any longer in view of the new weight matrix. Unlike the two fixed-distance elements, neither the CCAAT- nor the GC-box occurs at significantly high frequency in the upstream regions of non-vertebrate genes. Preliminary attempts to predict promoters with the aid of the new signal descriptions were unexpectedly successful. The new TATA-box matrix locates eukaryotic transcription initiation sites as reliably as do the best currently available methods to map Escherichia coli promoters. This analysis was made possible by the recently established Eukaryotic Promoter Database (EPD) of the EMBL Nucleotide Sequence Data Library. In order to derive the weight matrices, a novel algorithm has been devised that is generally applicable to sequence motifs positionally correlated with a biologically defined position in the sequences. The signal must be sufficiently over-represented in a particular region relative to the given site, but need not be present in all members of the input sequence collection. The algorithm iteratively redefines the set of putative motif representatives from which a weight matrix is derived, so as to maximize a quantitative measure of local over-representation, an optimization criterion that naturally combines structural and positional constancy. A comprehensive description of the technique is presented in Methods and Data.
Collapse
Affiliation(s)
- P Bucher
- Department of Polymer Research, Weizmann Institute of Science, Rehovot, Israel
| |
Collapse
|
39
|
Alexandrov NN, Mironov AA. Application of a new method of pattern recognition in DNA sequence analysis: a study of E. coli promoters. Nucleic Acids Res 1990; 18:1847-52. [PMID: 2186368 PMCID: PMC330605 DOI: 10.1093/nar/18.7.1847] [Citation(s) in RCA: 14] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
An algorithm from the pattern recognition theory 'generalized portrait' was used to find a distinguishing vector (scoring matrix) for E. coli promoters. We have attempted to solve three closely linked problems: (i) the selection of significant features of the signal; (ii) subsequent multiple alignment and (iii) calculation of the vector coordinates. Promoters with known strength have been successfully ranked in the correct order using this vector. We demonstrate the use of this method in predicting the location of promoters. A revised consensus promoter sequence is also presented.
Collapse
|
40
|
Senapathy P, Shapiro MB, Harris NL. Splice junctions, branch point sites, and exons: sequence statistics, identification, and applications to genome project. Methods Enzymol 1990; 183:252-78. [PMID: 2314278 DOI: 10.1016/0076-6879(90)83018-5] [Citation(s) in RCA: 525] [Impact Index Per Article: 15.0] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
|
41
|
Aiba H, Hanamura A, Tobe T. Semisynthetic promoters activated by cyclic AMP receptor protein of Escherichia coli. Gene 1989; 85:91-7. [PMID: 2559880 DOI: 10.1016/0378-1119(89)90468-x] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Semisynthetic promoters activated by Escherichia coli cyclic AMP receptor protein (CRP) were created by combining a synthetic CRP-binding site (crb) and nucleotide sequences derived from cryptic promoter regions. A 22-bp oligodeoxyribonucleotide corresponding to an idealized crb was randomly placed into DNA regions that precede a promoterless lacZ gene on a plasmid. Several plasmid clones were obtained which allowed the expression of lacZ in crp+ cya+ cells carrying a chromosomal deletion of lac genes. The beta-galactosidase and the quantitative S1-nuclease assays of crp+ and delta crp cells harboring these plasmids indicated that the transcription from newly created promoters is dependent on CRP. Sequence analysis revealed that these promoters are divided into two types based on the location of the crb relative to the transcription start point (tsp). The distance from the center of the crb to the tsp is 70 bp in the first type and 38 bp in the second type. The sequences of all these promoters exhibit poor homology with the consensus promoter sequence.
Collapse
Affiliation(s)
- H Aiba
- Department of Chemistry, University of Tsukuba, Ibaraki, Japan
| | | | | |
Collapse
|
42
|
Abstract
An Escherichia coli gene, which complements two independent hemA mutants of E. coli, has been cloned onto a multi-copy plasmid and both its strands have been sequenced. Both complemented mutants produce 5-aminolevulinic acid (ALA) and display fluorescence after 24h. The cloned sequence appears to encode a 46-kDa protein, which when produced in the maxicell procedure is processed to a 41-kDa protein as determined by sodium dodecyl sulfate-polyacrylamide-gel electrophoresis. The amino acid sequence of the cloned gene product shows no significant homologies with any cloned ALA synthase, nor with any protein, in two E. coli databanks. A second cloned gene fragment, which has its coding region 34 bp away from the coding region of the gene that complements hemA, has been identified as part of protein release factor 1(RF1), thus confirming the location of hemA at min 26.7 and mapping it precisely near RF1. We have shown that E. coli utilizes the intact five-carbon chain of glutamate for the synthesis of ALA [Li et al., J Bacteriol. 171 (1989b) 2547-2552].
Collapse
Affiliation(s)
- J M Li
- Department of Biochemistry Biology City College, City University of New York, NY 10031
| | | | | |
Collapse
|
43
|
Rozkot F, Sázelová P, Pivec L. A novel method for promoter search enhanced by function-specific subgrouping of promoters--developed and tested on E.coli system. Nucleic Acids Res 1989; 17:4799-815. [PMID: 2664710 PMCID: PMC318033 DOI: 10.1093/nar/17.12.4799] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
A new method for evaluating some complex characteristics of the primary structure of E.coli promoters is proposed. The method, of nonparametric statistical significance, selects important conserved single-base positions in combination with 2-base coupling relations of identity and complementarity. The extended consensus of promoter characteristics thus obtained was used to scan unknown sequences for similarity with E.coli promoters. In terms of this method, a complete set of 244 E.coli promoters was shown to be structurally inconsistent. The set was then broken down into functionally homogeneous subsets of promoters to enhance the selectivity of the search for E.coli-specific promoter sequences, with a high significance level being attained.
Collapse
Affiliation(s)
- F Rozkot
- Institute of Molecular Genetics, Czechoslovak Academy of Sciences, Prague
| | | | | |
Collapse
|
44
|
Lukashin AV, Anshelevich VV, Amirikyan BR, Gragerov AI, Frank-Kamenetskii MD. Neural network models for promoter recognition. J Biomol Struct Dyn 1989; 6:1123-33. [PMID: 2818859 DOI: 10.1080/07391102.1989.10506540] [Citation(s) in RCA: 40] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
The problem of recognition of promoter sites in the DNA sequence has been treated with models of learning neural networks. The maximum network capacity admissible for this problem has been estimated on the basis of the total of experimental data available on the determined promoter sequences. The model of a block neural network has been constructed to satisfy this estimate and rules have been elaborated for its learning and testing. The learning process involves a small (of the order of 10%) part of the total set of promoter sequences. During this procedure the neural network develops a system of distinctive features (key words) to be used as a reference in identifying promoters against the background of random sequences. The learning quality is then tested with the whole set. The efficiency of promoter recognition has been found to amount to 94 to 99%. The probability of an arbitrary sequence being identified as a promoter is 2 to 6%.
Collapse
Affiliation(s)
- A V Lukashin
- Institute of Molecular Genetics, USSR Academy of Sciences, Moscow
| | | | | | | | | |
Collapse
|
45
|
Rothmel RK, LeClerc JE. Mutational analysis of the lac regulatory region: second-site changes that activate mutant promoters. Nucleic Acids Res 1989; 17:3909-25. [PMID: 2660105 PMCID: PMC317869 DOI: 10.1093/nar/17.10.3909] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023] Open
Abstract
Second-site mutations that restored activity to severe lacP1 down-promoter mutants were isolated. This was accomplished by using a bacteriophage f1 vector containing a fusion of the mutant E. coli lac promoters with the structural gene for chloramphenicol acetyltransferase (CAT), so that a system was provided for selecting phage revertants (or pseudorevertants) that conferred resistance of phage-infected cells to chloramphenicol. Among the second-site changes that relieved defects in mutant lac promoters, the only one that restored lacP1 activity was a T----G substitution at position -14, a weakly conserved site in E. coli promoters. Three other sequence changes, G----A at -2, A----T at +1, and C----A at +10, activated nascent promoters in the lac regulatory region. The nascent promoters conformed to the consensus rule, that activity is gained by sequence changes toward homology with consensus sequences at the -35 and -10 regions of the promoter. However, the relative activities of some promoters cannot be explained solely by consideration of their conserved sequence elements.
Collapse
Affiliation(s)
- R K Rothmel
- Department of Biochemistry, University of Rochester School of Medicine and Dentistry, NY 14642
| | | |
Collapse
|
46
|
O'Neill MC. Consensus methods for finding and ranking DNA binding sites. Application to Escherichia coli promoters. J Mol Biol 1989; 207:301-10. [PMID: 2666673 DOI: 10.1016/0022-2836(89)90256-8] [Citation(s) in RCA: 45] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/02/2023]
Abstract
There have been many different approaches employed to define the "consensus" sequence of various DNA binding sites and to use the definition obtained to locate and rank members of a given sequence family. The analysis presented here enlists two of these approaches, each in modified form, to develop a highly efficient search protocol for Escherichia coli promoters and to provide a relative ranking of these sites showing good agreement with in vitro measurements of promoter strength. Schneider et al. have applied Shannon's index of information content to evaluate the significance of each position within the consensus of a family of aligned sequences. In a formal sense, this index is only applicable to a group of sequences, providing at each position a negative entropy value between zero (random) and two bits (total conservation of a single base) for sequences in which all bases are equally represented. A method for evaluating how well an individual sequence conforms to the information content pattern of the consensus is described. A function is derived, by analogy to the information content of the sequence family, for application to individual sequences. Since this function is a measure of conformity, it can be used in a search protocol to identify new members of the family represented by the consensus. A protocol for locating E. coli promoters is presented. The Berg-von Hippel statistical-mechanical function is also tested in a similar application. While the information content function provides a superior search protocol, the Berg-von Hippel function, when scaled at each position by the information content, does well at ranking promoters according to their strength as measured in vitro.
Collapse
Affiliation(s)
- M C O'Neill
- Department of Biological Sciences, University of Maryland, Baltimore, MD 21228
| |
Collapse
|
47
|
Karjalainen TK, Evans DG, So M, Lee CH. Molecular cloning and nucleotide sequence of the colonization factor antigen I gene of Escherichia coli. Infect Immun 1989; 57:1126-30. [PMID: 2564374 PMCID: PMC313240 DOI: 10.1128/iai.57.4.1126-1130.1989] [Citation(s) in RCA: 36] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The colonization factor antigen I (CFA/I) gene has been isolated and sequenced. The amino acid sequence of CFA/I deduced from the nucleotide sequence is composed of 170 amino acids. The first 23 amino acids are considered to be the signal peptide of the CFA/I protein since they are not present in the protein sequence. Among the remaining amino acids, only two are different from the protein sequence: amino acid position 76 is an aspartic acid instead of an asparagine, and position 97 is a serine instead of an alanine. The CFA/I gene has a typical Shine-Dalgarno sequence located 10 base pairs (bp) upstream from the initiation codon. The sequence TACAAT located 48 bp upstream from the initiation codon was tentatively designated the -10 sequence of the CFA/I gene promoter. No sequences homologous to the consensus -35 promoter sequence was found. A pair of inverted repeat sequences followed by a stretch of eight A's are located 45 bp downstream from the termination codon of the CFA/I gene; this region may be a rho-independent transcriptional terminator.
Collapse
Affiliation(s)
- T K Karjalainen
- Department of Pathology, Indiana University School of Medicine, Indianapolis 46223
| | | | | | | |
Collapse
|
48
|
|
49
|
Abstract
The Escherichia coli hemB gene, which encodes 5-aminolevulinic acid dehydratase, and was cloned into pTZ18U, a multicopy plasmid, was sequenced. The hemB insert was double-digested with restriction enzymes and recloned back into pTZ18U and pTZ19U to allow for sequencing in two directions. In a second procedure, used to fill in gaps and to confirm the sequence derived from the first procedure, the whole insert was cloned into M13 phages. A nested set of deletions was constructed and recloned into M13. Both the double-digested fragments cloned into plasmids pTZ18U and pTZ19U and the overlapping fragments contained in M13 phages were sequenced using the dideoxy procedure with [35S]dATP. Computer software was used to identify coding regions and the correct reading frame. Two promoter regions, two Shine-Dalgarno sequences and two possible start sites were identified. Extensive homologies with yeast (36%), human liver (40%) and rat liver (40%) amino-acid (aa) sequences were observed, especially in the 16-aa Zn-binding region (75%) and the 4 aa surrounding the essential lysine at the active site (100% for rat and human proteins). Computer analysis of promoter strength and two independent analyses of codon usage indicated that the hemB gene is moderately expressed.
Collapse
Affiliation(s)
- J M Li
- Department of Biochemistry, City College, City University of New York
| | | | | |
Collapse
|
50
|
Oliphant AR, Struhl K. Defining the consensus sequences of E.coli promoter elements by random selection. Nucleic Acids Res 1988; 16:7673-83. [PMID: 3045761 PMCID: PMC338434 DOI: 10.1093/nar/16.15.7673] [Citation(s) in RCA: 56] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023] Open
Abstract
The consensus sequence of E.coli promoter elements was determined by the method of random selection. A large collection of hybrid molecules was produced in which random-sequence oligonucleotides were cloned in place of a wild-type promoter element, and functional -10 and -35 E.coli promoter elements were obtained by a genetic selection involving the expression of a structural gene. The DNA sequences and relative levels of function for -10 and -35 elements were determined. The consensus sequences determined by this approach are very similar to those determined by comparing DNA sequences of naturally occurring E.coli promoters. However, no strong correlation is observed between similarity to the consensus and relative level of function. The results are considered in terms of E.coli promoter function and of the general applicability of the random selection method.
Collapse
Affiliation(s)
- A R Oliphant
- Department of Biological Chemistry, Harvard Medical School, Boston, MA 02115
| | | |
Collapse
|