126
|
Kazakov AE, Cipriano MJ, Novichkov PS, Minovitsky S, Vinogradov DV, Arkin A, Mironov AA, Gelfand MS, Dubchak I. RegTransBase--a database of regulatory sequences and interactions in a wide range of prokaryotic genomes. Nucleic Acids Res 2006; 35:D407-12. [PMID: 17142223 PMCID: PMC1669780 DOI: 10.1093/nar/gkl865] [Citation(s) in RCA: 76] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
Abstract
RegTransBase is a manually curated database of regulatory interactions in prokaryotes that captures the knowledge in public scientific literature using a controlled vocabulary. Although several databases describing interactions between regulatory proteins and their binding sites are already being maintained, they either focus mostly on the model organisms Escherichia coli and Bacillus subtilis or are entirely computationally derived. RegTransBase describes a large number of regulatory interactions reported in many organisms and contains the following types of experimental data: the activation or repression of transcription by an identified direct regulator, determining the transcriptional regulatory function of a protein (or RNA) directly binding to DNA (RNA), mapping or prediction of a binding site for a regulatory protein and characterization of regulatory mutations. Currently, RegTransBase content is derived from about 3000 relevant articles describing over 7000 experiments in relation to 128 microbes. It contains data on the regulation of about 7500 genes and evidence for 6500 interactions with 650 regulators. RegTransBase also contains manually created position weight matrices (PWM) that can be used to identify candidate regulatory sites in over 60 species. RegTransBase is available at .
Collapse
|
127
|
Lyubetskaya AV, Rubanov LI, Gelfand MS. Use of the flux model of amino acid metabolism of Escherichia coli. BIOCHEMISTRY (MOSCOW) 2006; 71:1256-60. [PMID: 17140387 DOI: 10.1134/s0006297906110113] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
A program implementing a flux model of Escherichia coli metabolism was used to analyze the effects of the addition of amino acids (tryptophan, tyrosine, phenylalanine, leucine, isoleucine, valine, histidine, lysine, threonine, cysteine, methionine, arginine, proline) to minimal medium or media lacking nitrogen, carbon, or both. The overall response of the metabolic system to the addition of various amino acids to the minimal medium is similar. Glycolysis and the synthesis of pyruvate with its subsequent degradation to acetate via acetyl-CoA become more efficient, whereas the fluxes through the pentose phosphate pathway and the TCA cycle decrease. If amino acids are used as the sole source of carbon, nitrogen, or both, the changes in the flux distribution are determined mainly by the carbon limitation. The phosphoenolpyruvate to glucose-6-phosphate flux increases; the flux through the pentose phosphate path is directed towards ribulose-5-phosphate. Other changes are determined by the compounds that are the primary products of catabolism of the added amino acid.
Collapse
|
128
|
Rodionov DA, Gelfand MS, Todd JD, Curson ARJ, Johnston AWB. Computational reconstruction of iron- and manganese-responsive transcriptional networks in alpha-proteobacteria. PLoS Comput Biol 2006; 2:e163. [PMID: 17173478 PMCID: PMC1698941 DOI: 10.1371/journal.pcbi.0020163] [Citation(s) in RCA: 129] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2006] [Accepted: 10/18/2006] [Indexed: 01/08/2023] Open
Abstract
We used comparative genomics to investigate the distribution of conserved DNA-binding motifs in the regulatory regions of genes involved in iron and manganese homeostasis in alpha-proteobacteria. Combined with other computational approaches, this allowed us to reconstruct the metal regulatory network in more than three dozen species with available genome sequences. We identified several classes of cis-acting regulatory DNA motifs (Irr-boxes or ICEs, RirA-boxes, Iron-Rhodo-boxes, Fur-alpha-boxes, Mur-box or MRS, MntR-box, and IscR-boxes) in regulatory regions of various genes involved in iron and manganese uptake, Fe-S and heme biosynthesis, iron storage, and usage. Despite the different nature of the iron regulons in selected lineages of alpha-proteobacteria, the overall regulatory network is consistent with, and confirmed by, many experimental observations. This study expands the range of genes involved in iron homeostasis and demonstrates considerable interconnection between iron-responsive regulatory systems. The detailed comparative and phylogenetic analyses of the regulatory systems allowed us to propose a theory about the possible evolution of Fe and Mn regulons in alpha-proteobacteria. The main evolutionary event likely occurred in the common ancestor of the Rhizobiales and Rhodobacterales, where the Fur protein switched to regulating manganese transporters (and hence Fur had become Mur). In these lineages, the role of global iron homeostasis was taken by RirA and Irr, two transcriptional regulators that act by sensing the physiological consequence of the metal availability rather than its concentration per se, and thus provide for more flexible regulation. The availability of hundreds of complete genomes allows one to use comparative genomics to describe key metabolic processes and regulatory gene networks. Genome context analyses and comparisons of transcription factor binding sites between genomes offer a powerful approach for functional gene annotation. Reconstruction of transcriptional regulatory networks allows for better understanding of cellular processes, which can be substantiated by direct experimentation. Iron homeostasis in bacteria is conferred by the regulation of various iron uptake transporters, iron storage ferritins, and iron-containing enzymes. In high concentrations, iron is poisonous for the cell, so strict control of iron homeostasis is maintained, mostly at the level of transcription by iron-responsive regulators. Despite their general importance, iron regulatory networks in most bacterial species are not well-understood. In this study, Rodionov and colleagues applied comparative genomic approaches to describe the regulatory network formed by genes involved in iron homeostasis in the alpha subclass of proteobacteria, which have extremely versatile lifestyles. These networks are mediated by a set of various DNA motifs (or regulatory signals) that occur in 5′ gene regions and involve at least six different metal-responsive regulators. This study once again shows the power of comparative genomics in the analysis of complex regulatory networks and their evolution.
Collapse
|
129
|
Yang C, Rodionov DA, Li X, Laikova ON, Gelfand MS, Zagnitko OP, Romine MF, Obraztsova AY, Nealson KH, Osterman AL. Comparative Genomics and Experimental Characterization of N-Acetylglucosamine Utilization Pathway of Shewanella oneidensis. J Biol Chem 2006; 281:29872-85. [PMID: 16857666 DOI: 10.1074/jbc.m605052200] [Citation(s) in RCA: 103] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/06/2022] Open
Abstract
We used a comparative genomics approach implemented in the SEED annotation environment to reconstruct the chitin and GlcNAc utilization subsystem and regulatory network in most proteobacteria, including 11 species of Shewanella with completely sequenced genomes. Comparative analysis of candidate regulatory sites allowed us to characterize three different GlcNAc-specific regulons, NagC, NagR, and NagQ, in various proteobacteria and to tentatively assign a number of novel genes with specific functional roles, in particular new GlcNAc-related transport systems, to this subsystem. Genes SO3506 and SO3507, originally annotated as hypothetical in Shewanella oneidensis MR-1, were suggested to encode novel variants of GlcN-6-P deaminase and GlcNAc kinase, respectively. Reconstitution of the GlcNAc catabolic pathway in vitro using these purified recombinant proteins and GlcNAc-6-P deacetylase (SO3505) validated the entire pathway. Kinetic characterization of GlcN-6-P deaminase demonstrated that it is the subject of allosteric activation by GlcNAc-6-P. Consistent with genomic data, all tested Shewanella strains except S. frigidimarina, which lacked representative genes for the GlcNAc metabolism, were capable of utilizing GlcNAc as the sole source of carbon and energy. This study expands the range of carbon substrates utilized by Shewanella spp., unambiguously identifies several genes involved in chitin metabolism, and describes a novel variant of the classical three-step biochemical conversion of GlcNAc to fructose 6-phosphate first described in Escherichia coli.
Collapse
|
130
|
Kalinina OV, Gelfand MS. Amino acid residues that determine functional specificity of NADP- and NAD-dependent isocitrate and isopropylmalate dehydrogenases. Proteins 2006; 64:1001-9. [PMID: 16767773 DOI: 10.1002/prot.21027] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
Isocitrate and isopropylmalalte dehydrogenases are homologous enzymes important for the cell metabolism. They oxidize their substrates using NAD or NADP as cofactors. Thus, they have two specificities, towards the substrate and the cofactor, appearing in three combinations. Although many three-dimensional (3D) structures are resolved, identification of amino acids determining these specificities remains a challenge. We present computational identification and analysis of specificity-determining positions (SDPs). Besides many experimentally proven SDPs, we predict new SDPs, for example, four substrate-specific positions (103Leu, 105Thr, 337Ala, and 341Thr in IDH from E. coli) that contact the cofactor and may play a role in the recognition process.
Collapse
|
131
|
Rodionov DA, Gelfand MS. Computational identification of BioR, a transcriptional regulator of biotin metabolism in Alphaproteobacteria, and of its binding signal. FEMS Microbiol Lett 2006; 255:102-7. [PMID: 16436068 DOI: 10.1111/j.1574-6968.2005.00070.x] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
Comparative genomic analysis was applied to identify the biotin transcriptional regulator, BioR, in most Alphaproteobacteria, and to identify its recognition signal TTATMKATAA. BioR belongs to the GntR family of transcriptional repressors. The functional assignment is supported by three lines of evidence: (1) bioR is positionally clustered with various bio genes, both for biotin biosynthesis and transport; (2) in most cases, candidate BioR-binding sites (BIOR boxes) are observed upstream of the bioR genes, suggesting autoregulation; (3) the phyletic distribution of the BIOR boxes coincides exactly with the phyletic distribution of the bioR genes, as the genomes lacking BIOR boxes do not have orthologs of bioR. Thus, in Alphaproteobacteria, BioR seems to have assumed the role of the biotin regulator that in most other bacteria is fulfilled by the dual function biotin-protein ligase BirA having the DNA-binding helix-turn-helix domain.
Collapse
|
132
|
Spirin V, Gelfand MS, Mironov AA, Mirny LA. A metabolic network in the evolutionary context: multiscale structure and modularity. Proc Natl Acad Sci U S A 2006; 103:8774-9. [PMID: 16731630 PMCID: PMC1482654 DOI: 10.1073/pnas.0510258103] [Citation(s) in RCA: 68] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2005] [Indexed: 01/09/2023] Open
Abstract
The enormous complexity of biological networks has led to the suggestion that networks are built of modules that perform particular functions and are "reused" in evolution in a manner similar to reusable domains in protein structures or modules of electronic circuits. Analysis of known biological networks has revealed several modules, many of which have transparent biological functions. However, it remains to be shown that identified structural modules constitute evolutionary building blocks, independent and easily interchangeable units. An alternative possibility is that evolutionary modules do not match structural modules. To investigate the structure of evolutionary modules and their relationship to functional ones, we integrated a metabolic network with evolutionary associations between genes inferred from comparative genomics. The resulting metabolic-genomic network places metabolic pathways into evolutionary and genomic context, thereby revealing previously unknown components and modules. We analyzed the integrated metabolic-genomic network on three levels: macro-, meso-, and microscale. The macroscale level demonstrates strong associations between neighboring enzymes and between enzymes that are distant on the network but belong to the same linear pathway. At the mesoscale level, we identified evolutionary metabolic modules and compared them with traditional metabolic pathways. Although, in some cases, there is almost exact correspondence, some pathways are split into independent modules. On the microscale level, we observed high association of enzyme subunits and weak association of isoenzymes independently catalyzing the same reaction. This study shows that evolutionary modules, rather than pathways, may be thought of as regulatory and functional units in bacterial genomes.
Collapse
|
133
|
Permina EA, Kazakov AE, Kalinina OV, Gelfand MS. Comparative genomics of regulation of heavy metal resistance in Eubacteria. BMC Microbiol 2006; 6:49. [PMID: 16753059 PMCID: PMC1526738 DOI: 10.1186/1471-2180-6-49] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2005] [Accepted: 06/05/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Heavy metal resistance (HMR) in Eubacteria is regulated by a variety of systems including transcription factors from the MerR family (COG0789). The HMR systems are characterized by the complex signal structure (strong palindrome within a 19 or 20 bp promoter spacer), and usually consist of transporter and regulator genes. Some HMR regulons also include detoxification systems. The number of sequenced bacterial genomes is constantly increasing and even though HMR resistance regulons of the COG0789 type usually consist of few genes per genome, the computational analysis may contribute to the understanding of the cellular systems of metal detoxification. RESULTS We studied the mercury (MerR), copper (CueR and HmrR), cadmium (CadR), lead (PbrR), and zinc (ZntR) resistance systems and demonstrated that combining protein sequence analysis and analysis of DNA regulatory signals it was possible to distinguish metal-dependent members of COG0789, assign specificity towards particular metals to uncharacterized loci, and find new genes involved in the metal resistance, in particular, multicopper oxidase and copper chaperones, candidate cytochromes from the copper regulon, new cadmium transporters and, possibly, glutathione-S-transferases. CONCLUSION Our data indicate that the specificity of the COG0789 systems can be determined combining phylogenetic analysis and identification of DNA regulatory sites. Taking into account signal structure, we can adequately identify genes that are activated using the DNA bending-unbending mechanism. In the case of regulon members that do not reside in single loci, analysis of potential regulatory sites could be crucial for the correct annotation and prediction of the specificity.
Collapse
|
134
|
Gelfand MS. Evolution of transcriptional regulatory networks in microbial genomes. Curr Opin Struct Biol 2006; 16:420-9. [PMID: 16650982 DOI: 10.1016/j.sbi.2006.04.001] [Citation(s) in RCA: 54] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2006] [Revised: 02/08/2006] [Accepted: 04/18/2006] [Indexed: 12/23/2022]
Abstract
Advances in sequencing and generating high-throughput expression data have created a situation in which it is possible to integrate comparative analysis with genome-wide studies of the structure and function of regulatory systems in model organisms. Recent studies have focused on topological properties and the evolution of regulatory networks. This problem can be addressed on several levels: evolution of binding sites upstream of orthologous or duplicated genes; co-evolution of transcription factors and the DNA motifs that they recognize; expansion, contraction and replacement of regulatory systems; the relationship between co-regulation and co-expression; and, finally, construction of evolutionary models that generate networks with realistic properties. This should eventually lead to the creation of a theory of regulatory evolution with a similar level of detail and understanding to the theory of molecular evolution of protein and DNA sequences.
Collapse
|
135
|
Ermakova EO, Nurtdinov RN, Gelfand MS. Fast rate of evolution in alternatively spliced coding regions of mammalian genes. BMC Genomics 2006; 7:84. [PMID: 16620375 PMCID: PMC1459143 DOI: 10.1186/1471-2164-7-84] [Citation(s) in RCA: 35] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2005] [Accepted: 04/18/2006] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND At least half of mammalian genes are alternatively spliced. Alternative isoforms are often genome-specific and it has been suggested that alternative splicing is one of the major mechanisms for generating protein diversity in the course of evolution. Another way of looking at alternative splicing is to consider sequence evolution of constitutive and alternative regions of protein-coding genes. Indeed, it turns out that constitutive and alternative regions evolve in different ways. RESULTS A set of 3029 orthologous pairs of human and mouse alternatively spliced genes was considered. The rate of nonsynonymous substitutions (dN), the rate of synonymous substitutions (dS), and their ratio (omega = dN/dS) appear to be significantly higher in alternatively spliced coding regions compared to constitutive regions. When N-terminal, internal and C-terminal alternatives are analysed separately, C-terminal alternatives appear to make the main contribution to the observed difference. The effects become even more pronounced in a subset of fast evolving genes. CONCLUSION These results provide evidence of weaker purifying selection and/or stronger positive selection in alternative regions and thus one more confirmation of accelerated evolution in alternative regions. This study corroborates the theory that alternative splicing serves as a testing ground for molecular evolution.
Collapse
|
136
|
Rodionov DA, Hebbeln P, Gelfand MS, Eitinger T. Comparative and functional genomic analysis of prokaryotic nickel and cobalt uptake transporters: evidence for a novel group of ATP-binding cassette transporters. J Bacteriol 2006; 188:317-27. [PMID: 16352848 PMCID: PMC1317602 DOI: 10.1128/jb.188.1.317-327.2006] [Citation(s) in RCA: 206] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022] Open
Abstract
The transition metals nickel and cobalt, essential components of many enzymes, are taken up by specific transport systems of several different types. We integrated in silico and in vivo methods for the analysis of various protein families containing both nickel and cobalt transport systems in prokaryotes. For functional annotation of genes, we used two comparative genomic approaches: identification of regulatory signals and analysis of the genomic positions of genes encoding candidate nickel/cobalt transporters. The nickel-responsive repressor NikR regulates many nickel uptake systems, though the NikR-binding signal is divergent in various taxonomic groups of bacteria and archaea. B(12) riboswitches regulate most of the candidate cobalt transporters in bacteria. The nickel/cobalt transporter genes are often colocalized with genes for nickel-dependent or coenzyme B(12) biosynthesis enzymes. Nickel/cobalt transporters of different families, including the previously known NiCoT, UreH, and HupE/UreJ families of secondary systems and the NikABCDE ABC-type transporters, showed a mosaic distribution in prokaryotic genomes. In silico analyses identified CbiMNQO and NikMNQO as the most widespread groups of microbial transporters for cobalt and nickel ions. These unusual uptake systems contain an ABC protein (CbiO or NikO) but lack an extracytoplasmic solute-binding protein. Experimental analysis confirmed metal transport activity for three members of this family and demonstrated significant activity for a basic module (CbiMN) of the Salmonella enterica serovar Typhimurium transporter.
Collapse
|
137
|
Malko DB, Makeev VJ, Mironov AA, Gelfand MS. Evolution of exon-intron structure and alternative splicing in fruit flies and malarial mosquito genomes. Genome Res 2006; 16:505-9. [PMID: 16520458 PMCID: PMC1457027 DOI: 10.1101/gr.4236606] [Citation(s) in RCA: 40] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Comparative analysis of alternative splicing of orthologous genes from fruit flies (Drosophila melanogaster and Drosophila pseudoobscura) and mosquito (Anopheles gambiae) demonstrated that both in the fruit fly genes and in fruit fly-mosquito comparisons, constitutive exons and splicing sites are more conserved than alternative ones. While >97% of constitutive D. melanogaster exons are conserved in D. pseudoobscura, only approximately 80% of alternative exons are conserved. Similarly, 77% of constitutive fruit fly exons are conserved in the mosquito genes, compared with <50% of alternative exons. Internal alternatives are more conserved than terminal ones. Retained introns are the least conserved, alternative acceptor sites are slightly more conserved than donor sites, and mutually exclusive exons are almost as conserved as constitutive exons. Cassette and mutually exclusive exons experience almost no intron insertions. We also observed cases of interconversion of various elementary alternatives, e.g., transformation of cassette exons into alternative sites. These results agree with the observations made earlier in human-mouse comparisons and demonstrate that the phenomenon of relatively low conservation of alternatively spliced regions may be universal, as it has been observed in different taxonomic groups (mammals and insects) and at various evolutionary distances.
Collapse
|
138
|
Gerasimova AV, Gelfand MS. Evolution of the NadR regulon in Enterobacteriaceae. J Bioinform Comput Biol 2005; 3:1007-19. [PMID: 16078372 DOI: 10.1142/s0219720005001387] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2005] [Revised: 02/18/2005] [Accepted: 02/24/2005] [Indexed: 12/23/2022]
Abstract
The NAD biosynthetic pathway and NAD transformations in E. coli and S. typhi are well characterized. Using comparative genomics methods we describe the NadR regulon in other Enterobacteriaceae, identity new candidate regulon members and demonstrate that even a very simple regulon covering an essential methabolic pathway could be different in closely related genomes.
Collapse
|
139
|
Neverov AD, Artamonova II, Nurtdinov RN, Frishman D, Gelfand MS, Mironov AA. Alternative splicing and protein function. BMC Bioinformatics 2005; 6:266. [PMID: 16274476 PMCID: PMC1298288 DOI: 10.1186/1471-2105-6-266] [Citation(s) in RCA: 38] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2005] [Accepted: 11/07/2005] [Indexed: 11/16/2022] Open
Abstract
Background Alternative splicing is a major mechanism of generating protein diversity in higher eukaryotes. Although at least half, and probably more, of mammalian genes are alternatively spliced, it was not clear, whether the frequency of alternative splicing is the same in different functional categories. The problem is obscured by uneven coverage of genes by ESTs and a large number of artifacts in the EST data. Results We have developed a method that generates possible mRNA isoforms for human genes contained in the EDAS database, taking into account the effects of nonsense-mediated decay and translation initiation rules, and a procedure for offsetting the effects of uneven EST coverage. Then we computed the number of mRNA isoforms for genes from different functional categories. Genes encoding ribosomal proteins and genes in the category "Small GTPase-mediated signal transduction" tend to have fewer isoforms than the average, whereas the genes in the category "DNA replication and chromosome cycle" have more isoforms than the average. Genes encoding proteins involved in protein-protein interactions tend to be alternatively spliced more often than genes encoding non-interacting proteins, although there is no significant difference in the number of isoforms of alternatively spliced genes. Conclusion Filtering for functional isoforms satisfying biological constraints and accountung for uneven EST coverage allowed us to describe differences in alternative splicing of genes from different functional categories. The observations seem to be consistent with expectations based on current biological knowledge: less isoforms for ribosomal and signal transduction proteins, and more alternative splicing of interacting and cell cycle proteins.
Collapse
|
140
|
Fededa JP, Petrillo E, Gelfand MS, Neverov AD, Kadener S, Nogués G, Pelisch F, Baralle FE, Muro AF, Kornblihtt AR. A polar mechanism coordinates different regions of alternative splicing within a single gene. Mol Cell 2005; 19:393-404. [PMID: 16061185 DOI: 10.1016/j.molcel.2005.06.035] [Citation(s) in RCA: 54] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2004] [Revised: 03/14/2005] [Accepted: 06/30/2005] [Indexed: 11/20/2022]
Abstract
Alternative splicing plays a key role in generating protein diversity. Transfections with minigenes revealed coordination between two distant, alternatively spliced exons in the same gene. Mutations that either inhibit or stimulate inclusion of the upstream alternative exon deeply affect inclusion of the downstream one. However, similar mutations at the downstream alternative exon have little effect on the upstream one. This polar effect is promoter specific and is enhanced by inhibition of transcriptional elongation. Consistently, cells from mutant mice with either constitutive or null inclusion of a fibronectin alternative exon revealed coordination with a second alternative splicing region, located far downstream. Using allele-specific RT-PCR, we demonstrate that this coordination occurs in cis and is also affected by transcriptional elongation rates. Bioinformatics supports the generality of these findings, indicating that 25% of human genes contain multiple alternative splicing regions and identifying several genes with nonrandom distribution of mRNA isoforms at two alternative regions.
Collapse
|
141
|
Rodionov DA, Dubchak IL, Arkin AP, Alm EJ, Gelfand MS. Dissimilatory metabolism of nitrogen oxides in bacteria: comparative reconstruction of transcriptional networks. PLoS Comput Biol 2005; 1:e55. [PMID: 16261196 PMCID: PMC1274295 DOI: 10.1371/journal.pcbi.0010055] [Citation(s) in RCA: 225] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2005] [Accepted: 09/29/2005] [Indexed: 12/30/2022] Open
Abstract
Bacterial response to nitric oxide (NO) is of major importance since NO is an obligatory intermediate of the nitrogen cycle. Transcriptional regulation of the dissimilatory nitric oxides metabolism in bacteria is diverse and involves FNR-like transcription factors HcpR, DNR, and NnrR; two-component systems NarXL and NarQP; NO-responsive activator NorR; and nitrite-sensitive repressor NsrR. Using comparative genomics approaches, we predict DNA-binding motifs for these transcriptional factors and describe corresponding regulons in available bacterial genomes. Within the FNR family of regulators, we observed a correlation of two specificity-determining amino acids and contacting bases in corresponding DNA recognition motif. Highly conserved regulon HcpR for the hybrid cluster protein and some other redox enzymes is present in diverse anaerobic bacteria, including Clostridia, Thermotogales, and delta-proteobacteria. NnrR and DNR control denitrification in alpha- and beta-proteobacteria, respectively. Sigma-54-dependent NorR regulon found in some gamma- and beta-proteobacteria contains various enzymes involved in the NO detoxification. Repressor NsrR, which was previously known to control only nitrite reductase operon in Nitrosomonas spp., appears to be the master regulator of the nitric oxides' metabolism, not only in most gamma- and beta-proteobacteria (including well-studied species such as Escherichia coli), but also in Gram-positive Bacillus and Streptomyces species. Positional analysis and comparison of regulatory regions of NO detoxification genes allows us to propose the candidate NsrR-binding motif. The most conserved member of the predicted NsrR regulon is the NO-detoxifying flavohemoglobin Hmp. In enterobacteria, the regulon also includes two nitrite-responsive loci, nipAB (hcp-hcr) and nipC (dnrN), thus confirming the identity of the effector, i.e. nitrite. The proposed NsrR regulons in Neisseria and some other species are extended to include denitrification genes. As the result, we demonstrate considerable interconnection between various nitrogen-oxides-responsive regulatory systems for the denitrification and NO detoxification genes and evolutionary plasticity of this transcriptional network. Comparative genomics is the analysis and comparison of genomes from different species. More then 100 complete genomes of bacteria are now available. Comparative analysis of binding sites for transcriptional regulators is a powerful approach for functional gene annotation. Knowledge of transcriptional regulatory networks is essential for understanding cellular processes in bacteria. The global nitrogen cycle includes interconversion of nitrogen oxides between a number of redox states. Despite the importance of bacterial nitrogen oxides' metabolism for ecology and medicine, our understanding of their regulation is limited. In this study, the researchers have applied comparative genomic approaches to describe a regulatory network of genes involved in the nitrogen oxides' metabolism in bacteria. The described regulatory network involves five nitric oxide−responsive transcription factors with different DNA recognition motifs. Different combinations of these regulators appear to regulate expression of dozens of genes involved in nitric oxide detoxification and denitrification. The reconstructed network demonstrates considerable interconnection and evolutionary plasticity. Not only are genes shuffled between regulons in different genomes, but there is also considerable interaction between regulators. Overall, the system seems to be quite conserved; however, many regulatory interactions in the identified core regulatory network are taxon-specific. This study demonstrates the power of comparative genomics in the analysis of complex regulatory networks and their evolution.
Collapse
|
142
|
Seliverstov AV, Putzer H, Gelfand MS, Lyubetsky VA. Comparative analysis of RNA regulatory elements of amino acid metabolism genes in Actinobacteria. BMC Microbiol 2005; 5:54. [PMID: 16202131 PMCID: PMC1262725 DOI: 10.1186/1471-2180-5-54] [Citation(s) in RCA: 34] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2005] [Accepted: 10/03/2005] [Indexed: 01/25/2023] Open
Abstract
Background Formation of alternative structures in mRNA in response to external stimuli, either direct or mediated by proteins or other RNAs, is a major mechanism of regulation of gene expression in bacteria. This mechanism has been studied in detail using experimental and computational approaches in proteobacteria and Firmicutes, but not in other groups of bacteria. Results Comparative analysis of amino acid biosynthesis operons in Actinobacteria resulted in identification of conserved regions upstream of several operons. Classical attenuators were predicted upstream of trp operons in Corynebacterium spp. and Streptomyces spp., and trpS and leuS genes in some Streptomyces spp. Candidate leader peptides with terminators were observed upstream of ilvB genes in Corynebacterium spp., Mycobacterium spp. and Streptomyces spp. Candidate leader peptides without obvious terminators were found upstream of cys operons in Mycobacterium spp. and several other species. A conserved pseudoknot (named LEU element) was identified upstream of leuA operons in most Actinobacteria. Finally, T-boxes likely involved in the regulation of translation initiation were observed upstream of ileS genes from several Actinobacteria. Conclusion The metabolism of tryptophan, cysteine and leucine in Actinobacteria seems to be regulated on the RNA level. In some cases the mechanism is classical attenuation, but in many cases some components of attenuators are missing. The most interesting case seems to be the leuA operon preceded by the LEU element that may fold into a conserved pseudoknot or an alternative structure. A LEU element has been observed in a transposase gene from Bifidobacterium longum, but it is not conserved in genes encoding closely related transposases despite a very high level of protein similarity. One possibility is that the regulatory region of the leuA has been co-opted from some element involved in transposition. Analysis of phylogenetic patterns allowed for identification of ML1624 of M. leprae and its orthologs as the candidate regulatory proteins that may bind to the LEU element. T-boxes upstream of the ileS genes are unusual, as their regulatory mechanism seems to be inhibition of translation initiation via a hairpin sequestering the Shine-Dalgarno box.
Collapse
|
143
|
Oparina NJ, Kalinina OV, Gelfand MS, Kisselev LL. Common and specific amino acid residues in the prokaryotic polypeptide release factors RF1 and RF2: possible functional implications. Nucleic Acids Res 2005; 33:5226-34. [PMID: 16162810 PMCID: PMC1214553 DOI: 10.1093/nar/gki841] [Citation(s) in RCA: 13] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
Termination of protein synthesis is promoted in ribosomes by proper stop codon discrimination by class 1 polypeptide release factors (RFs). A large set of prokaryotic RFs differing in stop codon specificity, RF1 for UAG and UAA, and RF2 for UGA and UAA, was analyzed by means of a recently developed computational method allowing identification of the specificity-determining positions (SDPs) in families composed of proteins with similar but not identical function. Fifteen SDPs were identified within the RF1/2 superdomain II/IV known to be implicated in stop codon decoding. Three of these SDPs had particularly high scores. Five residues invariant for RF1 and RF2 [invariant amino acid residues (IRs)] were spatially clustered with the highest-scoring SDPs that in turn were located in two zones within the SDP/IR area. Zone 1 (domain II) included PxT and SPF motifs identified earlier by others as ‘discriminator tripeptides’. We suggest that IRs in this zone take part in the recognition of U, the first base of all stop codons. Zone 2 (domain IV) possessed two SDPs with the highest scores not identified earlier. Presumably, they also take part in stop codon binding and discrimination. Elucidation of potential functional role(s) of the newly identified SDP/IR zones requires further experiments.
Collapse
|
144
|
Rodionov DA, Gelfand MS. Identification of a bacterial regulatory system for ribonucleotide reductases by phylogenetic profiling. Trends Genet 2005; 21:385-9. [PMID: 15949864 DOI: 10.1016/j.tig.2005.05.011] [Citation(s) in RCA: 65] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2004] [Revised: 02/16/2005] [Accepted: 05/10/2005] [Indexed: 11/29/2022]
Abstract
Using comparative genomics approaches, we analyzed the regulation of ribonucleotide reductase genes in bacterial genomes. A highly conserved palindromic signal with consensus acaCwAtATaTwGtg, named NrdR-box, was identified upstream of most operons encoding ribonuleotide reductases from three different classes. By correlating the occurrence of NrdR-boxes with phylogenetic distribution of ortholog families, we identified a transcriptional regulator containing Zn-ribbon and ATP-cone motifs (COG1327) for the predicted ribonucleotide reductase regulon. Further characterization of the regulon and metabolic reconstruction of the regulated pathways demonstrated its functional link to replication. The method of simultaneous phylogenetic profiling of genes and conserved regulatory signals introduced in this study could be used to identify transcriptional factors regulating orphan regulons.
Collapse
|
145
|
Artamonova II, Gelfand MS. Evolution of the exon-intron structure and alternative splicing of the MAGE-A family of cancer/testis antigens. J Mol Evol 2005; 59:620-31. [PMID: 15693618 DOI: 10.1007/s00239-004-2654-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Cancer/testis antigens (CT-antigens) are proteins that are predominantly expressed in cancer and testis and thus are possible targets for immunotherapy. Most of them form large multigene families. The evolution of the MAGE-A family of CT-antigens is characterized by four processes: (1) gene duplications; (2) duplications of the initial exon; (3) point mutations and short insertions/deletions inactivating splicing sites or creating new sites; and (4) deletions removing sites and creating chimeric exons. All this concerns the genomic regions upstream of the coding region, creating a wide diversity of isoforms with different 5'-untranslated regions. Many of these isoforms are gene-specific and have emerged due to point mutations in alternative and constitutive splicing sites. There are also examples of chimeric mRNAs, likely produced by splicing of read-through transcripts. Since there is consistent use of homologous sites for different genes and no random, indiscriminant use of preexisting cryptic sites, it is likely that most observed isoforms are functional, and do not result from relaxed control in transformed cells.
Collapse
|
146
|
Rodionov DA, Gelfand MS, Hugouvieux-Cotte-Pattat N. Comparative genomics of the KdgR regulon in Erwinia chrysanthemi 3937 and other gamma-proteobacteria. MICROBIOLOGY-SGM 2005; 150:3571-3590. [PMID: 15528647 DOI: 10.1099/mic.0.27041-0] [Citation(s) in RCA: 99] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
In the plant-pathogenic enterobacterium Erwinia chrysanthemi, almost all known genes involved in pectin catabolism are controlled by the transcriptional regulator KdgR. In this study, the comparative genomics approach was used to analyse the KdgR regulon in completely sequenced genomes of eight enterobacteria, including Erw. chrysanthemi, and two Vibrio species. Application of a signal recognition procedure complemented by operon structure and protein sequence analysis allowed identification of new candidate genes of the KdgR regulon. Most of these genes were found to be controlled by the cAMP-receptor protein, a global regulator of catabolic genes. At the next step, regulation of these genes in Erw. chrysanthemi was experimentally verified using in vivo transcriptional fusions and an attempt was made to clarify the functional role of the predicted genes in pectin catabolism. Interestingly, it was found that the KdgR protein, previously known as a repressor, positively regulates expression of two new members of the regulon, phosphoenolpyruvate synthase gene ppsA and an adjacent gene, ydiA, of unknown function. Other predicted regulon members, namely chmX, dhfX, gntB, pykF, spiX, sotA, tpfX, yeeO and yjgK, were found to be subject to classical negative regulation by KdgR. Possible roles of newly identified members of the Erw. chrysanthemi KdgR regulon, chmX, dhfX, gntDBMNAC, spiX, tpfX, ydiA, yeeO, ygjV and yjgK, in pectin catabolism are discussed. Finally, complete reconstruction of the KdgR regulons in various gamma-proteobacteria yielded a metabolic map reflecting a globally conserved pathway for the catabolism of pectin and its derivatives with variability in transport and enzymic capabilities among species. In particular, possible non-orthologous substitutes of isomerase KduI and a new oligogalacturonide transporter in the Vibrio species were detected.
Collapse
|
147
|
Favorov AV, Gelfand MS, Gerasimova AV, Ravcheev DA, Mironov AA, Makeev VJ. A Gibbs sampler for identification of symmetrically structured, spaced DNA motifs with improved estimation of the signal length. Bioinformatics 2005; 21:2240-5. [PMID: 15728117 DOI: 10.1093/bioinformatics/bti336] [Citation(s) in RCA: 75] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Transcription regulatory protein factors often bind DNA as homo-dimers or hetero-dimers. Thus they recognize structured DNA motifs that are inverted or direct repeats or spaced motif pairs. However, these motifs are often difficult to identify owing to their high divergence. The motif structure included explicitly into the motif recognition algorithm improves recognition efficiency for highly divergent motifs as well as estimation of motif geometric parameters. RESULT We present a modification of the Gibbs sampling motif extraction algorithm, SeSiMCMC (Sequence Similarities by Markov Chain Monte Carlo), which finds structured motifs of these types, as well as non-structured motifs, in a set of unaligned DNA sequences. It employs improved estimators of motif and spacer lengths. The probability that a sequence does not contain any motif is accounted for in a rigorous Bayesian manner. We have applied the algorithm to a set of upstream regions of genes from two Escherichia coli regulons involved in respiration. We have demonstrated that accounting for a symmetric motif structure allows the algorithm to identify weak motifs more accurately. In the examples studied, ArcA binding sites were demonstrated to have the structure of a direct spaced repeat, whereas NarP binding sites exhibited the palindromic structure. AVAILABILITY The WWW interface of the program, its FreeBSD (4.0) and Windows 32 console executables are available at http://bioinform.genetika.ru/SeSiMCMC
Collapse
|
148
|
Kotelnikova EA, Makeev VJ, Gelfand MS. Evolution of transcription factor DNA binding sites. Gene 2005; 347:255-63. [PMID: 15725380 DOI: 10.1016/j.gene.2004.12.013] [Citation(s) in RCA: 15] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2004] [Revised: 11/12/2004] [Accepted: 12/02/2004] [Indexed: 11/17/2022]
Abstract
In bioinformatics, binding of transcription regulatory factors to the cognate binding sites is usually described by sequence-specific binding energy, which is estimated from a training sample of sites. This model implies that all binding sites with binding energy above some threshold are functional and site sequence variations should be considered neutral until they do not reduce this energy below the threshold. To quantify this energy, the binding profile (positional weight matrix, PWM) model or consensus-based model is usually applied. Here we show that in many cases available data are not sufficient to construct a relevant PWM, and modified consensus-based model could be more effective to describe binding properties. Further, using the data about binding sites of several transcription factors, we demonstrate that some non-consensus nucleotides in "orthologous sites" (that is, binding sites of the same factor upstream of orthologous genes), which have been believed to be irrelevant or even hindering the regulation, are evolutionary very stable and specific for the regulated gene. For each two considered genomes, the number of substitutions between non-consensus nucleotides is far less than the expected number of neutral substitutions. Moreover, in several positions of binding sites regulating different genes, there are non-consensus nucleotides conserved in distant genomes. It means that there exists a selection pressure, which results in the stability of non-consensus nucleotides.
Collapse
|
149
|
Zinin NV, Serkina AV, Gelfand MS, Shevelev AB, Sineoky SP. Gene cloning, expression and characterization of novel phytase from Obesumbacterium proteus. FEMS Microbiol Lett 2005; 236:283-90. [PMID: 15251209 DOI: 10.1016/j.femsle.2004.05.051] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2004] [Revised: 05/22/2004] [Accepted: 05/31/2004] [Indexed: 11/26/2022] Open
Abstract
The gene phyA encoding phytase was isolated from Obesumbacterium proteus genomic library and sequenced. The cleavage site of the PhyA signal peptide was predicted and experimentally proved. The PhyA protein shows maximum identity of 53% and 47% to phosphoanhydride phosphorylase from Yersinia pestis and phytase AppA from Escherichia coli, respectively. Based on protein sequence similarity of PhyA and its homologs, the phytases form a novel subclass of the histidine acid phosphatase family. To characterize properties of the PhyA protein, we expressed the phyA gene in E. coli. The specific activity of the purified recombinant PhyA was 310 U mg(-1) of protein. Recombinant PhyA showed activity at pH values from 1.5 through 6.5 with the optimum at 4.9. The temperature optimum was 40-45 degrees C at pH 4.9. The Km value for sodium phytate was 0.34 mM with a Vmax of 435 U mg(-1).
Collapse
|
150
|
Rodionov DA, Dubchak I, Arkin A, Alm E, Gelfand MS. Reconstruction of regulatory and metabolic pathways in metal-reducing delta-proteobacteria. Genome Biol 2004; 5:R90. [PMID: 15535866 PMCID: PMC545781 DOI: 10.1186/gb-2004-5-11-r90] [Citation(s) in RCA: 152] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/02/2004] [Revised: 09/20/2004] [Accepted: 09/30/2004] [Indexed: 12/23/2022] Open
Abstract
A study of the genetic and regulatory factors in several biosynthesis, metal ion homeostasis, stress response, and energy metabolism pathways suggests that phylogenetically diverse δ-proteobacteria have homologous regulatory components. Background Relatively little is known about the genetic basis for the unique physiology of metal-reducing genera in the delta subgroup of the proteobacteria. The recent availability of complete finished or draft-quality genome sequences for seven representatives allowed us to investigate the genetic and regulatory factors in a number of key pathways involved in the biosynthesis of building blocks and cofactors, metal-ion homeostasis, stress response, and energy metabolism using a combination of regulatory sequence detection and analysis of genomic context. Results In the genomes of δ-proteobacteria, we identified candidate binding sites for four regulators of known specificity (BirA, CooA, HrcA, sigma-32), four types of metabolite-binding riboswitches (RFN-, THI-, B12-elements and S-box), and new binding sites for the FUR, ModE, NikR, PerR, and ZUR transcription factors, as well as for the previously uncharacterized factors HcpR and LysX. After reconstruction of the corresponding metabolic pathways and regulatory interactions, we identified possible functions for a large number of previously uncharacterized genes covering a wide range of cellular functions. Conclusions Phylogenetically diverse δ-proteobacteria appear to have homologous regulatory components. This study for the first time demonstrates the adaptability of the comparative genomic approach to de novo reconstruction of a regulatory network in a poorly studied taxonomic group of bacteria. Recent efforts in large-scale functional genomic characterization of Desulfovibrio species will provide a unique opportunity to test and expand our predictions.
Collapse
|