101
|
Zhang Y, Rodionov DA, Gelfand MS, Gladyshev VN. Comparative genomic analyses of nickel, cobalt and vitamin B12 utilization. BMC Genomics 2009; 10:78. [PMID: 19208259 PMCID: PMC2667541 DOI: 10.1186/1471-2164-10-78] [Citation(s) in RCA: 182] [Impact Index Per Article: 12.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2008] [Accepted: 02/10/2009] [Indexed: 12/25/2022] Open
Abstract
BACKGROUND Nickel (Ni) and cobalt (Co) are trace elements required for a variety of biological processes. Ni is directly coordinated by proteins, whereas Co is mainly used as a component of vitamin B12. Although a number of Ni and Co-dependent enzymes have been characterized, systematic evolutionary analyses of utilization of these metals are limited. RESULTS We carried out comparative genomic analyses to examine occurrence and evolutionary dynamics of the use of Ni and Co at the level of (i) transport systems, and (ii) metalloproteomes. Our data show that both metals are widely used in bacteria and archaea. Cbi/NikMNQO is the most common prokaryotic Ni/Co transporter, while Ni-dependent urease and Ni-Fe hydrogenase, and B12-dependent methionine synthase (MetH), ribonucleotide reductase and methylmalonyl-CoA mutase are the most widespread metalloproteins for Ni and Co, respectively. Occurrence of other metalloenzymes showed a mosaic distribution and a new B12-dependent protein family was predicted. Deltaproteobacteria and Methanosarcina generally have larger Ni- and Co-dependent proteomes. On the other hand, utilization of these two metals is limited in eukaryotes, and very few of these organisms utilize both of them. The Ni-utilizing eukaryotes are mostly fungi (except saccharomycotina) and plants, whereas most B12-utilizing organisms are animals. The NiCoT transporter family is the most widespread eukaryotic Ni transporter, and eukaryotic urease and MetH are the most common Ni- and B12-dependent enzymes, respectively. Finally, investigation of environmental and other conditions and identity of organisms that show dependence on Ni or Co revealed that host-associated organisms (particularly obligate intracellular parasites and endosymbionts) have a tendency for loss of Ni/Co utilization. CONCLUSION Our data provide information on the evolutionary dynamics of Ni and Co utilization and highlight widespread use of these metals in the three domains of life, yet only a limited number of user proteins.
Collapse
|
102
|
Sorokin V, Severinov K, Gelfand MS. Systematic prediction of control proteins and their DNA binding sites. Nucleic Acids Res 2008; 37:441-51. [PMID: 19056824 PMCID: PMC2632904 DOI: 10.1093/nar/gkn931] [Citation(s) in RCA: 30] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/26/2022] Open
Abstract
We present here the results of a systematic bioinformatics analysis of control (C) proteins, a class of DNA-binding regulators that control time-delayed transcription of their own genes as well as restriction endonuclease genes in many type II restriction-modification systems. More than 290 C protein homologs were identified and DNA-binding sites for ∼70% of new and previously known C proteins were predicted by a combination of phylogenetic footprinting and motif searches in DNA upstream of C protein genes. Additional analysis revealed that a large proportion of C protein genes are translated from leaderless RNA, which may contribute to time-delayed nature of genetic switches operated by these proteins. Analysis of genetic contexts of newly identified C protein genes revealed that they are not exclusively associated with restriction-modification genes; numerous instances of associations with genes originating from mobile genetic elements were observed. These instances might be vestiges of ancient horizontal transfers and indicate that during evolution ancestral restriction-modification system genes were the sites of mobile elements insertions.
Collapse
|
103
|
Gelfand MS. Some alphabets easily beat Russian letter count. Nature 2008; 454:691. [DOI: 10.1038/454691a] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
104
|
Sernova NV, Gelfand MS. Identification of replication origins in prokaryotic genomes. Brief Bioinform 2008; 9:376-91. [PMID: 18660512 DOI: 10.1093/bib/bbn031] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The availability of hundreds of complete bacterial genomes has created new challenges and simultaneously opportunities for bioinformatics. In the area of statistical analysis of genomic sequences, the studies of nucleotide compositional bias and gene bias between strands and replichores paved way to the development of tools for prediction of bacterial replication origins. Only a few (about 20) origin regions for eubacteria and archaea have been proven experimentally. One reason for that may be that this is now considered as an essentially bioinformatics problem, where predictions are sufficiently reliable not to run labor-intensive experiments, unless specifically needed. Here we describe the main existing approaches to the identification of replication origin (oriC) and termination (terC) loci in prokaryotic chromosomes and characterize a number of computational tools based on various skew types and other types of evidence. We also classify the eubacterial and archaeal chromosomes by predictability of their replication origins using skew plots. Finally, we discuss possible combined approaches to the identification of the oriC sites that may be used to improve the prediction tools, in particular, the analysis of DnaA binding sites using the comparative genomic methods.
Collapse
|
105
|
Ramensky VE, Nurtdinov RN, Neverov AD, Mironov AA, Gelfand MS. Positive selection in alternatively spliced exons of human genes. Am J Hum Genet 2008; 83:94-8. [PMID: 18571144 DOI: 10.1016/j.ajhg.2008.05.017] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2008] [Revised: 04/08/2008] [Accepted: 05/30/2008] [Indexed: 10/21/2022] Open
Abstract
Alternative splicing is a well-recognized mechanism of accelerated genome evolution. We have studied single-nucleotide polymorphisms and human-chimpanzee divergence in the exons of 6672 alternatively spliced human genes, with the aim of understanding the forces driving the evolution of alternatively spliced sequences. Here, we show that alternatively spliced exons and exon fragments (alternative exons) from minor isoforms experience lower selective pressure at the amino acid level, accompanied by selection against synonymous sequence variation. The results of the McDonald-Kreitman test suggest that alternatively spliced exons, unlike exons constitutively included in the mRNA, are also subject to positive selection, with up to 27% of amino acids fixed by positive selection.
Collapse
|
106
|
Rodionov DA, Li X, Rodionova IA, Yang C, Sorci L, Dervyn E, Martynowski D, Zhang H, Gelfand MS, Osterman AL. Transcriptional regulation of NAD metabolism in bacteria: genomic reconstruction of NiaR (YrxA) regulon. Nucleic Acids Res 2008; 36:2032-46. [PMID: 18276644 PMCID: PMC2330245 DOI: 10.1093/nar/gkn046] [Citation(s) in RCA: 60] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
A comparative genomic approach was used to reconstruct transcriptional regulation of NAD biosynthesis in bacteria containing orthologs of Bacillus subtilis gene yrxA, a previously identified niacin-responsive repressor of NAD de novo synthesis. Members of YrxA family (re-named here NiaR) are broadly conserved in the Bacillus/Clostridium group and in the deeply branching Fusobacteria and Thermotogales lineages. We analyzed upstream regions of genes associated with NAD biosynthesis to identify candidate NiaR-binding DNA motifs and assess the NiaR regulon content in these species. Representatives of the two distinct types of candidate NiaR-binding sites, characteristic of the Firmicutes and Thermotogales, were verified by an electrophoretic mobility shift assay. In addition to transcriptional control of the nadABC genes, the NiaR regulon in some species extends to niacin salvage (the pncAB genes) and includes uncharacterized membrane proteins possibly involved in niacin transport. The involvement in niacin uptake proposed for one of these proteins (re-named NiaP), encoded by the B. subtilis gene yceI, was experimentally verified. In addition to bacteria, members of the NiaP family are conserved in multicellular eukaryotes, including human, pointing to possible NaiP involvement in niacin utilization in these organisms. Overall, the analysis of the NiaR and NrtR regulons (described in the accompanying paper) revealed mechanisms of transcriptional regulation of NAD metabolism in nearly a hundred diverse bacteria.
Collapse
|
107
|
Vitreschak AG, Mironov AA, Lyubetsky VA, Gelfand MS. Comparative genomic analysis of T-box regulatory systems in bacteria. RNA (NEW YORK, N.Y.) 2008; 14:717-35. [PMID: 18359782 PMCID: PMC2271356 DOI: 10.1261/rna.819308] [Citation(s) in RCA: 100] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/06/2007] [Accepted: 12/31/2007] [Indexed: 05/26/2023]
Abstract
T-box antitermination is one of the main mechanisms of regulation of genes involved in amino acid metabolism in Gram-positive bacteria. T-box regulatory sites consist of conserved sequence and RNA secondary structure elements. Using a set of known T-box sites, we constructed the common pattern and used it to scan available bacterial genomes. New T-boxes were found in various Gram-positive bacteria, some Gram-negative bacteria (delta-proteobacteria), and some other bacterial groups (Deinococcales/Thermales, Chloroflexi, Dictyoglomi). The majority of T-box-regulated genes encode aminoacyl-tRNA synthetases. Two other groups of T-box-regulated genes are amino acid biosynthetic genes and transporters, as well as genes with unknown function. Analysis of candidate T-box sites resulted in new functional annotations. We assigned the amino acid specificity to a large number of candidate amino acid transporters and a possible function to amino acid biosynthesis genes. We then studied the evolution of the T-boxes. Analysis of the constructed phylogenetic trees demonstrated that in addition to the normal evolution consistent with the evolution of regulated genes, T-boxes may be duplicated, transferred to other genes, and change specificity. We observed several cases of recent T-box regulon expansion following the loss of a previously existing regulatory system, in particular, arginine regulon in Clostridium difficile and methionine regulon in Lactobacillaceae. Finally, we described a new structural class of T-boxes containing duplicated terminator-antiterminator elements and unusual reduced T-boxes regulating initiation of translation in the Actinobacteria.
Collapse
MESH Headings
- 5' Untranslated Regions
- Amino Acid Transport Systems/genetics
- Amino Acid Transport Systems/metabolism
- Amino Acids/metabolism
- Bacteria/genetics
- Bacteria/metabolism
- Bacterial Proteins/genetics
- Bacterial Proteins/metabolism
- Base Sequence
- DNA, Bacterial/genetics
- Evolution, Molecular
- Gene Expression Regulation, Bacterial
- Genome, Bacterial
- Genomics
- Models, Biological
- Models, Molecular
- Molecular Sequence Data
- Nucleic Acid Conformation
- Phylogeny
- RNA, Bacterial/chemistry
- RNA, Bacterial/genetics
- RNA, Messenger/chemistry
- RNA, Messenger/genetics
- Regulon
- Sequence Homology, Nucleic Acid
- T-Box Domain Proteins/genetics
- T-Box Domain Proteins/metabolism
Collapse
|
108
|
Kurmangaliyev YZ, Gelfand MS. Computational analysis of splicing errors and mutations in human transcripts. BMC Genomics 2008; 9:13. [PMID: 18194514 PMCID: PMC2234086 DOI: 10.1186/1471-2164-9-13] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2007] [Accepted: 01/14/2008] [Indexed: 01/10/2023] Open
Abstract
Background Most retained introns found in human cDNAs generated by high-throughput sequencing projects seem to result from underspliced transcripts, and thus they capture intermediate steps of pre-mRNA splicing. On the other hand, mutations in splice sites cause exon skipping of the respective exon or activation of pre-existing cryptic sites. Both types of events reflect properties of the splicing mechanism. Results The retained introns were significantly shorter than constitutive ones, and skipped exons are shorter than exons with cryptic sites. Both donor and acceptor splice sites of retained introns were weaker than splice sites of constitutive introns. The authentic acceptor sites affected by mutations were significantly weaker in exons with activated cryptic sites than in skipped exons. The distance from a mutated splice site to the nearest equivalent site is significantly shorter in cases of activated cryptic sites compared to exon skipping events. The prevalence of retained introns within genes monotonically increased in the 5'-to-3' direction (more retained introns close to the 3'-end), consistent with the model of co-transcriptional splicing. The density of exonic splicing enhancers was higher, and the density of exonic splicing silencers lower in retained introns compared to constitutive ones and in exons with cryptic sites compared to skipped exons. Conclusion Thus the analysis of retained introns in human cDNA, exons skipped due to mutations in splice sites and exons with cryptic sites produced results consistent with the intron definition mechanism of splicing of short introns, co-transcriptional splicing, dependence of splicing efficiency on the splice site strength and the density of candidate exonic splicing enhancers and silencers. These results are consistent with other, recently published analyses.
Collapse
|
109
|
Ermakova EO, Nurtdinov RN, Gelfand MS. Overlapping alternative donor splice sites in the human genome. J Bioinform Comput Biol 2008; 5:991-1004. [PMID: 17933007 DOI: 10.1142/s0219720007003089] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2007] [Revised: 05/30/2007] [Accepted: 06/01/2007] [Indexed: 11/18/2022]
Abstract
Over 50% of donor splice sites in the human genome have a potential alternative donor site at a distance of three to six nucleotides. Conservation of these potential sites is determined by the consensus requirements and by its exonic or intronic location. Several hundred pairs of overlapping sites are confirmed to be alternatively spliced as both sites in a pair are supported by a protein, by a full-length mRNA, or by expressed sequence tags (ESTs) from at least two independent clone libraries. Overlapping sites may clash with consensus requirements. Pairs with a site shift of four nucleotides are the most abundant, despite the frameshift in the protein-coding region that they introduce. The site usage in pairs is usually uneven, and the major site is more frequently conserved in other mammalian genomes. Overlapping alternative donor sites and acceptor sites may have different functional roles: alternative splicing of overlapping acceptor sites leads mainly to microvariations in protein sequences; whereas alternative donor sites often lead to frameshifts and thus either yield major differences in the protein sequence and structure, or generate nonsense-mediated decay-inducing mRNA isoforms likely involved in regulated unproductive splicing pathways.
Collapse
|
110
|
Nurtdinov RN, Neverov AD, Favorov AV, Mironov AA, Gelfand MS. Conserved and species-specific alternative splicing in mammalian genomes. BMC Evol Biol 2007; 7:249. [PMID: 18154685 PMCID: PMC2231371 DOI: 10.1186/1471-2148-7-249] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2007] [Accepted: 12/22/2007] [Indexed: 11/30/2022] Open
Abstract
Background Alternative splicing has been shown to be one of the major evolutionary mechanisms for protein diversification and proteome expansion, since a considerable fraction of alternative splicing events appears to be species- or lineage-specific. However, most studies were restricted to the analysis of cassette exons in pairs of genomes and did not analyze functionality of the alternative variants. Results We analyzed conservation of human alternative splice sites and cassette exons in the mouse and dog genomes. Alternative exons, especially minor-isofom ones, were shown to be less conserved than constitutive exons. Frame-shifting alternatives in the protein-coding regions are less conserved than frame-preserving ones. Similarly, the conservation of alternative sites is highest for evenly used alternatives, and higher when the distance between the sites is divisible by three. The rate of alternative-exon and site loss in mouse is slightly higher than in dog, consistent with faster evolution of the former. The evolutionary dynamics of alternative sites was shown to be consistent with the model of random activation of cryptic sites. Conclusion Consistent with other studies, our results show that minor cassette exons are less conserved than major-alternative and constitutive exons. However, our study provides evidence that this is caused not only by exon birth, but also lineage-specific loss of alternative exons and sites, and it depends on exon functionality.
Collapse
|
111
|
Kovaleva GY, Gelfand MS. Transcriptional regulation of the methionine and cysteine transport and metabolism in streptococci. FEMS Microbiol Lett 2007; 276:207-15. [DOI: 10.1111/j.1574-6968.2007.00934.x] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
|
112
|
Kazanov MD, Vitreschak AG, Gelfand MS. Abundance and functional diversity of riboswitches in microbial communities. BMC Genomics 2007; 8:347. [PMID: 17908319 PMCID: PMC2211319 DOI: 10.1186/1471-2164-8-347] [Citation(s) in RCA: 49] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2007] [Accepted: 10/01/2007] [Indexed: 02/07/2023] Open
Abstract
BACKGROUND Several recently completed large-scale enviromental sequencing projects produced a large amount of genetic information about microbial communities ('metagenomes') which is not biased towards cultured organisms. It is a good source for estimation of the abundance of genes and regulatory structures in both known and unknown members of microbial communities. In this study we consider the distribution of RNA regulatory structures, riboswitches, in the Sargasso Sea, Minnesota Soil and Whale Falls metagenomes. RESULTS Over three hundred riboswitches were found in about 2 Gbp metagenome DNA sequences. The abundabce of riboswitches in metagenomes was highest for the TPP, B12 and GCVT riboswitches; the S-box, RFN, YKKC/YXKD, YYBP/YKOY regulatory elements showed lower but significant abundance, while the LYS, G-box, GLMS and YKOK riboswitches were rare. Regions downstream of identified riboswitches were scanned for open reading frames. Comparative analysis of identified ORFs revealed new riboswitch-regulated functions for several classes of riboswitches. In particular, we have observed phosphoserine aminotransferase serC (COG1932) and malate synthase glcB (COG2225) to be regulated by the glycine (GCVT) riboswitch; fatty acid desaturase ole1 (COG1398), by the cobalamin (B12) riboswitch; 5-methylthioribose-1-phosphate isomerase ykrS (COG0182), by the SAM-riboswitch. We also identified conserved riboswitches upstream of genes of unknown function: thiamine (TPP), cobalamine (B12), and glycine (GCVT, upstream of genes from COG4198). CONCLUSION This study demonstrates applicability of bioinformatics to the analysis of RNA regulatory structures in metagenomes.
Collapse
|
113
|
Severinov K, Semenova E, Kazakov A, Kazakov T, Gelfand MS. Low-molecular-weight post-translationally modified microcins. Mol Microbiol 2007. [DOI: 10.1111/j.1365-2958.2007.05938.x] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
|
114
|
Makarova KS, Omelchenko MV, Gaidamakova EK, Matrosova VY, Vasilenko A, Zhai M, Lapidus A, Copeland A, Kim E, Land M, Mavromatis K, Pitluck S, Richardson PM, Detter C, Brettin T, Saunders E, Lai B, Ravel B, Kemner KM, Wolf YI, Sorokin A, Gerasimova AV, Gelfand MS, Fredrickson JK, Koonin EV, Daly MJ. Deinococcus geothermalis: the pool of extreme radiation resistance genes shrinks. PLoS One 2007; 2:e955. [PMID: 17895995 PMCID: PMC1978522 DOI: 10.1371/journal.pone.0000955] [Citation(s) in RCA: 185] [Impact Index Per Article: 10.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2007] [Accepted: 09/04/2007] [Indexed: 11/19/2022] Open
Abstract
Bacteria of the genus Deinococcus are extremely resistant to ionizing radiation (IR), ultraviolet light (UV) and desiccation. The mesophile Deinococcus radiodurans was the first member of this group whose genome was completely sequenced. Analysis of the genome sequence of D. radiodurans, however, failed to identify unique DNA repair systems. To further delineate the genes underlying the resistance phenotypes, we report the whole-genome sequence of a second Deinococcus species, the thermophile Deinococcus geothermalis, which at its optimal growth temperature is as resistant to IR, UV and desiccation as D. radiodurans, and a comparative analysis of the two Deinococcus genomes. Many D. radiodurans genes previously implicated in resistance, but for which no sensitive phenotype was observed upon disruption, are absent in D. geothermalis. In contrast, most D. radiodurans genes whose mutants displayed a radiation-sensitive phenotype in D. radiodurans are conserved in D. geothermalis. Supporting the existence of a Deinococcus radiation response regulon, a common palindromic DNA motif was identified in a conserved set of genes associated with resistance, and a dedicated transcriptional regulator was predicted. We present the case that these two species evolved essentially the same diverse set of gene families, and that the extreme stress-resistance phenotypes of the Deinococcus lineage emerged progressively by amassing cell-cleaning systems from different sources, but not by acquisition of novel DNA repair systems. Our reconstruction of the genomic evolution of the Deinococcus-Thermus phylum indicates that the corresponding set of enzymes proliferated mainly in the common ancestor of Deinococcus. Results of the comparative analysis weaken the arguments for a role of higher-order chromosome alignment structures in resistance; more clearly define and substantially revise downward the number of uncharacterized genes that might participate in DNA repair and contribute to resistance; and strengthen the case for a role in survival of systems involved in manganese and iron homeostasis.
Collapse
|
115
|
Artamonova II, Frishman G, Gelfand MS, Frishman D. Mining sequence annotation databanks for association patterns. Bioinformatics 2007; 21 Suppl 3:iii49-57. [PMID: 16306393 DOI: 10.1093/bioinformatics/bti1206] [Citation(s) in RCA: 27] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Millions of protein sequences currently being deposited to sequence databanks will never be annotated manually. Similarity-based annotation generated by automatic software pipelines unavoidably contains spurious assignments due to the imperfection of bioinformatics methods. Examples of such annotation errors include over- and underpredictions caused by the use of fixed recognition thresholds and incorrect annotations caused by transitivity based information transfer to unrelated proteins or transfer of errors already accumulated in databases. One of the most difficult and timely challenges in bioinformatics is the development of intelligent systems aimed at improving the quality of automatically generated annotation. A possible approach to this problem is to detect anomalies in annotation items based on association rule mining. RESULTS We present the first large-scale analysis of association rules derived from two large protein annotation databases-Swiss-Prot and PEDANT-and reveal novel, previously unknown tendencies of rule strength distributions. Most of the rules are either very strong or very weak, with rules in the medium strength range being relatively infrequent. Based on dynamics of error correction in subsequent Swiss-Prot releases and on our own manual analysis we demonstrate that exceptions from strong rules are, indeed, significantly enriched in annotation errors and can be used to automatically flag them. We identify different strength dependencies of rules derived from different fields in Swiss-Prot. A compositional breakdown of association rules generated from PEDANT in terms of their constituent items indicates that most of the errors that can be corrected are related to gene functional roles. Swiss-Prot errors are usually caused by under-annotation owing to its conservative approach, whereas automatically generated PEDANT annotation suffers from over-annotation. AVAILABILITY All data generated in this study are available for download and browsing at http://pedant.gsf.de/ARIA/index.htm.
Collapse
|
116
|
Severinov K, Semenova E, Kazakov A, Kazakov T, Gelfand MS. Low-molecular-weight post-translationally modified microcins. Mol Microbiol 2007; 65:1380-94. [PMID: 17711420 DOI: 10.1111/j.1365-2958.2007.05874.x] [Citation(s) in RCA: 117] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022]
Abstract
Microcins are a class of ribosomally synthesized antibacterial peptides produced by Enterobacteriaceae and active against closely related bacterial species. While some microcins are active as unmodified peptides, others are heavily modified by dedicated maturation enzymes. Low-molecular-weight microcins from the post-translationally modified group target essential molecular machines inside the cells. In this review, available structural and functional data about three such microcins--microcin J25, microcin B17 and microcin C7-C51--are discussed. While all three low-molecular-weight post-translationally modified microcins are produced by Escherichia coli, inferences based on sequence and structural similarities with peptides encoded or produced by phylogenetically diverse bacteria are made whenever possible to put these compounds into a larger perspective.
Collapse
|
117
|
Kolesov G, Wunderlich Z, Laikova ON, Gelfand MS, Mirny LA. How gene order is influenced by the biophysics of transcription regulation. Proc Natl Acad Sci U S A 2007; 104:13948-53. [PMID: 17709750 PMCID: PMC1955771 DOI: 10.1073/pnas.0700672104] [Citation(s) in RCA: 121] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
What are the forces that shape the structure of prokaryotic genomes: the order of genes, their proximity, and their orientation? Coregulation and coordinated horizontal gene transfer are believed to promote the proximity of functionally related genes and the formation of operons. However, forces that influence the structure of the genome beyond the level of a single operon remain unknown. Here, we show that the biophysical mechanism by which regulatory proteins search for their sites on DNA can impose constraints on genome structure. Using simulations, we demonstrate that rapid and reliable gene regulation requires that the transcription factor (TF) gene be close to the site on DNA the TF has to bind, thus promoting the colocalization of TF genes and their targets on the genome. We use parameters that have been measured in recent experiments to estimate the relevant length and times scales of this process and demonstrate that the search for a cognate site may be prohibitively slow if a TF has a low copy number and is not colocalized. We also analyze TFs and their sites in a number of bacterial genomes, confirm that they are colocalized significantly more often than expected, and show that this observation cannot be attributed to the pressure for coregulation or formation of selfish gene clusters, thus supporting the role of the biophysical constraint in shaping the structure of prokaryotic genomes. Our results demonstrate how spatial organization can influence timing and noise in gene expression.
Collapse
|
118
|
Artamonova II, Gelfand MS. Comparative Genomics and Evolution of Alternative Splicing: The Pessimists' Science. Chem Rev 2007; 107:3407-30. [PMID: 17645315 DOI: 10.1021/cr068304c] [Citation(s) in RCA: 45] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
119
|
Enikeeva FN, Kotelnikova EA, Gelfand MS, Makeev VJ. A model of evolution with constant selective pressure for regulatory DNA sites. BMC Evol Biol 2007; 7:125. [PMID: 17662135 PMCID: PMC1978210 DOI: 10.1186/1471-2148-7-125] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2007] [Accepted: 07/27/2007] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Molecular evolution is usually described assuming a neutral or weakly non-neutral substitution model. Recently, new data have become available on evolution of sequence regions under a selective pressure, e.g. transcription factor binding sites. To reconstruct the evolutionary history of such sequences, one needs evolutionary models that take into account a substantial constant selective pressure. RESULTS We present a simple evolutionary model with a single preferred (consensus) nucleotide and the neutral substitution model adopted for all other nucleotides. This evolutionary model has a rate matrix in which all substitutions that do not involve the consensus nucleotide occur with the same rate. The model has two time scales for achieving a stationary distribution; in the general case only one of the two rate parameters can be evaluated from the stationary distribution. In the middle-time zone, a counterintuitive behavior was observed for some parameter values, with a probability of conservation for a non-consensus nucleotide greater than that for the consensus nucleotide. Such an effect can be observed only in the case of weak preference for the consensus nucleotide, when the probability to observe the consensus nucleotide in the stationary distribution is less than 1/2. If the substitution rate is represented as a product of mutation and fixation, only the fixation can be calculated from the stationary distribution. The exhibited conservation of non-consensus nucleotides does not take place if the elements of mutation matrix are identical, and can be related to the reduced mutation rate between the non-consensus nucleotides. This bias can have no effect on the stationary distribution of nucleotide frequencies calculated over the ensemble of multiple alignments, e.g. transcription factor binding sites upstream of different sets of co-regulated orthologous genes. CONCLUSION The derived model can be used as a null model when analyzing the evolution of orthologous transcription factor binding sites. In particular, our findings show that a nucleotide preferred at some position of a multiple alignment of binding sites for some transcription factor in the same genome is not necessarily the most conserved nucleotide in an alignment of orthologous sites from different species. However, this effect can take place only in the case of a mutation matrix whose elements are not identical.
Collapse
|
120
|
Gvakharia BO, Permina EA, Gelfand MS, Bottomley PJ, Sayavedra-Soto LA, Arp DJ. Global transcriptional response of Nitrosomonas europaea to chloroform and chloromethane. Appl Environ Microbiol 2007; 73:3440-5. [PMID: 17369330 PMCID: PMC1907119 DOI: 10.1128/aem.02831-06] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
Upon exposure of Nitrosomonas europaea to chloroform (7 microM, 1 h), transcripts for 175 of 2,460 genes were found at higher levels in treated cells than in untreated cells and transcripts for 501 genes were found at lower levels. With chloromethane (3.2 mM, 1 h), transcripts for 67 genes were at higher levels and transcripts for 148 genes were at lower levels. Transcripts for 37 genes were at higher levels following both treatments and included genes for heat shock proteins, sigma-factors of the extracytoplasmic function subfamily, and toxin-antitoxin loci. N. europaea has higher levels of transcripts for a variety of defense genes when exposed to chloroform or chloromethane.
Collapse
|
121
|
Ravcheev DA, Gerasimova AV, Mironov AA, Gelfand MS. Comparative genomic analysis of regulation of anaerobic respiration in ten genomes from three families of gamma-proteobacteria (Enterobacteriaceae, Pasteurellaceae, Vibrionaceae). BMC Genomics 2007; 8:54. [PMID: 17313674 PMCID: PMC1805755 DOI: 10.1186/1471-2164-8-54] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2006] [Accepted: 02/21/2007] [Indexed: 01/08/2023] Open
Abstract
BACKGROUND Gamma-proteobacteria, such as Escherichia coli, can use a variety of respiratory substrates employing numerous aerobic and anaerobic respiratory systems controlled by multiple transcription regulators. Thus, in E. coli, global control of respiration is mediated by four transcription factors, Fnr, ArcA, NarL and NarP. However, in other Gamma-proteobacteria the composition of global respiration regulators may be different. RESULTS In this study we applied a comparative genomic approach to the analysis of three global regulatory systems, Fnr, ArcA and NarP. These systems were studied in available genomes containing these three regulators, but lacking NarL. So, we considered several representatives of Pasteurellaceae, Vibrionaceae and Yersinia spp. As a result, we identified new regulon members, functioning in respiration, central metabolism (glycolysis, gluconeogenesis, pentose phosphate pathway, citrate cicle, metabolism of pyruvate and lactate), metabolism of carbohydrates and fatty acids, transcriptional regulation and transport, in particular: the ATP synthase operon atpIBEFHAGCD, Na+-exporting NADH dehydrogenase operon nqrABCDEF, the D-amino acids dehydrogenase operon dadAX. Using an extension of the comparative technique, we demonstrated taxon-specific changes in regulatory interactions and predicted taxon-specific regulatory cascades. CONCLUSION A comparative genomic technique was applied to the analysis of global regulation of respiration in ten gamma-proteobacterial genomes. Three structurally different but functionally related regulatory systems were described. A correlation between the regulon size and the position of a transcription factor in regulatory cascades was observed: regulators with larger regulons tend to occupy top positions in the cascades. On the other hand, there is no obvious link to differences in the species' lifestyles and metabolic capabilities.
Collapse
|
122
|
Johnston AWB, Todd JD, Curson AR, Lei S, Nikolaidou-Katsaridou N, Gelfand MS, Rodionov DA. Living without Fur: the subtlety and complexity of iron-responsive gene regulation in the symbiotic bacterium Rhizobium and other α-proteobacteria. Biometals 2007; 20:501-11. [PMID: 17310401 DOI: 10.1007/s10534-007-9085-8] [Citation(s) in RCA: 66] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2007] [Accepted: 01/16/2007] [Indexed: 10/23/2022]
Abstract
The alpha-proteobacteria include several important genera, including the symbiotic N(2)-fixing "rhizobia", the plant pathogen Agrobacterium, the mammalian pathogens Brucella, Bartonella as well as many others that are of environmental or other interest--including Rhodobacter, Caulobacter and the hugely abundant marine genus Pelagibacter. Only a few species--mainly different members of the rhizobia--have been analyzed directly for their ability to use and to respond to iron. These studies, however, have shown that at least some of the "alphas" differ fundamentally in the ways in which they regulate their genes in response to Fe availability. In this paper, we build on our own work on Rhizobium leguminosarum (the symbiont of peas, beans and clovers) and on Bradyrhizobium japonicum, which nodulates soybeans and which has been studied in Buffalo and Zürich. In the former species, the predominant Fe-responsive regulator is not Fur, but RirA, a member of the Rrf2 protein family and which likely has an FeS cluster cofactor. In addition, there are several R. leguminosarum genes that are expressed at higher levels in Fe-replete conditions and at least some of these are regulated by Irr, a member of the Fur superfamily and which has the unusual property of being degraded by the presence of heme. In silico analyses of the genome sequences of other bacteria indicate that Irr occurs in all members of the Rhizobiales and the Rhodobacterales and that RirA is found in all but one branch of these two lineages, the exception being the clade that includes B. japonicum. Nearly all the Rhizobiales and the Rhodobacterales contain a gene whose product resembles bona fide Fur. However, direct genetic studies show that in most of the Rhizobiales and in the Rhodobacterales it is a "Mur" (a manganese responsive repressor of a small number of genes involved in Mn uptake) or, in Bradyrhizobium, it recognizes the operator sequences of only a few genes that are involved in Fe metabolism. We propose that the Rhizobiales and the Rhodobacterales have relegated Fur to a far more minor role than in (say) E. coli and that they employ Irr and, in the Rhizobiales, RirA as their global Fe-responsive transcriptional regulators. In contrast to the direct interaction between Fe2+ and conventional Fur, we suggest that these bacteria sense Fe more indirectly as functions of the intracellular concentrations of FeS clusters and of heme. Thus, their "iron-omes" may be more accurately linked to the real-time needs for the metal and not just to its absolute concentration in the environment.
Collapse
|
123
|
Sevostyanova A, Djordjevic M, Kuznedelov K, Naryshkina T, Gelfand MS, Severinov K, Minakhin L. Temporal regulation of viral transcription during development of Thermus thermophilus bacteriophage phiYS40. J Mol Biol 2007; 366:420-35. [PMID: 17187825 PMCID: PMC1885378 DOI: 10.1016/j.jmb.2006.11.050] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2006] [Revised: 11/03/2006] [Accepted: 11/14/2006] [Indexed: 11/28/2022]
Abstract
Regulation of gene expression of lytic bacteriophage varphiYS40 that infects the thermophilic bacterium Thermus thermophilus was investigated and three temporal classes of phage genes, early, middle, and late, were revealed. varphiYS40 does not encode a (RNAP) and must rely on host RNAP for transcription of its genes. Bioinformatic analysis using a model of Thermus promoters predicted 43 putative sigma(A)-dependent -10/-35 class phage promoters. A randomly chosen subset of those promoters was shown to be functional in vivo and in vitro and to belong to the early temporal class. Macroarray analysis, primer extension, and bioinformatic predictions identified 36 viral middle and late promoters. These promoters have a single common consensus element, which resembles host sigma(A) RNAP holoenzyme -10 promoter consensus element sequence. The mechanism responsible for the temporal control of the three classes of promoters remains unknown, since host sigma(A) RNAP holoenzyme purified from either infected or uninfected cells efficiently transcribed all varphiYS40 promoters in vitro. Interestingly, our data showed that during infection, there is a significant increase and decrease of transcript amounts of host translation initiation factors IF2 and IF3, respectively. This finding, together with the fact that most middle and late varphiYS40 transcripts were found to be leaderless, suggests that the shift to late viral gene expression may also occur at the level of mRNA translation.
Collapse
|
124
|
Mamirova L, Popadin K, Gelfand MS. Purifying selection in mitochondria, free-living and obligate intracellular proteobacteria. BMC Evol Biol 2007; 7:17. [PMID: 17295908 PMCID: PMC1803777 DOI: 10.1186/1471-2148-7-17] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2006] [Accepted: 02/12/2007] [Indexed: 12/02/2022] Open
Abstract
BACKGROUND The effectiveness of elimination of slightly deleterious mutations depends mainly on drift and recombination frequency. Here we analyze the influence of these two factors on the strength of the purifying selection in mitochondrial and proteobacterial orthologous genes taking into account the differences in the organism lifestyles. RESULTS (I) We found that the probability of fixation of nonsynonymous substitutions (Kn/Ks) in mitochondria is significantly lower compared to obligate intracellular bacteria and even marginally significantly lower compared to free-living bacteria. The comparison of bacteria of different lifestyles demonstrates more effective elimination of slightly deleterious mutations in (II) free-living bacteria as compared to obligate intracellular species and in (III) obligate intracellular parasites as compared to obligate intracellular symbionts. (IV) Finally, we observed that the level of the purifying selection (i.e. 1-Kn/Ks) increases with the density of mobile elements in bacterial genomes. CONCLUSION This study shows that the comparison of patterns of molecular evolution of orthologous genes between ecologically different groups of organisms allow to elucidate the genetic consequences of their various lifestyles. Comparing the strength of the purifying selection among proteobacteria with different lifestyles we obtained results, which are in concordance with theoretical expectations: (II) low effective population size and level of recombination in obligate intracellular proteobacteria lead to less effective elimination of mutations compared to free-living relatives; (III) rare horizontal transmissions, i.e. effectively zero recombination level in symbiotic obligate intracellular bacteria leads to less effective purifying selection than in parasitic obligate intracellular bacteria; (IV) the increased frequency of recombination in bacterial genomes with high mobile element density leads to a more effective elimination of slightly deleterious mutations. At the same time, (I) more effective purifying selection in relatively small populations of nonrecombining mitochondria as compared to large populations of recombining proteobacteria was unexpected. We hypothesize that additional features such as the high number of protein-protein interactions or female germ-cell atresia increase evolutionary constraints and maintain the effective purifying selection in mitochondria, but more work is needed to definitely establish these additional features.
Collapse
|
125
|
Kovaleva GY, Bazykin GA, Brudno M, Gelfand MS. Comparative genomics of transcriptional regulation in yeasts and its application to identification of a candidate alpha-isopropylmalate transporter. J Bioinform Comput Biol 2007; 4:981-98. [PMID: 17099937 DOI: 10.1142/s0219720006002284] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/07/2006] [Revised: 05/17/2006] [Accepted: 06/21/2006] [Indexed: 01/14/2023]
Abstract
Conservation rates in non-protein-coding regions of five yeast genomes of the genus Saccharomyces were analyzed using multiple whole-genome alignments. This analysis confirmed previously shown decrease in conservation rates observed immediately upstream of the translation start point and downstream of the stop-codon. Further, there was a sharp conservation peak in the upstream regions likely related to the core promoter (-35 bp to +35 bp around TSS) and a conservation peak downstream of the stop-codon whose function is not yet clear. Regulation of leucine and methionine biosynthesis controlled by the global regulator Gcn4p and pathway-specific regulators was analyzed in detail. A candidate alpha-isopropylmalate carrier, YOR271cp, was identified based on conservation of Leu3p binding sites, analysis of ChIP-chip data, protein localization and sequence similarity.
Collapse
|