1
|
Miralles-Robledillo JM, Martínez-Espinosa RM, Pire C. Transcriptomic profiling of haloarchaeal denitrification through RNA-Seq analysis. Appl Environ Microbiol 2024:e0057124. [PMID: 38814058 DOI: 10.1128/aem.00571-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 04/26/2024] [Indexed: 05/31/2024] Open
Abstract
Denitrification, a crucial biochemical pathway prevalent among haloarchaea in hypersaline ecosystems, has garnered considerable attention in recent years due to its ecological implications. Nevertheless, the underlying molecular mechanisms and genetic regulation governing this respiration/detoxification process in haloarchaea remain largely unexplored. In this study, RNA-sequencing was used to compare the transcriptomes of the haloarchaeon Haloferax mediterranei under oxic and denitrifying conditions, shedding light on the intricate metabolic alterations occurring within the cell, such as the accurate control of the metal homeostasis. Furthermore, the investigation identifies several genes encoding transcriptional regulators and potential accessory proteins with putative roles in denitrification. Among these are bacterioopsin-like transcriptional activators, proteins harboring a domain of unknown function (DUF2249), and cyanoglobin. In addition, the study delves into the genetic regulation of denitrification, finding a regulatory motif within promoter regions that activates numerous denitrification-related genes. This research serves as a starting point for future molecular biology studies in haloarchaea, offering a promising avenue to unravel the intricate mechanisms governing haloarchaeal denitrification, a pathway of paramount ecological importance.IMPORTANCEDenitrification, a fundamental process within the nitrogen cycle, has been subject to extensive investigation due to its close association with anthropogenic activities, and its contribution to the global warming issue, mainly through the release of N2O emissions. Although our comprehension of denitrification and its implications is generally well established, most studies have been conducted in non-extreme environments with mesophilic microorganisms. Consequently, there is a significant knowledge gap concerning extremophilic denitrifiers, particularly those inhabiting hypersaline environments. The significance of this research was to delve into the process of haloarchaeal denitrification, utilizing the complete denitrifier haloarchaeon Haloferax mediterranei as a model organism. This research led to the analysis of the metabolic state of this microorganism under denitrifying conditions and the identification of regulatory signals and genes encoding proteins potentially involved in this pathway, serving as a valuable resource for future molecular studies.
Collapse
Affiliation(s)
- Jose María Miralles-Robledillo
- Biochemistry, Molecular Biology, Edaphology and Agricultural Chemistry Department, Faculty of Sciences, Universitat d'Alacant, Alicante, Spain
| | - Rosa María Martínez-Espinosa
- Biochemistry, Molecular Biology, Edaphology and Agricultural Chemistry Department, Faculty of Sciences, Universitat d'Alacant, Alicante, Spain
- Multidisciplinary Institute for Environmental Studies "Ramón Margalef", University of Alicante, Alicante, Spain
| | - Carmen Pire
- Biochemistry, Molecular Biology, Edaphology and Agricultural Chemistry Department, Faculty of Sciences, Universitat d'Alacant, Alicante, Spain
- Multidisciplinary Institute for Environmental Studies "Ramón Margalef", University of Alicante, Alicante, Spain
| |
Collapse
|
2
|
Shaskolskiy B, Kravtsov D, Kandinov I, Dementieva E, Gryadunov D. Genomic Diversity and Chromosomal Rearrangements in Neisseria gonorrhoeae and Neisseria meningitidis. Int J Mol Sci 2022; 23:ijms232415644. [PMID: 36555284 PMCID: PMC9778887 DOI: 10.3390/ijms232415644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2022] [Revised: 11/18/2022] [Accepted: 12/07/2022] [Indexed: 12/14/2022] Open
Abstract
Chromosomal rearrangements in N. gonorrhoeae and N. meningitidis were studied with the determination of mobile elements and their role in rearrangements. The results of whole-genome sequencing and de novo genome assembly for 50 N. gonorrhoeae isolates collected in Russia were compared with 96 genomes of N. gonorrhoeae and 138 genomes of N. meningitidis from the databases. Rearrangement events with the determination of the coordinates of syntenic blocks were analyzed using the SibeliaZ software v.1.2.5, the minimum number of events that allow one genome to pass into another was calculated using the DCJ-indel model using the UniMoG program v.1.0. Population-level analysis revealed a stronger correlation between changes in the gene order and phylogenetic proximity for N. meningitidis in contrast to N. gonorrhoeae. Mobile elements were identified, including Correa elements; Spencer-Smith elements (in N. gonorrhoeae); Neisserial intergenic mosaic elements; IS elements of IS5, IS30, IS110, IS1595 groups; Nf1-Nf3 prophages; NgoФ1-NgoФ9 prophages; and Mu-like prophages Pnm1, Pnm2, MuMenB (in N. meningitidis). More than 44% of the observed rearrangements most likely occurred with the participation of mobile elements, including prophages. No differences were found between the Russian and global N. gonorrhoeae population both in terms of rearrangement events and in the number of transposable elements in genomes.
Collapse
|
3
|
Zhang Z, Quan S, Niu J, Guo C, Kang C, Liu J, Yuan X. Comprehensive Identification and Analyses of the GRF Gene Family in the Whole-Genome of Four Juglandaceae Species. Int J Mol Sci 2022; 23:ijms232012663. [PMID: 36293519 PMCID: PMC9604165 DOI: 10.3390/ijms232012663] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2022] [Revised: 10/15/2022] [Accepted: 10/19/2022] [Indexed: 11/05/2022] Open
Abstract
The GRF gene family plays an important role in plant growth and development as regulators involved in plant hormone signaling and metabolism. However, the Juglandaceae GRF gene family remains to be studied. Here, we identified 15, 15, 19, and 20 GRF genes in J. regia, C. illinoinensis, J. sigillata, and J. mandshurica, respectively. The phylogeny shows that the Juglandaceae family GRF is divided into two subfamilies, the ε-group and the non-ε-group, and that selection pressure analysis did not detect amino acid loci subject to positive selection pressure. In addition, we found that the duplications of the Juglandaceae family GRF genes were all segmental duplication events, and a total of 79 orthologous gene pairs and one paralogous homologous gene pair were identified in four Juglandaceae families. The Ka/KS ratios between these homologous gene pairs were further analyzed, and the Ka/KS values were all less than 1, indicating that purifying selection plays an important role in the evolution of the Juglandaceae family GRF genes. The codon bias of genes in the GRF family of Juglandaceae species is weak, and is affected by both natural selection pressure and base mutation, and translation selection plays a dominant role in the mutation pressure in codon usage. Finally, expression analysis showed that GRF genes play important roles in pecan embryo development and walnut male and female flower bud development, but with different expression patterns. In conclusion, this study will serve as a rich genetic resource for exploring the molecular mechanisms of flower bud differentiation and embryo development in Juglandaceae. In addition, this is the first study to report the GRF gene family in the Juglandaceae family; therefore, our study will provide guidance for future comparative and functional genomic studies of the GRF gene family in the Juglandaceae specie.
Collapse
Affiliation(s)
- Zhongrong Zhang
- Department of Horticulture, College of Agriculture, Shihezi University, Shihezi 832003, China
- Xinjiang Production and Construction Corps Key Laboratory of Special Fruits and Vegetables Cultivation Physiology and Germplasm Resources Utilization, Shihezi 832003, China
| | - Shaowen Quan
- Department of Horticulture, College of Agriculture, Shihezi University, Shihezi 832003, China
- Xinjiang Production and Construction Corps Key Laboratory of Special Fruits and Vegetables Cultivation Physiology and Germplasm Resources Utilization, Shihezi 832003, China
| | - Jianxin Niu
- Department of Horticulture, College of Agriculture, Shihezi University, Shihezi 832003, China
- Xinjiang Production and Construction Corps Key Laboratory of Special Fruits and Vegetables Cultivation Physiology and Germplasm Resources Utilization, Shihezi 832003, China
- Correspondence:
| | - Caihua Guo
- Department of Horticulture, College of Agriculture, Shihezi University, Shihezi 832003, China
- Xinjiang Production and Construction Corps Key Laboratory of Special Fruits and Vegetables Cultivation Physiology and Germplasm Resources Utilization, Shihezi 832003, China
| | - Chao Kang
- Department of Horticulture, College of Agriculture, Shihezi University, Shihezi 832003, China
- Xinjiang Production and Construction Corps Key Laboratory of Special Fruits and Vegetables Cultivation Physiology and Germplasm Resources Utilization, Shihezi 832003, China
| | - Jinming Liu
- Department of Horticulture, College of Agriculture, Shihezi University, Shihezi 832003, China
- Xinjiang Production and Construction Corps Key Laboratory of Special Fruits and Vegetables Cultivation Physiology and Germplasm Resources Utilization, Shihezi 832003, China
| | - Xing Yuan
- Department of Horticulture, College of Agriculture, Shihezi University, Shihezi 832003, China
- Xinjiang Production and Construction Corps Key Laboratory of Special Fruits and Vegetables Cultivation Physiology and Germplasm Resources Utilization, Shihezi 832003, China
| |
Collapse
|
4
|
Escorcia-Rodríguez JM, Esposito M, Freyre-González JA, Moreno-Hagelsieb G. Non-synonymous to synonymous substitutions suggest that orthologs tend to keep their functions, while paralogs are a source of functional novelty. PeerJ 2022; 10:e13843. [PMID: 36065404 PMCID: PMC9440661 DOI: 10.7717/peerj.13843] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2020] [Accepted: 07/14/2022] [Indexed: 01/18/2023] Open
Abstract
Orthologs separate after lineages split from each other and paralogs after gene duplications. Thus, orthologs are expected to remain more functionally coherent across lineages, while paralogs have been proposed as a source of new functions. Because protein functional divergence follows from non-synonymous substitutions, we performed an analysis based on the ratio of non-synonymous to synonymous substitutions (dN/dS), as proxy for functional divergence. We used five working definitions of orthology, including reciprocal best hits (RBH), among other definitions based on network analyses and clustering. The results showed that orthologs, by all definitions tested, had values of dN/dS noticeably lower than those of paralogs, suggesting that orthologs generally tend to be more functionally stable than paralogs. The differences in dN/dS ratios remained suggesting the functional stability of orthologs after eliminating gene comparisons with potential problems, such as genes with high codon usage biases, low coverage of either of the aligned sequences, or sequences with very high similarities. Separation by percent identity of the encoded proteins showed that the differences between the dN/dS ratios of orthologs and paralogs were more evident at high sequence identity, less so as identity dropped. The last results suggest that the differences between dN/dS ratios were partially related to differences in protein identity. However, they also suggested that paralogs undergo functional divergence relatively early after duplication. Our analyses indicate that choosing orthologs as probably functionally coherent remains the right approach in comparative genomics.
Collapse
Affiliation(s)
- Juan M. Escorcia-Rodríguez
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autonóma de México, Cuernavaca, Morelos, México
| | - Mario Esposito
- Department of Biology, Wilfrid Laurier University, Waterloo, Canada
| | - Julio A. Freyre-González
- Regulatory Systems Biology Research Group, Program of Systems Biology, Center for Genomic Sciences, Universidad Nacional Autonóma de México, Cuernavaca, Morelos, México
| | | |
Collapse
|
5
|
Badonyi M, Marsh JA. Large protein complex interfaces have evolved to promote cotranslational assembly. eLife 2022; 11:79602. [PMID: 35899946 PMCID: PMC9365393 DOI: 10.7554/elife.79602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2022] [Accepted: 07/27/2022] [Indexed: 11/13/2022] Open
Abstract
Assembly pathways of protein complexes should be precise and efficient to minimise misfolding and unwanted interactions with other proteins in the cell. One way to achieve this efficiency is by seeding assembly pathways during translation via the cotranslational assembly of subunits. While recent evidence suggests that such cotranslational assembly is widespread, little is known about the properties of protein complexes associated with the phenomenon. Here, using a combination of proteome-specific protein complex structures and publicly available ribosome profiling data, we show that cotranslational assembly is particularly common between subunits that form large intermolecular interfaces. To test whether large interfaces have evolved to promote cotranslational assembly, as opposed to cotranslational assembly being a non-adaptive consequence of large interfaces, we compared the sizes of first and last translated interfaces of heteromeric subunits in bacterial, yeast, and human complexes. When considering all together, we observe the N-terminal interface to be larger than the C-terminal interface 54% of the time, increasing to 64% when we exclude subunits with only small interfaces, which are unlikely to cotranslationally assemble. This strongly suggests that large interfaces have evolved as a means to maximise the chance of successful cotranslational subunit binding.
Collapse
Affiliation(s)
- Mihaly Badonyi
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, United Kingdom
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and Cancer, University of Edinburgh, Edinburgh, United Kingdom
| |
Collapse
|
6
|
L.B. Almeida B, M. Bahrudeen MN, Chauhan V, Dash S, Kandavalli V, Häkkinen A, Lloyd-Price J, S.D. Cristina P, Baptista ISC, Gupta A, Kesseli J, Dufour E, Smolander OP, Nykter M, Auvinen P, Jacobs HT, M.D. Oliveira S, S. Ribeiro A. The transcription factor network of E. coli steers global responses to shifts in RNAP concentration. Nucleic Acids Res 2022; 50:6801-6819. [PMID: 35748858 PMCID: PMC9262627 DOI: 10.1093/nar/gkac540] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2022] [Revised: 06/02/2022] [Accepted: 06/14/2022] [Indexed: 12/24/2022] Open
Abstract
The robustness and sensitivity of gene networks to environmental changes is critical for cell survival. How gene networks produce specific, chronologically ordered responses to genome-wide perturbations, while robustly maintaining homeostasis, remains an open question. We analysed if short- and mid-term genome-wide responses to shifts in RNA polymerase (RNAP) concentration are influenced by the known topology and logic of the transcription factor network (TFN) of Escherichia coli. We found that, at the gene cohort level, the magnitude of the single-gene, mid-term transcriptional responses to changes in RNAP concentration can be explained by the absolute difference between the gene's numbers of activating and repressing input transcription factors (TFs). Interestingly, this difference is strongly positively correlated with the number of input TFs of the gene. Meanwhile, short-term responses showed only weak influence from the TFN. Our results suggest that the global topological traits of the TFN of E. coli shape which gene cohorts respond to genome-wide stresses.
Collapse
Affiliation(s)
- Bilena L.B. Almeida
- Correspondence may also be addressed to Bilena L.B. Almeida. Tel: +358 2945211;
| | | | | | | | - Vinodh Kandavalli
- Department of Cell and Molecular Biology, Uppsala University, Uppsala, Sweden
| | - Antti Häkkinen
- Research Program in Systems Oncology, Research Programs Unit, Faculty of Medicine, University of Helsinki, FI-00014 Helsinki, Finland
| | | | - Palma S.D. Cristina
- Laboratory of Biosystem Dynamics, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Ines S C Baptista
- Laboratory of Biosystem Dynamics, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Abhishekh Gupta
- Center for Quantitative Medicine and Department of Cell Biology, University of Connecticut School of Medicine, 263 Farmington Av., Farmington, CT 06030-6033, USA
| | - Juha Kesseli
- Prostate Cancer Research Center, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland; Tays Cancer Center, Tampere University Hospital, Tampere, Finland
| | - Eric Dufour
- Mitochondrial bioenergetics and metabolism, BioMediTech, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland
| | - Olli-Pekka Smolander
- Department of Chemistry and Biotechnology, Tallinn University of Technology, Tallinn, Estonia
- Institute of Biotechnology, University of Helsinki, Viikinkaari 5D, 00790 Helsinki, Finland
| | - Matti Nykter
- Prostate Cancer Research Center, Faculty of Medicine and Health Technology, Tampere University, Tampere, Finland; Tays Cancer Center, Tampere University Hospital, Tampere, Finland
| | - Petri Auvinen
- Institute of Biotechnology, University of Helsinki, Viikinkaari 5D, 00790 Helsinki, Finland
| | - Howard T Jacobs
- Faculty of Medicine and Health Technology, FI-33014 Tampere University, Finland; Department of Environment and Genetics, La Trobe University, Melbourne, Victoria 3086, Australia
| | - Samuel M.D. Oliveira
- Department of Electrical and Computer Engineering, Boston University, Boston, MA, USA
| | | |
Collapse
|
7
|
Elhabashy H, Merino F, Alva V, Kohlbacher O, Lupas AN. Exploring protein-protein interactions at the proteome level. Structure 2022; 30:462-475. [DOI: 10.1016/j.str.2022.02.004] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2021] [Revised: 10/26/2021] [Accepted: 02/02/2022] [Indexed: 02/08/2023]
|
8
|
Taboada-Castro H, Castro-Mondragón JA, Aguilar-Vera A, Hernández-Álvarez AJ, van Helden J, Encarnación-Guevara S. RhizoBindingSites, a Database of DNA-Binding Motifs in Nitrogen-Fixing Bacteria Inferred Using a Footprint Discovery Approach. Front Microbiol 2020; 11:567471. [PMID: 33250866 PMCID: PMC7674921 DOI: 10.3389/fmicb.2020.567471] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 10/13/2020] [Indexed: 11/30/2022] Open
Abstract
Basic knowledge of transcriptional regulation is needed to understand the mechanisms governing biological processes, i.e., nitrogen fixation by Rhizobiales bacteria in symbiosis with leguminous plants. The RhizoBindingSites database is a computer-assisted framework providing motif-gene-associated conserved sequences potentially implicated in transcriptional regulation in nine symbiotic species. A dyad analysis algorithm was used to deduce motifs in the upstream regulatory region of orthologous genes, and only motifs also located in the gene seed promoter with a p-value of 1e-4 were accepted. A genomic scan analysis of the upstoream sequences with these motifs was performed. These predicted binding sites were categorized according to low, medium and high homology between the matrix and the upstream regulatory sequence. On average, 62.7% of the genes had a motif, accounting for 80.44% of the genes per genome, with 19613 matrices (a matrix is a representation of a motif). The RhizoBindingSites database provides motif and gene information, motif conservation in the order Rhizobiales, matrices, motif logos, regulatory networks constructed from theoretical or experimental data, a criterion for selecting motifs and a guide for users. The RhizoBindingSites database is freely available online at rhizobindingsites.ccg.unam.mx.
Collapse
Affiliation(s)
| | | | - Alejandro Aguilar-Vera
- Center for Genomic Sciences, National Autonomous University of Mexico, Cuernavaca, Mexico
| | | | - Jacques van Helden
- CNRS, IFB-core, UMS 3601, Institut Français de Bioinformatique, Évry, France.,Laboratoire Theory and Approaches of Genome Complexity (TAGC), Inserm, Aix-Marseille Univ, Marseille, France
| | | |
Collapse
|
9
|
Esch R, Merkl R. Conserved genomic neighborhood is a strong but no perfect indicator for a direct interaction of microbial gene products. BMC Bioinformatics 2020; 21:5. [PMID: 31900122 PMCID: PMC6941341 DOI: 10.1186/s12859-019-3200-z] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2019] [Accepted: 11/08/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The order of genes in bacterial genomes is not random; for example, the products of genes belonging to an operon work together in the same pathway. The cotranslational assembly of protein complexes is deemed to conserve genomic neighborhoods even stronger than a common function. This is why a conserved genomic neighborhood can be utilized to predict, whether gene products form protein complexes. RESULTS We were interested to assess the performance of a neighborhood-based classifier that analyzes a large number of genomes. Thus, we determined for the genes encoding the subunits of 494 experimentally verified hetero-dimers their local genomic context. In order to generate phylogenetically comprehensive genomic neighborhoods, we utilized the tools offered by the Enzyme Function Initiative. For each subunit, a sequence similarity network was generated and the corresponding genome neighborhood network was analyzed to deduce the most frequent gene product. This was predicted as interaction partner, if its abundance exceeded a threshold, which was the frequency giving rise to the maximal Matthews correlation coefficient. For the threshold of 16%, the true positive rate was 45%, the false positive rate 0.06%, and the precision 55%. For approximately 20% of the subunits, the interaction partner was not found in a neighborhood of ± 10 genes. CONCLUSIONS Our phylogenetically comprehensive analysis confirmed that complex formation is a strong evolutionary factor that conserves genome neighborhoods. On the other hand, for 55% of the cases analyzed here, classification failed. Either, the interaction partner was not present in a ± 10 gene window or was not the most frequent gene product.
Collapse
Affiliation(s)
- Robert Esch
- Faculty of Mathematics and Computer Science, University of Hagen, D-58084, Hagen, Germany
| | - Rainer Merkl
- Institute of Biophysics and Physical Biochemistry, University of Regensburg, D-93040, Regensburg, Germany.
| |
Collapse
|
10
|
Sousa A, Gonçalves E, Mirauta B, Ochoa D, Stegle O, Beltrao P. Multi-omics Characterization of Interaction-mediated Control of Human Protein Abundance levels. Mol Cell Proteomics 2019; 18:S114-S125. [PMID: 31239291 PMCID: PMC6692786 DOI: 10.1074/mcp.ra118.001280] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2018] [Revised: 06/07/2019] [Indexed: 11/13/2022] Open
Abstract
Proteogenomic studies of cancer samples have shown that copy-number variation can be attenuated at the protein level for a large fraction of the proteome, likely due to the degradation of unassembled protein complex subunits. Such interaction-mediated control of protein abundance remains poorly characterized. To study this, we compiled genomic, (phospho)proteomic and structural data for hundreds of cancer samples and find that up to 42% of 8,124 analyzed proteins show signs of post-transcriptional control. We find evidence of interaction-dependent control of protein abundance, correlated with interface size, for 516 protein pairs, with some interactions further controlled by phosphorylation. Finally, these findings in cancer were reflected in variation in protein levels in normal tissues. Importantly, expression differences due to natural genetic variation were increasingly buffered from phenotype differences for highly attenuated proteins. Altogether, this study further highlights the importance of posttranscriptional control of protein abundance in cancer and healthy cells.
Collapse
Affiliation(s)
- Abel Sousa
- Instituto de Investigação e Inovação em Saúde da Universidade do Porto (i3s), Rua Alfredo Allen 208, 4200-135, Porto, Portugal; Institute of Molecular Pathology and Immunology of the University of Porto (IPATIMUP), Rua Júlio Amaral de Carvalho 45, 4200-135, Porto, Portugal; Graduate Program in Areas of Basic and Applied Biology (GABBA), Abel Salazar Biomedical Sciences Institute, University of Porto, Rua de Jorge Viterbo Ferreira 228, 4050-313, Porto, Portugal; European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, CB10 1SD, Cambridge, UK
| | | | - Bogdan Mirauta
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, CB10 1SD, Cambridge, UK
| | - David Ochoa
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, CB10 1SD, Cambridge, UK
| | - Oliver Stegle
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, CB10 1SD, Cambridge, UK; ‡European Molecular Biology Laboratory, Genome Biology Unit, 69117 Heidelberg, Germany; §Division of Computational Genomics and Systems Genetics, German Cancer Research Center (DKFZ), 69120, Heidelberg, Germany
| | - Pedro Beltrao
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, CB10 1SD, Cambridge, UK.
| |
Collapse
|
11
|
Zaidi SSA, Zhang X. Computational operon prediction in whole-genomes and metagenomes. Brief Funct Genomics 2018; 16:181-193. [PMID: 27659221 DOI: 10.1093/bfgp/elw034] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
Microbial diversity in unique environmental settings enables abrupt responses catalysed by altering the gene regulation and formation of gene clusters called operons. Operons increases bacterial adaptability, which in turn increases their survival. This review article presents the emergence of computational operon prediction methods for whole microbial genomes and metagenomes, and discusses their strengths and limitations. Most of the whole-genome operon prediction methods struggle to generalize on unrelated genomes. The applicability of universal whole-genome operon prediction methods to metagenomic data is an interesting yet less investigated question. We have evaluated the potential of various operon prediction features for genomic and metagenomic data. Most of operon prediction methods with high accuracy have been compiled into databases. Despite of the high predictive performance, the data among many databases are not completely consistent for similar species. We performed a correlation analysis between the computationally predicted operon databases and experimentally validated data for Escherichia coli, Bacillus subtilis and Mycobacterium tuberculosis. Operon prediction for most of the less characterized microbes cannot be verified due to absence of experimentally validated operons. The generation of validated information for other microbes would test the authenticity of operon databases for other less annotated microbes as well. Advances in sequencing technologies and development of better analysis methods will help researchers to overcome the technological hurdles (such as long sequencing reads and improved contig size) and further improve operon predictions and better utilize operonic information.
Collapse
|
12
|
Transcriptional Modulation of Transport- and Metabolism-Associated Gene Clusters Leading to Utilization of Benzoate in Preference to Glucose in Pseudomonas putida CSV86. Appl Environ Microbiol 2017; 83:AEM.01280-17. [PMID: 28733285 DOI: 10.1128/aem.01280-17] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Accepted: 07/16/2017] [Indexed: 11/20/2022] Open
Abstract
The effective elimination of xenobiotic pollutants from the environment can be achieved by efficient degradation by microorganisms even in the presence of sugars or organic acids. Soil isolate Pseudomonas putida CSV86 displays a unique ability to utilize aromatic compounds prior to glucose. The draft genome and transcription analyses revealed that glucose uptake and benzoate transport and metabolism genes are clustered at the glc and ben loci, respectively, as two distinct operons. When grown on glucose plus benzoate, CSV86 displayed significantly higher expression of the ben locus in the first log phase and of the glc locus in the second log phase. Kinetics of substrate uptake and metabolism matched the transcription profiles. The inability of succinate to suppress benzoate transport and metabolism resulted in coutilization of succinate and benzoate. When challenged with succinate or benzoate, glucose-grown cells showed rapid reduction in glc locus transcription, glucose transport, and metabolic activity, with succinate being more effective at the functional level. Benzoate and succinate failed to interact with or inhibit the activities of glucose transport components or metabolic enzymes. The data suggest that succinate and benzoate suppress glucose transport and metabolism at the transcription level, enabling P. putida CSV86 to preferentially metabolize benzoate. This strain thus has the potential to be an ideal host to engineer diverse metabolic pathways for efficient bioremediation.IMPORTANCEPseudomonas strains play an important role in carbon cycling in the environment and display a hierarchy in carbon utilization: organic acids first, followed by glucose, and aromatic substrates last. This limits their exploitation for bioremediation. This study demonstrates the substrate-dependent modulation of ben and glc operons in Pseudomonas putida CSV86, wherein benzoate suppresses glucose transport and metabolism at the transcription level, leading to preferential utilization of benzoate over glucose. Interestingly, succinate and benzoate are cometabolized. These properties are unique to this strain compared to other pseudomonads and open up avenues to unravel novel regulatory processes. Strain CSV86 can serve as an ideal host to engineer and facilitate efficient removal of recalcitrant pollutants even in the presence of simpler carbon sources.
Collapse
|
13
|
Affiliation(s)
- Søren A Ladefoged
- Department of Medical Microbiology and Immunology University of Aarhus, Denmark.,Department of Clinical Biochemistry University Hospital of Aarhus, Denmark
| |
Collapse
|
14
|
Regulation, evolution and consequences of cotranslational protein complex assembly. Curr Opin Struct Biol 2016; 42:90-97. [PMID: 27969102 DOI: 10.1016/j.sbi.2016.11.023] [Citation(s) in RCA: 49] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2016] [Accepted: 11/28/2016] [Indexed: 01/05/2023]
Abstract
Most proteins assemble into complexes, which are involved in almost all cellular processes. Thus it is crucial for cell viability that mechanisms for correct assembly exist. The timing of assembly plays a key role in determining the fate of the protein: if the protein is allowed to diffuse into the crowded cellular milieu, it runs the risk of forming non-specific interactions, potentially leading to aggregation or other deleterious outcomes. It is therefore expected that strong regulatory mechanisms should exist to ensure efficient assembly. In this review we discuss the cotranslational assembly of protein complexes and discuss how it occurs, ways in which it is regulated, potential disadvantages of cotranslational interactions between proteins and the implications for the inheritance of dominant-negative genetic disorders.
Collapse
|
15
|
Inferring Functional Relationships from Conservation of Gene Order. Methods Mol Biol 2016. [PMID: 27896735 DOI: 10.1007/978-1-4939-6613-4_3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register]
Abstract
Predicting functional associations using the Gene Neighbor Method depends on the simple idea that if genes are conserved next to each other in evolutionarily distant prokaryotes they might belong to a polycistronic transcription unit. The procedure presented in this chapter starts with the organization of the genes within genomes into pairs of adjacent genes. Then, the pairs of adjacent genes in a genome of interest are mapped to their corresponding orthologs in other, informative, genomes. The final step is to verify if the mapped orthologs are also pairs of adjacent genes in the informative genomes.
Collapse
|
16
|
Kumar A, Manivelan V, Bansal M. Structural features of DNA are conserved in the promoter region of orthologous genes across different strains ofHelicobacter pylori. FEMS Microbiol Lett 2016; 363:fnw207. [DOI: 10.1093/femsle/fnw207] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/25/2016] [Indexed: 12/19/2022] Open
|
17
|
Wells JN, Bergendahl LT, Marsh JA. Operon Gene Order Is Optimized for Ordered Protein Complex Assembly. Cell Rep 2016; 14:679-685. [PMID: 26804901 PMCID: PMC4742563 DOI: 10.1016/j.celrep.2015.12.085] [Citation(s) in RCA: 67] [Impact Index Per Article: 8.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2015] [Revised: 11/07/2015] [Accepted: 12/17/2015] [Indexed: 01/07/2023] Open
Abstract
The assembly of heteromeric protein complexes is an inherently stochastic process in which multiple genes are expressed separately into proteins, which must then somehow find each other within the cell. Here, we considered one of the ways by which prokaryotic organisms have attempted to maximize the efficiency of protein complex assembly: the organization of subunit-encoding genes into operons. Using structure-based assembly predictions, we show that operon gene order has been optimized to match the order in which protein subunits assemble. Exceptions to this are almost entirely highly expressed proteins for which assembly is less stochastic and for which precisely ordered translation offers less benefit. Overall, these results show that ordered protein complex assembly pathways are of significant biological importance and represent a major evolutionary constraint on operon gene organization. Operon-encoded subunits tend to be encoded by neighboring genes and form large interfaces Operon gene order is often optimized for the order of protein complex assembly Exceptions are mostly highly expressed proteins for which assembly is less stochastic
Collapse
Affiliation(s)
- Jonathan N Wells
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, United Kingdom
| | - L Therese Bergendahl
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, United Kingdom
| | - Joseph A Marsh
- MRC Human Genetics Unit, Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, United Kingdom.
| |
Collapse
|
18
|
Touchon M, Rocha EPC. Coevolution of the Organization and Structure of Prokaryotic Genomes. Cold Spring Harb Perspect Biol 2016; 8:a018168. [PMID: 26729648 DOI: 10.1101/cshperspect.a018168] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The cytoplasm of prokaryotes contains many molecular machines interacting directly with the chromosome. These vital interactions depend on the chromosome structure, as a molecule, and on the genome organization, as a unit of genetic information. Strong selection for the organization of the genetic elements implicated in these interactions drives replicon ploidy, gene distribution, operon conservation, and the formation of replication-associated traits. The genomes of prokaryotes are also very plastic with high rates of horizontal gene transfer and gene loss. The evolutionary conflicts between plasticity and organization lead to the formation of regions with high genetic diversity whose impact on chromosome structure is poorly understood. Prokaryotic genomes are remarkable documents of natural history because they carry the imprint of all of these selective and mutational forces. Their study allows a better understanding of molecular mechanisms, their impact on microbial evolution, and how they can be tinkered in synthetic biology.
Collapse
Affiliation(s)
- Marie Touchon
- Microbial Evolutionary Genomics, Institut Pasteur, 75015 Paris, France CNRS, UMR3525, 75015 Paris, France
| | - Eduardo P C Rocha
- Microbial Evolutionary Genomics, Institut Pasteur, 75015 Paris, France CNRS, UMR3525, 75015 Paris, France
| |
Collapse
|
19
|
Predicting Functional Interactions Among Genes in Prokaryotes by Genomic Context. ADVANCES IN EXPERIMENTAL MEDICINE AND BIOLOGY 2015; 883:97-106. [PMID: 26621463 DOI: 10.1007/978-3-319-23603-2_5] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
Genomic context methods for finding functions of unannotated genes were implemented very early after the publication of the first few prokaryotic genomes. The ideas behind these methods include gene fusions, conservation of gene adjacency, and the patters of co-occurrence of genes across available genomes. A later addition was the prediction of features related to functional organization, such as operons, stretches of genes co-transcribed into a single messenger RNA. The ideas behind these methods tend to be easy to understand, while the strategies for transforming those basic ideas into predictions can vary in complexity, mostly because genes whose products are known to functionally interact vary in the way they relate to those basic ideas. We present here a view of genomic context methods for predicting functional interactions, with simple examples of their implementation as compared and evaluated using genes whose products are known to functionally interact.
Collapse
|
20
|
Cabezón E, Ripoll-Rozada J, Peña A, de la Cruz F, Arechaga I. Towards an integrated model of bacterial conjugation. FEMS Microbiol Rev 2014; 39:81-95. [PMID: 25154632 DOI: 10.1111/1574-6976.12085] [Citation(s) in RCA: 120] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022] Open
Abstract
Bacterial conjugation is one of the main mechanisms for horizontal gene transfer. It constitutes a key element in the dissemination of antibiotic resistance and virulence genes to human pathogenic bacteria. DNA transfer is mediated by a membrane-associated macromolecular machinery called Type IV secretion system (T4SS). T4SSs are involved not only in bacterial conjugation but also in the transport of virulence factors by pathogenic bacteria. Thus, the search for specific inhibitors of different T4SS components opens a novel approach to restrict plasmid dissemination. This review highlights recent biochemical and structural findings that shed new light on the molecular mechanisms of DNA and protein transport by T4SS. Based on these data, a model for pilus biogenesis and substrate transfer in conjugative systems is proposed. This model provides a renewed view of the mechanism that might help to envisage new strategies to curb the threating expansion of antibiotic resistance.
Collapse
Affiliation(s)
- Elena Cabezón
- Departamento de Biología Molecular, Instituto de Biomedicina y Biotecnología de Cantabria, IBBTEC, (Universidad de Cantabria, CSIC) Santander, Spain
| | - Jorge Ripoll-Rozada
- Departamento de Biología Molecular, Instituto de Biomedicina y Biotecnología de Cantabria, IBBTEC, (Universidad de Cantabria, CSIC) Santander, Spain
| | - Alejandro Peña
- Departamento de Biología Molecular, Instituto de Biomedicina y Biotecnología de Cantabria, IBBTEC, (Universidad de Cantabria, CSIC) Santander, Spain
| | - Fernando de la Cruz
- Departamento de Biología Molecular, Instituto de Biomedicina y Biotecnología de Cantabria, IBBTEC, (Universidad de Cantabria, CSIC) Santander, Spain
| | - Ignacio Arechaga
- Departamento de Biología Molecular, Instituto de Biomedicina y Biotecnología de Cantabria, IBBTEC, (Universidad de Cantabria, CSIC) Santander, Spain
| |
Collapse
|
21
|
Klinman JP, Bonnot F. Intrigues and intricacies of the biosynthetic pathways for the enzymatic quinocofactors: PQQ, TTQ, CTQ, TPQ, and LTQ. Chem Rev 2014; 114:4343-65. [PMID: 24350630 PMCID: PMC3999297 DOI: 10.1021/cr400475g] [Citation(s) in RCA: 127] [Impact Index Per Article: 12.7] [Reference Citation Analysis] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Affiliation(s)
- Judith P. Klinman
- Department of Chemistry University of California, Berkeley, California 94720, U.S.A. Supported by the National Institutes of Health (GM025765) to J.P.K
- Department of Molecular and Cell Biology University of California, Berkeley, California 94720, U.S.A. Supported by the National Institutes of Health (GM025765) to J.P.K
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, California 94720, U.S.A. Supported by the National Institutes of Health (GM025765) to J.P.K
| | - Florence Bonnot
- Department of Chemistry University of California, Berkeley, California 94720, U.S.A. Supported by the National Institutes of Health (GM025765) to J.P.K
- California Institute for Quantitative Biosciences (QB3), University of California, Berkeley, California 94720, U.S.A. Supported by the National Institutes of Health (GM025765) to J.P.K
| |
Collapse
|
22
|
Rokicki J, Knox D, Dowell RD, Copley SD. CodaChrome: a tool for the visualization of proteome conservation across all fully sequenced bacterial genomes. BMC Genomics 2014; 15:65. [PMID: 24460813 PMCID: PMC3908345 DOI: 10.1186/1471-2164-15-65] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/02/2013] [Accepted: 01/11/2014] [Indexed: 02/08/2023] Open
Abstract
Background The relationships between bacterial genomes are complicated by rampant horizontal gene transfer, varied selection pressures, acquisition of new genes, loss of genes, and divergence of genes, even in closely related lineages. As more and more bacterial genomes are sequenced, organizing and interpreting the incredible amount of relational information that connects them becomes increasingly difficult. Results We have developed CodaChrome (http://www.sourceforge.com/p/codachrome), a one-versus-all proteome comparison tool that allows the user to visually investigate the relationship between a bacterial proteome of interest and the proteomes encoded by every other bacterial genome recorded in GenBank in a massive interactive heat map. This tool has allowed us to rapidly identify the most highly conserved proteins encoded in the bacterial pan-genome, fast-clock genes useful for subtyping of bacterial species, the evolutionary history of an indel in the Sphingobium lineage, and an example of horizontal gene transfer from a member of the genus Enterococcus to a recent ancestor of Helicobacter pylori. Conclusion CodaChrome is a user-friendly and powerful tool for simultaneously visualizing relationships between thousands of proteomes.
Collapse
Affiliation(s)
| | | | - Robin D Dowell
- Department of Molecular, Cellular and Developmental Biology, University of Colorado at Boulder, Boulder CO, USA.
| | | |
Collapse
|
23
|
Shifman A, Ninyo N, Gophna U, Snir S. Phylo SI: a new genome-wide approach for prokaryotic phylogeny. Nucleic Acids Res 2013; 42:2391-404. [PMID: 24243847 PMCID: PMC3936750 DOI: 10.1093/nar/gkt1138] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
The evolutionary history of all life forms is usually represented as a vertical tree-like process. In prokaryotes, however, the vertical signal is partly obscured by the massive influence of horizontal gene transfer (HGT). The HGT creates widespread discordance between evolutionary histories of different genes as genomes become mosaics of gene histories. Thus, the Tree of Life (TOL) has been questioned as an appropriate representation of the evolution of prokaryotes. Nevertheless a common hypothesis is that prokaryotic evolution is primarily tree-like, and a routine effort is made to place new isolates in their appropriate location in the TOL. Moreover, it appears desirable to exploit non–tree-like evolutionary processes for the task of microbial classification. In this work, we present a novel technique that builds on the straightforward observation that gene order conservation (‘synteny’) decreases in time as a result of gene mobility. This is particularly true in prokaryotes, mainly due to HGT. Using a ‘synteny index’ (SI) that measures the average synteny between a pair of genomes, we developed the phylogenetic reconstruction tool ‘Phylo SI’. Phylo SI offers several attractive properties such as easy bootstrapping, high sensitivity in cases where phylogenetic signal is weak and computational efficiency. Phylo SI was tested both on simulated data and on two bacterial data sets and compared with two well-established phylogenetic methods. Phylo SI is particularly efficient on short evolutionary distances where synteny footprints remain detectable, whereas the nucleotide substitution signal is too weak for reliable sequence-based phylogenetic reconstruction. The method is publicly available at http://research.haifa.ac.il/ssagi/software/PhyloSI.zip.
Collapse
Affiliation(s)
- Anton Shifman
- Department of Evolutionary & Environmental Biology, University of Haifa, Haifa 31905 Israel, Department of Molecular Microbiology and Biotechnology Tel Aviv University, Tel Aviv 69978, Israel and National Evolutionary Synthesis Center, 2024 W. Main Street A200, Durham, NC 27705, USA
| | | | | | | |
Collapse
|
24
|
Muley VY, Ranjan A. Evaluation of physical and functional protein-protein interaction prediction methods for detecting biological pathways. PLoS One 2013; 8:e54325. [PMID: 23349851 PMCID: PMC3547882 DOI: 10.1371/journal.pone.0054325] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2012] [Accepted: 12/11/2012] [Indexed: 11/18/2022] Open
Abstract
BACKGROUND Cellular activities are governed by the physical and the functional interactions among several proteins involved in various biological pathways. With the availability of sequenced genomes and high-throughput experimental data one can identify genome-wide protein-protein interactions using various computational techniques. Comparative assessments of these techniques in predicting protein interactions have been frequently reported in the literature but not their ability to elucidate a particular biological pathway. METHODS Towards the goal of understanding the prediction capabilities of interactions among the specific biological pathway proteins, we report the analyses of 14 biological pathways of Escherichia coli catalogued in KEGG database using five protein-protein functional linkage prediction methods. These methods are phylogenetic profiling, gene neighborhood, co-presence of orthologous genes in the same gene clusters, a mirrortree variant, and expression similarity. CONCLUSIONS Our results reveal that the prediction of metabolic pathway protein interactions continues to be a challenging task for all methods which possibly reflect flexible/independent evolutionary histories of these proteins. These methods have predicted functional associations of proteins involved in amino acids, nucleotide, glycans and vitamins & co-factors pathways slightly better than the random performance on carbohydrate, lipid and energy metabolism. We also make similar observations for interactions involved among the environmental information processing proteins. On the contrary, genetic information processing or specialized processes such as motility related protein-protein linkages that occur in the subset of organisms are predicted with comparable accuracy. Metabolic pathways are best predicted by using neighborhood of orthologous genes whereas phyletic pattern is good enough to reconstruct central dogma pathway protein interactions. We have also shown that the effective use of a particular prediction method depends on the pathway under investigation. In case one is not focused on specific pathway, gene expression similarity method is the best option.
Collapse
Affiliation(s)
- Vijaykumar Yogesh Muley
- Computational and Functional Genomics Group, Centre for DNA Fingerprinting and Diagnostics, Hyderabad, India
| | - Akash Ranjan
- Computational and Functional Genomics Group, Centre for DNA Fingerprinting and Diagnostics, Hyderabad, India
- * E-mail:
| |
Collapse
|
25
|
Zhang Y, Lin K. A phylogenomic analysis of Escherichia coli / Shigella group: implications of genomic features associated with pathogenicity and ecological adaptation. BMC Evol Biol 2012; 12:174. [PMID: 22958895 PMCID: PMC3444427 DOI: 10.1186/1471-2148-12-174] [Citation(s) in RCA: 31] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2012] [Accepted: 08/28/2012] [Indexed: 01/28/2023] Open
Abstract
Background The Escherichia coli species contains a variety of commensal and pathogenic strains, and its intraspecific diversity is extraordinarily high. With the availability of an increasing number of E. coli strain genomes, a more comprehensive concept of their evolutionary history and ecological adaptation can be developed using phylogenomic analyses. In this study, we constructed two types of whole-genome phylogenies based on 34 E. coli strains using collinear genomic segments. The first phylogeny was based on the concatenated collinear regions shared by all of the studied genomes, and the second phylogeny was based on the variable collinear regions that are absent from at least one genome. Intuitively, the first phylogeny is likely to reveal the lineal evolutionary history among these strains (i.e., an evolutionary phylogeny), whereas the latter phylogeny is likely to reflect the whole-genome similarities of extant strains (i.e., a similarity phylogeny). Results Within the evolutionary phylogeny, the strains were clustered in accordance with known phylogenetic groups and phenotypes. When comparing evolutionary and similarity phylogenies, a concept emerges that Shigella may have originated from at least three distinct ancestors and evolved into a single clade. By scrutinizing the properties that are shared amongst Shigella strains but missing in other E. coli genomes, we found that the common regions of the Shigella genomes were mainly influenced by mobile genetic elements, implying that they may have experienced convergent evolution via horizontal gene transfer. Based on an inspection of certain key branches of interest, we identified several collinear regions that may be associated with the pathogenicity of specific strains. Moreover, by examining the annotated genes within these regions, further detailed evidence associated with pathogenicity was revealed. Conclusions Collinear regions are reliable genomic features used for phylogenomic analysis among closely related genomes while linking the genomic diversity with phenotypic differences in a meaningful way. The pathogenicity of a strain may be associated with both the arrival of virulence factors and the modification of genomes via mutations. Such phylogenomic studies that compare collinear regions of whole genomes will help to better understand the evolution and adaptation of closely related microbes and E. coli in particular.
Collapse
Affiliation(s)
- Yan Zhang
- College of Life Sciences, Beijing Normal University, No 19 Xinjiekouwai Street, Beijing 100875, China
| | | |
Collapse
|
26
|
Muley VY, Ranjan A. Effect of reference genome selection on the performance of computational methods for genome-wide protein-protein interaction prediction. PLoS One 2012; 7:e42057. [PMID: 22844541 PMCID: PMC3406042 DOI: 10.1371/journal.pone.0042057] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2011] [Accepted: 07/02/2012] [Indexed: 12/20/2022] Open
Abstract
Background Recent progress in computational methods for predicting physical and functional protein-protein interactions has provided new insights into the complexity of biological processes. Most of these methods assume that functionally interacting proteins are likely to have a shared evolutionary history. This history can be traced out for the protein pairs of a query genome by correlating different evolutionary aspects of their homologs in multiple genomes known as the reference genomes. These methods include phylogenetic profiling, gene neighborhood and co-occurrence of the orthologous protein coding genes in the same cluster or operon. These are collectively known as genomic context methods. On the other hand a method called mirrortree is based on the similarity of phylogenetic trees between two interacting proteins. Comprehensive performance analyses of these methods have been frequently reported in literature. However, very few studies provide insight into the effect of reference genome selection on detection of meaningful protein interactions. Methods We analyzed the performance of four methods and their variants to understand the effect of reference genome selection on prediction efficacy. We used six sets of reference genomes, sampled in accordance with phylogenetic diversity and relationship between organisms from 565 bacteria. We used Escherichia coli as a model organism and the gold standard datasets of interacting proteins reported in DIP, EcoCyc and KEGG databases to compare the performance of the prediction methods. Conclusions Higher performance for predicting protein-protein interactions was achievable even with 100–150 bacterial genomes out of 565 genomes. Inclusion of archaeal genomes in the reference genome set improves performance. We find that in order to obtain a good performance, it is better to sample few genomes of related genera of prokaryotes from the large number of available genomes. Moreover, such a sampling allows for selecting 50–100 genomes for comparable accuracy of predictions when computational resources are limited.
Collapse
Affiliation(s)
- Vijaykumar Yogesh Muley
- Computational and Functional Genomics Group, Centre for DNA Fingerprinting and Diagnostics, Hyderabad, Andhra Pradesh, India
- Department of Biotechnology, Dr. Babasaheb Ambedkar Marathwada University, Sub-centre, Osmanabad, Maharashtra, India
| | - Akash Ranjan
- Computational and Functional Genomics Group, Centre for DNA Fingerprinting and Diagnostics, Hyderabad, Andhra Pradesh, India
- * E-mail:
| |
Collapse
|
27
|
Shen YQ, Bonnot F, Imsand EM, RoseFigura JM, Sjölander K, Klinman JP. Distribution and properties of the genes encoding the biosynthesis of the bacterial cofactor, pyrroloquinoline quinone. Biochemistry 2012; 51:2265-75. [PMID: 22324760 DOI: 10.1021/bi201763d] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/24/2023]
Abstract
Pyrroloquinoline quinone (PQQ) is a small, redox active molecule that serves as a cofactor for several bacterial dehydrogenases, introducing pathways for carbon utilization that confer a growth advantage. Early studies had implicated a ribosomally translated peptide as the substrate for PQQ production. This study presents a sequence- and structure-based analysis of the components of the pqq operon. We find the necessary components for PQQ production are present in 126 prokaryotes, most of which are Gram-negative and a number of which are pathogens. A total of five gene products, PqqA, PqqB, PqqC, PqqD, and PqqE, are identified as being obligatory for PQQ production. Three of the gene products in the pqq operon, PqqB, PqqC, and PqqE, are members of large protein superfamilies. By combining evolutionary conservation patterns with information from three-dimensional structures, we are able to differentiate the gene products involved in PQQ biosynthesis from those with divergent functions. The observed persistence of a conserved gene order within analyzed operons strongly suggests a role for protein-protein interactions in the course of cofactor biosynthesis. These studies propose previously unidentified roles for several of the gene products, as well as identifying possible new targets for antibiotic design and application.
Collapse
Affiliation(s)
- Yao-Qing Shen
- Department of Chemistry, University of California, Berkeley, California 94720, United States
| | | | | | | | | | | |
Collapse
|
28
|
Conservation and Occurrence of Trans-Encoded sRNAs in the Rhizobiales. Genes (Basel) 2011; 2:925-56. [PMID: 24710299 PMCID: PMC3927594 DOI: 10.3390/genes2040925] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/31/2011] [Revised: 10/24/2011] [Accepted: 10/26/2011] [Indexed: 12/13/2022] Open
Abstract
Post-transcriptional regulation by trans-encoded sRNAs, for example via base-pairing with target mRNAs, is a common feature in bacteria and influences various cell processes, e.g., response to stress factors. Several studies based on computational and RNA-seq approaches identified approximately 180 trans-encoded sRNAs in Sinorhizobium meliloti. The initial point of this report is a set of 52 trans-encoded sRNAs derived from the former studies. Sequence homology combined with structural conservation analyses were applied to elucidate the occurrence and distribution of conserved trans-encoded sRNAs in the order of Rhizobiales. This approach resulted in 39 RNA family models (RFMs) which showed various taxonomic distribution patterns. Whereas the majority of RFMs was restricted to Sinorhizobium species or the Rhizobiaceae, members of a few RFMs were more widely distributed in the Rhizobiales. Access to this data is provided via the RhizoGATE portal [1,2].
Collapse
|
29
|
Yelton AP, Thomas BC, Simmons SL, Wilmes P, Zemla A, Thelen MP, Justice N, Banfield JF. A semi-quantitative, synteny-based method to improve functional predictions for hypothetical and poorly annotated bacterial and archaeal genes. PLoS Comput Biol 2011; 7:e1002230. [PMID: 22028637 PMCID: PMC3197636 DOI: 10.1371/journal.pcbi.1002230] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2011] [Accepted: 08/30/2011] [Indexed: 11/19/2022] Open
Abstract
During microbial evolution, genome rearrangement increases with increasing sequence divergence. If the relationship between synteny and sequence divergence can be modeled, gene clusters in genomes of distantly related organisms exhibiting anomalous synteny can be identified and used to infer functional conservation. We applied the phylogenetic pairwise comparison method to establish and model a strong correlation between synteny and sequence divergence in all 634 available Archaeal and Bacterial genomes from the NCBI database and four newly assembled genomes of uncultivated Archaea from an acid mine drainage (AMD) community. In parallel, we established and modeled the trend between synteny and functional relatedness in the 118 genomes available in the STRING database. By combining these models, we developed a gene functional annotation method that weights evolutionary distance to estimate the probability of functional associations of syntenous proteins between genome pairs. The method was applied to the hypothetical proteins and poorly annotated genes in newly assembled acid mine drainage Archaeal genomes to add or improve gene annotations. This is the first method to assign possible functions to poorly annotated genes through quantification of the probability of gene functional relationships based on synteny at a significant evolutionary distance, and has the potential for broad application.
Collapse
Affiliation(s)
- Alexis P. Yelton
- Department of Environmental Science, Policy, and Management, University of California, Berkeley, California, United States of America
| | - Brian C. Thomas
- Department of Environmental Science, Policy, and Management, University of California, Berkeley, California, United States of America
| | - Sheri L. Simmons
- Department of Earth and Planetary Sciences, University of California, Berkeley, California, United States of America
| | - Paul Wilmes
- Department of Earth and Planetary Sciences, University of California, Berkeley, California, United States of America
| | - Adam Zemla
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California, United States of America
| | - Michael P. Thelen
- Physical and Life Sciences Directorate, Lawrence Livermore National Laboratory, Livermore, California, United States of America
| | - Nicholas Justice
- Department of Plant and Microbial Biology, University of California, Berkeley, California, United States of America
| | - Jillian F. Banfield
- Department of Environmental Science, Policy, and Management, University of California, Berkeley, California, United States of America
- Department of Earth and Planetary Sciences, University of California, Berkeley, California, United States of America
- * E-mail:
| |
Collapse
|
30
|
Rajewska M, Wegrzyn K, Konieczny I. AT-rich region and repeated sequences - the essential elements of replication origins of bacterial replicons. FEMS Microbiol Rev 2011; 36:408-34. [PMID: 22092310 DOI: 10.1111/j.1574-6976.2011.00300.x] [Citation(s) in RCA: 78] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2011] [Accepted: 07/07/2011] [Indexed: 11/27/2022] Open
Abstract
Repeated sequences are commonly present in the sites for DNA replication initiation in bacterial, archaeal, and eukaryotic replicons. Those motifs are usually the binding places for replication initiation proteins or replication regulatory factors. In prokaryotic replication origins, the most abundant repeated sequences are DnaA boxes which are the binding sites for chromosomal replication initiation protein DnaA, iterons which bind plasmid or phage DNA replication initiators, defined motifs for site-specific DNA methylation, and 13-nucleotide-long motifs of a not too well-characterized function, which are present within a specific region of replication origin containing higher than average content of adenine and thymine residues. In this review, we specify methods allowing identification of a replication origin, basing on the localization of an AT-rich region and the arrangement of the origin's structural elements. We describe the regularity of the position and structure of the AT-rich regions in bacterial chromosomes and plasmids. The importance of 13-nucleotide-long repeats present at the AT-rich region, as well as other motifs overlapping them, was pointed out to be essential for DNA replication initiation including origin opening, helicase loading and replication complex assembly. We also summarize the role of AT-rich region repeated sequences for DNA replication regulation.
Collapse
Affiliation(s)
- Magdalena Rajewska
- Department of Molecular and Cellular Biology, Intercollegiate Faculty of Biotechnology, University of Gdansk, Gdansk, Poland
| | | | | |
Collapse
|
31
|
Rubinstein ND, Zeevi D, Oren Y, Segal G, Pupko T. The operonic location of auto-transcriptional repressors is highly conserved in bacteria. Mol Biol Evol 2011; 28:3309-18. [PMID: 21690561 DOI: 10.1093/molbev/msr163] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023] Open
Abstract
Bacterial genes are commonly encoded in clusters, known as operons, which share transcriptional regulatory control and often encode functionally related proteins that take part in certain biological pathways. Operons that are coregulated are known to colocalize in the genome, suggesting that their spatial organization is under selection for efficient expression regulation. However, the internal order of genes within operons is believed to be poorly conserved, and hence expression requirements are claimed to be too weak to oppose gene rearrangements. In light of these opposing views, we set out to investigate whether the internal location of the regulatory genes within operons is under selection. Our analysis shows that transcription factors (TFs) are preferentially encoded as either first or last in their operons, in the two diverged model bacteria Escherichia coli and Bacillus subtilis. In a higher resolution, we find that TFs that repress transcription of the operon in which they are encoded (autorepressors), contribute most of this signal by specific preference of the first operon position. We show that this trend is strikingly conserved throughout highly diverged bacterial phyla. Moreover, these autorepressors regulate operons that carry out highly diverse biological functions. We propose a model according to which autorepressors are selected to be located first in their operons in order to optimize transcription regulation. Specifically, the first operon position helps autorepressors to minimize leaky transcription of the operon structural genes, thus minimizing energy waste. Our analysis provides statistically robust evidence for a paradigm of bacterial autorepressor preferential operonic location. Corroborated with our suggested model, an additional layer of operon expression control that is common throughout the bacterial domain is revealed.
Collapse
Affiliation(s)
- Nimrod D Rubinstein
- Department of Cell Research and Immunology, George S. Wise Faculty of Life Sciences, Tel Aviv University, Ramat Aviv, Tel Aviv, Israel
| | | | | | | | | |
Collapse
|
32
|
Kumar M, Balaji PV. Comparative genomics analysis of completely sequenced microbial genomes reveals the ubiquity of N-linked glycosylation in prokaryotes. MOLECULAR BIOSYSTEMS 2011; 7:1629-45. [PMID: 21387023 DOI: 10.1039/c0mb00259c] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
Glycosylation of proteins in prokaryotes has been known for the last few decades. Glycan structures and/or the glycosylation pathways have been experimentally characterized in only a small number of prokaryotes. Even this has become possible only during the last decade or so, primarily due to technological and methodological developments. Glycosylated proteins are diverse in their function and localization. Glycosylation has been shown to be associated with a wide range of biological phenomena. Characterization of the various types of glycans and the glycosylation machinery is critical to understand such processes. Such studies can help in the identification of novel targets for designing drugs, diagnostics, and engineering of therapeutic proteins. In view of this, the experimentally characterized pgl system of Campylobacter jejuni, responsible for N-linked glycosylation, has been used in this study to identify glycosylation loci in 865 prokaryotes whose genomes have been completely sequenced. Results from the present study show that only a small number of organisms have homologs for all the pgl enzymes and a few others have homologs for none of the pgl enzymes. Most of the organisms have homologs for only a subset of the pgl enzymes. There is no specific pattern for the presence or absence of pgl homologs vis-à-vis the 16S rRNA sequence-based phylogenetic tree. This may be due to differences in the glycan structures, high sequence divergence, horizontal gene transfer or non-orthologous gene displacement. Overall, the presence of homologs for pgl enzymes in a large number of organisms irrespective of their habitat, pathogenicity, energy generation mechanism, etc., hints towards the ubiquity of N-linked glycosylation in prokaryotes.
Collapse
Affiliation(s)
- Manjeet Kumar
- Department of Biosciences and Bioengineering, Indian Institute of Technology Bombay, Powai, Mumbai 400 076, India
| | | |
Collapse
|
33
|
Weng FC, Su CH, Hsu MT, Wang TY, Tsai HK, Wang D. Reanalyze unassigned reads in Sanger based metagenomic data using conserved gene adjacency. BMC Bioinformatics 2010; 11:565. [PMID: 21083935 PMCID: PMC3098102 DOI: 10.1186/1471-2105-11-565] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/24/2009] [Accepted: 11/18/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Investigation of metagenomes provides greater insight into uncultured microbial communities. The improvement in sequencing technology, which yields a large amount of sequence data, has led to major breakthroughs in the field. However, at present, taxonomic binning tools for metagenomes discard 30-40% of Sanger sequencing data due to the stringency of BLAST cut-offs. In an attempt to provide a comprehensive overview of metagenomic data, we re-analyzed the discarded metagenomes by using less stringent cut-offs. Additionally, we introduced a new criterion, namely, the evolutionary conservation of adjacency between neighboring genes. To evaluate the feasibility of our approach, we re-analyzed discarded contigs and singletons from several environments with different levels of complexity. We also compared the consistency between our taxonomic binning and those reported in the original studies. RESULTS Among the discarded data, we found that 23.7 ± 3.9% of singletons and 14.1 ± 1.0% of contigs were assigned to taxa. The recovery rates for singletons were higher than those for contigs. The Pearson correlation coefficient revealed a high degree of similarity (0.94 ± 0.03 at the phylum rank and 0.80 ± 0.11 at the family rank) between the proposed taxonomic binning approach and those reported in original studies. In addition, an evaluation using simulated data demonstrated the reliability of the proposed approach. CONCLUSIONS Our findings suggest that taking account of conserved neighboring gene adjacency improves taxonomic assignment when analyzing metagenomes using Sanger sequencing. In other words, utilizing the conserved gene order as a criterion will reduce the amount of data discarded when analyzing metagenomes.
Collapse
Affiliation(s)
- Francis C Weng
- Biodiversity Research Center, Academia Sinica, Taipei, Taiwan
| | | | | | | | | | | |
Collapse
|
34
|
Fondi M, Emiliani G, Fani R. Origin and evolution of operons and metabolic pathways. Res Microbiol 2009; 160:502-12. [DOI: 10.1016/j.resmic.2009.05.001] [Citation(s) in RCA: 43] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2009] [Revised: 05/07/2009] [Accepted: 05/08/2009] [Indexed: 10/20/2022]
|
35
|
Abstract
Comparative genomics and systems biology offer unprecedented opportunities for testing central tenets of evolutionary biology formulated by Darwin in the Origin of Species in 1859 and expanded in the Modern Synthesis 100 years later. Evolutionary-genomic studies show that natural selection is only one of the forces that shape genome evolution and is not quantitatively dominant, whereas non-adaptive processes are much more prominent than previously suspected. Major contributions of horizontal gene transfer and diverse selfish genetic elements to genome evolution undermine the Tree of Life concept. An adequate depiction of evolution requires the more complex concept of a network or ‘forest’ of life. There is no consistent tendency of evolution towards increased genomic complexity, and when complexity increases, this appears to be a non-adaptive consequence of evolution under weak purifying selection rather than an adaptation. Several universals of genome evolution were discovered including the invariant distributions of evolutionary rates among orthologous genes from diverse genomes and of paralogous gene family sizes, and the negative correlation between gene expression level and sequence evolution rate. Simple, non-adaptive models of evolution explain some of these universals, suggesting that a new synthesis of evolutionary biology might become feasible in a not so remote future.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| |
Collapse
|
36
|
Koonin EV. Evolution of genome architecture. Int J Biochem Cell Biol 2009; 41:298-306. [PMID: 18929678 PMCID: PMC3272702 DOI: 10.1016/j.biocel.2008.09.015] [Citation(s) in RCA: 137] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2008] [Revised: 09/16/2008] [Accepted: 09/16/2008] [Indexed: 11/26/2022]
Abstract
Charles Darwin believed that all traits of organisms have been honed to near perfection by natural selection. The empirical basis underlying Darwin's conclusions consisted of numerous observations made by him and other naturalists on the exquisite adaptations of animals and plants to their natural habitats and on the impressive results of artificial selection. Darwin fully appreciated the importance of heredity but was unaware of the nature and, in fact, the very existence of genomes. A century and a half after the publication of the "Origin", we have the opportunity to draw conclusions from the comparisons of hundreds of genome sequences from all walks of life. These comparisons suggest that the dominant mode of genome evolution is quite different from that of the phenotypic evolution. The genomes of vertebrates, those purported paragons of biological perfection, turned out to be veritable junkyards of selfish genetic elements where only a small fraction of the genetic material is dedicated to encoding biologically relevant information. In sharp contrast, genomes of microbes and viruses are incomparably more compact, with most of the genetic material assigned to distinct biological functions. However, even in these genomes, the specific genome organization (gene order) is poorly conserved. The results of comparative genomics lead to the conclusion that the genome architecture is not a straightforward result of continuous adaptation but rather is determined by the balance between the selection pressure, that is itself dependent on the effective population size and mutation rate, the level of recombination, and the activity of selfish elements. Although genes and, in many cases, multigene regions of genomes possess elaborate architectures that ensure regulation of expression, these arrangements are evolutionarily volatile and typically change substantially even on short evolutionary scales when gene sequences diverge minimally. Thus, the observed genome architectures are, mostly, products of neutral processes or epiphenomena of more general selective processes, such as selection for genome streamlining in successful lineages with large populations. Selection for specific gene arrangements (elements of genome architecture) seems only to modulate the results of these processes.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20894, USA.
| |
Collapse
|
37
|
Ling X, He X, Xin D. Detecting gene clusters under evolutionary constraint in a large number of genomes. ACTA ACUST UNITED AC 2009; 25:571-7. [PMID: 19158161 DOI: 10.1093/bioinformatics/btp027] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Abstract
MOTIVATION Spatial clusters of genes conserved across multiple genomes provide important clues to gene functions and evolution of genome organization. Existing methods of identifying these clusters often made restrictive assumptions, such as exact conservation of gene order, and relied on heuristic algorithms. RESULTS We developed a very efficient algorithm based on a 'gene teams' model that allows genes in the clusters to appear in different orders. This allows us to detect conserved gene clusters under flexible evolutionary constraints in a large number of genomes. Our statistical evaluation incorporates the evolutionary relationship among genomes, a key aspect that has been missing in most previous studies. We conducted a large-scale analysis of 133 bacterial genomes. Our results confirm that our approach is an effective way of uncovering functionally related genes. The comparison with known operons and the analysis of the structural properties of our predicted clusters suggest that operons are an important source of constraint, but there are also other forces that determine evolution of gene order and arrangement. Using our method, we predicted functions of many poorly characterized genes in bacterial. The combined algorithmic and statistical methods we present here provide a rigorous framework for systematically studying evolutionary constraints of genomic contexts. AVAILABILITY The software, data and the full results of this article are available online at http://www.ews.uiuc.edu/~xuling/mcmusec.
Collapse
Affiliation(s)
- Xu Ling
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL 61801, USA.
| | | | | |
Collapse
|
38
|
Trends in prokaryotic evolution revealed by comparison of closely related bacterial and archaeal genomes. J Bacteriol 2008; 191:65-73. [PMID: 18978059 DOI: 10.1128/jb.01237-08] [Citation(s) in RCA: 97] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
In order to explore microevolutionary trends in bacteria and archaea, we constructed a data set of 41 alignable tight genome clusters (ATGCs). We show that the ratio of the medians of nonsynonymous to synonymous substitution rates (dN/dS) that is used as a measure of the purifying selection pressure on protein sequences is a stable characteristic of the ATGCs. In agreement with previous findings, parasitic bacteria, notwithstanding the sometimes dramatic genome shrinkage caused by gene loss, are typically subjected to relatively weak purifying selection, presumably owing to relatively small effective population sizes and frequent bottlenecks. However, no evidence of genome streamlining caused by strong selective pressure was found in any of the ATGCs. On the contrary, a significant positive correlation between the genome size, as well as gene size, and selective pressure was observed, although a variety of free-living prokaryotes with very close selective pressures span nearly the entire range of genome sizes. In addition, we examined the connections between the sequence evolution rate and other genomic features. Although gene order changes much faster than protein sequences during the evolution of prokaryotes, a strong positive correlation was observed between the "rearrangement distance" and the amino acid distance, suggesting that at least some of the events leading to genome rearrangement are subjected to the same type of selective constraints as the evolution of amino acid sequences.
Collapse
|
39
|
Koonin EV, Wolf YI. Genomics of bacteria and archaea: the emerging dynamic view of the prokaryotic world. Nucleic Acids Res 2008; 36:6688-719. [PMID: 18948295 PMCID: PMC2588523 DOI: 10.1093/nar/gkn668] [Citation(s) in RCA: 534] [Impact Index Per Article: 33.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
The first bacterial genome was sequenced in 1995, and the first archaeal genome in 1996. Soon after these breakthroughs, an exponential rate of genome sequencing was established, with a doubling time of approximately 20 months for bacteria and approximately 34 months for archaea. Comparative analysis of the hundreds of sequenced bacterial and dozens of archaeal genomes leads to several generalizations on the principles of genome organization and evolution. A crucial finding that enables functional characterization of the sequenced genomes and evolutionary reconstruction is that the majority of archaeal and bacterial genes have conserved orthologs in other, often, distant organisms. However, comparative genomics also shows that horizontal gene transfer (HGT) is a dominant force of prokaryotic evolution, along with the loss of genetic material resulting in genome contraction. A crucial component of the prokaryotic world is the mobilome, the enormous collection of viruses, plasmids and other selfish elements, which are in constant exchange with more stable chromosomes and serve as HGT vehicles. Thus, the prokaryotic genome space is a tightly connected, although compartmentalized, network, a novel notion that undermines the ‘Tree of Life’ model of evolution and requires a new conceptual framework and tools for the study of prokaryotic evolution.
Collapse
Affiliation(s)
- Eugene V Koonin
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD, USA.
| | | |
Collapse
|
40
|
Abstract
The idea behind the gene neighbor method is that conservation of gene order in evolutionarily distant prokaryotes indicates functional association. The procedure presented here starts with the organization of all the genomes into pairs of adjacent genes. Then, pairs of genes in a genome of interest are mapped to their corresponding orthologs in other, informative, genomes. The final step is to determine whether the orthologs of each original pair of genes are also adjacent in the informative genome.
Collapse
|
41
|
Berthon J, Cortez D, Forterre P. Genomic context analysis in Archaea suggests previously unrecognized links between DNA replication and translation. Genome Biol 2008; 9:R71. [PMID: 18400081 PMCID: PMC2643942 DOI: 10.1186/gb-2008-9-4-r71] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2007] [Revised: 02/22/2008] [Accepted: 04/09/2008] [Indexed: 11/05/2022] Open
Abstract
Specific functional interactions of proteins involved in DNA replication and/or DNA repair or transcription might occur in Archaea, suggesting a previously unrecognized regulatory network coupling DNA replication and translation, which might also exist in Eukarya. Background Comparative analysis of genomes is valuable to explore evolution of genomes, deduce gene functions, or predict functional linking between proteins. Here, we have systematically analyzed the genomic environment of all known DNA replication genes in 27 archaeal genomes to infer new connections for DNA replication proteins from conserved genomic associations. Results Two distinct sets of DNA replication genes frequently co-localize in archaeal genomes: the first includes the genes for PCNA, the small subunit of the DNA primase (PriS), and Gins15; the second comprises the genes for MCM and Gins23. Other genomic associations of genes encoding proteins involved in informational processes that may be functionally relevant at the cellular level have also been noted; in particular, the association between the genes for PCNA, transcription factor S, and NudF. Surprisingly, a conserved cluster of genes coding for proteins involved in translation or ribosome biogenesis (S27E, L44E, aIF-2 alpha, Nop10) is almost systematically contiguous to the group of genes coding for PCNA, PriS, and Gins15. The functional relevance of this cluster encoding proteins conserved in Archaea and Eukarya is strongly supported by statistical analysis. Interestingly, the gene encoding the S27E protein, also known as metallopanstimulin 1 (MPS-1) in human, is overexpressed in multiple cancer cell lines. Conclusion Our genome context analysis suggests specific functional interactions for proteins involved in DNA replication between each other or with proteins involved in DNA repair or transcription. Furthermore, it suggests a previously unrecognized regulatory network coupling DNA replication and translation in Archaea that may also exist in Eukarya.
Collapse
Affiliation(s)
- Jonathan Berthon
- Univ. Paris-Sud 11, CNRS, UMR8621, Institut de Génétique et Microbiologie, 91405 Orsay CEDEX, France.
| | | | | |
Collapse
|
42
|
Fong C, Rohmer L, Radey M, Wasnick M, Brittnacher MJ. PSAT: a web tool to compare genomic neighborhoods of multiple prokaryotic genomes. BMC Bioinformatics 2008; 9:170. [PMID: 18366802 PMCID: PMC2358893 DOI: 10.1186/1471-2105-9-170] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2007] [Accepted: 03/26/2008] [Indexed: 11/10/2022] Open
Abstract
Background The conservation of gene order among prokaryotic genomes can provide valuable insight into gene function, protein interactions, or events by which genomes have evolved. Although some tools are available for visualizing and comparing the order of genes between genomes of study, few support an efficient and organized analysis between large numbers of genomes. The Prokaryotic Sequence homology Analysis Tool (PSAT) is a web tool for comparing gene neighborhoods among multiple prokaryotic genomes. Results PSAT utilizes a database that is preloaded with gene annotation, BLAST hit results, and gene-clustering scores designed to help identify regions of conserved gene order. Researchers use the PSAT web interface to find a gene of interest in a reference genome and efficiently retrieve the sequence homologs found in other bacterial genomes. The tool generates a graphic of the genomic neighborhood surrounding the selected gene and the corresponding regions for its homologs in each comparison genome. Homologs in each region are color coded to assist users with analyzing gene order among various genomes. In contrast to common comparative analysis methods that filter sequence homolog data based on alignment score cutoffs, PSAT leverages gene context information for homologs, including those with weak alignment scores, enabling a more sensitive analysis. Features for constraining or ordering results are designed to help researchers browse results from large numbers of comparison genomes in an organized manner. PSAT has been demonstrated to be useful for helping to identify gene orthologs and potential functional gene clusters, and detecting genome modifications that may result in loss of function. Conclusion PSAT allows researchers to investigate the order of genes within local genomic neighborhoods of multiple genomes. A PSAT web server for public use is available for performing analyses on a growing set of reference genomes through any web browser with no client side software setup or installation required. Source code is freely available to researchers interested in setting up a local version of PSAT for analysis of genomes not available through the public server. Access to the public web server and instructions for obtaining source code can be found at .
Collapse
Affiliation(s)
- Christine Fong
- Department of Genome Sciences, University of Washington, Box 357710, Seattle, Washington 98195, USA.
| | | | | | | | | |
Collapse
|
43
|
Laing E, Sidhu K, Hubbard SJ. Predicted transcription factor binding sites as predictors of operons in Escherichia coli and Streptomyces coelicolor. BMC Genomics 2008; 9:79. [PMID: 18269733 PMCID: PMC2276206 DOI: 10.1186/1471-2164-9-79] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2007] [Accepted: 02/12/2008] [Indexed: 11/18/2022] Open
Abstract
Background As a polycistronic transcriptional unit of one or more adjacent genes, operons play a key role in regulation and function in prokaryotic biology, and a better understanding of how they are constituted and controlled is needed. Recent efforts have attempted to predict operonic status in sequenced genomes using a variety of techniques and data sources. To date, non-homology based operon prediction strategies have mainly used predicted promoters and terminators present at the extremities of transcriptional unit as predictors, with reasonable success. However, transcription factor binding sites (TFBSs), typically found upstream of the first gene in an operon, have not yet been evaluated. Results Here we apply a method originally developed for the prediction of TFBSs in Escherichia coli that minimises the need for prior knowledge and tests its ability to predict operons in E. coli and the 'more complex', pharmaceutically important, Streptomyces coelicolor. We demonstrate that through building genome specific TFBS position-specific-weight-matrices (PSWMs) it is possible to predict operons in E. coli and S. coelicolor with 83% and 93% accuracy respectively, using only TFBS as delimiters of operons. Additionally, the 'palindromicity' of TFBS footprint data of E. coli is characterised. Conclusion TFBS are proposed as novel independent features for use in prokaryotic operon prediction (whether alone or as part of a set of features) given their efficacy as operon predictors in E. coli and S. coelicolor. We also show that TFBS footprint data in E. coli generally contains inverted repeats with significantly (p < 0.05) greater palindromicity than random sequences. Consequently, the palindromicity of putative TFBSs predicted can also enhance operon predictions.
Collapse
Affiliation(s)
- Emma Laing
- Faculty of Life Sciences, The University of Manchester, Michael Smith Building, Oxford Road, Manchester, M13 9PT, UK.
| | | | | |
Collapse
|
44
|
Lemoine F, Lespinet O, Labedan B. Assessing the evolutionary rate of positional orthologous genes in prokaryotes using synteny data. BMC Evol Biol 2007; 7:237. [PMID: 18047665 PMCID: PMC2238764 DOI: 10.1186/1471-2148-7-237] [Citation(s) in RCA: 31] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2007] [Accepted: 11/29/2007] [Indexed: 11/15/2022] Open
Abstract
Background Comparison of completely sequenced microbial genomes has revealed how fluid these genomes are. Detecting synteny blocks requires reliable methods to determining the orthologs among the whole set of homologs detected by exhaustive comparisons between each pair of completely sequenced genomes. This is a complex and difficult problem in the field of comparative genomics but will help to better understand the way prokaryotic genomes are evolving. Results We have developed a suite of programs that automate three essential steps to study conservation of gene order, and validated them with a set of 107 bacteria and archaea that cover the majority of the prokaryotic taxonomic space. We identified the whole set of shared homologs between two or more species and computed the evolutionary distance separating each pair of homologs. We applied two strategies to extract from the set of homologs a collection of valid orthologs shared by at least two genomes. The first computes the Reciprocal Smallest Distance (RSD) using the PAM distances separating pairs of homologs. The second method groups homologs in families and reconstructs each family's evolutionary tree, distinguishing bona fide orthologs as well as paralogs created after the last speciation event. Although the phylogenetic tree method often succeeds where RSD fails, the reverse could occasionally be true. Accordingly, we used the data obtained with either methods or their intersection to number the orthologs that are adjacent in for each pair of genomes, the Positional Orthologous Genes (POGs), and to further study their properties. Once all these synteny blocks have been detected, we showed that POGs are subject to more evolutionary constraints than orthologs outside synteny groups, whichever the taxonomic distance separating the compared organisms. Conclusion The suite of programs described in this paper allows a reliable detection of orthologs and is useful for evaluating gene order conservation in prokaryotes whichever their taxonomic distance. Thus, our approach will make easy the rapid identification of POGS in the next few years as we are expecting to be inundated with thousands of completely sequenced microbial genomes.
Collapse
Affiliation(s)
- Frédéric Lemoine
- Institut de Génétique et Microbiologie, CNRS UMR 8621, Bâtiment 400, Université Paris Sud XI, 91405 Orsay Cedex, France.
| | | | | |
Collapse
|
45
|
Vignais PM, Billoud B. Occurrence, Classification, and Biological Function of Hydrogenases: An Overview. Chem Rev 2007; 107:4206-72. [PMID: 17927159 DOI: 10.1021/cr050196r] [Citation(s) in RCA: 1009] [Impact Index Per Article: 59.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Affiliation(s)
- Paulette M. Vignais
- CEA Grenoble, Laboratoire de Biochimie et Biophysique des Systèmes Intégrés, UMR CEA/CNRS/UJF 5092, Institut de Recherches en Technologies et Sciences pour le Vivant (iRTSV), 17 rue des Martyrs, 38054 Grenoble cedex 9, France, and Atelier de BioInformatique Université Pierre et Marie Curie (Paris 6), 12 rue Cuvier, 75005 Paris, France
| | - Bernard Billoud
- CEA Grenoble, Laboratoire de Biochimie et Biophysique des Systèmes Intégrés, UMR CEA/CNRS/UJF 5092, Institut de Recherches en Technologies et Sciences pour le Vivant (iRTSV), 17 rue des Martyrs, 38054 Grenoble cedex 9, France, and Atelier de BioInformatique Université Pierre et Marie Curie (Paris 6), 12 rue Cuvier, 75005 Paris, France
| |
Collapse
|
46
|
Larsen J, Kuhnert P, Frey J, Christensen H, Bisgaard M, Olsen JE. Analysis of gene order data supports vertical inheritance of the leukotoxin operon and genome rearrangements in the 5' flanking region in genus Mannheimia. BMC Evol Biol 2007; 7:184. [PMID: 17915007 PMCID: PMC2228313 DOI: 10.1186/1471-2148-7-184] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/21/2007] [Accepted: 10/03/2007] [Indexed: 12/30/2022] Open
Abstract
Background The Mannheimia subclades belong to the same bacterial genus, but have taken divergent paths toward their distinct lifestyles. For example, M. haemolytica + M. glucosida are potential pathogens of the respiratory tract in the mammalian suborder Ruminantia, whereas M. ruminalis, the supposed sister group, lives as a commensal in the ovine rumen. We have tested the hypothesis that vertical inheritance of the leukotoxin (lktCABD) operon has occurred from the last common ancestor of genus Mannheimia to any ancestor of the diverging subclades by exploring gene order data. Results We examined the gene order in the 5' flanking region of the leukotoxin operon and found that the 5' flanking gene strings, hslVU-lapB-artJ-lktC and xylAB-lktC, are peculiar to M. haemolytica + M. glucosida and M. granulomatis, respectively, whereas the gene string hslVU-lapB-lktC is present in M. ruminalis, the supposed sister group of M. haemolytica + M. glucosida, and in the most ancient subclade M. varigena. In M. granulomatis, we found remnants of the gene string hslVU-lapB-lktC in the xylB-lktC intergenic region. Conclusion These observations indicate that the gene string hslVU-lapB-lktC is more ancient than the hslVU-lapB-artJ-lktC and xylAB-lktC gene strings. The presence of (remnants of) the ancient gene string hslVU-lapB-lktC among any subclades within genus Mannheimia supports that it has been vertically inherited from the last common ancestor of genus Mannheimia to any ancestor of the diverging subclades, thus reaffirming the hypothesis of vertical inheritance of the leukotoxin operon. The presence of individual 5' flanking regions in M. haemolytica + M. glucosida and M. granulomatis reflects later genome rearrangements within each subclade. The evolution of the novel 5' flanking region in M. haemolytica + M. glucosida resulted in transcriptional coupling between the divergently arranged artJ and lkt promoters. We propose that the chimeric promoter have led to high level expression of the leukotoxin operon which could explain the increased potential of certain M. haemolytica + M. glucosida strains to cause a particular type of infection.
Collapse
Affiliation(s)
- Jesper Larsen
- Department of Veterinary Pathobiology, Faculty of Life Sciences, University of Copenhagen, Stigbøjlen 4, DK-1870 Frederiksberg C, Denmark
| | - Peter Kuhnert
- Institute of Veterinary Bacteriology, University of Berne, Länggass-Strasse 122, CH-3012 Berne, Switzerland
| | - Joachim Frey
- Institute of Veterinary Bacteriology, University of Berne, Länggass-Strasse 122, CH-3012 Berne, Switzerland
| | - Henrik Christensen
- Department of Veterinary Pathobiology, Faculty of Life Sciences, University of Copenhagen, Stigbøjlen 4, DK-1870 Frederiksberg C, Denmark
| | - Magne Bisgaard
- Department of Veterinary Pathobiology, Faculty of Life Sciences, University of Copenhagen, Stigbøjlen 4, DK-1870 Frederiksberg C, Denmark
| | - John E Olsen
- Department of Veterinary Pathobiology, Faculty of Life Sciences, University of Copenhagen, Stigbøjlen 4, DK-1870 Frederiksberg C, Denmark
| |
Collapse
|
47
|
Suen G, Arshinoff BI, Taylor RG, Welch RD. Practical Applications of Bacterial Functional Genomics. Biotechnol Genet Eng Rev 2007; 24:213-42. [DOI: 10.1080/02648725.2007.10648101] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
48
|
Wächtershäuser G. From volcanic origins of chemoautotrophic life to Bacteria, Archaea and Eukarya. Philos Trans R Soc Lond B Biol Sci 2006; 361:1787-806; discussion 1806-8. [PMID: 17008219 PMCID: PMC1664677 DOI: 10.1098/rstb.2006.1904] [Citation(s) in RCA: 143] [Impact Index Per Article: 7.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The theory of a chemoautotrophic origin of life in a volcanic iron-sulphur world postulates a pioneer organism at sites of reducing volcanic exhalations. The pioneer organism is characterized by a composite structure with an inorganic substructure and an organic superstructure. Within the surfaces of the inorganic substructure iron, cobalt, nickel and other transition metal centres with sulphido, carbonyl and other ligands were catalytically active and promoted the growth of the organic superstructure through carbon fixation, driven by the reducing potential of the volcanic exhalations. This pioneer metabolism was reproductive by an autocatalytic feedback mechanism. Some organic products served as ligands for activating catalytic metal centres whence they arose. The unitary structure-function relationship of the pioneer organism later gave rise to two major strands of evolution: cellularization and emergence of the genetic machinery. This early phase of evolution ended with segregation of the domains Bacteria, Archaea and Eukarya from a rapidly evolving population of pre-cells. Thus, life started with an initial, direct, deterministic chemical mechanism of evolution giving rise to a later, indirect, stochastic, genetic mechanism of evolution and the upward evolution of life by increase of complexity is grounded ultimately in the synthetic redox chemistry of the pioneer organism.
Collapse
|
49
|
Laing E, Mersinias V, Smith CP, Hubbard SJ. Analysis of gene expression in operons of Streptomyces coelicolor. Genome Biol 2006; 7:R46. [PMID: 16749941 PMCID: PMC1779546 DOI: 10.1186/gb-2006-7-6-r46] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2005] [Revised: 03/03/2006] [Accepted: 05/09/2006] [Indexed: 11/12/2022] Open
Abstract
Analysis of the relative transcript levels of intra-operonic genes in Streptomyces coelicolor suggests significant levels of internal regulation. Background Recent studies have shown that microarray-derived gene-expression data are useful for operon prediction. However, it is apparent that genes within an operon do not conform to the simple notion that they have equal levels of expression. Results To investigate the relative transcript levels of intra-operonic genes, we have used a Z-score approach to normalize the expression levels of all genes within an operon to expression of the first gene of that operon. Here we demonstrate that there is a general downward trend in expression from the first to the last gene in Streptomyces coelicolor operons, in contrast to what we observe in Escherichia coli. Combining transcription-factor binding-site prediction with the identification of operonic genes that exhibited higher transcript levels than the first gene of the same operon enabled the discovery of putative internal promoters. The presence of transcription terminators and abundance of putative transcriptional control sequences in S. coelicolor operons are also described. Conclusion Here we have demonstrated a polarity of expression in operons of S. coelicolor not seen in E. coli, bringing caution to those that apply operon prediction strategies based on E. coli 'equal-expression' to divergent species. We speculate that this general difference in transcription behavior could reflect the contrasting lifestyles of the two organisms and, in the case of Streptomyces, might also be influenced by its high G+C content genome. Identification of putative internal promoters, previously thought to cause problems in operon prediction strategies, has also been enabled.
Collapse
Affiliation(s)
- Emma Laing
- Faculty of Life Sciences, The University of Manchester, Manchester M13 9PT, UK
- Current Address: School of Biomedical and Molecular Sciences, University of Surrey, Guildford GU2 7XH, UK
| | - Vassilis Mersinias
- Functional Genomics Laboratory, School of Biomedical and Molecular Sciences, University of Surrey, Guildford GU2 7XH, UK
| | - Colin P Smith
- Functional Genomics Laboratory, School of Biomedical and Molecular Sciences, University of Surrey, Guildford GU2 7XH, UK
| | - Simon J Hubbard
- Faculty of Life Sciences, The University of Manchester, Manchester M13 9PT, UK
| |
Collapse
|
50
|
Fernández-López R, Garcillán-Barcia MP, Revilla C, Lázaro M, Vielva L, de la Cruz F. Dynamics of the IncW genetic backbone imply general trends in conjugative plasmid evolution. FEMS Microbiol Rev 2006; 30:942-66. [PMID: 17026718 DOI: 10.1111/j.1574-6976.2006.00042.x] [Citation(s) in RCA: 124] [Impact Index Per Article: 6.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
Plasmids cannot be understood as mere tools for genetic exchange: they are themselves subject to the forces of evolution. Their genomic and phylogenetic features have been less studied in this respect. Focusing on the IncW incompatibility group, which includes the smallest known conjugative plasmids, we attempt to unveil some common trends in plasmid evolution. The functional modules of IncW genetic backbone are described, with emphasis on their architecture and relationships to other plasmid groups. Some plasmid regions exhibit strong phylogenetic mosaicism, in striking contrast to others of unusual synteny conservation. The presence of genes of unknown function that are widely distributed in plasmid genomes is also emphasized, exposing the existence of ill-defined yet conserved plasmid functions. Conjugation is an essential hallmark of IncW plasmid biology and special attention is given to the organization and evolution of its transfer modules. Genetic exchange between plasmids and their hosts is analysed by following the evolution of the type IV secretion system. Adaptation of the trw conjugative machinery to pathogenicity functions in Bartonella is discussed as an example of how plasmids can change their host modus vivendi. Starting from the phage paradigm, our analysis articulates novel concepts that apply to plasmid evolution.
Collapse
Affiliation(s)
- Raúl Fernández-López
- Departamento de Biología Molecular (Unidad Asociada al C.I.B., C.S.I.C.), Universidad de Cantabria, Santander, Spain
| | | | | | | | | | | |
Collapse
|