1
|
Gendron EMS, Qing X, Sevigny JL, Li H, Liu Z, Blaxter M, Powers TO, Thomas WK, Porazinska DL. Comparative mitochondrial genomics in Nematoda reveal astonishing variation in compositional biases and substitution rates indicative of multi-level selection. BMC Genomics 2024; 25:615. [PMID: 38890582 PMCID: PMC11184840 DOI: 10.1186/s12864-024-10500-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Accepted: 06/05/2024] [Indexed: 06/20/2024] Open
Abstract
BACKGROUND Nematodes are the most abundant and diverse metazoans on Earth, and are known to significantly affect ecosystem functioning. A better understanding of their biology and ecology, including potential adaptations to diverse habitats and lifestyles, is key to understanding their response to global change scenarios. Mitochondrial genomes offer high species level characterization, low cost of sequencing, and an ease of data handling that can provide insights into nematode evolutionary pressures. RESULTS Generally, nematode mitochondrial genomes exhibited similar structural characteristics (e.g., gene size and GC content), but displayed remarkable variability around these general patterns. Compositional strand biases showed strong codon position specific G skews and relationships with nematode life traits (especially parasitic feeding habits) equal to or greater than with predicted phylogeny. On average, nematode mitochondrial genomes showed low non-synonymous substitution rates, but also high clade specific deviations from these means. Despite the presence of significant mutational saturation, non-synonymous (dN) and synonymous (dS) substitution rates could still be significantly explained by feeding habit and/or habitat. Low ratios of dN:dS rates, particularly associated with the parasitic lifestyles, suggested the presence of strong purifying selection. CONCLUSIONS Nematode mitochondrial genomes demonstrated a capacity to accumulate diversity in composition, structure, and content while still maintaining functional genes. Moreover, they demonstrated a capacity for rapid evolutionary change pointing to a potential interaction between multi-level selection pressures and rapid evolution. In conclusion, this study helps establish a background for our understanding of the potential evolutionary pressures shaping nematode mitochondrial genomes, while outlining likely routes of future inquiry.
Collapse
Affiliation(s)
- Eli M S Gendron
- Department of Entomology and Nematology, University of Florida, Gainesville, FL, USA.
| | - Xue Qing
- Department of Plant Pathology, Nanjing Agricultural University, Nanjing, China.
| | - Joseph L Sevigny
- Molecular, Cellular, and Biomedical Sciences, University of New Hampshire, Durham, NH, USA
- Hubbard Center for Genome Studies, University of New Hampshire, Durham, NH, USA
| | - Hongmei Li
- Department of Plant Pathology, Nanjing Agricultural University, Nanjing, China
| | - Zhiyin Liu
- Department of Plant Pathology, Nanjing Agricultural University, Nanjing, China
| | | | - Thomas O Powers
- Department of Plant Pathology, University of Nebraska, Lincoln, NE, USA
| | - W Kelly Thomas
- Molecular, Cellular, and Biomedical Sciences, University of New Hampshire, Durham, NH, USA
- Hubbard Center for Genome Studies, University of New Hampshire, Durham, NH, USA
| | - Dorota L Porazinska
- Department of Entomology and Nematology, University of Florida, Gainesville, FL, USA
| |
Collapse
|
2
|
Ding H, Gao J, Yang J, Zhang S, Han S, Yi R, Ye Y, Kan X. Genome evolution of Buchnera aphidicola (Gammaproteobacteria): Insights into strand compositional asymmetry, codon usage bias, and phylogenetic implications. Int J Biol Macromol 2023; 253:126738. [PMID: 37690648 DOI: 10.1016/j.ijbiomac.2023.126738] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2023] [Revised: 08/15/2023] [Accepted: 08/25/2023] [Indexed: 09/12/2023]
Abstract
Taxa of Buchnera aphidicola (hereafter "Buchnera") are mutualistic intracellular symbionts of aphids, known for their remarkable biological traits such as genome reduction, strand compositional asymmetry, and symbiont-host coevolution. With the growing availability of genomic data, we performed a comprehensive analysis of 103 genomes of Buchnera strains from 12 host subfamilies, focusing on the genomic characterizations, codon usage patterns, and phylogenetic implications. Our findings revealed consistent features among all genomes, including small genome sizes, low GC contents, and gene losses. We also identified strong strand compositional asymmetries in all strains at the genome level. Further investigation suggested that mutation pressure may have played a crucial role in shaping codon usage of Buchnera. Moreover, the genomic asymmetries were reflected in asymmetric codon usage preferences within chromosomal genes. Notably, the levels of these asymmetries were varied among strains and were significantly influenced by the degrees of genome shrinkages. Lastly, our phylogenetic analyses presented an alternative topology of Aphididae, based on the Buchnera symbionts, providing robust confirmation of the paraphylies of Eriosomatinae, and Macrosiphini. Our objectives are to further understand the strand compositional asymmetry and codon usage bias of Buchnera taxa, and provide new perspectives for phylogenetic studies of Aphididae.
Collapse
Affiliation(s)
- Hengwu Ding
- Anhui Provincial Key Laboratory of the Conservation and Exploitation of Biological Resources, College of Life Sciences, Anhui Normal University, Wuhu 241000, China; Key Laboratory of Development and Application of Rural Renewable Energy, Biogas Institute of Ministry of Agriculture and Rural Affairs, Chengdu 610041, China
| | - Jinming Gao
- Anhui Provincial Key Laboratory of the Conservation and Exploitation of Biological Resources, College of Life Sciences, Anhui Normal University, Wuhu 241000, China; The Institute of Bioinformatics, College of Life Sciences, Anhui Normal University, Wuhu 241000, China
| | - Jianke Yang
- School of Basic Medical Sciences, Wannan Medical College, Wuhu 241000, China
| | - Sijia Zhang
- Anhui Provincial Key Laboratory of the Conservation and Exploitation of Biological Resources, College of Life Sciences, Anhui Normal University, Wuhu 241000, China; The Institute of Bioinformatics, College of Life Sciences, Anhui Normal University, Wuhu 241000, China
| | - Shiyun Han
- Anhui Provincial Key Laboratory of the Conservation and Exploitation of Biological Resources, College of Life Sciences, Anhui Normal University, Wuhu 241000, China; The Institute of Bioinformatics, College of Life Sciences, Anhui Normal University, Wuhu 241000, China
| | - Ran Yi
- Anhui Provincial Key Laboratory of the Conservation and Exploitation of Biological Resources, College of Life Sciences, Anhui Normal University, Wuhu 241000, China; The Institute of Bioinformatics, College of Life Sciences, Anhui Normal University, Wuhu 241000, China
| | - Yuanxin Ye
- Anhui Provincial Key Laboratory of the Conservation and Exploitation of Biological Resources, College of Life Sciences, Anhui Normal University, Wuhu 241000, China; The Institute of Bioinformatics, College of Life Sciences, Anhui Normal University, Wuhu 241000, China
| | - Xianzhao Kan
- Anhui Provincial Key Laboratory of the Conservation and Exploitation of Biological Resources, College of Life Sciences, Anhui Normal University, Wuhu 241000, China; The Institute of Bioinformatics, College of Life Sciences, Anhui Normal University, Wuhu 241000, China.
| |
Collapse
|
3
|
Toft CJ, Moreau MJJ, Perutka J, Mandapati S, Enyeart P, Sorenson AE, Ellington AD, Schaeffer PM. Delineation of the Ancestral Tus-Dependent Replication Fork Trap. Int J Mol Sci 2021; 22:ijms222413533. [PMID: 34948327 PMCID: PMC8707476 DOI: 10.3390/ijms222413533] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2021] [Revised: 12/10/2021] [Accepted: 12/15/2021] [Indexed: 12/28/2022] Open
Abstract
In Escherichia coli, DNA replication termination is orchestrated by two clusters of Ter sites forming a DNA replication fork trap when bound by Tus proteins. The formation of a ‘locked’ Tus–Ter complex is essential for halting incoming DNA replication forks. However, the absence of replication fork arrest at some Ter sites raised questions about their significance. In this study, we examined the genome-wide distribution of Tus and found that only the six innermost Ter sites (TerA–E and G) were significantly bound by Tus. We also found that a single ectopic insertion of TerB in its non-permissive orientation could not be achieved, advocating against a need for ‘back-up’ Ter sites. Finally, examination of the genomes of a variety of Enterobacterales revealed a new replication fork trap architecture mostly found outside the Enterobacteriaceae family. Taken together, our data enabled the delineation of a narrow ancestral Tus-dependent DNA replication fork trap consisting of only two Ter sites.
Collapse
Affiliation(s)
- Casey J. Toft
- Molecular and Cell Biology, College of Public Health, Medical and Veterinary Sciences, James Cook University, Douglas, QLD 4811, Australia; (C.J.T.); (M.J.J.M.); (A.E.S.)
- Centre of Tropical Bioinformatics and Molecular Biology, James Cook University, Douglas, QLD 4811, Australia
| | - Morgane J. J. Moreau
- Molecular and Cell Biology, College of Public Health, Medical and Veterinary Sciences, James Cook University, Douglas, QLD 4811, Australia; (C.J.T.); (M.J.J.M.); (A.E.S.)
| | - Jiri Perutka
- Institute for Cell and Molecular Biology, University of Texas, Austin, TX 78712, USA; (J.P.); (S.M.); (P.E.); (A.D.E.)
| | - Savitri Mandapati
- Institute for Cell and Molecular Biology, University of Texas, Austin, TX 78712, USA; (J.P.); (S.M.); (P.E.); (A.D.E.)
| | - Peter Enyeart
- Institute for Cell and Molecular Biology, University of Texas, Austin, TX 78712, USA; (J.P.); (S.M.); (P.E.); (A.D.E.)
| | - Alanna E. Sorenson
- Molecular and Cell Biology, College of Public Health, Medical and Veterinary Sciences, James Cook University, Douglas, QLD 4811, Australia; (C.J.T.); (M.J.J.M.); (A.E.S.)
| | - Andrew D. Ellington
- Institute for Cell and Molecular Biology, University of Texas, Austin, TX 78712, USA; (J.P.); (S.M.); (P.E.); (A.D.E.)
| | - Patrick M. Schaeffer
- Molecular and Cell Biology, College of Public Health, Medical and Veterinary Sciences, James Cook University, Douglas, QLD 4811, Australia; (C.J.T.); (M.J.J.M.); (A.E.S.)
- Centre of Tropical Bioinformatics and Molecular Biology, James Cook University, Douglas, QLD 4811, Australia
- Correspondence: ; Tel.: +61-(0)-7-4781-4448; Fax: +61-(0)-7-4781-6078
| |
Collapse
|
4
|
Baquero F, Martínez JL, F. Lanza V, Rodríguez-Beltrán J, Galán JC, San Millán A, Cantón R, Coque TM. Evolutionary Pathways and Trajectories in Antibiotic Resistance. Clin Microbiol Rev 2021; 34:e0005019. [PMID: 34190572 PMCID: PMC8404696 DOI: 10.1128/cmr.00050-19] [Citation(s) in RCA: 95] [Impact Index Per Article: 23.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Evolution is the hallmark of life. Descriptions of the evolution of microorganisms have provided a wealth of information, but knowledge regarding "what happened" has precluded a deeper understanding of "how" evolution has proceeded, as in the case of antimicrobial resistance. The difficulty in answering the "how" question lies in the multihierarchical dimensions of evolutionary processes, nested in complex networks, encompassing all units of selection, from genes to communities and ecosystems. At the simplest ontological level (as resistance genes), evolution proceeds by random (mutation and drift) and directional (natural selection) processes; however, sequential pathways of adaptive variation can occasionally be observed, and under fixed circumstances (particular fitness landscapes), evolution is predictable. At the highest level (such as that of plasmids, clones, species, microbiotas), the systems' degrees of freedom increase dramatically, related to the variable dispersal, fragmentation, relatedness, or coalescence of bacterial populations, depending on heterogeneous and changing niches and selective gradients in complex environments. Evolutionary trajectories of antibiotic resistance find their way in these changing landscapes subjected to random variations, becoming highly entropic and therefore unpredictable. However, experimental, phylogenetic, and ecogenetic analyses reveal preferential frequented paths (highways) where antibiotic resistance flows and propagates, allowing some understanding of evolutionary dynamics, modeling and designing interventions. Studies on antibiotic resistance have an applied aspect in improving individual health, One Health, and Global Health, as well as an academic value for understanding evolution. Most importantly, they have a heuristic significance as a model to reduce the negative influence of anthropogenic effects on the environment.
Collapse
Affiliation(s)
- F. Baquero
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - J. L. Martínez
- National Center for Biotechnology (CNB-CSIC), Madrid, Spain
| | - V. F. Lanza
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
- Central Bioinformatics Unit, Ramón y Cajal Institute for Health Research (IRYCIS), Madrid, Spain
| | - J. Rodríguez-Beltrán
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - J. C. Galán
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - A. San Millán
- National Center for Biotechnology (CNB-CSIC), Madrid, Spain
| | - R. Cantón
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| | - T. M. Coque
- Department of Microbiology, Ramón y Cajal University Hospital, Ramón y Cajal Institute for Health Research (IRYCIS), Network Center for Research in Epidemiology and Public Health (CIBERESP), Madrid, Spain
| |
Collapse
|
5
|
Chromosomal Recombination Targets in Chlamydia Interspecies Lateral Gene Transfer. J Bacteriol 2019; 201:JB.00365-19. [PMID: 31501285 DOI: 10.1128/jb.00365-19] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2019] [Accepted: 09/03/2019] [Indexed: 12/12/2022] Open
Abstract
Lateral gene transfer (LGT) among Chlamydia trachomatis strains is common, in both isolates generated in the laboratory and those examined directly from patients. In contrast, there are very few examples of recent acquisition of DNA by any Chlamydia spp. from any other species. Interspecies LGT in this system was analyzed using crosses of tetracycline (Tc)-resistant C. trachomatis L2/434 and chloramphenicol (Cam)-resistant C. muridarum VR-123. Parental C. muridarum strains were created using a plasmid-based Himar transposition system, which led to integration of the Camr marker randomly across the chromosome. Fragments encompassing 79% of the C. muridarum chromosome were introduced into a C. trachomatis background, with the total coverage contained on 142 independent recombinant clones. Genome sequence analysis of progeny strains identified candidate recombination hot spots, a property not consistent with in vitro C. trachomatis × C. trachomatis (intraspecies) crosses. In both interspecies and intraspecies crosses, there were examples of duplications, mosaic recombination endpoints, and recombined sequences that were not linked to the selection marker. Quantitative analysis of the distribution and constitution of inserted sequences indicated that there are different constraints on interspecies LGT than on intraspecies crosses. These constraints may help explain why there is so little evidence of interspecies genetic exchange in this system, which is in contrast to very widespread intraspecies exchange in C. trachomatis IMPORTANCE Genome sequence analysis has demonstrated that there is widespread lateral gene transfer among strains within the species C. trachomatis and with other closely related Chlamydia species in laboratory experiments. This is in contrast to the complete absence of foreign DNA in the genomes of sequenced clinical C. trachomatis strains. There is no understanding of any mechanisms of genetic transfer in this important group of pathogens. In this report, we demonstrate that interspecies genetic exchange can occur but that the nature of the fragments exchanged is different than those observed in intraspecies crosses. We also generated a large hybrid strain library that can be exploited to examine important aspects of chlamydial disease.
Collapse
|
6
|
Complete mitochondrial genome of Ophichthus brevicaudatus reveals novel gene order and phylogenetic relationships of Anguilliformes. Int J Biol Macromol 2019; 135:609-618. [PMID: 31132441 DOI: 10.1016/j.ijbiomac.2019.05.139] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2019] [Revised: 04/30/2019] [Accepted: 05/21/2019] [Indexed: 11/20/2022]
Abstract
Generally, a teleostean group possesses only one type or a set of similar mitochondrial gene arrangement. However, two types of gene arrangement have been identified in the mitochondrial genomes (mitogenomes) of Anguilliformes. Here, a newly sequenced mitogenome of Ophichthus brevicaudatus (Anguilliformes; Ophichthidae) was presented. The total length of the O. brevicaudatus mitogenome was 17,773 bp, and it contained 13 protein-coding genes (PCGs), two ribosomal RNAs (rRNAs), 22 transfer RNA (tRNA) genes, and two identical control regions (CRs). The gene order differed from that of the typical vertebrate mitogenomes. The genes ND6 and the conjoint trnE were translocated to the location between trnT and trnP, and one of the duplicated CR was translocated to the upstream of the ND6. The duplication-random loss model was adopted to explain the gene rearrangement events in this mitogenome. The most comprehensive phylogenetic trees of Anguilliformes based on complete mitogenome was constructed. The non-monophyly of Congridae was well supported, whereas the non-monophyly of Derichthyidae and Chlopsidae was not supported. These results provide insight into gene arrangement features of anguilliform mitogenomes and lay the foundation for further phylogenetic studies on Anguilliformes.
Collapse
|
7
|
Compositional dynamics and codon usage pattern of BRCA1 gene across nine mammalian species. Genomics 2019; 111:167-176. [DOI: 10.1016/j.ygeno.2018.01.013] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/01/2017] [Revised: 12/22/2017] [Accepted: 01/22/2018] [Indexed: 11/19/2022]
|
8
|
Joesch-Cohen LM, Robinson M, Jabbari N, Lausted CG, Glusman G. Novel metrics for quantifying bacterial genome composition skews. BMC Genomics 2018; 19:528. [PMID: 29996771 PMCID: PMC6042203 DOI: 10.1186/s12864-018-4913-5] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2018] [Accepted: 07/02/2018] [Indexed: 11/17/2022] Open
Abstract
Background Bacterial genomes have characteristic compositional skews, which are differences in nucleotide frequency between the leading and lagging DNA strands across a segment of a genome. It is thought that these strand asymmetries arise as a result of mutational biases and selective constraints, particularly for energy efficiency. Analysis of compositional skews in a diverse set of bacteria provides a comparative context in which mutational and selective environmental constraints can be studied. These analyses typically require finished and well-annotated genomic sequences. Results We present three novel metrics for examining genome composition skews; all three metrics can be computed for unfinished or partially-annotated genomes. The first two metrics, (dot-skew and cross-skew) depend on sequence and gene annotation of a single genome, while the third metric (residual skew) highlights unusual genomes by subtracting a GC content-based model of a library of genome sequences. We applied these metrics to 7738 available bacterial genomes, including partial drafts, and identified outlier species. A phylogenetically diverse set of these outliers (i.e., Borrelia, Ehrlichia, Kinetoplastibacterium, and Phytoplasma) display similar skew patterns but share lifestyle characteristics, such as intracellularity and biosynthetic dependence on their hosts. Conclusions Our novel metrics appear to reflect the effects of biosynthetic constraints and adaptations to life within one or more hosts on genome composition. We provide results for each analyzed genome, software and interactive visualizations at http://db.systemsbiology.net/gestalt/skew_metrics. Electronic supplementary material The online version of this article (10.1186/s12864-018-4913-5) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Lena M Joesch-Cohen
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA, 98109, USA.,Brown University, Providence, RI, 02912, USA
| | - Max Robinson
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA, 98109, USA
| | - Neda Jabbari
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA, 98109, USA
| | | | - Gustavo Glusman
- Institute for Systems Biology, 401 Terry Ave N, Seattle, WA, 98109, USA.
| |
Collapse
|
9
|
Paul P, Malakar AK, Chakraborty S. Compositional bias coupled with selection and mutation pressure drives codon usage in Brassica campestris genes. Food Sci Biotechnol 2017; 27:725-733. [PMID: 30263798 DOI: 10.1007/s10068-017-0285-x] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2017] [Revised: 11/28/2017] [Accepted: 12/03/2017] [Indexed: 11/25/2022] Open
Abstract
The plant Brassica campestris includes the vegetables turnip and Chinese cabbage, important plants of economic importance. Here, we have analysed the codon usage bias of B. campestris for 116 protein coding genes. Neutrality analysis showed that B. campestris had a wide range of GC3s, and a significant correlation was observed between GC12 and GC3. Nc versus GC3s plot showed a few genes on or proximate to the expected curve, but the majority of points were found to be scattered distantly from the expected curve. Correspondence analysis on codon usage revealed that the position preference of codons on multidimensional space totally depends on the presence of A and T at synonymous third codon position. These results altogether suggest that composition bias along with selection (major) and mutation pressure (minor) affects the codon usage pattern of the protein coding genes in Brassica campestris.
Collapse
Affiliation(s)
- Prosenjit Paul
- Department of Biotechnology, Assam University, Silchar, Assam 788011 India
| | - Arup Kumar Malakar
- Department of Biotechnology, Assam University, Silchar, Assam 788011 India
| | | |
Collapse
|
10
|
Kaehler BD. Full reconstruction of non-stationary strand-symmetric models on rooted phylogenies. J Theor Biol 2017; 420:144-151. [PMID: 28286217 DOI: 10.1016/j.jtbi.2017.03.007] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2016] [Revised: 03/06/2017] [Accepted: 03/08/2017] [Indexed: 10/20/2022]
Abstract
Understanding the evolutionary relationship among species is of fundamental importance to the biological sciences. The location of the root in any phylogenetic tree is critical as it gives an order to evolutionary events. None of the popular models of nucleotide evolution currently used in likelihood or Bayesian methods are able to infer the location of the root without exogenous information. It is known that the most general Markov models of nucleotide substitution also cannot identify the location of the root or be fitted to multiple sequence alignments with fewer than three sequences. We prove that the location of the root and the full model can be identified and statistically consistently estimated for a non-stationary, strand-symmetric substitution model given a multiple sequence alignment with two or more sequences. We also generalise earlier work to provide a practical means of overcoming the computationally intractable problem of labelling hidden states in a phylogenetic model.
Collapse
Affiliation(s)
- Benjamin D Kaehler
- Research School of Biology, Australian National University, Canberra, Australian Capital Territory, Australia.
| |
Collapse
|
11
|
Dia N, Lavie L, Faye N, Méténier G, Yeramian E, Duroure C, Toguebaye BS, Frutos R, Niang MN, Vivarès CP, Ben Mamoun C, Cornillot E. Subtelomere organization in the genome of the microsporidian Encephalitozoon cuniculi: patterns of repeated sequences and physicochemical signatures. BMC Genomics 2016; 17:34. [PMID: 26744270 PMCID: PMC4704409 DOI: 10.1186/s12864-015-1920-7] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2015] [Accepted: 09/11/2015] [Indexed: 12/23/2022] Open
Abstract
Background The microsporidian Encephalitozoon cuniculi is an obligate intracellular eukaryotic pathogen with a small nuclear genome (2.9 Mbp) consisting of 11 chromosomes. Although each chromosome end is known to contain a single rDNA unit, the incomplete assembly of subtelomeric regions following sequencing of the genome identified only 3 of the 22 expected rDNA units. While chromosome end assembly remains a difficult process in most eukaryotic genomes, it is of significant importance for pathogens because these regions encode factors important for virulence and host evasion. Results Here we report the first complete assembly of E. cuniculi chromosome ends, and describe a novel mosaic structure of segmental duplications (EXT repeats) in these regions. EXT repeats range in size between 3.5 and 23.8 kbp and contain four multigene families encoding membrane associated proteins. Twenty-one recombination sites were identified in the sub-terminal region of E. cuniculi chromosomes. Our analysis suggests that these sites contribute to the diversity of chromosome ends organization through Double Strand Break repair mechanisms. The region containing EXT repeats at chromosome extremities can be differentiated based on gene composition, GC content, recombination sites density and chromosome landscape. Conclusion Together this study provides the complete structure of the chromosome ends of E. cuniculi GB-M1, and identifies important factors, which could play a major role in parasite diversity and host-parasite interactions. Comparison with other eukaryotic genomes suggests that terminal regions could be distinguished precisely based on gene content, genetic instability and base composition biais. The diversity of processes assciated with chromosome extremities and their biological consequences, as they are presented in the present study, emphasize the fact that great effort will be necessary in the future to characterize more carefully these regions during whole genome sequencing efforts. Electronic supplementary material The online version of this article (doi:10.1186/s12864-015-1920-7) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Ndongo Dia
- Unité de Virologie Médicale, Institut Pasteur de Dakar, 36 Avenue Pasteur, B.P. 220, Dakar, Sénégal.
| | - Laurence Lavie
- Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes, Génome et Environnement, UMR 6023, CNRS, 63177, Aubière, France.
| | - Ngor Faye
- Laboratoire de Parasitologie Générale, Département de Biologie Animale, Faculté des Sciences et Technologies, Université Cheikh Anta Diop, Dakar, Sénégal.
| | - Guy Méténier
- Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes, Génome et Environnement, UMR 6023, CNRS, 63177, Aubière, France.
| | - Edouard Yeramian
- Unité de Bioinformatique Structurale, UMR 3528 CNRS, Institut Pasteur, 25-28, rue du Dr Roux, 75015, Paris, France.
| | - Christophe Duroure
- Laboratoire de Météorologie Physique, OPGC UMR 6016 CNRS-Université Blaise Pascal, 24 Avenue des Landais, 63177, Aubière Cedex, France.
| | - Bhen S Toguebaye
- Laboratoire de Parasitologie Générale, Département de Biologie Animale, Faculté des Sciences et Technologies, Université Cheikh Anta Diop, Dakar, Sénégal.
| | - Roger Frutos
- CIRAD, UMR 17, Cirad-Ird, TA-A17/G, Campus International de Baillarguet, 34398, Montpellier, France.
| | - Mbayame N Niang
- Unité de Virologie Médicale, Institut Pasteur de Dakar, 36 Avenue Pasteur, B.P. 220, Dakar, Sénégal.
| | - Christian P Vivarès
- Clermont Université, Université Blaise Pascal, Laboratoire Microorganismes, Génome et Environnement, UMR 6023, CNRS, 63177, Aubière, France.
| | - Choukri Ben Mamoun
- Section of Infectious Disease and Department of Microbial Pathogenesis, Winchester Building WWW403D, Yale School of Medicine, 15 York St., New Haven, CT, 06520, USA.
| | - Emmanuel Cornillot
- Institut de Recherche en Cancérologie de Montpellier, IRCM - INSERM U1194 & Université de Montpellier & ICM, Institut régional du Cancer Montpellier, Campus Val d'Aurelle, 34298, Montpellier cedex 5, France. .,Institut de Biologie Computationnelle, IBC, Campus Saint Priest, 34090, Montpellier, France.
| |
Collapse
|
12
|
Touchon M, Rocha EPC. Coevolution of the Organization and Structure of Prokaryotic Genomes. Cold Spring Harb Perspect Biol 2016; 8:a018168. [PMID: 26729648 DOI: 10.1101/cshperspect.a018168] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022]
Abstract
The cytoplasm of prokaryotes contains many molecular machines interacting directly with the chromosome. These vital interactions depend on the chromosome structure, as a molecule, and on the genome organization, as a unit of genetic information. Strong selection for the organization of the genetic elements implicated in these interactions drives replicon ploidy, gene distribution, operon conservation, and the formation of replication-associated traits. The genomes of prokaryotes are also very plastic with high rates of horizontal gene transfer and gene loss. The evolutionary conflicts between plasticity and organization lead to the formation of regions with high genetic diversity whose impact on chromosome structure is poorly understood. Prokaryotic genomes are remarkable documents of natural history because they carry the imprint of all of these selective and mutational forces. Their study allows a better understanding of molecular mechanisms, their impact on microbial evolution, and how they can be tinkered in synthetic biology.
Collapse
Affiliation(s)
- Marie Touchon
- Microbial Evolutionary Genomics, Institut Pasteur, 75015 Paris, France CNRS, UMR3525, 75015 Paris, France
| | - Eduardo P C Rocha
- Microbial Evolutionary Genomics, Institut Pasteur, 75015 Paris, France CNRS, UMR3525, 75015 Paris, France
| |
Collapse
|
13
|
Uddin A, Chakraborty S. Synonymous codon usage pattern in mitochondrial CYB gene in pisces, aves, and mammals. Mitochondrial DNA A DNA Mapp Seq Anal 2015; 28:187-196. [DOI: 10.3109/19401736.2015.1115842] [Citation(s) in RCA: 24] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Affiliation(s)
- Arif Uddin
- Department of Biotechnology, Assam University, Silchar, Assam, India
| | | |
Collapse
|
14
|
Lewis SC, Joers P, Willcox S, Griffith JD, Jacobs HT, Hyman BC. A rolling circle replication mechanism produces multimeric lariats of mitochondrial DNA in Caenorhabditis elegans. PLoS Genet 2015; 11:e1004985. [PMID: 25693201 PMCID: PMC4334201 DOI: 10.1371/journal.pgen.1004985] [Citation(s) in RCA: 33] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2014] [Accepted: 01/05/2015] [Indexed: 11/24/2022] Open
Abstract
Mitochondrial DNA (mtDNA) encodes respiratory complex subunits essential to almost all eukaryotes; hence respiratory competence requires faithful duplication of this molecule. However, the mechanism(s) of its synthesis remain hotly debated. Here we have developed Caenorhabditis elegans as a convenient animal model for the study of metazoan mtDNA synthesis. We demonstrate that C. elegans mtDNA replicates exclusively by a phage-like mechanism, in which multimeric molecules are synthesized from a circular template. In contrast to previous mammalian studies, we found that mtDNA synthesis in the C. elegans gonad produces branched-circular lariat structures with multimeric DNA tails; we were able to detect multimers up to four mtDNA genome unit lengths. Further, we did not detect elongation from a displacement-loop or analogue of 7S DNA, suggesting a clear difference from human mtDNA in regard to the site(s) of replication initiation. We also identified cruciform mtDNA species that are sensitive to cleavage by the resolvase RusA; we suggest these four-way junctions may have a role in concatemer-to-monomer resolution. Overall these results indicate that mtDNA synthesis in C. elegans does not conform to any previously documented metazoan mtDNA replication mechanism, but instead are strongly suggestive of rolling circle replication, as employed by bacteriophages. As several components of the metazoan mitochondrial DNA replisome are likely phage-derived, these findings raise the possibility that the rolling circle mtDNA replication mechanism may be ancestral among metazoans.
Collapse
Affiliation(s)
- Samantha C. Lewis
- Department of Biology and Interdepartmental Graduate Program in Genetics, Genomics and Bioinformatics, University of California Riverside, Riverside, California, United States of America
- BioMediTech and Tampere University Hospital, University of Tampere, Tampere, Finland
| | - Priit Joers
- BioMediTech and Tampere University Hospital, University of Tampere, Tampere, Finland
- Estonian Biocentre, Tartu, Estonia
| | - Smaranda Willcox
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Jack D. Griffith
- Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, Chapel Hill, North Carolina, United States of America
| | - Howard T. Jacobs
- BioMediTech and Tampere University Hospital, University of Tampere, Tampere, Finland
- Molecular Neurology Research Program, University of Helsinki, Helsinki, Finland
| | - Bradley C. Hyman
- Department of Biology and Interdepartmental Graduate Program in Genetics, Genomics and Bioinformatics, University of California Riverside, Riverside, California, United States of America
| |
Collapse
|
15
|
Hyrien O, Rappailles A, Guilbaud G, Baker A, Chen CL, Goldar A, Petryk N, Kahli M, Ma E, d'Aubenton-Carafa Y, Audit B, Thermes C, Arneodo A. From simple bacterial and archaeal replicons to replication N/U-domains. J Mol Biol 2013; 425:4673-89. [PMID: 24095859 DOI: 10.1016/j.jmb.2013.09.021] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2013] [Revised: 09/15/2013] [Accepted: 09/19/2013] [Indexed: 10/26/2022]
Abstract
The Replicon Theory proposed 50 years ago has proven to apply for replicons of the three domains of life. Here, we review our knowledge of genome organization into single and multiple replicons in bacteria, archaea and eukarya. Bacterial and archaeal replicator/initiator systems are quite specific and efficient, whereas eukaryotic replicons show degenerate specificity and efficiency, allowing for complex regulation of origin firing time. We expand on recent evidence that ~50% of the human genome is organized as ~1,500 megabase-sized replication domains with a characteristic parabolic (U-shaped) replication timing profile and linear (N-shaped) gradient of replication fork polarity. These N/U-domains correspond to self-interacting segments of the chromatin fiber bordered by open chromatin zones and replicate by cascades of origin firing initiating at their borders and propagating to their center, possibly by fork-stimulated initiation. The conserved occurrence of this replication pattern in the germline of mammals has resulted over evolutionary times in the formation of megabase-sized domains with an N-shaped nucleotide compositional skew profile due to replication-associated mutational asymmetries. Overall, these results reveal an evolutionarily conserved but developmentally plastic organization of replication that is driving mammalian genome evolution.
Collapse
Affiliation(s)
- Olivier Hyrien
- Ecole Normale Supérieure, IBENS UMR8197 U1024, Paris 75005, France.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
16
|
Marsolier-Kergoat MC. Asymmetry indices for analysis and prediction of replication origins in eukaryotic genomes. PLoS One 2012; 7:e45050. [PMID: 23028755 PMCID: PMC3459929 DOI: 10.1371/journal.pone.0045050] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2012] [Accepted: 08/15/2012] [Indexed: 01/15/2023] Open
Abstract
DNA replication was recently shown to induce the formation of compositional skews in the genomes of the yeasts Saccharomyces cerevisiae and Kluyveromyces lactis. In this work, I have characterized further GC and TA skew variations in the vicinity of S. cerevisiae replication origins and termination sites, and defined asymmetry indices for origin analysis and prediction. The presence of skew jumps at some termination sites in the S. cerevisiae genome was established. The majority of S. cerevisiae replication origins are marked by an oriented consensus sequence called ACS, but no evidence could be found for asymmetric origin firing that would be linked to ACS orientation. Asymmetry indices related to GC and TA skews were defined, and a global asymmetry index IGC,TA was described. IGC,TA was found to strongly correlate with origin efficiency in S. cerevisiae and to allow the determination of sets of intergenes significantly enriched in origin loci. The generalized use of asymmetry indices for origin prediction in naive genomes implies the determination of the direction of the skews, i.e. the identification of which strand, leading or lagging, is enriched in G and which one is enriched in T. Recent work indicates that in Candida albicans and in several related species, centromeres contain early and efficient replication origins. It has been proposed that the skew jumps observed at these positions would reflect the activity of these origins, thus allowing to determine the direction of the skews in these genomes. However, I show here that the skew jumps at C. albicans centromeres are not related to replication and that replication-associated GC and TA skews in C. albicans have in fact the opposite directions of what was proposed.
Collapse
|
17
|
Arakawa K, Tomita M. Measures of compositional strand bias related to replication machinery and its applications. Curr Genomics 2012; 13:4-15. [PMID: 22942671 PMCID: PMC3269016 DOI: 10.2174/138920212799034749] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2011] [Revised: 09/10/2011] [Accepted: 09/20/2011] [Indexed: 11/22/2022] Open
Abstract
The compositional asymmetry of complementary bases in nucleotide sequences implies the existence of a mutational or selectional bias in the two strands of the DNA duplex, which is commonly shaped by strand-specific mechanisms in transcription or replication. Such strand bias in genomes, frequently visualized by GC skew graphs, is used for the computational prediction of transcription start sites and replication origins, as well as for comparative evolutionary genomics studies. The use of measures of compositional strand bias in order to quantify the degree of strand asymmetry is crucial, as it is the basis for determining the applicability of compositional analysis and comparing the strength of the mutational bias in different biological machineries in various species. Here, we review the measures of strand bias that have been proposed to date, including the ∆GC skew, the B1 index, the predictability score of linear discriminant analysis for gene orientation, the signal-to-noise ratio of the oligonucleotide bias, and the GC skew index. These measures have been predominantly designed for and applied to the analysis of replication-related mutational processes in prokaryotes, but we also give research examples in eukaryotes.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan
| | | |
Collapse
|
18
|
Incidence of genome structure, DNA asymmetry, and cell physiology on T-DNA integration in chromosomes of the phytopathogenic fungus Leptosphaeria maculans. G3-GENES GENOMES GENETICS 2012; 2:891-904. [PMID: 22908038 PMCID: PMC3411245 DOI: 10.1534/g3.112.002048] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 01/21/2012] [Accepted: 06/07/2012] [Indexed: 11/18/2022]
Abstract
The ever-increasing generation of sequence data is accompanied by unsatisfactory functional annotation, and complex genomes, such as those of plants and filamentous fungi, show a large number of genes with no predicted or known function. For functional annotation of unknown or hypothetical genes, the production of collections of mutants using Agrobacterium tumefaciens–mediated transformation (ATMT) associated with genotyping and phenotyping has gained wide acceptance. ATMT is also widely used to identify pathogenicity determinants in pathogenic fungi. A systematic analysis of T-DNA borders was performed in an ATMT-mutagenized collection of the phytopathogenic fungus Leptosphaeria maculans to evaluate the features of T-DNA integration in its particular transposable element-rich compartmentalized genome. A total of 318 T-DNA tags were recovered and analyzed for biases in chromosome and genic compartments, existence of CG/AT skews at the insertion site, and occurrence of microhomologies between the T-DNA left border (LB) and the target sequence. Functional annotation of targeted genes was done using the Gene Ontology annotation. The T-DNA integration mainly targeted gene-rich, transcriptionally active regions, and it favored biological processes consistent with the physiological status of a germinating spore. T-DNA integration was strongly biased toward regulatory regions, and mainly promoters. Consistent with the T-DNA intranuclear-targeting model, the density of T-DNA insertion correlated with CG skew near the transcription initiation site. The existence of microhomologies between promoter sequences and the T-DNA LB flanking sequence was also consistent with T-DNA integration to host DNA mediated by homologous recombination based on the microhomology-mediated end-joining pathway.
Collapse
|
19
|
Kono N, Arakawa K, Tomita M. Validation of bacterial replication termination models using simulation of genomic mutations. PLoS One 2012; 7:e34526. [PMID: 22509315 PMCID: PMC3317982 DOI: 10.1371/journal.pone.0034526] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2011] [Accepted: 03/05/2012] [Indexed: 11/21/2022] Open
Abstract
In bacterial circular chromosomes and most plasmids, the replication is known to be terminated when either of the following occurs: the forks progressing in opposite directions meet at the distal end of the chromosome or the replication forks become trapped by Tus proteins bound to Ter sites. Most bacterial genomes have various polarities in their genomic structures. The most notable feature is polar genomic compositional asymmetry of the bases G and C in the leading and lagging strands, called GC skew. This asymmetry is caused by replication-associated mutation bias, and this “footprint" of the replication machinery suggests that, in contrast to the two known mechanisms, replication termination occurs near the chromosome dimer resolution site dif. To understand this difference between the known replication machinery and genomic compositional bias, we undertook a simulation study of genomic mutations, and we report here how different replication termination models contribute to the generation of replication-related genomic compositional asymmetry. Contrary to naive expectations, our results show that a single finite termination site at dif or at the GC skew shift point is not sufficient to reconstruct the genomic compositional bias as observed in published sequences. The results also show that the known replication mechanisms are sufficient to explain the position of the GC skew shift point.
Collapse
Affiliation(s)
- Nobuaki Kono
- Institute for Advanced Biosciences, Keio University, Fujisawa, Kanagawa, Japan
| | - Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa, Kanagawa, Japan
- * E-mail:
| | - Masaru Tomita
- Institute for Advanced Biosciences, Keio University, Fujisawa, Kanagawa, Japan
| |
Collapse
|
20
|
Raj A, Dewar M, Palacios G, Rabadan R, Wiggins CH. Identifying hosts of families of viruses: a machine learning approach. PLoS One 2011; 6:e27631. [PMID: 22174744 PMCID: PMC3235098 DOI: 10.1371/journal.pone.0027631] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2011] [Accepted: 10/20/2011] [Indexed: 01/02/2023] Open
Abstract
Identifying emerging viral pathogens and characterizing their transmission is essential to developing effective public health measures in response to an epidemic. Phylogenetics, though currently the most popular tool used to characterize the likely host of a virus, can be ambiguous when studying species very distant to known species and when there is very little reliable sequence information available in the early stages of the outbreak of disease. Motivated by an existing framework for representing biological sequence information, we learn sparse, tree-structured models, built from decision rules based on subsequences, to predict viral hosts from protein sequence data using popular discriminative machine learning tools. Furthermore, the predictive motifs robustly selected by the learning algorithm are found to show strong host-specificity and occur in highly conserved regions of the viral proteome.
Collapse
Affiliation(s)
- Anil Raj
- Department of Applied Physics and Applied Mathematics, Columbia University, New York, New York, United States of America.
| | | | | | | | | |
Collapse
|
21
|
CAGO: a software tool for dynamic visual comparison and correlation measurement of genome organization. PLoS One 2011; 6:e27080. [PMID: 22114666 PMCID: PMC3219657 DOI: 10.1371/journal.pone.0027080] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/16/2011] [Accepted: 10/10/2011] [Indexed: 11/26/2022] Open
Abstract
CAGO (Comparative Analysis of Genome Organization) is developed to address two critical shortcomings of conventional genome atlas plotters: lack of dynamic exploratory functions and absence of signal analysis for genomic properties. With dynamic exploratory functions, users can directly manipulate chromosome tracks of a genome atlas and intuitively identify distinct genomic signals by visual comparison. Signal analysis of genomic properties can further detect inconspicuous patterns from noisy genomic properties and calculate correlations between genomic properties across various genomes. To implement dynamic exploratory functions, CAGO presents each genome atlas in Scalable Vector Graphics (SVG) format and allows users to interact with it using a SVG viewer through JavaScript. Signal analysis functions are implemented using R statistical software and a discrete wavelet transformation package waveslim. CAGO is not only a plotter for generating complex genome atlases, but also a platform for exploring genome atlases with dynamic exploratory functions for visual comparison and with signal analysis for comparing genomic properties across multiple organisms. The web-based application of CAGO, its source code, user guides, video demos, and live examples are publicly available and can be accessed at http://cbs.ym.edu.tw/cago.
Collapse
|
22
|
Suzuki H, Lefébure T, Hubisz MJ, Pavinski Bitar P, Lang P, Siepel A, Stanhope MJ. Comparative genomic analysis of the Streptococcus dysgalactiae species group: gene content, molecular adaptation, and promoter evolution. Genome Biol Evol 2011; 3:168-85. [PMID: 21282711 PMCID: PMC3056289 DOI: 10.1093/gbe/evr006] [Citation(s) in RCA: 41] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
Comparative genomics of closely related bacterial species with different pathogenesis and host preference can provide a means of identifying the specifics of adaptive differences. Streptococcus dysgalactiae (SD) is comprised of two subspecies: S. dysgalactiae subsp. equisimilis is both a human commensal organism and a human pathogen, and S. dysgalactiae subsp. dysgalactiae is strictly an animal pathogen. Here, we present complete genome sequences for both taxa, with analyses involving other species of Streptococcus but focusing on adaptation in the SD species group. We found little evidence for enrichment in biochemical categories of genes carried by each SD strain, however, differences in the virulence gene repertoire were apparent. Some of the differences could be ascribed to prophage and integrative conjugative elements. We identified approximately 9% of the nonrecombinant core genome to be under positive selection, some of which involved known virulence factors in other bacteria. Analyses of proteomes by pooling data across genes, by biochemical category, clade, or branch, provided evidence for increased rates of evolution in several gene categories, as well as external branches of the tree. Promoters were primarily evolving under purifying selection but with certain categories of genes evolving faster. Many of these fast-evolving categories were the same as those associated with rapid evolution in proteins. Overall, these results suggest that adaptation to changing environments and new hosts in the SD species group has involved the acquisition of key virulence genes along with selection of orthologous protein-coding loci and operon promoters.
Collapse
Affiliation(s)
- Haruo Suzuki
- Department of Population Medicine and Diagnostic Sciences, College of Veterinary Medicine, Cornell University, Ithaca, New York
| | | | | | | | | | | | | |
Collapse
|
23
|
Polak P, Querfurth R, Arndt PF. The evolution of transcription-associated biases of mutations across vertebrates. BMC Evol Biol 2010; 10:187. [PMID: 20565875 PMCID: PMC2927911 DOI: 10.1186/1471-2148-10-187] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2009] [Accepted: 06/18/2010] [Indexed: 02/03/2024] Open
Abstract
Background The interplay between transcription and mutational processes can lead to particular mutation patterns in transcribed regions of the genome. Transcription introduces several biases in mutational patterns; in particular it invokes strand specific mutations. In order to understand the forces that have shaped transcripts during evolution, one has to study mutation patterns associated with transcription across animals. Results Using multiple alignments of related species we estimated the regional single-nucleotide substitution patterns along genes in four vertebrate taxa: primates, rodents, laurasiatheria and bony fishes. Our analysis is focused on intronic and intergenic regions and reveals differences in the patterns of substitution asymmetries between mammals and fishes. In mammals, the levels of asymmetries are stronger for genes starting within CpG islands than in genes lacking this property. In contrast to all other species analyzed, we found a mutational pressure in dog and stickleback, promoting an increase of GC-contents in the proximity to transcriptional start sites. Conclusions We propose that the asymmetric patterns in transcribed regions are results of transcription associated mutagenic processes and transcription coupled repair, which both seem to evolve in a taxon related manner. We also discuss alternative mechanisms that can generate strand biases and involves error prone DNA polymerases and reverse transcription. A localized increase of the GC content near the transcription start site is a signature of biased gene conversion (BGC) that occurs during recombination and heteroduplex formation. Since dog and stickleback are known to be subject to rapid adaptations due to population bottlenecks and breeding, we further hypothesize that an increase in recombination rates near gene starts has been part of an adaptive process.
Collapse
Affiliation(s)
- Paz Polak
- Department of Computational Molecular Biology, Max Planck Institute for Molecular Genetics, Berlin, Germany.
| | | | | |
Collapse
|
24
|
Arakawa K, Suzuki H, Tomita M. Quantitative analysis of replication-related mutation and selection pressures in bacterial chromosomes and plasmids using generalised GC skew index. BMC Genomics 2009; 10:640. [PMID: 20042086 PMCID: PMC2804667 DOI: 10.1186/1471-2164-10-640] [Citation(s) in RCA: 29] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2009] [Accepted: 12/30/2009] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Due to their bi-directional replication machinery starting from a single finite origin, bacterial genomes show characteristic nucleotide compositional bias between the two replichores, which can be visualised through GC skew or (C-G)/(C+G). Although this polarisation is used for computational prediction of replication origins in many bacterial genomes, the degree of GC skew visibility varies widely among different species, necessitating a quantitative measurement of GC skew strength in order to provide confidence measures for GC skew-based predictions of replication origins. RESULTS Here we discuss a quantitative index for the measurement of GC skew strength, named the generalised GC skew index (gGCSI), which is applicable to genomes of any length, including bacterial chromosomes and plasmids. We demonstrate that gGCSI is independent of the window size and can thus be used to compare genomes with different sizes, such as bacterial chromosomes and plasmids. It can suggest the existence of different replication mechanisms in archaea and of rolling-circle replication in plasmids. Correlation of gGCSI values between plasmids and their corresponding host chromosomes suggests that within the same strain, these replicons have reproduced using the same replication machinery and thus exhibit similar strengths of replication strand skew. CONCLUSIONS gGCSI can be applied to genomes of any length and thus allows comparative study of replication-related mutation and selection pressures in genomes of different lengths such as bacterial chromosomes and plasmids. Using gGCSI, we showed that replication-related mutation or selection pressure is similar for replicons with similar machinery.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa, 252-8520, Japan.
| | | | | |
Collapse
|
25
|
Polak P, Arndt PF. Long-range bidirectional strand asymmetries originate at CpG islands in the human genome. Genome Biol Evol 2009; 1:189-97. [PMID: 20333189 PMCID: PMC2817419 DOI: 10.1093/gbe/evp024] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/22/2009] [Indexed: 12/24/2022] Open
Abstract
In the human genome, CpG islands (CGIs), which are GC- and CpG-rich sequences, are associated with transcription starting sites (TSSs); in addition, there is evidence that CGIs harbor origins of bidirectional replication (OBRs) and are preferred sites for heteroduplex formation during recombination. Transcription, replication, and recombination processes are known to induce specific mutational patterns in various genomes, and therefore, these patterns are expected to be found around CGIs. We use triple alignments of human, chimp, and macaque to compute the rates of nucleotide substitutions in up to 1 Mbps long intergenic regions on both sides of CGIs. Our analysis revealed that around a CGI there is an asymmetry between complementary substitution rates that is similar to the one that found around the OBR in bacteria. We hypothesize that these asymmetries are induced by differences in the replication of the leading and lagging strand and that a significant number of CGIs overlap OBRs. Within CGIs, we observed a mutational signature of GC-biased gene conversion that is associated with recombination. We suggest that recombination has played a major role in the creation of CGIs.
Collapse
Affiliation(s)
- Paz Polak
- Max Planck Institute for Molecular Genetics, Berlin, Germany.
| | | |
Collapse
|
26
|
Kvikstad EM, Chiaromonte F, Makova KD. Ride the wavelet: A multiscale analysis of genomic contexts flanking small insertions and deletions. Genome Res 2009; 19:1153-64. [PMID: 19502380 DOI: 10.1101/gr.088922.108] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
Recent studies have revealed that insertions and deletions (indels) are more different in their formation than previously assumed. What remains enigmatic is how the local DNA sequence context contributes to these differences. To investigate the relative impact of various molecular mechanisms to indel formation, we analyzed sequence contexts of indels in the non protein- or RNA-coding, nonrepetitive (NCNR) portion of the human genome. We considered small (<or=30-bp) indels occurring in the human lineage since its divergence from chimpanzee and used wavelet techniques to study, simultaneously for multiple scales, the spatial patterns of short sequence motifs associated with indel mutagenesis. In particular, we focused on motifs associated with DNA polymerase activity, topoisomerase cleavage, double-strand breaks (DSBs), and their repair. We came to the following conclusions. First, many motifs are characterized by unique enrichment profiles in the vicinity of indels vs. indel-free portions of the genome, verifying the importance of sequence context in indel mutagenesis. Second, only limited similarity in motif frequency profiles is evident flanking insertions vs. deletions, confirming differences in their mutagenesis. Third, substantial similarity in frequency profiles exists between pairs of individual motifs flanking insertions (and separately deletions), suggesting "cooperation" among motifs, and thus molecular mechanisms, during indel formation. Fourth, the wavelet analyses demonstrate that all these patterns are highly dependent on scale (the size of an interval considered). Finally, our results depict a model of indel mutagenesis comprising both replication and recombination (via repair of paused replication forks and site-specific recombination).
Collapse
Affiliation(s)
- Erika M Kvikstad
- Center for Comparative Genomics and Bioinformatics, Penn State University, University Park, Pennsylvania 16802, USA
| | | | | |
Collapse
|
27
|
Duggin IG, Bell SD. Termination structures in the Escherichia coli chromosome replication fork trap. J Mol Biol 2009; 387:532-9. [PMID: 19233209 DOI: 10.1016/j.jmb.2009.02.027] [Citation(s) in RCA: 63] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2008] [Revised: 02/10/2009] [Accepted: 02/12/2009] [Indexed: 11/19/2022]
Abstract
The Escherichia coli chromosome contains two opposed sets of unidirectional DNA replication pause (Ter) sites that, according to the replication fork trap theory, control the termination of chromosome replication by restricting replication fork fusion to the terminus region. In contrast, a recent hypothesis suggested that termination occurs at the dif locus instead. Using two-dimensional agarose gel electrophoresis, we examined DNA replication intermediates at the Ter sites and at dif in wild-type cells. Two definitive signatures of site-specific termination--specific replication fork arrest and converging replication forks--were clearly detected at Ter sites, but not at dif. We also detected a significant pause during the latter stages of replication fork convergence at Ter sites. Quantification of fork pausing at the Ter sites in both their native chromosomal context and the plasmid context further supported the fork trap model.
Collapse
Affiliation(s)
- Iain G Duggin
- Medical Research Council Cancer Cell Unit, Hutchison-Medical Research Council Research Centre, Hills Road, Cambridge, UK.
| | | |
Collapse
|
28
|
Abstract
Many bacterial cellular processes interact intimately with the chromosome. Such interplay is the major driving force of genome structure or organization. Interactions take place at different scales-local for gene expression, global for replication-and lead to the differentiation of the chromosome into organizational units such as operons, replichores, or macrodomains. These processes are intermingled in the cell and create complex higher-level organizational features that are adaptive because they favor the interplay between the processes. The surprising result of selection for genome organization is that gene repertoires change much more quickly than chromosomal structure. Comparative genomics and experimental genomic manipulations are untangling the different cellular and evolutionary mechanisms causing such resilience to change. Since organization results from cellular processes, a better understanding of chromosome organization will help unravel the underlying cellular processes and their diversity.
Collapse
Affiliation(s)
- Eduardo P C Rocha
- Institut Pasteur, Microbial Evolutionary Genomics, F-75015 Paris, France.
| |
Collapse
|
29
|
Arakawa K, Tamaki S, Kono N, Kido N, Ikegami K, Ogawa R, Tomita M. Genome Projector: zoomable genome map with multiple views. BMC Bioinformatics 2009; 10:31. [PMID: 19166610 PMCID: PMC2636772 DOI: 10.1186/1471-2105-10-31] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2008] [Accepted: 01/23/2009] [Indexed: 01/24/2023] Open
Abstract
BACKGROUND Molecular biology data exist on diverse scales, from the level of molecules to -omics. At the same time, the data at each scale can be categorised into multiple layers, such as the genome, transcriptome, proteome, metabolome, and biochemical pathways. Due to the highly multi-layer and multi-dimensional nature of biological information, software interfaces for database browsing should provide an intuitive interface that allows for rapid migration across different views and scales. The Zoomable User Interface (ZUI) and tabbed browsing have proven successful for this purpose in other areas, especially to navigate the vast information in the World Wide Web. RESULTS This paper presents Genome Projector, a Web-based gateway for genomics information with a zoomable user interface using Google Maps API, equipped with four seamlessly accessible and searchable views: a circular genome map, a traditional genome map, a biochemical pathways map, and a DNA walk map. The Web application for 320 bacterial genomes is available at http://www.g-language.org/GenomeProjector/. All data and software including the source code, documentations, and development API are freely available under the GNU General Public License. Zoomable maps can be easily created from any image file using the development API, and an online data mapping service for Genome Projector is also available at our Web site. CONCLUSION Genome Projector is an intuitive Web application for browsing genomics information, implemented with a zoomable user interface and tabbed browsing utilising Google Maps API and Asynchronous JavaScript and XML (AJAX) technology.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa, 252-8520, Japan.
| | | | | | | | | | | | | |
Collapse
|
30
|
Duggin IG, Wake RG, Bell SD, Hill TM. The replication fork trap and termination of chromosome replication. Mol Microbiol 2008; 70:1323-33. [PMID: 19019156 DOI: 10.1111/j.1365-2958.2008.06500.x] [Citation(s) in RCA: 82] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022]
Abstract
Bacteria that have a circular chromosome with a bidirectional DNA replication origin are thought to utilize a 'replication fork trap' to control termination of replication. The fork trap is an arrangement of replication pause sites that ensures that the two replication forks fuse within the terminus region of the chromosome, approximately opposite the origin on the circular map. However, the biological significance of the replication fork trap has been mysterious, as its inactivation has no obvious consequence. Here we review the research that led to the replication fork trap theory, and we aim to integrate several recent findings that contribute towards an understanding of the physiological roles of the replication fork trap. Likely roles include the prevention of over-replication, and the optimization of post-replicative mechanisms of chromosome segregation, such as that involving FtsK in Escherichia coli.
Collapse
Affiliation(s)
- Iain G Duggin
- Sir William Dunn School of Pathology, University of Oxford, Oxford OX1 3RE, UK.
| | | | | | | |
Collapse
|
31
|
Sernova NV, Gelfand MS. Identification of replication origins in prokaryotic genomes. Brief Bioinform 2008; 9:376-91. [PMID: 18660512 DOI: 10.1093/bib/bbn031] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The availability of hundreds of complete bacterial genomes has created new challenges and simultaneously opportunities for bioinformatics. In the area of statistical analysis of genomic sequences, the studies of nucleotide compositional bias and gene bias between strands and replichores paved way to the development of tools for prediction of bacterial replication origins. Only a few (about 20) origin regions for eubacteria and archaea have been proven experimentally. One reason for that may be that this is now considered as an essentially bioinformatics problem, where predictions are sufficiently reliable not to run labor-intensive experiments, unless specifically needed. Here we describe the main existing approaches to the identification of replication origin (oriC) and termination (terC) loci in prokaryotic chromosomes and characterize a number of computational tools based on various skew types and other types of evidence. We also classify the eubacterial and archaeal chromosomes by predictability of their replication origins using skew plots. Finally, we discuss possible combined approaches to the identification of the oriC sites that may be used to improve the prediction tools, in particular, the analysis of DnaA binding sites using the comparative genomic methods.
Collapse
Affiliation(s)
- Natalia V Sernova
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Bolshoi Karetny pereulok, 19, Moscow, 127994, Russia
| | | |
Collapse
|