1
|
Comparative Mitogenome Analyses Uncover Mitogenome Features and Phylogenetic Implications of the Parrotfishes (Perciformes: Scaridae). BIOLOGY 2023; 12:biology12030410. [PMID: 36979102 PMCID: PMC10044791 DOI: 10.3390/biology12030410] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/31/2023] [Revised: 02/28/2023] [Accepted: 03/02/2023] [Indexed: 03/09/2023]
Abstract
In order to investigate the molecular evolution of mitogenomes among the family Scaridae, the complete mitogenome sequences of twelve parrotfish species were determined and compared with those of seven other parrotfish species. The comparative analysis revealed that the general features and organization of the mitogenome were similar among the 19 parrotfish species. The base composition was similar among the parrotfishes, with the exception of the genus Calotomus, which exhibited an unusual negative AT skew in the whole mitogenome. The PCGs showed similar codon usage, and all of them underwent a strong purifying selection. The gene rearrangement typical of the parrotfishes was detected, with the tRNAMet inserted between the tRNAIle and tRNAGln, and the tRNAGln was followed by a putative tRNAMet pseudogene. The parrotfish mitogenomes displayed conserved gene overlaps and secondary structure in most tRNA genes, while the non-coding intergenic spacers varied among species. Phylogenetic analysis based on the thirteen PCGs and two rRNAs strongly supported the hypothesis that the parrotfishes could be subdivided into two clades with distinct ecological adaptations. The early divergence of the sea grass and coral reef clades occurred in the late Oligocene, probably related to the expansion of sea grass habitat. Later diversification within the coral reef clade could be dated back to the Miocene, likely associated with the geomorphology alternation since the closing of the Tethys Ocean. This work provided fundamental molecular data that will be useful for species identification, conservation, and further studies on the evolution of parrotfishes.
Collapse
|
2
|
Almirantis Y, Provata A, Li W. Noether's Theorem as a Metaphor for Chargaff's 2nd Parity Rule in Genomics. J Mol Evol 2022; 90:231-238. [PMID: 35704064 DOI: 10.1007/s00239-022-10062-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/26/2022] [Accepted: 05/18/2022] [Indexed: 10/18/2022]
Abstract
In the present note, the genomic compositional rule largely known as 'Chargaff's 2nd parity rule' (asserting equimolarity between Adenine-Thymine and Guanine-Cytosine in any of the two DNA strands) is regarded in association with Noether's theorem linking symmetries with conservation laws in physics. In the case of the genome, the strict physical and mathematical prerequisites of Noether's theorem do not hold. However, we conclude that a metaphor can be established with Noether's theorem, as inter-strand symmetry concerning DNA functionality engenders specific features in genome composition. Inversely, when inter-strand symmetry does not hold, the corresponding quantitative relations fail to appear. This association is also considered from the point of view of the existence of emergent laws and properties in evolutionary genomics.
Collapse
Affiliation(s)
- Yannis Almirantis
- Theoretical Biology and Computational Genomics Laboratory, Institute of Bioscience and Applications, National Center for Scientific Research "Demokritos", 15341, Athens, Greece.
| | - Astero Provata
- Statistical Mechanics and Dynamical Systems Laboratory, Institute of Nanoscience and Nanotechnology, National Center for Scientific Research, "Demokritos", 15341, Athens, Greece
| | - Wentian Li
- The Robert S. Boas Center for Genomics and Human Genetics, The Feinstein Institutes for Medical Research, Northwell Health, Manhasset, NY, USA
| |
Collapse
|
3
|
Shi M, Qi L, He LS. Comparative Analysis of the Mitochondrial Genome of Galatheanthemum sp. MT-2020 (Actiniaria Galatheanthemidae) From a Depth of 9,462 m at the Mariana Trench. Front Genet 2022; 13:854009. [PMID: 35754826 PMCID: PMC9213748 DOI: 10.3389/fgene.2022.854009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2022] [Accepted: 04/26/2022] [Indexed: 11/30/2022] Open
Abstract
The hadal zone, which represents the deepest marine habitat on Earth (6,000–11,000 m), is a harsh environment mainly characterized by extremely high hydrostatic pressure, and this habitat is believed to have a high degree of endemism. The deep-sea anemone family Galatheanthemidae comprises two valid species exclusively from the hadal; however, no other information about this family is currently available. In the present study, a sea anemone was collected from a depth of 9,462 m at the Mariana Trench and was defined as Galatheanthemum sp. MT-2020 (Actiniaria Galatheanthemidae). The mitochondrial genome of Galatheanthemum sp. MT-2020 was circular, was 16,633 bp in length, and contained two ribosomal RNA genes, 13 protein-coding genes and two transfer RNA genes. The order of the genes of Galatheanthemum sp. MT-2020 was identical to that of the majority of the species of the order Actiniaria. The value of the AT-skew was the lowest in the whole mitochondrial genome, with a positive GC skew value for the atp8 gene, while other species, except Antholoba achates, had the negative values of the GC skew. Galatheanthemum sp. MT-2020 was clustered with another abyssal species, Paraphelliactis xishaensis, in the phylogenetic tree, and these species diverged in the early Jurassic approximately 200 Mya from the shallow-sea species. The usage ratio of valine, which is one of the five amino acids with the strongest barophilic properties, in the mitochondrial genomes of the two abyssal species was significantly higher than that in other species with habitats above the depth of 3,000 m. The ω (dN/dS) ratio of the genomes was 2.45-fold higher than that of the shallow-sea species, indicating a slower evolutionary rate. Overall, the present study is the first to provide a complete mitogenome of sea anemones from the hadal and reveal some characteristics that may be associated with adaptation to an extreme environment.
Collapse
Affiliation(s)
- Mengke Shi
- Institute of Deep-sea Science and Engineering, Chinese Academy of Sciences, Sanya, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Li Qi
- Institute of Deep-sea Science and Engineering, Chinese Academy of Sciences, Sanya, China.,University of Chinese Academy of Sciences, Beijing, China
| | - Li-Sheng He
- Institute of Deep-sea Science and Engineering, Chinese Academy of Sciences, Sanya, China
| |
Collapse
|
4
|
Ataeian M, Vadlamani A, Haines M, Mosier D, Dong X, Kleiner M, Strous M, Hawley AK. Proteome and strain analysis of cyanobacterium Candidatus "Phormidium alkaliphilum" reveals traits for success in biotechnology. iScience 2021; 24:103405. [PMID: 34877483 PMCID: PMC8633866 DOI: 10.1016/j.isci.2021.103405] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 08/27/2021] [Accepted: 11/03/2021] [Indexed: 11/18/2022] Open
Abstract
Cyanobacteria encompass a diverse group of photoautotrophic bacteria with important roles in nature and biotechnology. Here we characterized Candidatus “Phormidium alkaliphilum,” an abundant member in alkaline soda lake microbial communities globally. The complete, circular whole-genome sequence of Ca. “P. alkaliphilum” was obtained using combined Nanopore and Illumina sequencing of a Ca. “P. alkaliphilum” consortium. Strain-level diversity of Ca. “P. alkaliphilum” was shown to contribute to photobioreactor robustness under different operational conditions. Comparative genomics of closely related species showed that adaptation to high pH was not attributed to specific genes. Proteomics at high and low pH showed only minimal changes in gene expression, but higher productivity in high pH. Diverse photosystem antennae proteins, and high-affinity terminal oxidase, compared with other soda lake cyanobacteria, appear to contribute to the success of Ca. “P. alkaliphilum” in photobioreactors and biotechnology applications. Closed genome of the cyanobacteria Ca. P. alkaliphilum from high-pH photobioreactor Genetic factors lead this Phormidium to outcompete other cyanobacteria in photobioreactor Adaptation to high pH and alkalinity is not linked to specific genes Strain-level diversity contributes Ca. P. alkaliphilum success in changing conditions
Collapse
Affiliation(s)
- Maryam Ataeian
- Department of Geoscience, University of Calgary, Calgary, AB, Canada
| | | | - Marianne Haines
- Department of Geoscience, University of Calgary, Calgary, AB, Canada
| | - Damon Mosier
- Department of Geoscience, University of Calgary, Calgary, AB, Canada
| | - Xiaoli Dong
- Department of Geoscience, University of Calgary, Calgary, AB, Canada
| | - Manuel Kleiner
- Department of Plant and Microbial Biology, North Carolina State University, Raleigh, NC 27695, USA
| | - Marc Strous
- Department of Geoscience, University of Calgary, Calgary, AB, Canada
| | - Alyse K. Hawley
- Department of Geoscience, University of Calgary, Calgary, AB, Canada
- School of Engineering, University of British Columbia Okanagan, Kelowna, BC, Canada
- Corresponding author
| |
Collapse
|
5
|
An interplay between compositional constraint and natural selection dictates the codon usage pattern among select Galliformes. Biosystems 2021; 204:104390. [PMID: 33636205 DOI: 10.1016/j.biosystems.2021.104390] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2021] [Accepted: 02/18/2021] [Indexed: 11/20/2022]
Abstract
Galliformes are believed to be the first avian order that started living in human association and became domesticated. Members of this order ranged from common to rare species. Next-generation sequencing has availed researchers with the whole genome sequences of five Galliformes; chicken, helmeted Guinea fowl, turkey, Japanese quail, and peafowl. Bioinformatic analysis based on codon usage, evolution, and species-specific functional enrichment can provide some crucial information aiding proper understanding of their genomic strategies. In this study, we investigated the genomic features of chicken, helmeted guinea fowl, turkey, and Japanese quail. Their genomes were AT biased although the potentially highly expressed genes contained more GC than AT. Cytosine dominated the third position of frequently used optimal codons. Mutational pressures in the analyzed Galliformes were in the range of 0.2-0.6%. Neutrality plot, translational selection index, and mutational responsive index indicated the dominance of selection pressure over mutational pressure among Galliformes. A pair of di-nucleotides, TpA and CpG, was found to be used less frequently than others in protein-coding genes since both of them are associated with the conversion of euchromatin to heterochromatin. Functional enrichment analysis revealed the dominance of proteins associated with fundamental biological processes. In turkey, chicken and helmeted Guinea fowl proteins with immunity-boosting capacity prevailed along with proteins needed for signal transduction and maintenance of central dogma. Evolutionary analysis indicated a bias towards synonymous substitution than non-synonymous mutation.
Collapse
|
6
|
Revisiting the Relationships Between Genomic G + C Content, RNA Secondary Structures, and Optimal Growth Temperature. J Mol Evol 2020; 89:165-171. [PMID: 33216148 DOI: 10.1007/s00239-020-09974-w] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2020] [Accepted: 11/09/2020] [Indexed: 10/23/2022]
Abstract
Over twenty years ago Galtier and Lobry published a manuscript entitled "Relationships between Genomic G + C Content, RNA Secondary Structure, and Optimal Growth Temperature" in the Journal of Molecular Evolution that showcased the lack of a relationship between genomic G + C content and optimal growth temperature (OGT) in a set of about 200 prokaryotes. Galtier and Lobry also assessed the relationship between RNA secondary structures (rRNA stems, tRNAs) and OGT, and in this case a clear relationship emerged. Increasing structured RNA G + C content (particularly in regions that are double-stranded) correlates with increased OGT. Both of these fundamental relationships have withstood test of many additional sequences and spawned a variety of different applications that include prediction of OGT from rRNA sequence and computational ncRNA identification approaches. In this work, I present the motivation behind Galtier and Lobry's original paper and the larger questions addressed by the work, how these questions have evolved over the last two decades, and the impact of Galtier and Lobry's manuscript in fields beyond these questions.
Collapse
|
7
|
Blanca L, Christo-Foroux E, Rigou S, Legendre M. Comparative Analysis of the Circular and Highly Asymmetrical Marseilleviridae Genomes. Viruses 2020; 12:E1270. [PMID: 33171839 PMCID: PMC7695187 DOI: 10.3390/v12111270] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/07/2020] [Revised: 11/05/2020] [Accepted: 11/05/2020] [Indexed: 12/11/2022] Open
Abstract
Marseilleviridae members are large dsDNA viruses with icosahedral particles 250 nm in diameter infecting Acanthamoeba. Their 340 to 390 kb genomes encode 450 to 550 protein-coding genes. Since the discovery of marseillevirus (the prototype of the family) in 2009, several strains were isolated from various locations, among which 13 are now fully sequenced. This allows the organization of their genomes to be deciphered through comparative genomics. Here, we first experimentally demonstrate that the Marseilleviridae genomes are circular. We then acknowledge a strong bias in sequence conservation, revealing two distinct genomic regions. One gathers most Marseilleviridae paralogs and has undergone genomic rearrangements, while the other, enriched in core genes, exhibits the opposite pattern. Most of the genes whose protein products compose the viral particles are located in the conserved region. They are also strongly biased toward a late gene expression pattern. We finally discuss the potential advantages of Marseilleviridae having a circular genome, and the possible link between the biased distribution of their genes and the transcription as well as DNA replication mechanisms that remain to be characterized.
Collapse
Affiliation(s)
| | | | | | - Matthieu Legendre
- CNRS, IGS, Information Génomique & Structurale (UMR7256), Institut de Microbiologie de la Méditerranée (FR 3489), Aix Marseille Univ., 13288 Marseille, France; (L.B.); (E.C.-F.); (S.R.)
| |
Collapse
|
8
|
Comparative Genomics Unveils Regionalized Evolution of the Faustovirus Genomes. Viruses 2020; 12:v12050577. [PMID: 32456325 PMCID: PMC7290515 DOI: 10.3390/v12050577] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2020] [Revised: 05/19/2020] [Accepted: 05/22/2020] [Indexed: 11/17/2022] Open
Abstract
Faustovirus is a recently discovered genus of large DNA virus infecting the amoeba Vermamoeba vermiformis, which is phylogenetically related to Asfarviridae. To better understand the diversity and evolution of this viral group, we sequenced six novel Faustovirus strains, mined published metagenomic datasets and performed a comparative genomic analysis. Genomic sequences revealed three consistent phylogenetic groups, within which genetic diversity was moderate. The comparison of the major capsid protein (MCP) genes unveiled between 13 and 18 type-I introns that likely evolved through a still-active birth and death process mediated by intron-encoded homing endonucleases that began before the Faustovirus radiation. Genome-wide alignments indicated that despite genomes retaining high levels of gene collinearity, the central region containing the MCP gene together with the extremities of the chromosomes evolved at a faster rate due to increased indel accumulation and local rearrangements. The fluctuation of the nucleotide composition along the Faustovirus (FV) genomes is mostly imprinted by the consistent nucleotide bias of coding sequences and provided no evidence for a single DNA replication origin like in circular bacterial genomes.
Collapse
|
9
|
Ohbayashi R, Hirooka S, Onuma R, Kanesaki Y, Hirose Y, Kobayashi Y, Fujiwara T, Furusawa C, Miyagishima SY. Evolutionary Changes in DnaA-Dependent Chromosomal Replication in Cyanobacteria. Front Microbiol 2020; 11:786. [PMID: 32411117 PMCID: PMC7198777 DOI: 10.3389/fmicb.2020.00786] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2020] [Accepted: 04/02/2020] [Indexed: 12/02/2022] Open
Abstract
Replication of the circular bacterial chromosome is initiated at a unique origin (oriC) in a DnaA-dependent manner in which replication proceeds bidirectionally from oriC to ter. The nucleotide compositions of most bacteria differ between the leading and lagging DNA strands. Thus, the chromosomal DNA sequence typically exhibits an asymmetric GC skew profile. Further, free-living bacteria without genomes encoding dnaA were unknown. Thus, a DnaA-oriC-dependent replication initiation mechanism may be essential for most bacteria. However, most cyanobacterial genomes exhibit irregular GC skew profiles. We previously found that the Synechococcus elongatus chromosome, which exhibits a regular GC skew profile, is replicated in a DnaA-oriC-dependent manner, whereas chromosomes of Synechocystis sp. PCC 6803 and Nostoc sp. PCC 7120, which exhibit an irregular GC skew profile, are replicated from multiple origins in a DnaA-independent manner. Here we investigate the variation in the mechanisms of cyanobacterial chromosome replication. We found that the genomes of certain free-living species do not encode dnaA and such species, including Cyanobacterium aponinum PCC 10605 and Geminocystis sp. NIES-3708, replicate their chromosomes from multiple origins. Synechococcus sp. PCC 7002, which is phylogenetically closely related to dnaA-lacking free-living species as well as to dnaA-encoding but DnaA-oriC-independent Synechocystis sp. PCC 6803, possesses dnaA. In Synechococcus sp. PCC 7002, dnaA was not essential and its chromosomes were replicated from a unique origin in a DnaA-oriC independent manner. Our results also suggest that loss of DnaA-oriC-dependency independently occurred multiple times during cyanobacterial evolution and raises a possibility that the loss of dnaA or loss of DnaA-oriC dependency correlated with an increase in ploidy level.
Collapse
Affiliation(s)
- Ryudo Ohbayashi
- Department of Gene Function and Phenomics, National Institute of Genetics, Shizuoka, Japan
| | - Shunsuke Hirooka
- Department of Gene Function and Phenomics, National Institute of Genetics, Shizuoka, Japan
| | - Ryo Onuma
- Department of Gene Function and Phenomics, National Institute of Genetics, Shizuoka, Japan
| | - Yu Kanesaki
- Research Institute of Green Science and Technology, Shizuoka University, Shizuoka, Japan
| | - Yuu Hirose
- Department of Applied Chemistry and Life Science, Toyohashi University of Technology, Toyohashi, Japan
| | - Yusuke Kobayashi
- Department of Gene Function and Phenomics, National Institute of Genetics, Shizuoka, Japan
| | - Takayuki Fujiwara
- Department of Gene Function and Phenomics, National Institute of Genetics, Shizuoka, Japan.,Department of Genetics, The Graduate University for Advanced Studies (SOKENDAI), Shizuoka, Japan
| | - Chikara Furusawa
- Center for Biosystems Dynamics Research, RIKEN, Osaka, Japan.,Universal Biology Institute, Graduate School of Science, The University of Tokyo, Tokyo, Japan
| | - Shin-Ya Miyagishima
- Department of Gene Function and Phenomics, National Institute of Genetics, Shizuoka, Japan.,Department of Genetics, The Graduate University for Advanced Studies (SOKENDAI), Shizuoka, Japan
| |
Collapse
|
10
|
Quan CL, Gao F. Quantitative analysis and assessment of base composition asymmetry and gene orientation bias in bacterial genomes. FEBS Lett 2019; 593:918-925. [PMID: 30941752 DOI: 10.1002/1873-3468.13374] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2019] [Revised: 03/28/2019] [Accepted: 03/31/2019] [Indexed: 11/10/2022]
Abstract
Base composition asymmetry and gene orientation bias are two common genomic structures in bacterial genomes. Here, correlation coefficients between nucleotide disparities and coding sequence (CDS) skew have been calculated, which provides insights into their relationship from an individual genome perspective. Consequently, we find GC and RY disparities correlate significantly with CDS skew, since around 60% of the bacterial genomes under study have correlation coefficients > 0.9. Then, we present a model for quantitative assessment of nucleotide disparity and CDS skew in which a numerical index R2 is used for evaluation. We find that skew curves with higher R2 perform better on the prediction of replication origins in bacteria.
Collapse
Affiliation(s)
- Chun-Lan Quan
- Department of Physics, School of Science, Tianjin University, China
| | - Feng Gao
- Department of Physics, School of Science, Tianjin University, China.,Key Laboratory of Systems Bioengineering, Ministry of Education, Tianjin University, China.,SynBio Research Platform, Collaborative Innovation Center of Chemical Science and Engineering (Tianjin), China
| |
Collapse
|
11
|
Yu P, Zhou L, Zhou XY, Yang WT, Zhang J, Zhang XJ, Wang Y, Gui JF. Unusual AT-skew of Sinorhodeus microlepis mitogenome provides new insights into mitogenome features and phylogenetic implications of bitterling fishes. Int J Biol Macromol 2019; 129:339-350. [PMID: 30738158 DOI: 10.1016/j.ijbiomac.2019.01.200] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2018] [Revised: 01/17/2019] [Accepted: 01/29/2019] [Indexed: 12/25/2022]
Abstract
Sinorhodeus microlepis (S. microlepis) is recently described as a new species and represents a new genus Sinorhodeu of the subfamily Acheilognathinae. In this study, we first sequenced the complete mitogenome of S. microlepis and compared with the other 29 bitterling mitogenomes. The S. microlepis mitogenome is 16,591 bp in length and contains 37 genes. Gene distribution pattern is identical among 30 bitterling mitogenomes. A significant linear correlation between A+T% and AT-skew were found among 29 bitterling mitogenomes, except S. microlepis shows unusual AT-skew with slightly negative in tRNAs and PCGs. Bitterling mitogenomes exhibit highly conserved usage bias of start codon, relative synonymous codons and amino acids, overlaps and non-coding intergenic spacers. Phylogenetic trees constructed by 13 PCGs strongly support the polyphyly of the genus Acheilognathus and the paraphyly of Rhodeus and Tanakia. Together with the unusual characters of S. microlepis mitogenomes and phylogenetic trees, S. microlepis should be a sister species to the genus Rhodeu that might diverge about 13.69 Ma (95% HPD: 12.96-14.48 Ma).
Collapse
Affiliation(s)
- Peng Yu
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, The Innovation Academy of Seed Design, Chinese Academy of Sciences, Wuhan 430072, China; University of Chinese Academy of Sciences, Beijing 100049, China; College of Animal Science and Technology, Anhui Agricultural University, Hefei 230036, China
| | - Li Zhou
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, The Innovation Academy of Seed Design, Chinese Academy of Sciences, Wuhan 430072, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiao-Ya Zhou
- College of Animal Science and Technology, Anhui Agricultural University, Hefei 230036, China
| | - Wen-Tao Yang
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, The Innovation Academy of Seed Design, Chinese Academy of Sciences, Wuhan 430072, China; University of Chinese Academy of Sciences, Beijing 100049, China
| | - Jun Zhang
- College of Animal Science and Technology, Anhui Agricultural University, Hefei 230036, China
| | - Xiao-Juan Zhang
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, The Innovation Academy of Seed Design, Chinese Academy of Sciences, Wuhan 430072, China
| | - Yang Wang
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, The Innovation Academy of Seed Design, Chinese Academy of Sciences, Wuhan 430072, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| | - Jian-Fang Gui
- State Key Laboratory of Freshwater Ecology and Biotechnology, Institute of Hydrobiology, The Innovation Academy of Seed Design, Chinese Academy of Sciences, Wuhan 430072, China; University of Chinese Academy of Sciences, Beijing 100049, China.
| |
Collapse
|
12
|
Apostolou-Karampelis K, Nikolaou C, Almirantis Y. A novel skew analysis reveals substitution asymmetries linked to genetic code GC-biases and PolIII a-subunit isoforms. DNA Res 2016; 23:353-63. [PMID: 27345720 PMCID: PMC4991834 DOI: 10.1093/dnares/dsw021] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/11/2016] [Accepted: 05/09/2016] [Indexed: 11/30/2022] Open
Abstract
Strand biases reflect deviations from a null expectation of DNA evolution that assumes strand-symmetric substitution rates. Here, we present strong evidence that nearest-neighbour preferences are a strand-biased feature of bacterial genomes, indicating neighbour-dependent substitution asymmetries. To detect such asymmetries we introduce an alignment free index (relative abundance skews). The profiles of relative abundance skews along coding sequences can trace the phylogenetic relations of bacteria, suggesting that the patterns of neighbour-dependent substitution strand-biases are not common among different lineages, but are rather species-specific. Analysis of neighbour-dependent and codon-site skews sheds light on the origins of substitution asymmetries. Via a simple model we argue that the structure of the genetic code imposes position-dependent substitution strand-biases along coding sequences, as a response to GC mutation pressure. Thus, the organization of the genetic code per se can lead to an uneven distribution of nucleotides among different codon sites, even when requirements for specific codons and amino-acids are not accounted for. Moreover, our results suggest that strand-biases in replication fidelity of PolIII α-subunit induce substitution asymmetries, both neighbour-dependent and independent, on a genome scale. The role of DNA repair systems, such as transcription-coupled repair, is also considered.
Collapse
Affiliation(s)
| | - Christoforos Nikolaou
- Computational Genomics Group, Department of Biology, University of Crete, 71409 Heraklion, Greece
| | - Yannis Almirantis
- Institute of Biosciences and Applications, National Center for Scientific Research "Demokritos", 15310 Athens, Greece
| |
Collapse
|
13
|
Multiple Factors Drive Replicating Strand Composition Bias in Bacterial Genomes. Int J Mol Sci 2015; 16:23111-26. [PMID: 26404268 PMCID: PMC4613354 DOI: 10.3390/ijms160923111] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2015] [Revised: 08/18/2015] [Accepted: 09/18/2015] [Indexed: 11/18/2022] Open
Abstract
Composition bias from Chargaff’s second parity rule (PR2) has long been found in sequenced genomes, and is believed to relate strongly with the replication process in microbial genomes. However, some disagreement on the underlying reason for strand composition bias remains. We performed an integrative analysis of various genomic features that might influence composition bias using a large-scale dataset of 1111 genomes. Our results indicate (1) the bias was stronger in obligate intracellular bacteria than in other free-living species (p-value = 0.0305); (2) Fusobacteria and Firmicutes had the highest average bias among the 24 microbial phyla analyzed; (3) the strength of selected codon usage bias and generation times were not observably related to strand composition bias (p-value = 0.3247); (4) significant negative relationships were found between GC content, genome size, rearrangement frequency, Clusters of Orthologous Groups (COG) functional subcategories A, C, I, Q, and composition bias (p-values < 1.0 × 10−8); (5) gene density and COG functional subcategories D, F, J, L, and V were positively related with composition bias (p-value < 2.2 × 10−16); and (6) gene density made the most important contribution to composition bias, indicating transcriptional bias was associated strongly with strand composition bias. Therefore, strand composition bias was found to be influenced by multiple factors with varying weights.
Collapse
|
14
|
Shen H, Braband A, Scholtz G. The complete mitogenomes of lobsters and crayfish (Crustacea: Decapoda: Astacidea) reveal surprising differences in closely related taxa and convergences to Priapulida. J ZOOL SYST EVOL RES 2015. [DOI: 10.1111/jzs.12106] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/07/2023]
Affiliation(s)
- Hong Shen
- Institut für Biologie/Vergleichende Zoologie; Humboldt-Universität zu Berlin; Berlin Germany
| | - Anke Braband
- Institut für Biologie/Vergleichende Zoologie; Humboldt-Universität zu Berlin; Berlin Germany
| | - Gerhard Scholtz
- Institut für Biologie/Vergleichende Zoologie; Humboldt-Universität zu Berlin; Berlin Germany
| |
Collapse
|
15
|
Rapoport AE, Trifonov EN. Compensatory nature of Chargaff’s second parity rule. J Biomol Struct Dyn 2013; 31:1324-36. [DOI: 10.1080/07391102.2012.736757] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/27/2022]
|
16
|
Nikolaou C, Bermúdez I, Manichanh C, García-Martinez J, Guigó R, Pérez-Ortín JE, Roca J. Topoisomerase II regulates yeast genes with singular chromatin architectures. Nucleic Acids Res 2013; 41:9243-56. [PMID: 23935120 PMCID: PMC3814376 DOI: 10.1093/nar/gkt707] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
Eukaryotic topoisomerase II (topo II) is the essential decatenase of newly replicated chromosomes and the main relaxase of nucleosomal DNA. Apart from these general tasks, topo II participates in more specialized functions. In mammals, topo IIα interacts with specific RNA polymerases and chromatin-remodeling complexes, whereas topo IIβ regulates developmental genes in conjunction with chromatin remodeling and heterochromatin transitions. Here we show that in budding yeast, topo II regulates the expression of specific gene subsets. To uncover this, we carried out a genomic transcription run-on shortly after the thermal inactivation of topo II. We identified a modest number of genes not involved in the general stress response but strictly dependent on topo II. These genes present distinctive functional and structural traits in comparison with the genome average. Yeast topo II is a positive regulator of genes with well-defined promoter architecture that associates to chromatin remodeling complexes; it is a negative regulator of genes extremely hypo-acetylated with complex promoters and undefined nucleosome positioning, many of which are involved in polyamine transport. These findings indicate that yeast topo II operates on singular chromatin architectures to activate or repress DNA transcription and that this activity produces functional responses to ensure chromatin stability.
Collapse
Affiliation(s)
- Christoforos Nikolaou
- Molecular Biology Institute of Barcelona, CSIC, 08028 Barcelona, Spain, Department of Biology, University of Crete, 71409 Heraklion, Greece, Department of Genetics and ERI Biotecmed, University of Valencia, 46100 Burjassot, Spain, Centre for Genomic Regulation (CRG), 08003 Barcelona, Spain and Department of Biochemistry and Molecular Biology and ERI Biotecmed, University of Valencia, 46100 Burjassot, Spain
| | | | | | | | | | | | | |
Collapse
|
17
|
Song TJ, Wang Y, Shen JG, Pan JP, Huang J. Genomic comparisons between paired bacterial strains with strong and weak GC skews. J Basic Microbiol 2013; 54:111-9. [PMID: 23457112 DOI: 10.1002/jobm.201200252] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2012] [Accepted: 09/19/2012] [Indexed: 11/07/2022]
Abstract
A majority of known eubacterial genomes are characteristic of GC skew, i.e., the leading strand has exceeding number of G over C. The cause of this compositional bias is still not very clear. In this study, we chose five pairs of genomes from distantly related bacterial genera, i.e., Buchnera, Haemophilus, Mycoplasma, Mycobacterium, and Synechococcus, each containing one with strong GC skew and the other with weak GC skew. Through comparison of the orthologous genes in these genera, we found that neither chromosomal rearrangement nor CDS skew has direct relationship with GC skew.
Collapse
Affiliation(s)
- Tie-Jun Song
- Department of Clinic Lab, Sir Run Run Shaw Hospital, School of Medicine, Zhejiang University, Hangzhou, China
| | | | | | | | | |
Collapse
|
18
|
Patterns of nucleotide asymmetries in plant and animal genomes. Biosystems 2013; 111:181-9. [PMID: 23438636 DOI: 10.1016/j.biosystems.2013.02.001] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2012] [Revised: 11/29/2012] [Accepted: 02/07/2013] [Indexed: 11/20/2022]
Abstract
Symmetry in biology provides many intriguing puzzles to the scientist's mind. Chargaff's second parity rule states a symmetric distribution of oligonucleotides within a single strand of double-stranded DNA. While this rule has been verified in a wide range of microbial genomes, it still awaits explanation. In our study, we inquired into patterns of mono- and trinucleotide intra-strand parity in complex plant genomic sequences that became available during the last few years, and compared these to equally complex animal genomes. The degree and patterns of deviation from Chargaff's second rule were different between plant and animal species. We observed a universal inter-chromosomal homogeneity of mononucleotide skews in coding sequences of plant chromosomes, while the base composition of animal coding sequences differed between chromosomes even within a single species. We also found differences in the base composition of dicot introns in comparison to those of monocots. These genome-wide patterns were limited to genic regions and were not encountered in inter-genic sequences. We discuss the implications of our findings in relation to hypotheses about functional correlations of intra-strand parity which have hitherto been put forward. Furthermore, we propose more recent polyploidization and subsequent homogenization of homoeologues as a possible reason for more homogeneous skew patterns in plants.
Collapse
|
19
|
Seligmann H. Coding constraints modulate chemically spontaneous mutational replication gradients in mitochondrial genomes. Curr Genomics 2012; 13:37-54. [PMID: 22942674 PMCID: PMC3269015 DOI: 10.2174/138920212799034802] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2011] [Revised: 09/07/2011] [Accepted: 09/20/2011] [Indexed: 11/30/2022] Open
Abstract
Distances from heavy and light strand replication origins determine duration mitochondrial DNA remains singlestranded during replication. Hydrolytic deaminations from A->G and C->T occur more on single- than doublestranded DNA. Corresponding replicational nucleotide gradients exist across mitochondrial genomes, most at 3rd, least 2nd codon positions. DNA singlestrandedness during RNA transcription causes gradients mainly in long-lived species with relatively slow metabolism (high transcription/replication ratios). Third codon nucleotide contents, evolutionary results of mutation cumulation, follow replicational, not transcriptional gradients in Homo; observed human mutations follow transcriptional gradients. Synonymous third codon position transitions potentially alter adaptive off frame information. No mutational gradients occur at synonymous positions forming off frame stops (these adaptively stop early accidental frameshifted protein synthesis), nor in regions coding for putative overlapping genes according to an overlapping genetic code reassigning stop codons to amino acids. Deviation of 3rd codon nucleotide contents from deamination gradients increases with coding importance of main frame 3rd codon positions in overlapping genes (greatest if these are 2nd position in overlapping genes). Third codon position deamination gradients calculated separately for each codon family are strongest where synonymous transitions are rarely pathogenic; weakest where transitions are frequently pathogenic. Synonymous mutations affect translational accuracy, such as error compensation of misloaded tRNAs by codon-anticodon mismatches (prevents amino acid misinsertion despite tRNA misacylation), a potential cause of pathogenic mutations at synonymous codon positions. Indeed, codon-family-specific gradients are inversely proportional to error compensation associated with gradient-promoted transitions. Deamination gradients reflect spontaneous chemical reactions in singlestranded DNA, but functional coding constraints modulate gradients.
Collapse
Affiliation(s)
- Hervé Seligmann
- National Collections of Natural History at the Hebrew University of Jerusalem, Jerusalem 91404; Department of Life Sciences, Ben Gurion University, 84105 Beer Sheva, Israel
| |
Collapse
|
20
|
Arakawa K, Tomita M. Measures of compositional strand bias related to replication machinery and its applications. Curr Genomics 2012; 13:4-15. [PMID: 22942671 PMCID: PMC3269016 DOI: 10.2174/138920212799034749] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2011] [Revised: 09/10/2011] [Accepted: 09/20/2011] [Indexed: 11/22/2022] Open
Abstract
The compositional asymmetry of complementary bases in nucleotide sequences implies the existence of a mutational or selectional bias in the two strands of the DNA duplex, which is commonly shaped by strand-specific mechanisms in transcription or replication. Such strand bias in genomes, frequently visualized by GC skew graphs, is used for the computational prediction of transcription start sites and replication origins, as well as for comparative evolutionary genomics studies. The use of measures of compositional strand bias in order to quantify the degree of strand asymmetry is crucial, as it is the basis for determining the applicability of compositional analysis and comparing the strength of the mutational bias in different biological machineries in various species. Here, we review the measures of strand bias that have been proposed to date, including the ∆GC skew, the B1 index, the predictability score of linear discriminant analysis for gene orientation, the signal-to-noise ratio of the oligonucleotide bias, and the GC skew index. These measures have been predominantly designed for and applied to the analysis of replication-related mutational processes in prokaryotes, but we also give research examples in eukaryotes.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan
| | | |
Collapse
|
21
|
Khrustalev VV, Barkovsky EV. A blueprint for a mutationist theory of replicative strand asymmetries formation. Curr Genomics 2012; 13:55-64. [PMID: 22942675 PMCID: PMC3269017 DOI: 10.2174/138920212799034730] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2011] [Revised: 09/15/2011] [Accepted: 09/29/2011] [Indexed: 11/26/2022] Open
Abstract
In the present review, we summarized current knowledge on replicative strand asymmetries in prokaryotic genomes. A cornerstone for the creation of a theory of their formation has been overviewed. According to our recent works, the probability of nonsense mutation caused by replication-associated mutational pressure is higher for genes from lagging strands than for genes from leading strands of both bacterial and archaeal genomes. Lower density of open reading frames in lagging strands can be explained by faster rates of nonsense mutations in genes situated on them. According to the asymmetries in nucleotide usage in fourfold and twofold degenerate sites, the direction of replication-associated mutational pressure for genes from lagging strands is usually the same as the direction of transcription-associated mutational pressure. It means that lagging strands should accumulate more 8-oxo-G, uracil and 5-formyl-uracil, respectively. In our opinion, consequences of cytosine deamination (C to T transitions) do not lead to the decrease of cytosine usage in genes from lagging strands because of the consequences of thymine oxidation (T to C transitions), while guanine oxidation (causing G to T transversions) makes the main contribution into the decrease of guanine usage in fourfold degenerate sites of genes from lagging strands. Nucleotide usage asymmetries and bias in density of coding regions can be found in archaeal genomes, although, the percent of "inversed" asymmetries is much higher for them than for bacterial genomes. "Homogenized" and "inversed" replicative strand asymmetries in archaeal genomes can be used as retrospective indexes for detection of OriC translocations and large inversions.
Collapse
Affiliation(s)
- Vladislav V Khrustalev
- Department of General Chemistry, Belarussian State Medical University, Belarus, Minsk, Dzerzinskogo, 83, Russia
| | | |
Collapse
|
22
|
Guo FB. [Strong strand specific composition bias-a genomic character of some obligate parasites or symbionts]. YI CHUAN = HEREDITAS 2011; 33:1039-1047. [PMID: 21993278 DOI: 10.3724/sp.j.1005.2011.01039] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
DNA replication includes a set of asymmetric mechanisms, which is a division into lagging and leading strands. The former is synthesized continuously whereas the synthesis for the latter is discontinuous. Such a asymmetric mechanism leads to distinct nucleotide composition of these two strands. Strands specific nucleotide composition bias was originally found in genomes of echinoderm and vertebrate mitochondria and then in several bacterial genomes. With the rapid growth in the number of sequenced genomes, many bacteria and even eukaryotes are found to have the consistent strand composition bias. In some bacteria, the extent of strand specific composition bias was so strong that genes on the two replicating strands could be separated according to their codon usages. Till now, 11 obligate intracellular bacteria have been found to have separate codon usages according to whether genes located on the leading or lagging strands. However, there is still not a well-accepted theory that could interpret the reason for the occurrence of separate codon usages in some special bacterial genomes and not in others. This paper reviews the related works and points out its open problems.
Collapse
Affiliation(s)
- Feng-Biao Guo
- University of Electronic Science and Technology of China, Chengdu, China.
| |
Collapse
|
23
|
Charneski CA, Honti F, Bryant JM, Hurst LD, Feil EJ. Atypical at skew in Firmicute genomes results from selection and not from mutation. PLoS Genet 2011; 7:e1002283. [PMID: 21935355 PMCID: PMC3174206 DOI: 10.1371/journal.pgen.1002283] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2011] [Accepted: 07/12/2011] [Indexed: 11/18/2022] Open
Abstract
The second parity rule states that, if there is no bias in mutation or selection, then within each strand of DNA complementary bases are present at approximately equal frequencies. In bacteria, however, there is commonly an excess of G (over C) and, to a lesser extent, T (over A) in the replicatory leading strand. The low G+C Firmicutes, such as Staphylococcus aureus, are unusual in displaying an excess of A over T on the leading strand. As mutation has been established as a major force in the generation of such skews across various bacterial taxa, this anomaly has been assumed to reflect unusual mutation biases in Firmicute genomes. Here we show that this is not the case and that mutation bias does not explain the atypical AT skew seen in S. aureus. First, recently arisen intergenic SNPs predict the classical replication-derived equilibrium enrichment of T relative to A, contrary to what is observed. Second, sites predicted to be under weak purifying selection display only weak AT skew. Third, AT skew is primarily associated with largely non-synonymous first and second codon sites and is seen with respect to their sense direction, not which replicating strand they lie on. The atypical AT skew we show to be a consequence of the strong bias for genes to be co-oriented with the replicating fork, coupled with the selective avoidance of both stop codons and costly amino acids, which tend to have T-rich codons. That intergenic sequence has more A than T, while at mutational equilibrium a preponderance of T is expected, points to a possible further unresolved selective source of skew. When considering a single strand of DNA, it is not necessarily the case that the frequency of each base should equal its complementary partner, such that A = T and G = C. For the leading strand, it is typically the case that Gs are more common than Cs, and Ts more common than As. This bias is widely thought to arise due to different mutational biases during replication. The Firmicutes exhibit an atypical preference for A over T on the leading strand, and here we show that selection, rather than mutation, can explain this exception. For those bases within coding regions, selection acts to inflate the frequency of A over T in order to avoid stop codons and to use metabolically cheap amino acids. Because genes are not orientated randomly, this manifests as an overall enrichment of A on the leading strand. Furthermore, a direct examination of mutational patterns is inconsistent with the observed enrichment of As. Curiously, our data also point to an unresolved source of selection on synonymous and intergenic sites, which are widely assumed to be neutral.
Collapse
|
24
|
Rangannan V, Bansal M. PromBase: a web resource for various genomic features and predicted promoters in prokaryotic genomes. BMC Res Notes 2011; 4:257. [PMID: 21781326 PMCID: PMC3160392 DOI: 10.1186/1756-0500-4-257] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2011] [Accepted: 07/22/2011] [Indexed: 12/19/2022] Open
Abstract
Background As more and more genomes are being sequenced, an overview of their genomic features and annotation of their functional elements, which control the expression of each gene or transcription unit of the genome, is a fundamental challenge in genomics and bioinformatics. Findings Relative stability of DNA sequence has been used to predict promoter regions in 913 microbial genomic sequences with GC-content ranging from 16.6% to 74.9%. Irrespective of the genome GC-content the relative stability based promoter prediction method has already been proven to be robust in terms of recall and precision. The predicted promoter regions for the 913 microbial genomes have been accumulated in a database called PromBase. Promoter search can be carried out in PromBase either by specifying the gene name or the genomic position. Each predicted promoter region has been assigned to a reliability class (low, medium, high, very high and highest) based on the difference between its average free energy and the downstream region. The recall and precision values for each class are shown graphically in PromBase. In addition, PromBase provides detailed information about base composition, CDS and CG/TA skews for each genome and various DNA sequence dependent structural properties (average free energy, curvature and bendability) in the vicinity of all annotated translation start sites (TLS). Conclusion PromBase is a database, which contains predicted promoter regions and detailed analysis of various genomic features for 913 microbial genomes. PromBase can serve as a valuable resource for comparative genomics study and help the experimentalist to rapidly access detailed information on various genomic features and putative promoter regions in any given genome. This database is freely accessible for academic and non- academic users via the worldwide web http://nucleix.mbu.iisc.ernet.in/prombase/.
Collapse
Affiliation(s)
- Vetriselvi Rangannan
- Molecular Biophysics Unit, Indian Institute of Science, Bangalore-560 012, India.
| | | |
Collapse
|
25
|
Wei SJ, Shi M, Chen XX, Sharkey MJ, van Achterberg C, Ye GY, He JH. New views on strand asymmetry in insect mitochondrial genomes. PLoS One 2010; 5:e12708. [PMID: 20856815 PMCID: PMC2939890 DOI: 10.1371/journal.pone.0012708] [Citation(s) in RCA: 205] [Impact Index Per Article: 13.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/05/2009] [Accepted: 08/20/2010] [Indexed: 01/16/2023] Open
Abstract
Strand asymmetry in nucleotide composition is a remarkable feature of animal mitochondrial genomes. Understanding the mutation processes that shape strand asymmetry is essential for comprehensive knowledge of genome evolution, demographical population history and accurate phylogenetic inference. Previous studies found that the relative contributions of different substitution types to strand asymmetry are associated with replication alone or both replication and transcription. However, the relative contributions of replication and transcription to strand asymmetry remain unclear. Here we conducted a broad survey of strand asymmetry across 120 insect mitochondrial genomes, with special reference to the correlation between the signs of skew values and replication orientation/gene direction. The results show that the sign of GC skew on entire mitochondrial genomes is reversed in all species of three distantly related families of insects, Philopteridae (Phthiraptera), Aleyrodidae (Hemiptera) and Braconidae (Hymenoptera); the replication-related elements in the A+T-rich regions of these species are inverted, confirming that reversal of strand asymmetry (GC skew) was caused by inversion of replication origin; and finally, the sign of GC skew value is associated with replication orientation but not with gene direction, while that of AT skew value varies with gene direction, replication and codon positions used in analyses. These findings show that deaminations during replication and other mutations contribute more than selection on amino acid sequences to strand compositions of G and C, and that the replication process has a stronger affect on A and T content than does transcription. Our results may contribute to genome-wide studies of replication and transcription mechanisms.
Collapse
Affiliation(s)
- Shu-Jun Wei
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
- Institute of Plant and Environmental Protection, Beijing Academy of Agriculture and Forestry Sciences, Beijing, China
| | - Min Shi
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - Xue-Xin Chen
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - Michael J. Sharkey
- Department of Entomology, University of Kentucky, Lexington, Kentucky, United States of America
| | | | - Gong-Yin Ye
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| | - Jun-Hua He
- State Key Laboratory of Rice Biology and Ministry of Agriculture Key Laboratory of Molecular Biology of Crop Pathogens and Insects, Institute of Insect Sciences, Zhejiang University, Hangzhou, China
| |
Collapse
|
26
|
The probability of nonsense mutation caused by replication-associated mutational pressure is much higher for bacterial genes from lagging than from leading strands. Genomics 2010; 96:173-80. [DOI: 10.1016/j.ygeno.2010.06.002] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2010] [Revised: 04/21/2010] [Accepted: 06/12/2010] [Indexed: 11/19/2022]
|
27
|
Powdel BR, Satapathy SS, Kumar A, Jha PK, Buragohain AK, Borah M, Ray SK. A study in entire chromosomes of violations of the intra-strand parity of complementary nucleotides (Chargaff's second parity rule). DNA Res 2009; 16:325-43. [PMID: 19861381 PMCID: PMC2780954 DOI: 10.1093/dnares/dsp021] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023] Open
Abstract
Chargaff's rule of intra-strand parity (ISP) between complementary mono/oligonucleotides in chromosomes is well established in the scientific literature. Although a large numbers of papers have been published citing works and discussions on ISP in the genomic era, scientists are yet to find all the factors responsible for such a universal phenomenon in the chromosomes. In the present work, we have tried to address the issue from a new perspective, which is a parallel feature to ISP. The compositional abundance values of mono/oligonucleotides were determined in all non-overlapping sub-chromosomal regions of specific size. Also the frequency distributions of the mono/oligonucleotides among the regions were compared using the Kolmogorov–Smirnov test. Interestingly, the frequency distributions between the complementary mono/oligonucleotides revealed statistical similarity, which we named as intra-strand frequency distribution parity (ISFDP). ISFDP was observed as a general feature in chromosomes of bacteria, archaea and eukaryotes. Violation of ISFDP was also observed in several chromosomes. Chromosomes of different strains belonging a species in bacteria/archaea (Haemophilus influenza, Xylella fastidiosa etc.) and chromosomes of a eukaryote are found to be different among each other with respect to ISFDP violation. ISFDP correlates weakly with ISP in chromosomes suggesting that the latter one is not entirely responsible for the former. Asymmetry of replication topography and composition of forward-encoded sequences between the strands in chromosomes are found to be insufficient to explain the ISFDP feature in all chromosomes. This suggests that multiple factors in chromosomes are responsible for establishing ISFDP.
Collapse
Affiliation(s)
- B R Powdel
- 1Department of Mathematical Sciences, Tezpur University, Tezpur, Assam 784 028, India
| | | | | | | | | | | | | |
Collapse
|
28
|
Poptsova MS, Larionov SA, Ryadchenko EV, Rybalko SD, Zakharov IA, Loskutov A. Hidden chromosome symmetry: in silico transformation reveals symmetry in 2D DNA walk trajectories of 671 chromosomes. PLoS One 2009; 4:e6396. [PMID: 19636424 PMCID: PMC2712679 DOI: 10.1371/journal.pone.0006396] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/07/2009] [Accepted: 06/23/2009] [Indexed: 11/18/2022] Open
Abstract
Maps of 2D DNA walk of 671 examined chromosomes show composition complexity change from symmetrical half-turn in bacteria to pseudo-random trajectories in archaea, fungi and humans. In silico transformation of gene order and strand position returns most of the analyzed chromosomes to a symmetrical bacterial-like state with one transition point. The transformed chromosomal sequences also reveal remarkable segmental compositional symmetry between regions from different strands located equidistantly from the transition point. Despite extensive chromosome rearrangement the relation of gene numbers on opposite strands for chromosomes of different taxa varies in narrow limits around unity with Pearson coefficient r = 0.98. Similar relation is observed for total genes' length (r = 0.86) and cumulative GC (r = 0.95) and AT (r = 0.97) skews. This is also true for human coding sequences (CDS), which comprise only several percent of the entire chromosome length. We found that frequency distributions of the length of gene clusters, continuously located on the same strand, have close values for both strands. Eukaryotic gene distribution is believed to be non-random. Contribution of different subsystems to the noted symmetries and distributions, and evolutionary aspects of symmetry are discussed.
Collapse
Affiliation(s)
- Maria S Poptsova
- University of Connecticut, Storrs, Connecticut, United States of America.
| | | | | | | | | | | |
Collapse
|
29
|
Sernova NV, Gelfand MS. Identification of replication origins in prokaryotic genomes. Brief Bioinform 2008; 9:376-91. [PMID: 18660512 DOI: 10.1093/bib/bbn031] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022] Open
Abstract
The availability of hundreds of complete bacterial genomes has created new challenges and simultaneously opportunities for bioinformatics. In the area of statistical analysis of genomic sequences, the studies of nucleotide compositional bias and gene bias between strands and replichores paved way to the development of tools for prediction of bacterial replication origins. Only a few (about 20) origin regions for eubacteria and archaea have been proven experimentally. One reason for that may be that this is now considered as an essentially bioinformatics problem, where predictions are sufficiently reliable not to run labor-intensive experiments, unless specifically needed. Here we describe the main existing approaches to the identification of replication origin (oriC) and termination (terC) loci in prokaryotic chromosomes and characterize a number of computational tools based on various skew types and other types of evidence. We also classify the eubacterial and archaeal chromosomes by predictability of their replication origins using skew plots. Finally, we discuss possible combined approaches to the identification of the oriC sites that may be used to improve the prediction tools, in particular, the analysis of DnaA binding sites using the comparative genomic methods.
Collapse
Affiliation(s)
- Natalia V Sernova
- Institute for Information Transmission Problems (Kharkevich Institute), Russian Academy of Sciences, Bolshoi Karetny pereulok, 19, Moscow, 127994, Russia
| | | |
Collapse
|
30
|
Savalia D, Westblade LF, Goel M, Florens L, Kemp P, Akulenko N, Pavlova O, Padovan JC, Chait BT, Washburn MP, Ackermann HW, Mushegian A, Gabisonia T, Molineux I, Severinov K. Genomic and proteomic analysis of phiEco32, a novel Escherichia coli bacteriophage. J Mol Biol 2008; 377:774-89. [PMID: 18294652 PMCID: PMC2587145 DOI: 10.1016/j.jmb.2007.12.077] [Citation(s) in RCA: 55] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/02/2007] [Revised: 12/19/2007] [Accepted: 12/29/2007] [Indexed: 10/22/2022]
Abstract
A novel bacteriophage infecting Escherichia coli was isolated during a large-scale screen for bacteriophages that may be used for therapy of mastitis in cattle. The 77,554-bp genome of the bacteriophage, named phiEco32, was sequenced and annotated, and its virions were characterized by electron microscopy and proteomics. Two phiEco32-encoded proteins that interact with host RNA polymerase were identified. One of them is an ECF family sigma factor that may be responsible for transcription of some viral genes. Another RNA polymerase-binding protein is a novel transcription inhibitor whose mechanism of action remains to be defined.
Collapse
Affiliation(s)
- Dhruti Savalia
- Waksman Institute for Microbiology, Piscataway, NJ 08854
| | | | - Manisha Goel
- Stowers Institute for Medical Research, Kansas City, MO 64110
| | | | - Priscilla Kemp
- Molecular Genetics and Microbiology, and Institute for Cell and Molecular Biology, University of Texas, Austin TX 78712
| | | | - Olga Pavlova
- Institute of Molecular Genetics, Moscow 123182, Russia
| | - Julio C. Padovan
- Laboratory for Mass Spectrometry and Gaseous Ion Chemistry, Rockefeller University, New York, NY 10065
| | - Brian T. Chait
- Laboratory for Mass Spectrometry and Gaseous Ion Chemistry, Rockefeller University, New York, NY 10065
| | | | - Hans-W. Ackermann
- Félix d'Herelle Reference Center for Bacterial Viruses, Faculty of Medicine, Laval University, Quebec, Qc, Canada
| | - Arcady Mushegian
- Stowers Institute for Medical Research, Kansas City, MO 64110
- Department of Microbiology, Kansas University Medical Center, Kansas City KS 66160
| | | | - Ian Molineux
- Molecular Genetics and Microbiology, and Institute for Cell and Molecular Biology, University of Texas, Austin TX 78712
| | - Konstantin Severinov
- Waksman Institute for Microbiology, Piscataway, NJ 08854
- Institute of Molecular Genetics, Moscow 123182, Russia
- Institute of Gene Biology, Moscow 117312, Russia
| |
Collapse
|
31
|
Evans KJ. Strand bias structure in mouse DNA gives a glimpse of how chromatin structure affects gene expression. BMC Genomics 2008; 9:16. [PMID: 18194530 PMCID: PMC2266913 DOI: 10.1186/1471-2164-9-16] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/16/2007] [Accepted: 01/14/2008] [Indexed: 12/20/2022] Open
Abstract
Background On a single strand of genomic DNA the number of As is usually about equal to the number of Ts (and similarly for Gs and Cs), but deviations have been noted for transcribed regions and origins of replication. Results The mouse genome is shown to have a segmented structure defined by strand bias. Transcription is known to cause a strand bias and numerous analyses are presented to show that the strand bias in question is not caused by transcription. However, these strand bias segments influence the position of genes and their unspliced length. The position of genes within the strand bias structure affects the probability that a gene is switched on and its expression level. Transcription has a highly directional flow within this structure and the peak volume of transcription is around 20 kb from the A-rich/T-rich segment boundary on the T-rich side, directed away from the boundary. The A-rich/T-rich boundaries are SATB1 binding regions, whereas the T-rich/A-rich boundary regions are not. Conclusion The direct cause of the strand bias structure may be DNA replication. The strand bias segments represent a further biological feature, the chromatin structure, which in turn influences the ease of transcription.
Collapse
Affiliation(s)
- Kenneth J Evans
- School of Crystallography, Birkbeck College, University of London, Malet Street, London, WC1E 7HX, UK.
| |
Collapse
|
32
|
Quantitative determination of gene strand bias in prokaryotic genomes. Genomics 2007; 90:733-40. [DOI: 10.1016/j.ygeno.2007.07.010] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/19/2007] [Revised: 07/09/2007] [Accepted: 07/23/2007] [Indexed: 11/19/2022]
|
33
|
Touchon M, Rocha EPC. From GC skews to wavelets: a gentle guide to the analysis of compositional asymmetries in genomic data. Biochimie 2007; 90:648-59. [PMID: 17988781 DOI: 10.1016/j.biochi.2007.09.015] [Citation(s) in RCA: 32] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2007] [Accepted: 09/21/2007] [Indexed: 12/29/2022]
Abstract
Compositional asymmetries are pervasive in DNA sequences. They are the result of the asymmetric interactions between DNA and cellular mechanisms such as replication and transcription. Here, we review many of the methods that have been proposed over the years to analyse compositional asymmetries in DNA sequences. Among these we list GC skews, oligonucleotide skews and wavelets, which among other uses have been extensively employed to delimitate origins and termini of replication in genomes. We also review the use of multivariate methods, such as factorial correspondence analysis, discriminant analysis and analysis of variance, which allow assigning compositional strand asymmetries to the different biological processes shaping sequence composition. Finally, we review methods that have been used to infer substitution matrices and allow understanding the mutational processes underlying strand asymmetry. We focus on replication asymmetries because they have been more thoroughly studied, but the methods may be adapted, and often are, to other problems. Although strand asymmetry has been studied more frequently through compositional skews of nucleotides or oligonucleotides, we recall that, depending on the goal of the analysis, other methods may be more appropriate to answer certain biological questions. We also refer to programs freely available to analyse strand asymmetry.
Collapse
Affiliation(s)
- Marie Touchon
- Atelier de Bioinformatique, Université Pierre et Marie Curie-Paris 6, Paris, France
| | | |
Collapse
|
34
|
Prozorov AA. Regularities of the location of genes having different functions and of some other nucleotide sequences in the bacterial chromosome. Microbiology (Reading) 2007. [DOI: 10.1134/s0026261707040017] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022] Open
|
35
|
Hu J, Zhao X, Yu J. Replication-associated purine asymmetry may contribute to strand-biased gene distribution. Genomics 2007; 90:186-94. [PMID: 17532183 DOI: 10.1016/j.ygeno.2007.04.002] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2006] [Revised: 03/09/2007] [Accepted: 04/02/2007] [Indexed: 11/19/2022]
Abstract
Among prokaryotic genomes, the distribution of genes on the leading and lagging strands of the replication fork is known to be biased. Several hypotheses explaining this strand-biased gene distribution (SGD) have been proposed, but none have been tested or supported by sufficient data analyses. In this work we have analyzed 211 prokaryotic genomes in terms of compositional strand asymmetries and the presence or absence of polC and have found that SGD correlates not only with polC, but also with purine asymmetry (PAS). Furthermore, SGD, PAS, and polC are all features associated with a group of low-GC, gram-positive bacteria (Firmicutes). We conclude that PAS is a characteristic of organisms with a heterodimeric DNA polymerase III alpha-subunit constituted by polC and dnaE, which may play a direct role in the maintenance of SGD.
Collapse
Affiliation(s)
- Jianfei Hu
- College of Life Sciences, Peking University, Beijing 100871, China.
| | | | | |
Collapse
|
36
|
Evolutionary implications of inversions that have caused intra-strand parity in DNA. BMC Genomics 2007; 8:160. [PMID: 17562011 PMCID: PMC1913523 DOI: 10.1186/1471-2164-8-160] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2007] [Accepted: 06/11/2007] [Indexed: 11/22/2022] Open
Abstract
Background Chargaff's rule of DNA base composition, stating that DNA comprises equal amounts of adenine and thymine (%A = %T) and of guanine and cytosine (%C = %G), is well known because it was fundamental to the conception of the Watson-Crick model of DNA structure. His second parity rule stating that the base proportions of double-stranded DNA are also reflected in single-stranded DNA (%A = %T, %C = %G) is more obscure, likely because its biological basis and significance are still unresolved. Within each strand, the symmetry of single nucleotide composition extends even further, being demonstrated in the balance of di-, tri-, and multi-nucleotides with their respective complementary oligonucleotides. Results Here, we propose that inversions are sufficient to account for the symmetry within each single-stranded DNA. Human mitochondrial DNA does not demonstrate such intra-strand parity, and we consider how its different functional drivers may relate to our theory. This concept is supported by the recent observation that inversions occur frequently. Conclusion Along with chromosomal duplications, inversions must have been shaping the architecture of genomes since the origin of life.
Collapse
|
37
|
Wang HF, Hou WR, Niu DK. Strand compositional asymmetries in vertebrate large genes. Mol Biol Rep 2007; 35:163-9. [PMID: 17420956 DOI: 10.1007/s11033-007-9066-6] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2006] [Accepted: 02/26/2007] [Indexed: 10/23/2022]
Abstract
Both transcription-associated and replication-associated strand compositional asymmetries have recently been shown in vertebrate genomes. In this paper, we illustrate that transcription-associated strand compositional asymmetries and replication-associated ones coexist in most vertebrate large genes, although in most case the former conceals the latter. Furthermore, we found that the transcription-associated strand compositional asymmetries of housekeeping genes are stronger than those of somatic cell expressed genes. Together with other evidence, we suggest that germline transcription-associated strand asymmetric mutations may be the main cause of the transcription-associated strand compositional asymmetries.
Collapse
Affiliation(s)
- Hai-Fang Wang
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing, 100875, China
| | | | | |
Collapse
|
38
|
Nikolaou C, Almirantis Y. Deviations from Chargaff's second parity rule in organellar DNA Insights into the evolution of organellar genomes. Gene 2006; 381:34-41. [PMID: 16893615 DOI: 10.1016/j.gene.2006.06.010] [Citation(s) in RCA: 42] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2006] [Revised: 04/18/2006] [Accepted: 06/13/2006] [Indexed: 10/24/2022]
Abstract
Chargaff' s second parity rule (PR2) states that complementary nucleotides are met with almost equal frequencies in single stranded DNA. This is indeed the case for all bacterial and eukaryotic genomes studied, although the genomic patterns may differ among genomes in terms of local deviations. The behaviour of organellar genomes regarding the second parity rule has not been studied in detail up to now. We tested all available organellar genomes and found that a large number of mitochondrial genomes significantly deviate from the 2nd parity rule in contrast to the eubacterial ones, although mitochondria are believed to have evolved from proteobacteria. Moreover, mitochondria may be divided into three distinct sub-groups according to their overall deviation from the aforementioned parity rule. On the other hand, chloroplast genomes share the pattern of eubacterial genomes and, interestingly, so do mitochondrial genomes originating from plants and some fungi. The deviation from the second parity is found to be weakly correlated with the overall excess of purines against pyrimidines. The behaviour of the large majority of the mitochondrial genomes may be attributed to their distinct mode of replication, which is fundamentally different from the one of the eubacteria. Differences between chloroplast and mitochondrial genomes might also be explained on the basis of different replication mechanisms and correlated to differences in the genome size and compaction. The results presented herein may provide some insight into different modes of evolution of genome structure between chloroplasts and mitochondria.
Collapse
Affiliation(s)
- Christoforos Nikolaou
- Computational Genomics Group, Institute of Biology, NCSR Demokritos, 15310 Athens, Greece.
| | | |
Collapse
|
39
|
Hou WR, Wang HF, Niu DK. Replication-associated strand asymmetries in vertebrate genomes and implications for replicon size, DNA replication origin, and termination. Biochem Biophys Res Commun 2006; 344:1258-62. [PMID: 16650814 DOI: 10.1016/j.bbrc.2006.04.039] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2006] [Accepted: 04/17/2006] [Indexed: 11/16/2022]
Abstract
Strand compositional asymmetry has been observed in prokaryotes and used in predicting prokaryotic DNA replication origins and termini. However, it was not found in eukaryotic genomes by the same methods. We propose that transcription-associated strand asymmetries mask the replication-associated ones. By analyzing the nucleotide composition of intergenic sequences larger than 50 kb by cumulative skew diagrams (CSD), we found replication-associated strand asymmetry in vertebrate genomes. Furthermore, we found that the most common replicon sizes in vertebrates are 50-100 kb, and show evidence that the replication origin and termination regions of vertebrate genomes range from a discrete site to a broad zone.
Collapse
Affiliation(s)
- Wen-Ru Hou
- Ministry of Education Key Laboratory for Biodiversity Science and Ecological Engineering, College of Life Sciences, Beijing Normal University, Beijing 100875, China
| | | | | |
Collapse
|