1
|
Hasenauer F, Barreto H, Lotton C, Matic I. Genome-wide mapping of spontaneous DNA replication error-hotspots using mismatch repair proteins in rapidly proliferating Escherichia coli. Nucleic Acids Res 2025; 53:gkae1196. [PMID: 39660654 PMCID: PMC11754648 DOI: 10.1093/nar/gkae1196] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/29/2024] [Revised: 11/12/2024] [Accepted: 11/19/2024] [Indexed: 12/12/2024] Open
Abstract
Fidelity of DNA replication is crucial for the accurate transmission of genetic information across generations, yet errors still occur despite multiple control mechanisms. This study investigated the factors influencing spontaneous replication errors across the Escherichia coli genome. We detected errors using the MutS and MutL mismatch repair proteins in rapidly proliferating mutH-deficient cells, where errors can be detected but not corrected. Our findings reveal that replication error hotspots are non-randomly distributed along the chromosome and are enriched in sequences with distinct features: lower thermal stability facilitating DNA strand separation, mononucleotide repeats prone to DNA polymerase slippage and sequences prone to forming secondary structures like cruciforms and G4 structures, which increase likelihood of DNA polymerase stalling. These hotspots showed enrichment for binding sites of nucleoid-associated proteins, RpoB and GyrA, as well as highly expressed genes, and depletion of GATC sequence. Finally, the enrichment of single-stranded DNA stretches in the hotspot regions establishes a nexus between the formation of secondary structures, transcriptional activity and replication stress. In conclusion, this study provides a comprehensive genome-wide map of replication error hotspots, offering a holistic perspective on the intricate interplay between various mechanisms that can compromise the faithful transmission of genetic information.
Collapse
Affiliation(s)
- Flavia C Hasenauer
- Université Paris Cité, CNRS, Inserm, Institut Cochin, F-75014 Paris, France
| | - Hugo C Barreto
- Université Paris Cité, CNRS, Inserm, Institut Cochin, F-75014 Paris, France
| | - Chantal Lotton
- Université Paris Cité, CNRS, Inserm, Institut Cochin, F-75014 Paris, France
| | - Ivan Matic
- Université Paris Cité, CNRS, Inserm, Institut Cochin, F-75014 Paris, France
| |
Collapse
|
2
|
Yella VR, Vanaja A. Computational analysis on the dissemination of non-B DNA structural motifs in promoter regions of 1180 cellular genomes. Biochimie 2023; 214:101-111. [PMID: 37311475 DOI: 10.1016/j.biochi.2023.06.002] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2022] [Revised: 05/05/2023] [Accepted: 06/05/2023] [Indexed: 06/15/2023]
Abstract
The promoter regions of gene regulation are under evolutionary constraints and earlier studies uncovered that they are characterized by enrichment of functional non-B DNA structural signatures like curved DNA, cruciform DNA, G-quadruplex, triple-helical DNA, slipped DNA structures, and Z-DNA. However, these studies are restricted to a few model organisms, single non-B DNA motif types, or whole genomic sequences, and their comparative accumulation in promoter regions of different domains of life has not been reported comprehensively. In this study, for the first time, we investigated the preponderance of non-B DNA-prone motifs in promoter regions in 1180 genomes belonging to 28 taxonomic groups using the non-B DNA Motif Search Tool (nBMST). The trends suggest that they are predominant in promoters compared to the upstream and downstream regions of all three domains of life and variably linked to taxonomic groups. Cruciform DNA motif is the most abundant form of non-B DNA, spanning from archaea to lower eukaryotes. Curved DNA motifs are prominent in host-associated bacteria, and suppressed in mammals. Triplex-DNA and slipped DNA structure repeats are discretely dispersed in all lineages. G-quadruplex motifs are significantly enriched in mammals. We also observed that the unique enrichment of non-B DNA in promoters is strongly linked to genome GC, size, evolutionary time divergence, and ecological adaptations. Overall, our work systematically reports the unique non-B DNA structural landscape of cellular organisms from the perspective of the cis-regulatory code of genomes.
Collapse
Affiliation(s)
- Venkata Rajesh Yella
- Department of Biotechnology, Koneru Lakshmaiah Education Foundation, Guntur, 522302, Andhra Pradesh, India.
| | - Akkinepally Vanaja
- Department of Biotechnology, Koneru Lakshmaiah Education Foundation, Guntur, 522302, Andhra Pradesh, India; KL College of Pharmacy, Koneru Lakshmaiah Education Foundation, Guntur, 522302, Andhra Pradesh, India
| |
Collapse
|
3
|
Getz LJ, Brown JM, Sobot L, Chow A, Mahendrarajah J, Thomas N. Attenuation of a DNA cruciform by a conserved regulator directs T3SS1 mediated virulence in Vibrio parahaemolyticus. Nucleic Acids Res 2023; 51:6156-6171. [PMID: 37158250 PMCID: PMC10325908 DOI: 10.1093/nar/gkad370] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2022] [Revised: 04/23/2023] [Accepted: 04/27/2023] [Indexed: 05/10/2023] Open
Abstract
Pathogenic Vibrio species account for 3-5 million annual life-threatening human infections. Virulence is driven by bacterial hemolysin and toxin gene expression often positively regulated by the winged helix-turn-helix (wHTH) HlyU transcriptional regulator family and silenced by histone-like nucleoid structural protein (H-NS). In the case of Vibrio parahaemolyticus, HlyU is required for virulence gene expression associated with type 3 Secretion System-1 (T3SS1) although its mechanism of action is not understood. Here, we provide evidence for DNA cruciform attenuation mediated by HlyU binding to support concomitant virulence gene expression. Genetic and biochemical experiments revealed that upon HlyU mediated DNA cruciform attenuation, an intergenic cryptic promoter became accessible allowing for exsA mRNA expression and initiation of an ExsA autoactivation feedback loop at a separate ExsA-dependent promoter. Using a heterologous E. coli expression system, we reconstituted the dual promoter elements which revealed that HlyU binding and DNA cruciform attenuation were strictly required to initiate the ExsA autoactivation loop. The data indicate that HlyU acts to attenuate a transcriptional repressive DNA cruciform to support T3SS1 virulence gene expression and reveals a non-canonical extricating gene regulation mechanism in pathogenic Vibrio species.
Collapse
Affiliation(s)
- Landon J Getz
- Department of Microbiology and Immunology, Faculty of Medicine, Dalhousie University. Halifax, NS, Canada
| | - Justin M Brown
- Department of Microbiology and Immunology, Faculty of Medicine, Dalhousie University. Halifax, NS, Canada
| | - Lauren Sobot
- Department of Microbiology and Immunology, Faculty of Medicine, Dalhousie University. Halifax, NS, Canada
| | - Alexandra Chow
- Department of Microbiology and Immunology, Faculty of Medicine, Dalhousie University. Halifax, NS, Canada
| | - Jastina Mahendrarajah
- Department of Microbiology and Immunology, Faculty of Medicine, Dalhousie University. Halifax, NS, Canada
| | - Nikhil A Thomas
- Department of Microbiology and Immunology, Faculty of Medicine, Dalhousie University. Halifax, NS, Canada
- Department of Medicine, Faculty of Medicine, Dalhousie University. Halifax, NS, Canada
| |
Collapse
|
4
|
Abstract
Repetitive elements in the human genome, once considered 'junk DNA', are now known to adopt more than a dozen alternative (that is, non-B) DNA structures, such as self-annealed hairpins, left-handed Z-DNA, three-stranded triplexes (H-DNA) or four-stranded guanine quadruplex structures (G4 DNA). These dynamic conformations can act as functional genomic elements involved in DNA replication and transcription, chromatin organization and genome stability. In addition, recent studies have revealed a role for these alternative structures in triggering error-generating DNA repair processes, thereby actively enabling genome plasticity. As a driving force for genetic variation, non-B DNA structures thus contribute to both disease aetiology and evolution.
Collapse
Affiliation(s)
- Guliang Wang
- Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Dell Paediatric Research Institute, Austin, TX, USA
| | - Karen M Vasquez
- Division of Pharmacology and Toxicology, College of Pharmacy, The University of Texas at Austin, Dell Paediatric Research Institute, Austin, TX, USA.
| |
Collapse
|
5
|
Murray A, Mendieta JP, Vollmers C, Schmitz RJ. Simple and accurate transcriptional start site identification using Smar2C2 and examination of conserved promoter features. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2022; 112:583-596. [PMID: 36030508 PMCID: PMC9827901 DOI: 10.1111/tpj.15957] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 07/11/2022] [Revised: 08/12/2022] [Accepted: 08/22/2022] [Indexed: 06/15/2023]
Abstract
The precise and accurate identification and quantification of transcriptional start sites (TSSs) is key to understanding the control of transcription. The core promoter consists of the TSS and proximal non-coding sequences, which are critical in transcriptional regulation. Therefore, the accurate identification of TSSs is important for understanding the molecular regulation of transcription. Existing protocols for TSS identification are challenging and expensive, leaving high-quality data available for a small subset of organisms. This sparsity of data impairs study of TSS usage across tissues or in an evolutionary context. To address these shortcomings, we developed Smart-Seq2 Rolling Circle to Concatemeric Consensus (Smar2C2), which identifies and quantifies TSSs and transcription termination sites. Smar2C2 incorporates unique molecular identifiers that allowed for the identification of as many as 70 million sites, with no known upper limit. We have also generated TSS data sets from as little as 40 pg of total RNA, which was the smallest input tested. In this study, we used Smar2C2 to identify TSSs in Glycine max (soybean), Oryza sativa (rice), Sorghum bicolor (sorghum), Triticum aestivum (wheat) and Zea mays (maize) across multiple tissues. This wide panel of plant TSSs facilitated the identification of evolutionarily conserved features, such as novel patterns in the dinucleotides that compose the initiator element (Inr), that correlated with promoter expression levels across all species examined. We also discovered sequence variations in known promoter motifs that are positioned reliably close to the TSS, such as differences in the TATA box and in the Inr that may prove significant to our understanding and control of transcription initiation. Smar2C2 allows for the easy study of these critical sequences, providing a tool to facilitate discovery.
Collapse
Affiliation(s)
- Andrew Murray
- Department of Plant BiologyUniversity of GeorgiaAthensGA30602USA
| | | | - Chris Vollmers
- Deparment of Biomolecular EngineeringUniversity of California Santa CruzSanta CruzCA95064USA
| | | |
Collapse
|
6
|
Bowater RP, Bohálová N, Brázda V. Interaction of Proteins with Inverted Repeats and Cruciform Structures in Nucleic Acids. Int J Mol Sci 2022; 23:ijms23116171. [PMID: 35682854 PMCID: PMC9180970 DOI: 10.3390/ijms23116171] [Citation(s) in RCA: 12] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 05/26/2022] [Accepted: 05/30/2022] [Indexed: 01/27/2023] Open
Abstract
Cruciforms occur when inverted repeat sequences in double-stranded DNA adopt intra-strand hairpins on opposing strands. Biophysical and molecular studies of these structures confirm their characterization as four-way junctions and have demonstrated that several factors influence their stability, including overall chromatin structure and DNA supercoiling. Here, we review our understanding of processes that influence the formation and stability of cruciforms in genomes, covering the range of sequences shown to have biological significance. It is challenging to accurately sequence repetitive DNA sequences, but recent advances in sequencing methods have deepened understanding about the amounts of inverted repeats in genomes from all forms of life. We highlight that, in the majority of genomes, inverted repeats are present in higher numbers than is expected from a random occurrence. It is, therefore, becoming clear that inverted repeats play important roles in regulating many aspects of DNA metabolism, including replication, gene expression, and recombination. Cruciforms are targets for many architectural and regulatory proteins, including topoisomerases, p53, Rif1, and others. Notably, some of these proteins can induce the formation of cruciform structures when they bind to DNA. Inverted repeat sequences also influence the evolution of genomes, and growing evidence highlights their significance in several human diseases, suggesting that the inverted repeat sequences and/or DNA cruciforms could be useful therapeutic targets in some cases.
Collapse
Affiliation(s)
- Richard P. Bowater
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK;
| | - Natália Bohálová
- Department of Biophysical Chemistry and Molecular Oncology, Institute of Biophysics of the Czech Academy of Sciences, 61265 Brno, Czech Republic;
- Department of Experimental Biology, Faculty of Science, Masaryk University, Kamenice 5, 62500 Brno, Czech Republic
| | - Václav Brázda
- Department of Biophysical Chemistry and Molecular Oncology, Institute of Biophysics of the Czech Academy of Sciences, 61265 Brno, Czech Republic;
- Correspondence:
| |
Collapse
|
7
|
Brázda V, Bartas M, Bowater RP. Evolution of Diverse Strategies for Promoter Regulation. Trends Genet 2021; 37:730-744. [PMID: 33931265 DOI: 10.1016/j.tig.2021.04.003] [Citation(s) in RCA: 28] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 03/31/2021] [Accepted: 04/01/2021] [Indexed: 12/15/2022]
Abstract
DNA is fundamentally important for all cellular organisms due to its role as a store of hereditary genetic information. The precise and accurate regulation of gene transcription depends primarily on promoters, which vary significantly within and between genomes. Some promoters are rich in specific types of bases, while others have more varied, complex sequence characteristics. However, it is not only base sequence but also epigenetic modifications and altered DNA structure that regulate promoter activity. Significantly, many promoters across all organisms contain sequences that can form intrastrand hairpins (cruciforms) or four-stranded structures (G-quadruplex or i-motif). In this review we integrate recent studies on promoter regulation that highlight the importance of DNA structure in the evolutionary adaptation of promoter sequences.
Collapse
Affiliation(s)
- Václav Brázda
- Institute of Biophysics of the Czech Academy of Sciences, Královopolská 135, 612 65 Brno, Czech Republic
| | - Martin Bartas
- Department of Biology and Ecology/Institute of Environmental Technologies, Faculty of Science, University of Ostrava, 710 00 Ostrava, Czech Republic
| | - Richard P Bowater
- School of Biological Sciences, University of East Anglia, Norwich Research Park, Norwich NR4 7TJ, UK.
| |
Collapse
|
8
|
Cruciform Formable Sequences within Pou5f1 Enhancer Are Indispensable for Mouse ES Cell Integrity. Int J Mol Sci 2021; 22:ijms22073399. [PMID: 33810223 PMCID: PMC8036336 DOI: 10.3390/ijms22073399] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2021] [Revised: 03/22/2021] [Accepted: 03/22/2021] [Indexed: 01/04/2023] Open
Abstract
DNA can adopt various structures besides the B-form. Among them, cruciform structures are formed on inverted repeat (IR) sequences. While cruciform formable IRs (CFIRs) are sometimes found in regulatory regions of transcription, their function in transcription remains elusive, especially in eukaryotes. We found a cluster of CFIRs within the mouse Pou5f1 enhancer. Here, we demonstrate that this cluster or some member(s) plays an active role in the transcriptional regulation of not only Pou5f1, but also Sox2, Nanog, Klf4 and Esrrb. To clarify in vivo function of the cluster, we performed genome editing using mouse ES cells, in which each of the CFIRs was altered to the corresponding mirror repeat sequence. The alterations reduced the level of the Pou5f1 transcript in the genome-edited cell lines, and elevated those of Sox2, Nanog, Klf4 and Esrrb. Furthermore, transcription of non-coding RNAs (ncRNAs) within the enhancer was also upregulated in the genome-edited cell lines, in a similar manner to Sox2, Nanog, Klf4 and Esrrb. These ncRNAs are hypothesized to control the expression of these four pluripotency genes. The CFIRs present in the Pou5f1 enhancer seem to be important to maintain the integrity of ES cells.
Collapse
|
9
|
Global analysis of inverted repeat sequences in human gene promoters reveals their non-random distribution and association with specific biological pathways. Genomics 2020; 112:2772-2777. [DOI: 10.1016/j.ygeno.2020.03.014] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2019] [Revised: 01/02/2020] [Accepted: 03/20/2020] [Indexed: 12/11/2022]
|
10
|
Fleming AM, Zhu J, Jara-Espejo M, Burrows CJ. Cruciform DNA Sequences in Gene Promoters Can Impact Transcription upon Oxidative Modification of 2'-Deoxyguanosine. Biochemistry 2020; 59:2616-2626. [PMID: 32567845 DOI: 10.1021/acs.biochem.0c00387] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
Abstract
Sequences of DNA typically adopt B-form duplexes in genomes, although noncanonical structures such as G-quadruplexes, i-motifs, Z-DNA, and cruciform structures can occur. A challenge is to determine the functions of these various structures in cellular processes. We and others have hypothesized that G-rich G-quadruplex-forming sequences in human genome promoters serve to sense oxidative damage generated during oxidative stress impacting gene regulation. Herein, chemical tools and a cell-based assay were used to study the oxidation of guanine to 8-oxo-7,8-dihydroguanine (OG) in the context of a cruciform-forming sequence in a gene promoter to determine the impact on transcription. We found that OG in the nontemplate strand in the loop of a cruciform-forming sequence could induce gene expression; conversely when OG was in the same sequence on the template strand, gene expression was inhibited. A model for the transcriptional changes observed is proposed in which OG focuses the DNA repair process on the promoter to impact expression. Our cellular and biophysical studies and literature sources support the idea that removal of OG from duplex DNA by OGG1 yields an abasic site (AP) that triggers a structural shift to the cruciform fold. The AP-bearing cruciform structure is presented to APE1, which functions as a conduit between DNA repair and gene regulation. The significance is enhanced by a bioinformatic study of all human gene promoters and transcription termination sites for inverted repeats (IRs). Comparison of the two regions showed that promoters have stable and G-rich IRs at a low frequency and termination sites have many AT-rich IRs with low stability.
Collapse
Affiliation(s)
- Aaron M Fleming
- Department of Chemistry, University of Utah, 315 South 1400 East, Salt Lake City, Utah 84112-0850, United States
| | - Judy Zhu
- Department of Chemistry, University of Utah, 315 South 1400 East, Salt Lake City, Utah 84112-0850, United States
| | - Manuel Jara-Espejo
- Department of Chemistry, University of Utah, 315 South 1400 East, Salt Lake City, Utah 84112-0850, United States.,Department of Morphology, Piracicaba Dental School, University of Campinas-UNICAMP, Av. Limeira 901, Piracicaba, CEP 13414-018 Sao Paulo, Brazil
| | - Cynthia J Burrows
- Department of Chemistry, University of Utah, 315 South 1400 East, Salt Lake City, Utah 84112-0850, United States
| |
Collapse
|
11
|
Miura O, Ogake T, Yoneyama H, Kikuchi Y, Ohyama T. A strong structural correlation between short inverted repeat sequences and the polyadenylation signal in yeast and nucleosome exclusion by these inverted repeats. Curr Genet 2018; 65:575-590. [PMID: 30498953 PMCID: PMC6420913 DOI: 10.1007/s00294-018-0907-8] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2018] [Revised: 11/14/2018] [Accepted: 11/15/2018] [Indexed: 11/22/2022]
Abstract
DNA sequences that read the same from 5′ to 3′ in either strand are called inverted repeat sequences or simply IRs. They are found throughout a wide variety of genomes, from prokaryotes to eukaryotes. Despite extensive research, their in vivo functions, if any, remain unclear. Using Saccharomyces cerevisiae, we performed genome-wide analyses for the distribution, occurrence frequency, sequence characteristics and relevance to chromatin structure, for the IRs that reportedly have a cruciform-forming potential. Here, we provide the first comprehensive map of these IRs in the S. cerevisiae genome. The statistically significant enrichment of the IRs was found in the close vicinity of the DNA positions corresponding to polyadenylation [poly(A)] sites and ~ 30 to ~ 60 bp downstream of start codon-coding sites (referred to as ‘start codons’). In the former, ApT- or TpA-rich IRs and A-tract- or T-tract-rich IRs are enriched, while in the latter, different IRs are enriched. Furthermore, we found a strong structural correlation between the former IRs and the poly(A) signal. In the chromatin formed on the gene end regions, the majority of the IRs causes low nucleosome occupancy. The IRs in the region ~ 30 to ~ 60 bp downstream of start codons are located in the + 1 nucleosomes. In contrast, fewer IRs are present in the adjacent region downstream of start codons. The current study suggests that the IRs play similar roles in Escherichia coli and S. cerevisiae to regulate or complete transcription at the RNA level.
Collapse
Affiliation(s)
- Osamu Miura
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Toshihiro Ogake
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Hiroki Yoneyama
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Yo Kikuchi
- Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan
| | - Takashi Ohyama
- Department of Biology, Faculty of Education and Integrated Arts and Sciences, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan. .,Major in Integrative Bioscience and Biomedical Engineering, Graduate School of Science and Engineering, Waseda University, 2-2 Wakamatsu-cho, Shinjuku-ku, Tokyo, 162-8480, Japan.
| |
Collapse
|