1
|
The intricate relationship of G-Quadruplexes and bacterial pathogenicity islands. eLife 2024; 12:RP91985. [PMID: 38391174 PMCID: PMC10942614 DOI: 10.7554/elife.91985] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/24/2024] Open
Abstract
The dynamic interplay between guanine-quadruplex (G4) structures and pathogenicity islands (PAIs) represents a captivating area of research with implications for understanding the molecular mechanisms underlying pathogenicity. This study conducted a comprehensive analysis of a large-scale dataset from reported 89 pathogenic strains of bacteria to investigate the potential interactions between G4 structures and PAIs. G4 structures exhibited an uneven and non-random distribution within the PAIs and were consistently conserved within the same pathogenic strains. Additionally, this investigation identified positive correlations between the number and frequency of G4 structures and the GC content across different genomic features, including the genome, promoters, genes, tRNA, and rRNA regions, indicating a potential relationship between G4 structures and the GC-associated regions of the genome. The observed differences in GC content between PAIs and the core genome further highlight the unique nature of PAIs and underlying factors, such as DNA topology. High-confidence G4 structures within regulatory regions of Escherichia coli were identified, modulating the efficiency or specificity of DNA integration events within PAIs. Collectively, these findings pave the way for future research to unravel the intricate molecular mechanisms and functional implications of G4-PAI interactions, thereby advancing our understanding of bacterial pathogenicity and the role of G4 structures in pathogenic diseases.
Collapse
|
2
|
Abundance of G-Quadruplex Forming Sequences in the Hepatitis Delta Virus Genomes. ACS OMEGA 2024; 9:4096-4101. [PMID: 38284014 PMCID: PMC10809645 DOI: 10.1021/acsomega.3c09288] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 11/21/2023] [Revised: 12/15/2023] [Accepted: 12/19/2023] [Indexed: 01/30/2024]
Abstract
Hepatitis delta virus (HDV) is a highly unusual RNA satellite virus that depends on the presence of hepatitis B virus (HBV) to be infectious. Its compact and variable single-stranded RNA genome consists of eight major genotypes distributed unevenly across different continents. The significance of noncanonical secondary structures such as G-quadruplexes (G4s) is increasingly recognized at the DNA and RNA levels, particularly for transcription, replication, and translation. G4s are formed from guanine-rich sequences and have been identified in the vast majority of viral, eukaryotic, and prokaryotic genomes. In this study, we analyzed the G4 propensity of HDV genomes by using G4Hunter. Unlike HBV, which has a G4 density similar to that of the human genome, HDV displays a significantly higher number of potential quadruplex-forming sequences (PQS), with a density more than four times greater than that of the human genome. This finding suggests a critical role for G4s in HDV, especially given that the PQS regions are conserved across HDV genotypes. Furthermore, the prevalence of G4-forming sequences may represent a promising target for therapeutic interventions to control HDV replication.
Collapse
|
3
|
A sodium/potassium switch for G4-prone G/C-rich sequences. Nucleic Acids Res 2024; 52:448-461. [PMID: 37986223 PMCID: PMC10783510 DOI: 10.1093/nar/gkad1073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2023] [Revised: 10/19/2023] [Accepted: 11/03/2023] [Indexed: 11/22/2023] Open
Abstract
Metal ions are essential components for the survival of living organisms. For most species, intracellular and extracellular ionic conditions differ significantly. As G-quadruplexes (G4s) are ion-dependent structures, changes in the [Na+]/[K+] ratio may affect the folding of genomic G4s. More than 11000 putative G4 sequences in the human genome (hg19) contain at least two runs of three continuous cytosines, and these mixed G/C-rich sequences may form a quadruplex or a competing hairpin structure based on G-C base pairing. In this study, we examine how the [Na+]/[K+] ratio influences the structures of G/C-rich sequences. The natural G4 structure with a 9-nt long central loop, CEBwt, was chosen as a model sequence, and the loop bases were gradually replaced by cytosines. The series of CEB mutations revealed that the presence of cytosines in G4 loops does not prevent G4 folding or decrease G4 stability but increases the probability of forming a competing structure, either a hairpin or an intermolecular duplex. Slow conversion to the quadruplex in vitro (in a potassium-rich buffer) and cells was demonstrated by NMR. 'Shape-shifting' sequences may respond to [Na+]/[K+] changes with delayed kinetics.
Collapse
|
4
|
Genetic variations in G-quadruplex forming sequences affect the transcription of human disease-related genes. Nucleic Acids Res 2023; 51:12124-12139. [PMID: 37930868 PMCID: PMC10711447 DOI: 10.1093/nar/gkad948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 09/22/2023] [Accepted: 10/12/2023] [Indexed: 11/08/2023] Open
Abstract
Guanine-rich DNA strands can fold into non-canonical four-stranded secondary structures named G-quadruplexes (G4s). G4s folded in proximal promoter regions (PPR) are associated either with positive or negative transcriptional regulation. Given that single nucleotide variants (SNVs) affecting G4 folding (G4-Vars) may alter gene transcription, and that SNVs are associated with the human diseases' onset, we undertook a novel comprehensive study of the G4-Vars genome-wide (G4-variome) to find disease-associated G4-Vars located into PPRs. We developed a bioinformatics strategy to find disease-related SNVs located into PPRs simultaneously overlapping with putative G4-forming sequences (PQSs). We studied five G4-Vars disturbing in vitro the folding and stability of the G4s located into PPRs, which had been formerly associated with sporadic Alzheimer's disease (GRIN2B), a severe familiar coagulopathy (F7), atopic dermatitis (CSF2), myocardial infarction (SIRT1) and deafness (LHFPL5). Results obtained in cultured cells for these five G4-Vars suggest that the changes in the G4s affect the transcription, potentially contributing to the development of the mentioned diseases. Collectively, data reinforce the general idea that G4-Vars may impact on the different susceptibilities to human genetic diseases' onset, and could be novel targets for diagnosis and drug design in precision medicine.
Collapse
|
5
|
Quadruplexes and aging: G4-binding proteins regulate the presence of miRNA in small extracellular vesicles (sEVs). Biochimie 2023; 214:69-72. [PMID: 36690199 DOI: 10.1016/j.biochi.2023.01.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2022] [Revised: 01/08/2023] [Accepted: 01/18/2023] [Indexed: 01/22/2023]
Abstract
The interaction between proteins and nucleic acids is a core element of life. Many proteins bind nucleic acids via a sequence-specific manner, but there are also many types of proteins that recognize various structural motifs. Researchers have recently found that proteins that can recognize DNA and RNA G-quadruplexes (G4s) are very important for basic cellular processes, particularly in eukaryotes. Some of these proteins are located outside the nucleus and interact with RNA, potentially affecting miRNA functions in intercellular communication, which is facilitated by small extracellular vesicles (sEVs). Imbalances in the production of sEVs are associated with various pathologies and senescence in humans. The distribution of miRNA into sEVs is regulated by two RNA-binding proteins, Alyref and FUS. Both proteins possess G-rich recognition motifs that are compatible with the formation of RNA parallel G4 structures. This lends credence to the new hypothesis that G4-formation in RNAs and their interaction with G4-binding proteins can affect the fate of miRNAs and control their distribution in sEVs that are associated with senescence and aging.
Collapse
|
6
|
Noncanonical DNA structures are drivers of genome evolution. Trends Genet 2023; 39:109-124. [PMID: 36604282 PMCID: PMC9877202 DOI: 10.1016/j.tig.2022.11.005] [Citation(s) in RCA: 26] [Impact Index Per Article: 26.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2022] [Revised: 11/04/2022] [Accepted: 11/28/2022] [Indexed: 01/05/2023]
Abstract
In addition to the canonical right-handed double helix, other DNA structures, termed 'non-B DNA', can form in the genomes across the tree of life. Non-B DNA regulates multiple cellular processes, including replication and transcription, yet its presence is associated with elevated mutagenicity and genome instability. These discordant cellular roles fuel the enormous potential of non-B DNA to drive genomic and phenotypic evolution. Here we discuss recent studies establishing non-B DNA structures as novel functional elements subject to natural selection, affecting evolution of transposable elements (TEs), and specifying centromeres. By highlighting the contributions of non-B DNA to repeated evolution and adaptation to changing environments, we conclude that evolutionary analyses should include a perspective of not only DNA sequence, but also its structure.
Collapse
|
7
|
Analysis of G-Quadruplex-Forming Sequences in Drought Stress-Responsive Genes, and Synthesis Genes of Phenolic Compounds in Arabidopsis thaliana. LIFE (BASEL, SWITZERLAND) 2023; 13:life13010199. [PMID: 36676148 PMCID: PMC9865073 DOI: 10.3390/life13010199] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/10/2022] [Revised: 12/30/2022] [Accepted: 01/08/2023] [Indexed: 01/11/2023]
Abstract
Sequences of nucleic acids with the potential to form four-stranded G-quadruplex structures are intensively studied mainly in the context of human diseases, pathogens, or extremophile organisms; nonetheless, the knowledge about their occurrence and putative role in plants is still limited. This work is focused on G-quadruplex-forming sites in two gene sets of interest: drought stress-responsive genes, and genes related to the production/biosynthesis of phenolic compounds in the model plant organism Arabidopsis thaliana. In addition, 20 housekeeping genes were analyzed as well, where the constitutive gene expression was expected (with no need for precise regulation depending on internal or external factors). The results have shown that none of the tested gene sets differed significantly in the content of G-quadruplex-forming sites, however, the highest frequency of G-quadruplex-forming sites was found in the 5'-UTR regions of phenolic compounds' biosynthesis genes, which indicates the possibility of their regulation at the mRNA level. In addition, mainly within the introns and 1000 bp flanks downstream gene regions, G-quadruplex-forming sites were highly underrepresented. Finally, cluster analysis allowed us to observe similarities between particular genes in terms of their PQS characteristics. We believe that the original approach used in this study may become useful for further and more comprehensive bioinformatic studies in the field of G-quadruplex genomics.
Collapse
|
8
|
Beyond the Primary Structure of Nucleic Acids: Potential Roles of Epigenetics and Noncanonical Structures in the Regulations of Plant Growth and Stress Responses. Methods Mol Biol 2023; 2642:331-361. [PMID: 36944887 DOI: 10.1007/978-1-0716-3044-0_18] [Citation(s) in RCA: 2] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 03/23/2023]
Abstract
Epigenetics deals with changes in gene expression that are not caused by modifications in the primary sequence of nucleic acids. These changes beyond primary structures of nucleic acids not only include DNA/RNA methylation, but also other reversible conversions, together with histone modifications or RNA interference. In addition, under particular conditions (such as specific ion concentrations or protein-induced stabilization), the right-handed double-stranded DNA helix (B-DNA) can form noncanonical structures commonly described as "non-B DNA" structures. These structures comprise, for example, cruciforms, i-motifs, triplexes, and G-quadruplexes. Their formation often leads to significant differences in replication and transcription rates. Noncanonical RNA structures have also been documented to play important roles in translation regulation and the biology of noncoding RNAs. In human and animal studies, the frequency and dynamics of noncanonical DNA and RNA structures are intensively investigated, especially in the field of cancer research and neurodegenerative diseases. In contrast, noncanonical DNA and RNA structures in plants have been on the fringes of interest for a long time and only a few studies deal with their formation, regulation, and physiological importance for plant stress responses. Herein, we present a review focused on the main fields of epigenetics in plants and their possible roles in stress responses and signaling, with special attention dedicated to noncanonical DNA and RNA structures.
Collapse
|
9
|
Biological solution conditions and flanking sequence modulate LLPS of RNA G-quadruplex structures. RNA (NEW YORK, N.Y.) 2022; 28:1197-1209. [PMID: 35760522 PMCID: PMC9380743 DOI: 10.1261/rna.079196.122] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/05/2022] [Accepted: 06/13/2022] [Indexed: 05/17/2023]
Abstract
Guanine-rich regions of DNA or RNA can form structures with two or more consecutive G-quartets called G-quadruplexes (GQ). Recent studies reveal the potential for these structures to aggregate in vitro. Here, we report effects of in vivo concentrations of additives-amino acids, nucleotides, and crowding agents-on the structure and solution behavior of RNAs containing GQ-forming sequences. We found that cytosine nucleotides destabilize a model GQ structure at biological salt concentrations, while free amino acids and other nucleotides do not do so to a substantial degree. We also report that the tendency of folded GQs to form droplets or to aggregate depends on the nature of flanking sequence and the presence of additives. Notably, in the presence of biological amounts of polyamines, flanking regions on the 5'-end of the RNA drive more droplet-like phase separation, while flanking regions on the 3'-end, as well as both the 5'- and 3'-ends, induce more condensed, granular structures. Finally, we provide an example of a biological sequence in the presence of polyamines and show that crowders such as PEG and dextran can selectively cause its phase separation. These findings have implications for the participation of GQS in LLPS in vivo.
Collapse
|
10
|
GAIA: G-quadruplexes in alive creature database. Nucleic Acids Res 2022; 51:D135-D140. [PMID: 35971612 PMCID: PMC9825426 DOI: 10.1093/nar/gkac657] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2022] [Revised: 07/08/2022] [Accepted: 08/08/2022] [Indexed: 01/29/2023] Open
Abstract
G-quadruplexes (G4) are 3D structures that are found in both DNA and RNA. Interest in this structure has grown over the past few years due to both its implication in diverse biological mechanisms and its potential use as a therapeutic target, to name two examples. G4s in humans have been widely studied; however, the level of their study in other species remains relatively minimal. That said, progress in this field has resulted in the prediction of G4s structures in various species, ranging from bacteria to eukaryotes. These predictions were analysed in a previous study which revealed that G4s are present in all living kingdoms. To date, eleven different databases have grouped the various G4s depending on either their structures, on the proteins that might bind them, or on their location in the various genomes. However, none of these databases contains information on their location in the transcriptome of many of the implicated species. The GAIA database was designed so as to make this data available online in a user-friendly manner. Through its web interface, users can query GAIA to filter G4s, which, we hope, will help the research in this field. GAIA is available at: https://gaia.cobius.usherbrooke.ca.
Collapse
|
11
|
The Newly Sequenced Genome of Pisum sativum Is Replete with Potential G-Quadruplex-Forming Sequences-Implications for Evolution and Biological Regulation. Int J Mol Sci 2022; 23:8482. [PMID: 35955617 PMCID: PMC9369095 DOI: 10.3390/ijms23158482] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Revised: 07/25/2022] [Accepted: 07/28/2022] [Indexed: 11/20/2022] Open
Abstract
G-quadruplexes (G4s) have been long considered rare and physiologically unimportant in vitro curiosities, but recent methodological advances have proved their presence and functions in vivo. Moreover, in addition to their functional relevance in bacteria and animals, including humans, their importance has been recently demonstrated in evolutionarily distinct plant species. In this study, we analyzed the genome of Pisum sativum (garden pea, or the so-called green pea), a unique member of the Fabaceae family. Our results showed that this genome contained putative G4 sequences (PQSs). Interestingly, these PQSs were located nonrandomly in the nuclear genome. We also found PQSs in mitochondrial (mt) and chloroplast (cp) DNA, and we experimentally confirmed G4 formation for sequences found in these two organelles. The frequency of PQSs for nuclear DNA was 0.42 PQSs per thousand base pairs (kbp), in the same range as for cpDNA (0.53/kbp), but significantly lower than what was found for mitochondrial DNA (1.58/kbp). In the nuclear genome, PQSs were mainly associated with regulatory regions, including 5'UTRs, and upstream of the rRNA region. In contrast to genomic DNA, PQSs were located around RNA genes in cpDNA and mtDNA. Interestingly, PQSs were also associated with specific transposable elements such as TIR and LTR and around them, pointing to their role in their spreading in nuclear DNA. The nonrandom localization of PQSs uncovered their evolutionary and functional significance in the Pisum sativum genome.
Collapse
|
12
|
Iso-FRET: an isothermal competition assay to analyze quadruplex formation in vitro. Nucleic Acids Res 2022; 50:e93. [PMID: 35670668 PMCID: PMC9458428 DOI: 10.1093/nar/gkac465] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 04/26/2022] [Accepted: 05/16/2022] [Indexed: 11/23/2022] Open
Abstract
Algorithms have been widely used to predict G-quadruplexes (G4s)-prone sequences. However, an experimental validation of these predictions is generally required. We previously reported a high-throughput technique to evidence G4 formation in vitro called FRET-MC. This method, while convenient and reproducible, has one known weakness: its inability to pin point G4 motifs of low thermal stability. As such quadruplexes may still be biologically relevant if formed at physiological temperature, we wanted to develop an independent assay to overcome this limitation. To this aim, we introduced an isothermal version of the competition assay, called iso-FRET, based on a duplex-quadruplex competition and a well-characterized bis-quinolinium G4 ligand, PhenDC3. G4-forming competitors act as decoys for PhenDC3, lowering its ability to stabilize the G4-forming motif reporter oligonucleotide conjugated to a fluorescence quencher (37Q). The decrease in available G4 ligand concentration restores the ability of 37Q to hybridize to its FAM-labeled short complementary C-rich strand (F22), leading to a decrease in fluorescence signal. In contrast, when no G4-forming competitor is present, PhenDC3 remains available to stabilize the 37Q quadruplex, preventing the formation of the F22 + 37Q complex. Iso-FRET was first applied to a reference panel of 70 sequences, and then used to investigate 23 different viral sequences.
Collapse
|
13
|
Abstract
The noncanonical structures, G-quadruplexes (GQs), formed in the guanine-rich region of nucleic acids regulate various biological and molecular functions in prokaryotes and eukaryotes. Neisseria meningitidis is a commensal residing in a human's upper respiratory tract but occasionally becomes virulent, causing life-threatening septicemia and meningitis. The factors causing these changes in phenotypes are not fully understood. At the molecular level, regulatory components help in a clearer understanding of the pathogen's virulence and pathogenesis. Herein, genome analysis followed by biophysical assays and cell-based experiments revealed the presence of conserved GQ motifs in N. meningitidis. These GQs are linked to the essential genes involved in cell adhesion, pathogenesis, virulence, transport, DNA repair, and recombination. Primer extension stop assay, reporter assays, and quantitative real-time polymerase chain reaction (qRT-PCR) further affirmed the formation of stable GQs in vitro and in vivo. These results support the existence of evolutionarily conserved GQ motifs in N. meningitidis and uphold the usage of GQ-specific ligands as novel antimeningococcal therapeutics.
Collapse
|
14
|
G-quadruplex occurrence and conservation: more than just a question of guanine–cytosine content. NAR Genom Bioinform 2022; 4:lqac010. [PMID: 35261973 PMCID: PMC8896161 DOI: 10.1093/nargab/lqac010] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/10/2021] [Revised: 12/06/2021] [Accepted: 02/25/2022] [Indexed: 12/14/2022] Open
Abstract
G-quadruplexes are motifs found in DNA and RNA that can fold into tertiary structures. Until now, they have been studied experimentally mainly in humans and a few other species. Recently, predictions have been made with bacterial and archaeal genomes. Nevertheless, a global comparison of predicted G4s (pG4s) across and within the three living kingdoms has not been addressed. In this study, we aimed to predict G4s in genes and transcripts of all kingdoms of living organisms and investigated the differences in their distributions. The relation of the predictions with GC content was studied. It appears that GC content is not the only parameter impacting G4 predictions and abundance. The distribution of pG4 densities varies depending on the class of transcripts and the group of species. Indeed, we have observed that, in coding transcripts, there are more predicted G4s than expected for eukaryotes but not for archaea and bacteria, while in noncoding transcripts, there are as many or fewer predicted G4s in all species groups. We even noticed that some species with the same GC content presented different pG4 profiles. For instance, Leishmania major and Chlamydomonas reinhardtii both have 60% of GC content, but the former has a pG4 density of 0.07 and the latter 1.16.
Collapse
|
15
|
G-quadruplexes in helminth parasites. Nucleic Acids Res 2022; 50:2719-2735. [PMID: 35234933 PMCID: PMC8934627 DOI: 10.1093/nar/gkac129] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2021] [Revised: 02/07/2022] [Accepted: 02/25/2022] [Indexed: 12/12/2022] Open
Abstract
Parasitic helminths infecting humans are highly prevalent infecting ∼2 billion people worldwide, causing inflammatory responses, malnutrition and anemia that are the primary cause of morbidity. In addition, helminth infections of cattle have a significant economic impact on livestock production, milk yield and fertility. The etiological agents of helminth infections are mainly Nematodes (roundworms) and Platyhelminths (flatworms). G-quadruplexes (G4) are unusual nucleic acid structures formed by G-rich sequences that can be recognized by specific G4 ligands. Here we used the G4Hunter Web Tool to identify and compare potential G4 sequences (PQS) in the nuclear and mitochondrial genomes of various helminths to identify G4 ligand targets. PQS are nonrandomly distributed in these genomes and often located in the proximity of genes. Unexpectedly, a Nematode, Ascaris lumbricoides, was found to be highly enriched in stable PQS. This species can tolerate high-stability G4 structures, which are not counter selected at all, in stark contrast to most other species. We experimentally confirmed G4 formation for sequences found in four different parasitic helminths. Small molecules able to selectively recognize G4 were found to bind to Schistosoma mansoni G4 motifs. Two of these ligands demonstrated potent activity both against larval and adult stages of this parasite.
Collapse
|
16
|
Major Achievements in the Design of Quadruplex-Interactive Small Molecules. Pharmaceuticals (Basel) 2022; 15:ph15030300. [PMID: 35337098 PMCID: PMC8953082 DOI: 10.3390/ph15030300] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 02/22/2022] [Accepted: 02/24/2022] [Indexed: 12/17/2022] Open
Abstract
Organic small molecules that can recognize and bind to G-quadruplex and i-Motif nucleic acids have great potential as selective drugs or as tools in drug target discovery programs, or even in the development of nanodevices for medical diagnosis. Hundreds of quadruplex-interactive small molecules have been reported, and the challenges in their design vary with the intended application. Herein, we survey the major achievements on the therapeutic potential of such quadruplex ligands, their mode of binding, effects upon interaction with quadruplexes, and consider the opportunities and challenges for their exploitation in drug discovery.
Collapse
|
17
|
Novel G-quadruplex prone sequences emerge in the complete assembly of the human X chromosome. Biochimie 2021; 191:87-90. [PMID: 34508825 DOI: 10.1016/j.biochi.2021.09.004] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2021] [Revised: 09/01/2021] [Accepted: 09/05/2021] [Indexed: 12/13/2022]
Abstract
G-quadruplexes are non-B secondary structures with regulatory functions and therapeutic potential. Improvements in sequencing methods recently allowed the completion of the first human chromosome which is now available as a gapless, end-to-end assembly, with the previously remaining spaces filled and newly identified regions added. We compared the presence of G-quadruplex forming sequences in the current human reference genome (GRCh38) and in the new end-to-end assembly of the X chromosome constructed by high-coverage ultra-long-read nanopore sequencing. This comparison revealed that, even though the corrected length of the chromosome X assembly is surprisingly 1.14% shorter than expected, the number of G-quadruplex forming sequences found in this gapless chromosome is significantly higher, with 493 new motifs having G4Hunter scores above 1.4 and 23 new sequences with G4Hunter scores above 3.5. This observation reflects an improved precision of the new sequencing approaches and points to an underestimation of G-quadruplex propensity in the previous, widely used version of the human genome assembly, especially for motifs with a high G4Hunter score, expected to be very stable. These G-quadruplex forming sequences probably remained undiscovered in earlier genome datasets due to previously unsolved G-rich and repetitive genomic regions. These observations allow a precise targeting of these important regulatory regions.
Collapse
|
18
|
SARS-CoV-2 Nsp3 unique domain SUD interacts with guanine quadruplexes and G4-ligands inhibit this interaction. Nucleic Acids Res 2021; 49:7695-7712. [PMID: 34232992 PMCID: PMC8287907 DOI: 10.1093/nar/gkab571] [Citation(s) in RCA: 36] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/20/2021] [Revised: 06/15/2021] [Accepted: 06/19/2021] [Indexed: 12/16/2022] Open
Abstract
The multidomain non-structural protein 3 (Nsp3) is the largest protein encoded by coronavirus (CoV) genomes and several regions of this protein are essential for viral replication. Of note, SARS-CoV Nsp3 contains a SARS-Unique Domain (SUD), which can bind Guanine-rich non-canonical nucleic acid structures called G-quadruplexes (G4) and is essential for SARS-CoV replication. We show herein that the SARS-CoV-2 Nsp3 protein also contains a SUD domain that interacts with G4s. Indeed, interactions between SUD proteins and both DNA and RNA G4s were evidenced by G4 pull-down, Surface Plasmon Resonance and Homogenous Time Resolved Fluorescence. These interactions can be disrupted by mutations that prevent oligonucleotides from folding into G4 structures and, interestingly, by molecules known as specific ligands of these G4s. Structural models for these interactions are proposed and reveal significant differences with the crystallographic and modeled 3D structures of the SARS-CoV SUD-NM/G4 interaction. Altogether, our results pave the way for further studies on the role of SUD/G4 interactions during SARS-CoV-2 replication and the use of inhibitors of these interactions as potential antiviral compounds.
Collapse
|
19
|
G-Quadruplex in Gene Encoding Large Subunit of Plant RNA Polymerase II: A Billion-Year-Old Story. Int J Mol Sci 2021; 22:ijms22147381. [PMID: 34299001 PMCID: PMC8306923 DOI: 10.3390/ijms22147381] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 06/24/2021] [Accepted: 07/05/2021] [Indexed: 12/12/2022] Open
Abstract
G-quadruplexes have long been perceived as rare and physiologically unimportant nucleic acid structures. However, several studies have revealed their importance in molecular processes, suggesting their possible role in replication and gene expression regulation. Pathways involving G-quadruplexes are intensively studied, especially in the context of human diseases, while their involvement in gene expression regulation in plants remains largely unexplored. Here, we conducted a bioinformatic study and performed a complex circular dichroism measurement to identify a stable G-quadruplex in the gene RPB1, coding for the RNA polymerase II large subunit. We found that this G-quadruplex-forming locus is highly evolutionarily conserved amongst plants sensu lato (Archaeplastida) that share a common ancestor more than one billion years old. Finally, we discussed a new hypothesis regarding G-quadruplexes interacting with UV light in plants to potentially form an additional layer of the regulatory network.
Collapse
|
20
|
Abstract
Guanine-rich DNA and RNA sequences can fold into noncanonical nucleic acid structures called G-quadruplexes (G4s). Since the discovery that these structures may act as scaffolds for the binding of specific ligands, G4s aroused the attention of a growing number of scientists. The versatile roles of G4 structures in viral replication, transcription, and translation suggest direct applications in therapy or diagnostics. G4-interacting molecules (proteins or small molecules) may also affect the balance between latent and lytic phases, and increasing evidence reveals that G4s are implicated in generally suppressing viral processes, such as replication, transcription, translation, or reverse transcription. In this review, we focus on the discovery of G4s in viruses and the role of G4 ligands in the antiviral drug discovery process. After assessing the role of viral G4s, we argue that host G4s participate in immune modulation, viral tumorigenesis, cellular pathways involved in virus maturation, and DNA integration of viral genomes, which can be potentially employed for antiviral therapeutics. Furthermore, we scrutinize the impediments and shortcomings in the process of studying G4 ligands and drug discovery. Finally, some unanswered questions regarding viral G4s are highlighted for prospective future projects. SIGNIFICANCE STATEMENT: G-quadruplexes (G4s) are noncanonical nucleic acid structures that have gained increasing recognition during the last few decades. First identified as relevant targets in oncology, their importance in virology is now increasingly clear. A number of G-quadruplex ligands are known: viral transcription and replication are the main targets of these ligands. Both viral and cellular G4s may be targeted; this review embraces the different aspects of G-quadruplexes in both host and viral contexts.
Collapse
|
21
|
Abstract
The control of DNA topology is a prerequisite for all the DNA transactions such as DNA replication, repair, recombination, and transcription. This global control is carried out by essential enzymes, named DNA-topoisomerases, that are mandatory for the genome stability. Since many decades, the Archaea provide a significant panel of new types of topoisomerases such as the reverse gyrase, the type IIB or the type IC. These more or less recent discoveries largely contributed to change the understanding of the role of the DNA topoisomerases in all the living world. Despite their very different life styles, Archaea share a quasi-homogeneous set of DNA-topoisomerases, except thermophilic organisms that possess at least one reverse gyrase that is considered a marker of the thermophily. Here, we discuss the effect of the life style of Archaea on DNA structure and topology and then we review the content of these essential enzymes within all the archaeal diversity based on complete sequenced genomes available. Finally, we discuss their roles, in particular in the processes involved in both the archaeal adaptation and the preservation of the genome stability.
Collapse
|
22
|
Evolution of Diverse Strategies for Promoter Regulation. Trends Genet 2021; 37:730-744. [PMID: 33931265 DOI: 10.1016/j.tig.2021.04.003] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/29/2021] [Revised: 03/31/2021] [Accepted: 04/01/2021] [Indexed: 12/15/2022]
Abstract
DNA is fundamentally important for all cellular organisms due to its role as a store of hereditary genetic information. The precise and accurate regulation of gene transcription depends primarily on promoters, which vary significantly within and between genomes. Some promoters are rich in specific types of bases, while others have more varied, complex sequence characteristics. However, it is not only base sequence but also epigenetic modifications and altered DNA structure that regulate promoter activity. Significantly, many promoters across all organisms contain sequences that can form intrastrand hairpins (cruciforms) or four-stranded structures (G-quadruplex or i-motif). In this review we integrate recent studies on promoter regulation that highlight the importance of DNA structure in the evolutionary adaptation of promoter sequences.
Collapse
|
23
|
Recurrent Potential G-Quadruplex Sequences in Archaeal Genomes. Front Microbiol 2021; 12:647851. [PMID: 33868206 PMCID: PMC8044849 DOI: 10.3389/fmicb.2021.647851] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2020] [Accepted: 03/03/2021] [Indexed: 11/23/2022] Open
Abstract
Evolutionary conservation or over-representation of the potential G-quadruplex sequences (PQS) in genomes are usually considered as a sign of the functional relevance of these sequences. However, uneven base distribution (GC-content) along the genome may along the genome may result in seeming abundance of PQSs over average in the genome. Apart from this, a number of other conserved functional signals that are encoded in the GC-rich genomic regions may inadvertently result in emergence of G-quadruplex compatible sequences. Here, we analyze the genomes of archaea focusing our search to repetitive PQS (rPQS) motifs within each organism. The probability of occurrence of several identical PQSs within a relatively short archaeal genome is low and, thus, the structure and genomic location of such rPQSs may become a direct indication of their functionality. We have found that the majority of the genomes of Methanomicrobiaceae family of archaea contained multiple copies of the interspersed highly similar PQSs. Short oligonucleotides corresponding to the rPQS formed the G-quadruplex (G4) structure in presence of potassium ions as demonstrated by circular dichroism (CD) and enzymatic probing. However, further analysis of the genomic context for the rPQS revealed a 10–12 nt cytosine-rich track adjacent to 3'-end of each rPQS. Synthetic DNA fragments that included the C-rich track tended to fold into alternative structures such as hairpin structure and antiparallel triplex that were in equilibrium with G4 structure depending on the presence of potassium ions in solution. Structural properties of the found repetitive sequences, their location in the genomes of archaea, and possible functions are discussed.
Collapse
|
24
|
Analyses of viral genomes for G-quadruplex forming sequences reveal their correlation with the type of infection. Biochimie 2021; 186:13-27. [PMID: 33839192 DOI: 10.1016/j.biochi.2021.03.017] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/18/2021] [Revised: 03/30/2021] [Accepted: 03/31/2021] [Indexed: 12/12/2022]
Abstract
G-quadruplexes contribute to the regulation of key molecular processes. Their utilization for antiviral therapy is an emerging field of contemporary research. Here we present comprehensive analyses of the presence and localization of putative G-quadruplex forming sequences (PQS) in all viral genomes currently available in the NCBI database (including subviral agents). The G4Hunter algorithm was applied to a pool of 11,000 accessible viral genomes representing 350 Mbp in total. PQS frequencies differ across evolutionary groups of viruses, and are enriched in repeats, replication origins, 5'UTRs and 3'UTRs. Importantly, PQS presence and localization is connected to viral lifecycles and corresponds to the type of viral infection rather than to nucleic acid type; while viruses routinely causing persistent infections in Metazoa hosts are enriched for PQS, viruses causing acute infections are significantly depleted for PQS. The unique localization of PQS identifies the importance of G-quadruplex-based regulation of viral replication and life cycle, providing a tool for potential therapeutic targeting.
Collapse
|
25
|
Tracing dsDNA Virus-Host Coevolution through Correlation of Their G-Quadruplex-Forming Sequences. Int J Mol Sci 2021; 22:ijms22073433. [PMID: 33810462 PMCID: PMC8036883 DOI: 10.3390/ijms22073433] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 03/17/2021] [Accepted: 03/23/2021] [Indexed: 12/12/2022] Open
Abstract
The importance of gene expression regulation in viruses based upon G-quadruplex may point to its potential utilization in therapeutic targeting. Here, we present analyses as to the occurrence of putative G-quadruplex-forming sequences (PQS) in all reference viral dsDNA genomes and evaluate their dependence on PQS occurrence in host organisms using the G4Hunter tool. PQS frequencies differ across host taxa without regard to GC content. The overlay of PQS with annotated regions reveals the localization of PQS in specific regions. While abundance in some, such as repeat regions, is shared by all groups, others are unique. There is abundance within introns of Eukaryota-infecting viruses, but depletion of PQS in introns of bacteria-infecting viruses. We reveal a significant positive correlation between PQS frequencies in dsDNA viruses and corresponding hosts from archaea, bacteria, and eukaryotes. A strong relationship between PQS in a virus and its host indicates their close coevolution and evolutionarily reciprocal mimicking of genome organization.
Collapse
|
26
|
CNBP Binds and Unfolds In Vitro G-Quadruplexes Formed in the SARS-CoV-2 Positive and Negative Genome Strands. Int J Mol Sci 2021; 22:ijms22052614. [PMID: 33807682 PMCID: PMC7961906 DOI: 10.3390/ijms22052614] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2021] [Revised: 02/20/2021] [Accepted: 02/20/2021] [Indexed: 12/11/2022] Open
Abstract
The Coronavirus Disease 2019 (COVID-19) pandemic has become a global health emergency with no effective medical treatment and with incipient vaccines. It is caused by a new positive-sense RNA virus called severe acute respiratory syndrome-related coronavirus 2 (SARS-CoV-2). G-quadruplexes (G4s) are nucleic acid secondary structures involved in the control of a variety of biological processes including viral replication. Using several G4 prediction tools, we identified highly putative G4 sequences (PQSs) within the positive-sense (+gRNA) and negative-sense (−gRNA) RNA strands of SARS-CoV-2 conserved in related betacoronaviruses. By using multiple biophysical techniques, we confirmed the formation of two G4s in the +gRNA and provide the first evidence of G4 formation by two PQSs in the −gRNA of SARS-CoV-2. Finally, biophysical and molecular approaches were used to demonstrate for the first time that CNBP, the main human cellular protein bound to SARS-CoV-2 RNA genome, binds and promotes the unfolding of G4s formed by both strands of SARS-CoV-2 RNA genome. Our results suggest that G4s found in SARS-CoV-2 RNA genome and its negative-sense replicative intermediates, as well as the cellular proteins that interact with them, are relevant factors for viral genes expression and replication cycle, and may constitute interesting targets for antiviral drugs development.
Collapse
|
27
|
G-quadruplex motifs are functionally conserved in cis-regulatory regions of pathogenic bacteria: An in-silico evaluation. Biochimie 2021; 184:40-51. [PMID: 33548392 DOI: 10.1016/j.biochi.2021.01.017] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2020] [Revised: 01/28/2021] [Accepted: 01/29/2021] [Indexed: 02/06/2023]
Abstract
The role of G-quadruplexes in the cellular physiology of human pathogenesis is an intriguing area of research. Nonetheless, their functional roles and evolutionary conservation have not been compared comprehensively in pathogenic forms of various bacterial genera and species. In the current in silico study, we addressed the role of G-quadruplex-forming sequences (G4 motifs) in the context of cis-regulation, expression variation, regulatory networks, gene orthology and ontology. Genome-wide screening across seven pathogenic genomes using the G4Hunter tool revealed the significant prevalence of G4 motifs in cis-regulatory regions compared to the intragenic regions. Significant conservation of G4 motifs was observed in the regulatory region of 300 orthologous genes. Further analysis of published ChIP-Seq data (Minch et al., 2015) of 91 DNA-binding proteins of the M. tuberculosis genome revealed significant links between G4 motifs and target sites of transcriptional regulators. Interestingly, the transcription factors entangled with virulence, in specific, CsoR, Rv0081, DevR/DosR, and TetR family are found to have G4 motifs in their target regulatory regions. Overall the current study applies positional-functional relationship computation to delve into the cis-regulation of G-quadruplex structures in the context of gene orthology in pathogenic bacteria.
Collapse
|
28
|
Amino Acid Composition in Various Types of Nucleic Acid-Binding Proteins. Int J Mol Sci 2021; 22:ijms22020922. [PMID: 33477647 PMCID: PMC7831508 DOI: 10.3390/ijms22020922] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2020] [Revised: 01/15/2021] [Accepted: 01/16/2021] [Indexed: 12/20/2022] Open
Abstract
Nucleic acid-binding proteins are traditionally divided into two categories: With the ability to bind DNA or RNA. In the light of new knowledge, such categorizing should be overcome because a large proportion of proteins can bind both DNA and RNA. Another even more important features of nucleic acid-binding proteins are so-called sequence or structure specificities. Proteins able to bind nucleic acids in a sequence-specific manner usually contain one or more of the well-defined structural motifs (zinc-fingers, leucine zipper, helix-turn-helix, or helix-loop-helix). In contrast, many proteins do not recognize nucleic acid sequence but rather local DNA or RNA structures (G-quadruplexes, i-motifs, triplexes, cruciforms, left-handed DNA/RNA form, and others). Finally, there are also proteins recognizing both sequence and local structural properties of nucleic acids (e.g., famous tumor suppressor p53). In this mini-review, we aim to summarize current knowledge about the amino acid composition of various types of nucleic acid-binding proteins with a special focus on significant enrichment and/or depletion in each category.
Collapse
|