1
|
Alternative splicing induced by bacterial pore-forming toxins sharpens CIRBP-mediated cell response to Listeria infection. Nucleic Acids Res 2023; 51:12459-12475. [PMID: 37941135 PMCID: PMC10711537 DOI: 10.1093/nar/gkad1033] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/22/2023] [Revised: 10/09/2023] [Accepted: 10/20/2023] [Indexed: 11/10/2023] Open
Abstract
Cell autonomous responses to intracellular bacteria largely depend on reorganization of gene expression. To gain isoform-level resolution of these modes of regulation, we combined long- and short-read transcriptomic analyses of the response of intestinal epithelial cells to infection by the foodborne pathogen Listeria monocytogenes. Among the most striking isoform-based types of regulation, expression of the cellular stress response regulator CIRBP (cold-inducible RNA-binding protein) and of several SRSFs (serine/arginine-rich splicing factors) switched from canonical transcripts to nonsense-mediated decay-sensitive isoforms by inclusion of 'poison exons'. We showed that damage to host cell membranes caused by bacterial pore-forming toxins (listeriolysin O, perfringolysin, streptolysin or aerolysin) led to the dephosphorylation of SRSFs via the inhibition of the kinase activity of CLK1, thereby driving CIRBP alternative splicing. CIRBP isoform usage was found to have consequences on infection, since selective repression of canonical CIRBP reduced intracellular bacterial load while that of the poison exon-containing isoform exacerbated it. Consistently, CIRBP-bound mRNAs were shifted towards stress-relevant transcripts in infected cells, with increased mRNA levels or reduced translation efficiency for some targets. Our results thus generalize the alternative splicing of CIRBP and SRSFs as a common response to biotic or abiotic stresses by extending its relevance to the context of bacterial infection.
Collapse
|
2
|
Abstract
RSAT (Regulatory Sequence Analysis Tools) enables the detection and the analysis of cis-regulatory elements in genomic sequences. This software suite performs (i) de novo motif discovery (including from genome-wide datasets like ChIP-seq/ATAC-seq) (ii) genomic sequences scanning with known motifs, (iii) motif analysis (quality assessment, comparisons and clustering), (iv) analysis of regulatory variations and (v) comparative genomics. RSAT comprises 50 tools. Six public Web servers (including a teaching server) are offered to meet the needs of different biological communities. RSAT philosophy and originality are: (i) a multi-modal access depending on the user needs, through web forms, command-line for local installation and programmatic web services, (ii) a support for virtually any genome (animals, bacteria, plants, totalizing over 10 000 genomes directly accessible). Since the 2018 NAR Web Software Issue, we have developed a large REST API, extended the support for additional genomes and external motif collections, enhanced some tools and Web forms, and developed a novel tool that builds or refine gene regulatory networks using motif scanning (network-interactions). The RSAT website provides extensive documentation, tutorials and published protocols. RSAT code is under open-source license and now hosted in GitHub. RSAT is available at http://www.rsat.eu/.
Collapse
|
3
|
Logical modelling of in vitro differentiation of human monocytes into dendritic cells unravels novel transcriptional regulatory interactions. Interface Focus 2021; 11:20200061. [PMID: 34123352 PMCID: PMC8193469 DOI: 10.1098/rsfs.2020.0061] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 04/15/2021] [Indexed: 12/13/2022] Open
Abstract
Dendritic cells (DCs) are the major specialized antigen-presenting cells, thereby connecting innate and adaptive immunity. Because of their role in establishing adaptive immunity, they constitute promising targets for immunotherapy. Monocytes can differentiate into DCs in vitro in the presence of colony-stimulating factor 2 (CSF2) and interleukin 4 (IL4), activating four signalling pathways (MAPK, JAK/STAT, NFKB and PI3K). However, the downstream transcriptional programme responsible for DC differentiation from monocytes (moDCs) remains unknown. By analysing the scientific literature on moDC differentiation, we established a preliminary logical model that helped us identify missing information regarding the activation of genes responsible for this differentiation, including missing targets for key transcription factors (TFs). Using ChIP-seq and RNA-seq data from the Blueprint consortium, we defined active and inactive promoters, together with differentially expressed genes in monocytes, moDCs and macrophages, which correspond to an alternative cell fate. We then used this functional genomic information to predict novel targets for previously identified TFs. By integrating this information, we refined our model and recapitulated the main established facts regarding moDC differentiation. Prospectively, the resulting model should be useful to develop novel immunotherapies targeting moDCs.
Collapse
|
4
|
Cis-acting variation is common across regulatory layers but is often buffered during embryonic development. Genome Res 2021; 31:211-224. [PMID: 33310749 PMCID: PMC7849415 DOI: 10.1101/gr.266338.120] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/21/2020] [Accepted: 12/09/2020] [Indexed: 12/14/2022]
Abstract
Precise patterns of gene expression are driven by interactions between transcription factors, regulatory DNA sequences, and chromatin. How DNA mutations affecting any one of these regulatory "layers" are buffered or propagated to gene expression remains unclear. To address this, we quantified allele-specific changes in chromatin accessibility, histone modifications, and gene expression in F1 embryos generated from eight Drosophila crosses at three embryonic stages, yielding a comprehensive data set of 240 samples spanning multiple regulatory layers. Genetic variation (allelic imbalance) impacts gene expression more frequently than chromatin features, with metabolic and environmental response genes being most often affected. Allelic imbalance in cis-regulatory elements (enhancers) is common and highly heritable, yet its functional impact does not generally propagate to gene expression. When it does, genetic variation impacts RNA levels through two alternative mechanisms involving either H3K4me3 or chromatin accessibility and H3K27ac. Changes in RNA are more predictive of variation in H3K4me3 than vice versa, suggesting a role for H3K4me3 downstream from transcription. The impact of a substantial proportion of genetic variation is consistent across embryonic stages, with 50% of allelic imbalanced features at one stage being also imbalanced at subsequent developmental stages. Crucially, buffering, as well as the magnitude and evolutionary impact of genetic variants, is influenced by regulatory complexity (i.e., number of enhancers regulating a gene), with transcription factors being most robust to cis-acting, but most influenced by trans-acting, variation.
Collapse
|
5
|
Deciphering and modelling the TGF-β signalling interplays specifying the dorsal-ventral axis of the sea urchin embryo. Development 2021; 148:dev.189944. [PMID: 33298464 DOI: 10.1242/dev.189944] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2020] [Accepted: 11/16/2020] [Indexed: 11/20/2022]
Abstract
During sea urchin development, secretion of Nodal and BMP2/4 ligands and their antagonists Lefty and Chordin from a ventral organiser region specifies the ventral and dorsal territories. This process relies on a complex interplay between the Nodal and BMP pathways through numerous regulatory circuits. To decipher the interplay between these pathways, we used a combination of treatments with recombinant Nodal and BMP2/4 proteins and a computational modelling approach. We assembled a logical model focusing on cell responses to signalling inputs along the dorsal-ventral axis, which was extended to cover ligand diffusion and enable multicellular simulations. Our model simulations accurately recapitulate gene expression in wild-type embryos, accounting for the specification of ventral ectoderm, ciliary band and dorsal ectoderm. Our model simulations further recapitulate various morphant phenotypes, reveal a dominance of the BMP pathway over the Nodal pathway and stress the crucial impact of the rate of Smad activation in dorsal-ventral patterning. These results emphasise the key role of the mutual antagonism between the Nodal and BMP2/4 pathways in driving early dorsal-ventral patterning of the sea urchin embryo.
Collapse
|
6
|
Computational Verification of Large Logical Models-Application to the Prediction of T Cell Response to Checkpoint Inhibitors. Front Physiol 2020; 11:558606. [PMID: 33101049 PMCID: PMC7554341 DOI: 10.3389/fphys.2020.558606] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2020] [Accepted: 08/19/2020] [Indexed: 12/31/2022] Open
Abstract
At the crossroad between biology and mathematical modeling, computational systems biology can contribute to a mechanistic understanding of high-level biological phenomenon. But as knowledge accumulates, the size and complexity of mathematical models increase, calling for the development of efficient dynamical analysis methods. Here, we propose the use of two approaches for the development and analysis of complex cellular network models. A first approach, called "model verification" and inspired by unitary testing in software development, enables the formalization and automated verification of validation criteria for whole models or selected sub-parts. When combined with efficient analysis methods, this approach is suitable for continuous testing, thereby greatly facilitating model development. A second approach, called "value propagation," enables efficient analytical computation of the impact of specific environmental or genetic conditions on the dynamical behavior of some models. We apply these two approaches to the delineation and the analysis of a comprehensive model for T cell activation, taking into account CTLA4 and PD-1 checkpoint inhibitory pathways. While model verification greatly eases the delineation of logical rules complying with a set of dynamical specifications, propagation provides interesting insights into the different potential of CTLA4 and PD-1 immunotherapies. Both methods are implemented and made available in the all-inclusive CoLoMoTo Docker image, while the different steps of the model analysis are fully reported in two companion interactive jupyter notebooks, thereby ensuring the reproduction of our results.
Collapse
|
7
|
The activation trajectory of plasmacytoid dendritic cells in vivo during a viral infection. Nat Immunol 2020; 21:983-997. [PMID: 32690951 PMCID: PMC7610367 DOI: 10.1038/s41590-020-0731-4] [Citation(s) in RCA: 46] [Impact Index Per Article: 11.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 06/08/2020] [Indexed: 12/15/2022]
Abstract
Plasmacytoid dendritic cells (pDCs) are a major source of type I interferon (IFN-I). What other functions pDCs exert in vivo during viral infections is controversial, and more studies are needed to understand their orchestration. In the present study, we characterize in depth and link pDC activation states in animals infected by mouse cytomegalovirus by combining Ifnb1 reporter mice with flow cytometry, single-cell RNA sequencing, confocal microscopy and a cognate CD4 T cell activation assay. We show that IFN-I production and T cell activation were performed by the same pDC, but these occurred sequentially in time and in different micro-anatomical locations. In addition, we show that pDC commitment to IFN-I production was marked early on by their downregulation of leukemia inhibitory factor receptor and was promoted by cell-intrinsic tumor necrosis factor signaling. We propose a new model for how individual pDCs are endowed to exert different functions in vivo during a viral infection, in a manner tightly orchestrated in time and space.
Collapse
|
8
|
IL-12 Signaling Contributes to the Reprogramming of Neonatal CD8 + T Cells. Front Immunol 2020; 11:1089. [PMID: 32582178 PMCID: PMC7292210 DOI: 10.3389/fimmu.2020.01089] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2020] [Accepted: 05/05/2020] [Indexed: 01/26/2023] Open
Abstract
Neonates are highly susceptible to intracellular pathogens, leading to high morbidity and mortality rates. CD8+ T lymphocytes are responsible for the elimination of infected cells. Understanding the response of these cells to normal and high stimulatory conditions is important to propose better treatments and vaccine formulations for neonates. We have previously shown that human neonatal CD8+ T cells overexpress innate inflammatory genes and have a low expression of cytotoxic and cell signaling genes. To investigate the activation potential of these cells, we evaluated the transcriptome of human neonatal and adult naïve CD8+ T cells after TCR/CD28 signals ± IL-12. We found that in neonatal cells, IL-12 signals contribute to the adult-like expression of genes associated with cell-signaling, T-cell cytokines, metabolism, and cell division. Additionally, IL-12 signals contributed to the downregulation of the neutrophil signature transcription factor CEBPE and other immaturity related genes. To validate the transcriptome results, we evaluated the expression of a series of genes by RT-qPCR and the promoter methylation status on independent samples. We found that in agreement with the transcriptome, IL-12 signals contributed to the chromatin closure of neutrophil-like genes and the opening of cytotoxicity genes, suggesting that IL-12 signals contribute to the epigenetic reprogramming of neonatal lymphocytes. Furthermore, high expression of some inflammatory genes was observed in naïve and stimulated neonatal cells, in agreement with the high inflammatory profile of neonates to infections. Altogether our results point to an important contribution of IL-12 signals to the reprogramming of the neonatal CD8+ T cells.
Collapse
|
9
|
RSAT variation-tools: An accessible and flexible framework to predict the impact of regulatory variants on transcription factor binding. Comput Struct Biotechnol J 2019; 17:1415-1428. [PMID: 31871587 PMCID: PMC6906655 DOI: 10.1016/j.csbj.2019.09.009] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2019] [Revised: 09/22/2019] [Accepted: 09/25/2019] [Indexed: 02/06/2023] Open
Abstract
Gene regulatory regions contain short and degenerated DNA binding sites recognized by transcription factors (TFBS). When TFBS harbor SNPs, the DNA binding site may be affected, thereby altering the transcriptional regulation of the target genes. Such regulatory SNPs have been implicated as causal variants in Genome-Wide Association Study (GWAS) studies. In this study, we describe improved versions of the programs Variation-tools designed to predict regulatory variants, and present four case studies to illustrate their usage and applications. In brief, Variation-tools facilitate i) obtaining variation information, ii) interconversion of variation file formats, iii) retrieval of sequences surrounding variants, and iv) calculating the change on predicted transcription factor affinity scores between alleles, using motif scanning approaches. Notably, the tools support the analysis of haplotypes. The tools are included within the well-maintained suite Regulatory Sequence Analysis Tools (RSAT, http://rsat.eu), and accessible through a web interface that currently enables analysis of five metazoa and ten plant genomes. Variation-tools can also be used in command-line with any locally-installed Ensembl genome. Users can input personal collections of variants and motifs, providing flexibility in the analysis.
Collapse
Key Words
- Binding motifs
- CEU, Northern Europeans from Utah
- CRM, Cis-Regulatory Module
- GWAS, Genome Wide Association Studies
- LD, Linkage Disequilibrium
- MPRA, Massively Parallel Reporter Assays: MPRA
- PSSM, Position Specific Scoring Matrix
- Position specific scoring matrix
- ROC, Receiver Operating Characteristic
- RSAT, Regulatory Sequence Analysis Tools
- Regulatory variants
- SNP, Single Nucleotide Polymorphism
- SNPs
- SOIs, SNPs of Interest
- TF, Transcription Factor
- TFBS, Transcription Factor Binding Site
- Transcription factors
- eQTL, Expression Quantitative Trait Loci
- rsID, Reference SNP Identifier
Collapse
|
10
|
RSAT 2018: regulatory sequence analysis tools 20th anniversary. Nucleic Acids Res 2019; 46:W209-W214. [PMID: 29722874 PMCID: PMC6030903 DOI: 10.1093/nar/gky317] [Citation(s) in RCA: 127] [Impact Index Per Article: 25.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2018] [Accepted: 04/23/2018] [Indexed: 12/27/2022] Open
Abstract
RSAT (Regulatory Sequence Analysis Tools) is a suite of modular tools for the detection and the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, including from genome-wide datasets like ChIP-seq/ATAC-seq, (ii) motif scanning, (iii) motif analysis (quality assessment, comparisons and clustering), (iv) analysis of regulatory variations, (v) comparative genomics. Six public servers jointly support 10 000 genomes from all kingdoms. Six novel or refactored programs have been added since the 2015 NAR Web Software Issue, including updated programs to analyse regulatory variants (retrieve-variation-seq, variation-scan, convert-variations), along with tools to extract sequences from a list of coordinates (retrieve-seq-bed), to select motifs from motif collections (retrieve-matrix), and to extract orthologs based on Ensembl Compara (get-orthologs-compara). Three use cases illustrate the integration of new and refactored tools to the suite. This Anniversary update gives a 20-year perspective on the software suite. RSAT is well-documented and available through Web sites, SOAP/WSDL (Simple Object Access Protocol/Web Services Description Language) web services, virtual machines and stand-alone programs at http://www.rsat.eu/.
Collapse
|
11
|
Cooperation between T cell receptor and Toll-like receptor 5 signaling for CD4 + T cell activation. Sci Signal 2019; 12:12/577/eaar3641. [PMID: 30992399 DOI: 10.1126/scisignal.aar3641] [Citation(s) in RCA: 27] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022]
Abstract
CD4+ T cells recognize antigens through their T cell receptors (TCRs); however, additional signals involving costimulatory receptors, for example, CD28, are required for proper T cell activation. Alternative costimulatory receptors have been proposed, including members of the Toll-like receptor (TLR) family, such as TLR5 and TLR2. To understand the molecular mechanism underlying a potential costimulatory role for TLR5, we generated detailed molecular maps and logical models for the TCR and TLR5 signaling pathways and a merged model for cross-interactions between the two pathways. Furthermore, we validated the resulting model by analyzing how T cells responded to the activation of these pathways alone or in combination, in terms of the activation of the transcriptional regulators CREB, AP-1 (c-Jun), and NF-κB (p65). Our merged model accurately predicted the experimental results, showing that the activation of TLR5 can play a similar role to that of CD28 activation with respect to AP-1, CREB, and NF-κB activation, thereby providing insights regarding the cross-regulation of these pathways in CD4+ T cells.
Collapse
|
12
|
Synthetic STARR-seq reveals how DNA shape and sequence modulate transcriptional output and noise. PLoS Genet 2018; 14:e1007793. [PMID: 30427832 PMCID: PMC6261644 DOI: 10.1371/journal.pgen.1007793] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/14/2018] [Revised: 11/28/2018] [Accepted: 10/26/2018] [Indexed: 12/29/2022] Open
Abstract
The binding of transcription factors to short recognition sequences plays a pivotal role in controlling the expression of genes. The sequence and shape characteristics of binding sites influence DNA binding specificity and have also been implicated in modulating the activity of transcription factors downstream of binding. To quantitatively assess the transcriptional activity of tens of thousands of designed synthetic sites in parallel, we developed a synthetic version of STARR-seq (synSTARR-seq). We used the approach to systematically analyze how variations in the recognition sequence of the glucocorticoid receptor (GR) affect transcriptional regulation. Our approach resulted in the identification of a novel highly active functional GR binding sequence and revealed that sequence variation both within and flanking GR’s core binding site can modulate GR activity without apparent changes in DNA binding affinity. Notably, we found that the sequence composition of variants with similar activity profiles was highly diverse. In contrast, groups of variants with similar activity profiles showed specific DNA shape characteristics indicating that DNA shape may be a better predictor of activity than DNA sequence. Finally, using single cell experiments with individual enhancer variants, we obtained clues indicating that the architecture of the response element can independently tune expression mean and cell-to cell variability in gene expression (noise). Together, our studies establish synSTARR as a powerful method to systematically study how DNA sequence and shape modulate transcriptional output and noise. The expression level of genes is controlled by transcription factors, which are proteins that bind to genomic response elements that contain their recognition DNA sequence. Importantly, genes are not simply turned on but need to be expressed at the right level. This is, at least in part, assured by the sequence composition of genomic response elements. Here, we studied how the recognition DNA sequence influences gene regulation by a transcription factor called the glucocorticoid receptor. Specifically, we developed a method to test the activity of variants in a highly parallelized setting where everything is kept identical except for the sequence of the binding site. The systematic analysis of tens of thousands of sequence variants facilitated the identification of a previously unknown sequence variant with high activity. Moreover, we report how sequence variation of the response element influences cell-to-cell variability in expression levels. Finally, we observe similar activity profiles for distinct sequence variants that share similar three-dimensional DNA shape characteristics arguing that the three-dimensional perception of DNA by the glucocorticoid receptor, modulates its activity towards individual target genes.
Collapse
|
13
|
RSAT matrix-clustering: dynamic exploration and redundancy reduction of transcription factor binding motif collections. Nucleic Acids Res 2017; 45:e119. [PMID: 28591841 PMCID: PMC5737723 DOI: 10.1093/nar/gkx314] [Citation(s) in RCA: 62] [Impact Index Per Article: 8.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2016] [Accepted: 06/04/2017] [Indexed: 01/08/2023] Open
Abstract
Transcription factor (TF) databases contain multitudes of binding motifs (TFBMs) from various sources, from which non-redundant collections are derived by manual curation. The advent of high-throughput methods stimulated the production of novel collections with increasing numbers of motifs. Meta-databases, built by merging these collections, contain redundant versions, because available tools are not suited to automatically identify and explore biologically relevant clusters among thousands of motifs. Motif discovery from genome-scale data sets (e.g. ChIP-seq) also produces redundant motifs, hampering the interpretation of results. We present matrix-clustering, a versatile tool that clusters similar TFBMs into multiple trees, and automatically creates non-redundant TFBM collections. A feature unique to matrix-clustering is its dynamic visualisation of aligned TFBMs, and its capability to simultaneously treat multiple collections from various sources. We demonstrate that matrix-clustering considerably simplifies the interpretation of combined results from multiple motif discovery tools, and highlights biologically relevant variations of similar motifs. We also ran a large-scale application to cluster ∼11 000 motifs from 24 entire databases, showing that matrix-clustering correctly groups motifs belonging to the same TF families, and drastically reduced motif redundancy. matrix-clustering is integrated within the RSAT suite (http://rsat.eu/), accessible through a user-friendly web interface or command-line for its integration in pipelines.
Collapse
|
14
|
Role of the chromatin landscape and sequence in determining cell type-specific genomic glucocorticoid receptor binding and gene regulation. Nucleic Acids Res 2017; 45:1805-1819. [PMID: 27903902 PMCID: PMC5389550 DOI: 10.1093/nar/gkw1163] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/27/2016] [Revised: 11/03/2016] [Accepted: 11/08/2016] [Indexed: 01/18/2023] Open
Abstract
The genomic loci bound by the glucocorticoid receptor (GR), a hormone-activated transcription factor, show little overlap between cell types. To study the role of chromatin and sequence in specifying where GR binds, we used Bayesian modeling within the universe of accessible chromatin. Taken together, our results uncovered that although GR preferentially binds accessible chromatin, its binding is biased against accessible chromatin located at promoter regions. This bias can only be explained partially by the presence of fewer GR recognition sequences, arguing for the existence of additional mechanisms that interfere with GR binding at promoters. Therefore, we tested the role of H3K9ac, the chromatin feature with the strongest negative association with GR binding, but found that this correlation does not reflect a causative link. Finally, we find a higher percentage of promoter-proximal GR binding for genes regulated by GR across cell types than for cell type-specific target genes. Given that GR almost exclusively binds accessible chromatin, we propose that cell type-specific regulation by GR preferentially occurs via distal enhancers, whose chromatin accessibility is typically cell type-specific, whereas ubiquitous target gene regulation is more likely to result from binding to promoter regions, which are often accessible regardless of cell type examined.
Collapse
|
15
|
Corrigendum: Sequences flanking the core-binding site modulate glucocorticoid receptor structure and activity. Nat Commun 2016; 7:13784. [PMID: 27873998 PMCID: PMC5121421 DOI: 10.1038/ncomms13784] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/25/2022] Open
|
16
|
Sequences flanking the core-binding site modulate glucocorticoid receptor structure and activity. Nat Commun 2016; 7:12621. [PMID: 27581526 PMCID: PMC5025757 DOI: 10.1038/ncomms12621] [Citation(s) in RCA: 38] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2016] [Accepted: 07/18/2016] [Indexed: 02/07/2023] Open
Abstract
The glucocorticoid receptor (GR) binds as a homodimer to genomic response elements, which have particular sequence and shape characteristics. Here we show that the nucleotides directly flanking the core-binding site, differ depending on the strength of GR-dependent activation of nearby genes. Our study indicates that these flanking nucleotides change the three-dimensional structure of the DNA-binding site, the DNA-binding domain of GR and the quaternary structure of the dimeric complex. Functional studies in a defined genomic context show that sequence-induced changes in GR activity cannot be explained by differences in GR occupancy. Rather, mutating the dimerization interface mitigates DNA-induced changes in both activity and structure, arguing for a role of DNA-induced structural changes in modulating GR activity. Together, our study shows that DNA sequence identity of genomic binding sites modulates GR activity downstream of binding, which may play a role in achieving regulatory specificity towards individual target genes.
Collapse
|
17
|
Identification and characterization of DNA sequences that prevent glucocorticoid receptor binding to nearby response elements. Nucleic Acids Res 2016; 44:6142-56. [PMID: 27016732 PMCID: PMC5291246 DOI: 10.1093/nar/gkw203] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 03/16/2016] [Indexed: 01/13/2023] Open
Abstract
Out of the myriad of potential DNA binding sites of the glucocorticoid receptor (GR) found in the human genome, only a cell-type specific minority is actually bound, indicating that the presence of a recognition sequence alone is insufficient to specify where GR binds. Cooperative interactions with other transcription factors (TFs) are known to contribute to binding specificity. Here, we reasoned that sequence signals preventing GR recruitment to certain loci provide an alternative means to confer specificity. Motif analyses uncovered candidate Negative Regulatory Sequences (NRSs) that interfere with genomic GR binding. Subsequent functional analyses demonstrated that NRSs indeed prevent GR binding to nearby response elements. We show that NRS activity is conserved across species, found in most tissues and that they also interfere with the genomic binding of other TFs. Interestingly, the effects of NRSs appear not to be a simple consequence of changes in chromatin accessibility. Instead, we find that NRSs interact with proteins found at sub-nuclear structures called paraspeckles and that these proteins might mediate the repressive effects of NRSs. Together, our studies suggest that the joint influence of positive and negative sequence signals partition the genome into regions where GR can bind and those where it cannot.
Collapse
|
18
|
Histone Chaperone SSRP1 is Essential for Wnt Signaling Pathway Activity During Osteoblast Differentiation. Stem Cells 2016; 34:1369-76. [PMID: 27146025 DOI: 10.1002/stem.2287] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2015] [Accepted: 11/12/2015] [Indexed: 12/21/2022]
Abstract
Cellular differentiation is accompanied by dramatic changes in chromatin structure which direct the activation of lineage-specific transcriptional programs. Structure-specific recognition protein-1 (SSRP1) is a histone chaperone which is important for chromatin-associated processes such as transcription, DNA replication and repair. Since the function of SSRP1 during cell differentiation remains unclear, we investigated its potential role in controlling lineage determination. Depletion of SSRP1 in human mesenchymal stem cells elicited lineage-specific effects by increasing expression of adipocyte-specific genes and decreasing the expression of osteoblast-specific genes. Consistent with a role in controlling lineage specification, transcriptome-wide RNA-sequencing following SSRP1 depletion and the induction of osteoblast differentiation revealed a specific decrease in the expression of genes involved in biological processes related to osteoblast differentiation. Importantly, we observed a specific downregulation of target genes of the canonical Wnt signaling pathway, which was accompanied by decreased nuclear localization of active β-catenin. Together our data uncover a previously unknown role for SSRP1 in promoting the activation of the Wnt signaling pathway activity during cellular differentiation. Stem Cells 2016;34:1369-1376.
Collapse
|
19
|
RSAT 2015: Regulatory Sequence Analysis Tools. Nucleic Acids Res 2015; 43:W50-6. [PMID: 25904632 PMCID: PMC4489296 DOI: 10.1093/nar/gkv362] [Citation(s) in RCA: 188] [Impact Index Per Article: 20.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/06/2015] [Accepted: 04/07/2015] [Indexed: 11/13/2022] Open
Abstract
RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/.
Collapse
|
20
|
ChIP-exo signal associated with DNA-binding motifs provides insight into the genomic binding of the glucocorticoid receptor and cooperating transcription factors. Genome Res 2015; 25:825-35. [PMID: 25720775 PMCID: PMC4448679 DOI: 10.1101/gr.185157.114] [Citation(s) in RCA: 102] [Impact Index Per Article: 11.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2014] [Accepted: 02/23/2015] [Indexed: 12/22/2022]
Abstract
The classical DNA recognition sequence of the glucocorticoid receptor (GR) appears to be present at only a fraction of bound genomic regions. To identify sequences responsible for recruitment of this transcription factor (TF) to individual loci, we turned to the high-resolution ChIP-exo approach. We exploited this signal by determining footprint profiles of TF binding at single-base-pair resolution using ExoProfiler, a computational pipeline based on DNA binding motifs. When applied to our GR and the few available public ChIP-exo data sets, we find that ChIP-exo footprints are protein- and recognition sequence-specific signatures of genomic TF association. Furthermore, we show that ChIP-exo captures information about TFs other than the one directly targeted by the antibody in the ChIP procedure. Consequently, the shape of the ChIP-exo footprint can be used to discriminate between direct and indirect (tethering to other DNA-bound proteins) DNA association of GR. Together, our findings indicate that the absence of classical recognition sequences can be explained by direct GR binding to a broader spectrum of sequences than previously known, either as a homodimer or as a heterodimer binding together with a member of the ETS or TEAD families of TFs, or alternatively by indirect recruitment via FOX or STAT proteins. ChIP-exo footprints also bring structural insights and locate DNA:protein cross-link points that are compatible with crystal structures of the studied TFs. Overall, our generically applicable footprint-based approach uncovers new structural and functional insights into the diverse ways of genomic cooperation and association of TFs.
Collapse
|
21
|
Abstract
Despite tremendous body form diversity in nature, bilaterian animals share common sets of developmental genes that display conserved expression patterns in the embryo. Among them are the Hox genes, which define different identities along the anterior–posterior axis. Hox proteins exert their function by interaction with TALE transcription factors. Hox and TALE members are also present in some but not all non-bilaterian phyla, raising the question of how Hox–TALE interactions evolved to provide positional information. By using proteins from unicellular and multicellular lineages, we showed that these networks emerged from an ancestral generic motif present in Hox and other related protein families. Interestingly, Hox-TALE networks experienced additional and extensive molecular innovations that were likely crucial for differentiating Hox functions along body plans. Together our results highlight how homeobox gene families evolved during eukaryote evolution to eventually constitute a major patterning system in Eumetazoans. DOI:http://dx.doi.org/10.7554/eLife.01939.001 Any animal with a body that is symmetric about an imaginary line that runs from its head to its tail is known as a bilaterian. Humans and most animals are bilateral, whereas jellyfish and starfish are not. Bilateral symmetry can take many forms—as demonstrated by the differences between flies, frogs and humans—but all bilaterians express many of the same genes during development. One of these groups of genes is known as the Hox family. The expression of specific Hox genes at specific times instructs cells in the developing embryo to adopt different fates according to their position along the anterior–posterior (head to tail) axis. The patterning function of Hox genes relies on the presence of two additional cofactors that belong to the so-called TALE family. Although both Hox and TALE proteins were present early on during animal evolution, it is unclear how and when the interactions between them first began to generate symmetrical body plans. Now, Hudry et al. have provided insights into the origin of the Hox-TALE network by analysing the expression and molecular properties of Hox and TALE proteins from various multicellular and unicellular organisms. These experiments revealed that Hox and TALE proteins of the sea anemone Nematostella, which belongs to a group of animals called cnidarians that have radial rather than bilateral symmetry, interact with one another in a similar manner to the interactions seen in bilaterians. Hudry et al. then showed that two Nematostella Hox genes were able to substitute for their bilaterian equivalents in fruit flies, and that a Nematostella TALE gene was able to take over neuronal functions of its equivalent in Xenopus frogs. This striking conservation of function between species suggests that Hox and TALE genes were already working together in the common ancestor of all bilaterian and cnidarian animals. By contrast, TALE members from a unicellular amoeba were unable to interact with Hox proteins, suggesting that Hox–TALE interactions first emerged in multicellular animals. In addition to increasing our knowledge of highly conserved Hox signalling, these data provide insight into the molecular mechanisms that gave rise to the symmetrical body plan that has been adopted, and adapted, by the majority of animals since. DOI:http://dx.doi.org/10.7554/eLife.01939.002
Collapse
|
22
|
Abstract
ChIP-seq is increasingly used to characterize transcription factor binding and chromatin marks at a genomic scale. Various tools are now available to extract binding motifs from peak data sets. However, most approaches are only available as command-line programs, or via a website but with size restrictions. We present peak-motifs, a computational pipeline that discovers motifs in peak sequences, compares them with databases, exports putative binding sites for visualization in the UCSC genome browser and generates an extensive report suited for both naive and expert users. It relies on time- and memory-efficient algorithms enabling the treatment of several thousand peaks within minutes. Regarding time efficiency, peak-motifs outperforms all comparable tools by several orders of magnitude. We demonstrate its accuracy by analyzing data sets ranging from 4000 to 1 28 000 peaks for 12 embryonic stem cell-specific transcription factors. In all cases, the program finds the expected motifs and returns additional motifs potentially bound by cofactors. We further apply peak-motifs to discover tissue-specific motifs in peak collections for the p300 transcriptional co-activator. To our knowledge, peak-motifs is the only tool that performs a complete motif analysis and offers a user-friendly web interface without any restriction on sequence size or number of peaks.
Collapse
|
23
|
Abstract
RSAT (Regulatory Sequence Analysis Tools) comprises a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. Thirteen new programs have been added to the 30 described in the 2008 NAR Web Software Issue, including an automated sequence retrieval from EnsEMBL (retrieve-ensembl-seq), two novel motif discovery algorithms (oligo-diff and info-gibbs), a 100-times faster version of matrix-scan enabling the scanning of genome-scale sequence sets, and a series of facilities for random model generation and statistical evaluation (random-genome-fragments, random-motifs, random-sites, implant-sites, sequence-probability, permute-matrix). Our most recent work also focused on motif comparison (compare-matrices) and evaluation of motif quality (matrix-quality) by combining theoretical and empirical measures to assess the predictive capability of position-specific scoring matrices. To process large collections of peak sequences obtained from ChIP-seq or related technologies, RSAT provides a new program (peak-motifs) that combines several efficient motif discovery algorithms to predict transcription factor binding motifs, match them against motif databases and predict their binding sites. Availability (web site, stand-alone programs and SOAP/WSDL (Simple Object Access Protocol/Web Services Description Language) web services): http://rsat.ulb.ac.be/rsat/.
Collapse
|
24
|
Transcription factor binding predictions using TRAP for the analysis of ChIP-seq data and regulatory SNPs. Nat Protoc 2011; 6:1860-9. [PMID: 22051799 DOI: 10.1038/nprot.2011.409] [Citation(s) in RCA: 168] [Impact Index Per Article: 12.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
The transcription factor affinity prediction (TRAP) method calculates the affinity of transcription factors for DNA sequences on the basis of a biophysical model. This method has proven to be useful for several applications, including for determining the putative target genes of a given factor. This protocol covers two other applications: (i) determining which transcription factors have the highest affinity in a set of sequences (illustrated with chromatin immunoprecipitation-sequencing (ChIP-seq) peaks), and (ii) finding which factor is the most affected by a regulatory single-nucleotide polymorphism. The protocol describes how to use the TRAP web tools to address these questions, and it also presents a way to run TRAP on random control sequences to better estimate the significance of the results. All of the tools are fully available online and do not need any additional installation. The complete protocol takes about 45 min, but each individual tool runs in a few minutes.
Collapse
|
25
|
Abstract
Position-specific scoring matrices (PSSMs) are routinely used to predict transcription factor (TF)-binding sites in genome sequences. However, their reliability to predict novel binding sites can be far from optimum, due to the use of a small number of training sites or the inappropriate choice of parameters when building the matrix or when scanning sequences with it. Measures of matrix quality such as E-value and information content rely on theoretical models, and may fail in the context of full genome sequences. We propose a method, implemented in the program ‘matrix-quality’, that combines theoretical and empirical score distributions to assess reliability of PSSMs for predicting TF-binding sites. We applied ‘matrix-quality’ to estimate the predictive capacity of matrices for bacterial, yeast and mouse TFs. The evaluation of matrices from RegulonDB revealed some poorly predictive motifs, and allowed us to quantify the improvements obtained by applying multi-genome motif discovery. Interestingly, the method reveals differences between global and specific regulators. It also highlights the enrichment of binding sites in sequence sets obtained from high-throughput ChIP-chip (bacterial and yeast TFs), and ChIP–seq and experiments (mouse TFs). The method presented here has many applications, including: selecting reliable motifs before scanning sequences; improving motif collections in TFs databases; evaluating motifs discovered using high-throughput data sets.
Collapse
|
26
|
A non-tree-based comprehensive study of metazoan Hox and ParaHox genes prompts new insights into their origin and evolution. BMC Evol Biol 2010; 10:73. [PMID: 20222951 PMCID: PMC2842273 DOI: 10.1186/1471-2148-10-73] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2009] [Accepted: 03/11/2010] [Indexed: 12/17/2022] Open
Abstract
BACKGROUND Hox and the closely-related ParaHox genes, which emerged prior to the divergence between cnidarians and bilaterians, are the most well-known members of the ancient genetic toolkit that controls embryonic development across all metazoans. Fundamental questions relative to their origin and evolutionary relationships remain however unresolved. We investigate here the evolution of metazoan Hox and ParaHox genes using the HoxPred program that allows the identification of Hox genes without the need of phylogenetic tree reconstructions. RESULTS We show that HoxPred provides an efficient and accurate classification of Hox and ParaHox genes in their respective homology groups, including Hox paralogous groups (PGs). We analyzed more than 10,000 sequences from 310 metazoan species, from 6 genome projects and the complete UniProtKB database. The HoxPred program and all results arranged in the Datab'Hox database are freely available at http://cege.vub.ac.be/hoxpred/. Results for the genome-scale studies are coherent with previous studies, and also brings knowledge on the Hox repertoire and clusters for newly-sequenced species. The unprecedented scale of this study and the use of a non-tree-based approach allows unresolved key questions about Hox and ParaHox genes evolution to be addressed. CONCLUSIONS Our analysis suggests that the presence of a single type of Posterior Hox genes (PG9-like) is ancestral to bilaterians, and that new Posterior PGs would have arisen in deuterostomes through independent gene duplications. Four types of Central genes would also be ancestral to bilaterians, with two of them, PG6- and PG7-like that gave rise, in protostomes, to the UbdA- and ftz/Antp/Lox5-type genes, respectively. A fifth type of Central genes (PG8) would have emerged in the vertebrate lineage. Our results also suggest the presence of Anterior (PG1 and PG3), Central and Posterior Hox genes in the cnidarians, supporting an ancestral four-gene Hox cluster. In addition, our data support the relationship of the bilaterian ParaHox genes Gsx and Xlox with PG3, and Cdx with the Central genes. Our study therefore indicates three possible models for the origin of Hox and ParaHox in early metazoans, a two-gene (Anterior/PG3--Central/Posterior), a three-gene (Anterior/PG1, Anterior/PG3 and Central/Posterior), or a four-gene (Anterior/PG1--Anterior/PG3--Central--Posterior) ProtoHox cluster.
Collapse
|
27
|
Retrieve-ensembl-seq: user-friendly and large-scale retrieval of single or multi-genome sequences from Ensembl. Bioinformatics 2009; 25:2739-40. [PMID: 19720677 DOI: 10.1093/bioinformatics/btp519] [Citation(s) in RCA: 11] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open
Abstract
UNLABELLED The preparation of an appropriate sequence dataset is the starting point of all genomic analyses. We present retrieve-ensembl-seq, an application that considerably eases the retrieval of sequences from the Ensembl database, via our user-friendly web site or web services. The user provides Ensembl identifiers or gene names, and the program returns corresponding upstream, downstream, intronic, exonic, UTR or whole gene sequences. retrieve-ensembl-seq also offers a multiple organism mode to retrieve sequences from homologous genes at any taxonomical level. And we introduce various original filters such as the masking of coding fragments and the avoidance of sequence redundancy for genes with multiple transcripts. retrieve-ensembl-seq is included in the software suite regulatory sequence analysis tools (RSAT), allowing instant submission of retrieved sequences to further analysis tools. AVAILABILITY retrieve-ensembl-seq is integrated in the RSAT suite: http://rsat.ulb.ac.be/rsat. Web site: http://rsat.ulb.ac.be/rsat/retrieve-ensembl-seq_form.cgi. Web services: http://rsat.ulb.ac.be/rsat/web_services/RSATWS.wsdl. Stand-alone distribution: freely available under an academic licence to download from the RSAT web site. The complete manual, a convenient tutorial and demos are available from the RSAT website. Additional help can be found on the RSAT public forum.
Collapse
|
28
|
Using RSAT to scan genome sequences for transcription factor binding sites and cis-regulatory modules. Nat Protoc 2008; 3:1578-88. [PMID: 18802439 DOI: 10.1038/nprot.2008.97] [Citation(s) in RCA: 197] [Impact Index Per Article: 12.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/01/2023]
Abstract
This protocol shows how to detect putative cis-regulatory elements and regions enriched in such elements with the regulatory sequence analysis tools (RSAT) web server (http://rsat.ulb.ac.be/rsat/). The approach applies to known transcription factors, whose binding specificity is represented by position-specific scoring matrices, using the program matrix-scan. The detection of individual binding sites is known to return many false predictions. However, results can be strongly improved by estimating P value, and by searching for combinations of sites (homotypic and heterotypic models). We illustrate the detection of sites and enriched regions with a study case, the upstream sequence of the Drosophila melanogaster gene even-skipped. This protocol is also tested on random control sequences to evaluate the reliability of the predictions. Each task requires a few minutes of computation time on the server. The complete protocol can be executed in about one hour.
Collapse
|
29
|
Abstract
The regulatory sequence analysis tools (RSAT, http://rsat.ulb.ac.be/rsat/) is a software suite that integrates a wide collection of modular tools for the detection of cis-regulatory elements in genome sequences. The suite includes programs for sequence retrieval, pattern discovery, phylogenetic footprint detection, pattern matching, genome scanning and feature map drawing. Random controls can be performed with random gene selections or by generating random sequences according to a variety of background models (Bernoulli, Markov). Beyond the original word-based pattern-discovery tools (oligo-analysis and dyad-analysis), we recently added a battery of tools for matrix-based detection of cis-acting elements, with some original features (adaptive background models, Markov-chain estimation of P-values) that do not exist in other matrix-based scanning tools. The web server offers an intuitive interface, where each program can be accessed either separately or connected to the other tools. In addition, the tools are now available as web services, enabling their integration in programmatic workflows. Genomes are regularly updated from various genome repositories (NCBI and EnsEMBL) and 682 organisms are currently supported. Since 1998, the tools have been used by several hundreds of researchers from all over the world. Several predictions made with RSAT were validated experimentally and published.
Collapse
|
30
|
Comparative phylogenomic analyses of teleost fish Hox gene clusters: lessons from the cichlid fish Astatotilapia burtoni: comment. BMC Genomics 2008; 9:35. [PMID: 18218066 PMCID: PMC2246111 DOI: 10.1186/1471-2164-9-35] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2007] [Accepted: 01/24/2008] [Indexed: 11/10/2022] Open
Abstract
A reanalysis of the sequences reported by Hoegg et al has highlighted the presence of a putative HoxC1a gene in Astatotilapia burtoni. We discuss the evolutionary history of the HoxC1a gene in the teleost fish lineages and suggest that HoxC1a gene was lost twice independently in the Neoteleosts. This comment points out that combining several gene-finding methods and a Hox-dedicated program can improve the identification of Hox genes.
Collapse
|