1
|
Masuda LHP, Sabino AU, Reinitz J, Ramos AF, Machado-Lima A, Andrioli LP. Global repression by tailless during segmentation. Dev Biol 2024; 505:11-23. [PMID: 37879494 PMCID: PMC10949167 DOI: 10.1016/j.ydbio.2023.09.014] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2022] [Revised: 09/26/2023] [Accepted: 09/28/2023] [Indexed: 10/27/2023]
Abstract
The orphan nuclear receptor Tailless (Tll) exhibits conserved roles in brain formation and maintenance that are shared, for example, with vertebrate orthologous forms (Tlx). However, the early expression of tll in two gap domains in the segmentation cascade of Drosophila is unusual even for most other insects. Here we investigate tll regulation on pair-rule stripes. With ectopic misexpression of tll we detected unexpected repression of almost all pair-rule stripes of hairy (h), even-skipped (eve), runt (run), and fushi-tarazu (ftz). Examining Tll embryonic ChIP-chip data with regions mapped as Cis-Regulatory Modules (CRMs) of pair-rule stripes we verified Tll interactions to these regions. With the ChIP-chip data we also verified Tll interactions to the CRMs of gap domains and in the misexpression assay, Tll-mediated repression on Kruppel (Kr), kni (kni) and giant (gt) according to their differential sensitivity to Tll. These results with gap genes confirmed previous data from the literature and argue against indirect repression roles of Tll in the striped pattern. Moreover, the prediction of Tll binding sites in the CRMs of eve stripes and the mathematical modeling of their removal using an experimentally validated theoretical framework shows effects on eve stripes compatible with the absence of a repressor binding to the CRMs. In addition, modeling increased tll levels in the embryo results in the differential repression of eve stripes, agreeing well with the results of the misexpression assay. In genetic assays we investigated eve 5, that is strongly repressed by the ectopic domain and representative of more central stripes not previously implied to be under direct regulation of tll. While this stripe is little affected in tll-, its posterior border is expanded in gt- but detected with even greater expansion in gt-;tll-. We end up by discussing tll with key roles in combinatorial repression mechanisms to contain the expression of medial patterns of the segmentation cascade in the extremities of the embryo.
Collapse
Affiliation(s)
| | - Alan Utsuni Sabino
- Departamento de Radiologia e Oncologia, Instituto do Câncer do Estado de São Paulo, Hospital das Clínicas, Faculdade de Medicina, Universidade de São Paulo, São Paulo, Brazil
| | - John Reinitz
- Departments of Statistics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL, USA
| | | | - Ariane Machado-Lima
- Escola de Artes, Ciências e Humanidades da Universidade de São Paulo, São Paulo, Brazil
| | - Luiz Paulo Andrioli
- Escola de Artes, Ciências e Humanidades da Universidade de São Paulo, São Paulo, Brazil.
| |
Collapse
|
2
|
Ling L, Mühling B, Jaenichen R, Gompel N. Increased chromatin accessibility promotes the evolution of a transcriptional silencer in Drosophila. SCIENCE ADVANCES 2023; 9:eade6529. [PMID: 36800429 PMCID: PMC9937571 DOI: 10.1126/sciadv.ade6529] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 01/17/2023] [Indexed: 06/18/2023]
Abstract
The loss of discrete morphological traits, the most common evolutionary transition, is typically driven by changes in developmental gene expression. Mutations accumulating in regulatory elements of these genes can disrupt DNA binding sites for transcription factors patterning their spatial expression, or delete entire enhancers. Regulatory elements, however, may be silenced through changes in chromatin accessibility or the emergence of repressive elements. Here, we show that increased chromatin accessibility at the gene yellow, combined with the gain of a repressor site, underlies the loss of a wing spot pigmentation pattern in a Drosophila species. The gain of accessibility of this repressive element is regulated by E93, a transcription factor governing the progress of metamorphosis. This convoluted evolutionary scenario contrasts with the parsimonious mutational paths generally envisioned and often documented for morphological losses. It illustrates how evolutionary changes in chromatin accessibility may directly contribute to morphological diversification.
Collapse
|
3
|
REDfly: An Integrated Knowledgebase for Insect Regulatory Genomics. INSECTS 2022; 13:insects13070618. [PMID: 35886794 PMCID: PMC9323752 DOI: 10.3390/insects13070618] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/16/2022] [Revised: 07/01/2022] [Accepted: 07/06/2022] [Indexed: 11/29/2022]
Abstract
Simple Summary Understanding how genes are regulated is a vital area of current biological research and a crucial adjunct to ongoing efforts to sequence entire genomes. Knowing the DNA sequences responsible for gene regulation—transcriptional cis-regulatory modules (CRMs, e.g., “enhancers”) and transcription factor binding sites (TFBSs)—is important for many areas of research including interpretation and validation of data developed by large-scale genomics projects, providing training data for machine-learning CRM-discovery methods, genome annotation, modeling gene-regulatory networks, studying the evolution of gene regulation, and numerous aspects of the basic biology of transcriptional regulation. Knowledge of insect CRMs is also an important step in developing biotechnology methods for control of insect disease vectors and for eliminating pathogen transmission. The REDfly (Regulatory Element Database for Fly) database integrates all of the available insect cis-regulatory information from multiple sources to provide a comprehensive collection of known regulatory elements. In this paper, we describe REDfly’s basic contents and data model, emphasizing recently added features, and provide illustrated walk-throughs of some common search scenarios. Abstract We provide here an updated description of the REDfly (Regulatory Element Database for Fly) database of transcriptional regulatory elements, a unique resource that provides regulatory annotation for the genome of Drosophila and other insects. The genomic sequences regulating insect gene expression—transcriptional cis-regulatory modules (CRMs, e.g., “enhancers”) and transcription factor binding sites (TFBSs)—are not currently curated by any other major database resources. However, knowledge of such sequences is important, as CRMs play critical roles with respect to disease as well as normal development, phenotypic variation, and evolution. Characterized CRMs also provide useful tools for both basic and applied research, including developing methods for insect control. REDfly, which is the most detailed existing platform for metazoan regulatory-element annotation, includes over 40,000 experimentally verified CRMs and TFBSs along with their DNA sequences, their associated genes, and the expression patterns they direct. Here, we briefly describe REDfly’s contents and data model, with an emphasis on the new features implemented since 2020. We then provide an illustrated walk-through of several common REDfly search use cases.
Collapse
|
4
|
Cai X, Rondeel I, Baumgartner S. Modulating the bicoid gradient in space and time. Hereditas 2021; 158:29. [PMID: 34404481 PMCID: PMC8371787 DOI: 10.1186/s41065-021-00192-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Accepted: 07/19/2021] [Indexed: 11/15/2022] Open
Abstract
Background The formation of the Bicoid (Bcd) gradient in the early Drosophila is one of the most fascinating observations in biology and serves as a paradigm for gradient formation, yet its mechanism is still not fully understood. Two distinct models were proposed in the past, the SDD and the ARTS model. Results We define novel cis- and trans-acting factors that are indispensable for gradient formation. The first one is the poly A tail length of the bcd mRNA where we demonstrate that it changes not only in time, but also in space. We show that posterior bcd mRNAs possess a longer poly tail than anterior ones and this elongation is likely mediated by wispy (wisp), a poly A polymerase. Consequently, modulating the activity of Wisp results in changes of the Bcd gradient, in controlling downstream targets such as the gap and pair-rule genes, and also in influencing the cuticular pattern. Attempts to modulate the Bcd gradient by subjecting the egg to an extra nuclear cycle, i.e. a 15th nuclear cycle by means of the maternal haploid (mh) mutation showed no effect, neither on the appearance of the gradient nor on the control of downstream target. This suggests that the segmental anlagen are determined during the first 14 nuclear cycles. Finally, we identify the Cyclin B (CycB) gene as a trans-acting factor that modulates the movement of Bcd such that Bcd movement is allowed to move through the interior of the egg. Conclusions Our analysis demonstrates that Bcd gradient formation is far more complex than previously thought requiring a revision of the models of how the gradient is formed.
Collapse
Affiliation(s)
- Xiaoli Cai
- Departmentof Experimental Medical Sciences, Lund University, BMC D10, 22184, Lund, Sweden
| | - Inge Rondeel
- Departmentof Experimental Medical Sciences, Lund University, BMC D10, 22184, Lund, Sweden.,Present address: Hubrecht Institute, 3584 CT, Utrecht, The Netherlands
| | - Stefan Baumgartner
- Departmentof Experimental Medical Sciences, Lund University, BMC D10, 22184, Lund, Sweden. .,Department of Biology, University of Konstanz, 78457, Konstanz, Germany.
| |
Collapse
|
5
|
Asma H, Halfon MS. Annotating the Insect Regulatory Genome. INSECTS 2021; 12:591. [PMID: 34209769 PMCID: PMC8305585 DOI: 10.3390/insects12070591] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 06/23/2021] [Accepted: 06/25/2021] [Indexed: 11/17/2022]
Abstract
An ever-growing number of insect genomes is being sequenced across the evolutionary spectrum. Comprehensive annotation of not only genes but also regulatory regions is critical for reaping the full benefits of this sequencing. Driven by developments in sequencing technologies and in both empirical and computational discovery strategies, the past few decades have witnessed dramatic progress in our ability to identify cis-regulatory modules (CRMs), sequences such as enhancers that play a major role in regulating transcription. Nevertheless, providing a timely and comprehensive regulatory annotation of newly sequenced insect genomes is an ongoing challenge. We review here the methods being used to identify CRMs in both model and non-model insect species, and focus on two tools that we have developed, REDfly and SCRMshaw. These resources can be paired together in a powerful combination to facilitate insect regulatory annotation over a broad range of species, with an accuracy equal to or better than that of other state-of-the-art methods.
Collapse
Affiliation(s)
- Hasiba Asma
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
| | - Marc S. Halfon
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- NY State Center of Excellence in Bioinformatics & Life Sciences, Buffalo, NY 14203, USA
| |
Collapse
|
6
|
Jindal GA, Farley EK. Enhancer grammar in development, evolution, and disease: dependencies and interplay. Dev Cell 2021; 56:575-587. [PMID: 33689769 PMCID: PMC8462829 DOI: 10.1016/j.devcel.2021.02.016] [Citation(s) in RCA: 47] [Impact Index Per Article: 15.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 02/15/2021] [Accepted: 02/16/2021] [Indexed: 12/19/2022]
Abstract
Each language has standard books describing that language's grammatical rules. Biologists have searched for similar, albeit more complex, principles relating enhancer sequence to gene expression. Here, we review the literature on enhancer grammar. We introduce dependency grammar, a model where enhancers encode information based on dependencies between enhancer features shaped by mechanistic, evolutionary, and biological constraints. Classifying enhancers based on the types of dependencies may identify unifying principles relating enhancer sequence to gene expression. Such rules would allow us to read the instructions for development within genomes and pinpoint causal enhancer variants underlying disease and evolutionary changes.
Collapse
Affiliation(s)
- Granton A Jindal
- Division of Cardiology, Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Division of Biological Sciences, Section of Molecular Biology, University of California San Diego, La Jolla, CA 92093, USA
| | - Emma K Farley
- Division of Cardiology, Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Division of Biological Sciences, Section of Molecular Biology, University of California San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
7
|
Avsec Ž, Weilert M, Shrikumar A, Krueger S, Alexandari A, Dalal K, Fropf R, McAnany C, Gagneur J, Kundaje A, Zeitlinger J. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat Genet 2021; 53:354-366. [PMID: 33603233 PMCID: PMC8812996 DOI: 10.1038/s41588-021-00782-6] [Citation(s) in RCA: 225] [Impact Index Per Article: 75.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Accepted: 01/07/2021] [Indexed: 01/30/2023]
Abstract
The arrangement (syntax) of transcription factor (TF) binding motifs is an important part of the cis-regulatory code, yet remains elusive. We introduce a deep learning model, BPNet, that uses DNA sequence to predict base-resolution chromatin immunoprecipitation (ChIP)-nexus binding profiles of pluripotency TFs. We develop interpretation tools to learn predictive motif representations and identify soft syntax rules for cooperative TF binding interactions. Strikingly, Nanog preferentially binds with helical periodicity, and TFs often cooperate in a directional manner, which we validate using clustered regularly interspaced short palindromic repeat (CRISPR)-induced point mutations. Our model represents a powerful general approach to uncover the motifs and syntax of cis-regulatory sequences in genomics data.
Collapse
Affiliation(s)
- Žiga Avsec
- Department of Informatics, Technical University of Munich, Garching, Germany,Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, Munich, Germany,Currently at DeepMind, London, UK
| | - Melanie Weilert
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Avanti Shrikumar
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Sabrina Krueger
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Amr Alexandari
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Khyati Dalal
- Stowers Institute for Medical Research, Kansas City, MO, USA,The University of Kansas Medical Center, Kansas City, KS, USA
| | - Robin Fropf
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Charles McAnany
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA, USA,Department of Genetics, Stanford University, Stanford, CA, USA,correspondence: ,
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO, USA,The University of Kansas Medical Center, Kansas City, KS, USA,correspondence: ,
| |
Collapse
|
8
|
Makashov AA, Myasnikova EM, Spirov AV. Fuzzy Linguistic Modeling of the Regulation of Drosophila Segmentation Genes. Biophysics (Nagoya-shi) 2021. [DOI: 10.1134/s0006350921010073] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022] Open
|
9
|
Veil M, Yampolsky LY, Grüning B, Onichtchouk D. Pou5f3, SoxB1, and Nanog remodel chromatin on high nucleosome affinity regions at zygotic genome activation. Genome Res 2019; 29:383-395. [PMID: 30674556 PMCID: PMC6396415 DOI: 10.1101/gr.240572.118] [Citation(s) in RCA: 29] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/11/2018] [Accepted: 01/16/2019] [Indexed: 12/16/2022]
Abstract
The zebrafish embryo is transcriptionally mostly quiescent during the first 10 cell cycles, until the main wave of zygotic genome activation (ZGA) occurs, accompanied by fast chromatin remodeling. At ZGA, homologs of the mammalian stem cell transcription factors (TFs) Pou5f3, Nanog, and Sox19b bind to thousands of developmental enhancers to initiate transcription. So far, how these TFs influence chromatin dynamics at ZGA has remained unresolved. To address this question, we analyzed nucleosome positions in wild-type and maternal-zygotic (MZ) mutants for pou5f3 and nanog by MNase-seq. We show that Nanog, Sox19b, and Pou5f3 bind to the high nucleosome affinity regions (HNARs). HNARs are spanning over 600 bp, featuring high in vivo and predicted in vitro nucleosome occupancy and high predicted propeller twist DNA shape value. We suggest a two-step nucleosome destabilization-depletion model, in which the same intrinsic DNA properties of HNAR promote both high nucleosome occupancy and differential binding of TFs. In the first step, already before ZGA, Pou5f3 and Nanog destabilize nucleosomes at HNAR centers genome-wide. In the second step, post-ZGA, Nanog, Pou5f3, and SoxB1 maintain open chromatin state on the subset of HNARs, acting synergistically. Nanog binds to the HNAR center, whereas the Pou5f3 stabilizes the flanks. The HNAR model will provide a useful tool for genome regulatory studies in a variety of biological systems.
Collapse
Affiliation(s)
- Marina Veil
- Department of Developmental Biology, Institute of Biology I, Faculty of Biology, Albert Ludwigs University of Freiburg, 79104, Freiburg, Germany
| | - Lev Y Yampolsky
- Department of Biological Sciences, East Tennessee State University, Johnson City, Tennessee 37614-1710, USA.,Zoological Institute, Basel University, Basel, CH-4051 Switzerland
| | - Björn Grüning
- Department of Computer Science, Albert Ludwigs University of Freiburg, 79110, Freiburg, Germany.,Center for Biological Systems Analysis (ZBSA), University of Freiburg, 79104, Freiburg, Germany
| | - Daria Onichtchouk
- Department of Developmental Biology, Institute of Biology I, Faculty of Biology, Albert Ludwigs University of Freiburg, 79104, Freiburg, Germany.,Signalling Research centers BIOSS and CIBSS, 79104, Freiburg, Germany.,Institute of Developmental Biology RAS, 119991 Moscow, Russia
| |
Collapse
|
10
|
Li L, Wunderlich Z. An Enhancer's Length and Composition Are Shaped by Its Regulatory Task. Front Genet 2017; 8:63. [PMID: 28588608 PMCID: PMC5440464 DOI: 10.3389/fgene.2017.00063] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2017] [Accepted: 05/08/2017] [Indexed: 12/02/2022] Open
Abstract
Enhancers drive the gene expression patterns required for virtually every process in metazoans. We propose that enhancer length and transcription factor (TF) binding site composition—the number and identity of TF binding sites—reflect the complexity of the enhancer's regulatory task. In development, we define regulatory task complexity as the number of fates specified in a set of cells at once. We hypothesize that enhancers with more complex regulatory tasks will be longer, with more, but less specific, TF binding sites. Larger numbers of binding sites can be arranged in more ways, allowing enhancers to drive many distinct expression patterns, and therefore cell fates, using a finite number of TF inputs. We compare ~100 enhancers patterning the more complex anterior-posterior (AP) axis and the simpler dorsal-ventral (DV) axis in Drosophila and find that the AP enhancers are longer with more, but less specific binding sites than the (DV) enhancers. Using a set of ~3,500 enhancers, we find enhancer length and TF binding site number again increase with increasing regulatory task complexity. Therefore, to be broadly applicable, computational tools to study enhancers must account for differences in regulatory task.
Collapse
Affiliation(s)
- Lily Li
- Department of Developmental and Cell Biology, University of California, IrvineIrvine, CA, United States
| | - Zeba Wunderlich
- Department of Developmental and Cell Biology, University of California, IrvineIrvine, CA, United States
| |
Collapse
|
11
|
Crocker J, Tsai A, Stern DL. A Fully Synthetic Transcriptional Platform for a Multicellular Eukaryote. Cell Rep 2017; 18:287-296. [DOI: 10.1016/j.celrep.2016.12.025] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2015] [Revised: 12/14/2015] [Accepted: 12/07/2016] [Indexed: 01/12/2023] Open
|
12
|
Spirov AV, Myasnikova EM, Holloway DM. Sequential construction of a model for modular gene expression control, applied to spatial patterning of theDrosophilagenehunchback. J Bioinform Comput Biol 2016; 14:1641005. [DOI: 10.1142/s0219720016410055] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022]
Abstract
Gene network simulations are increasingly used to quantify mutual gene regulation in biological tissues. These are generally based on linear interactions between single-entity regulatory and target genes. Biological genes, by contrast, commonly have multiple, partially independent, cis-regulatory modules (CRMs) for regulator binding, and can produce variant transcription and translation products. We present a modeling framework to address some of the gene regulatory dynamics implied by this biological complexity. Spatial patterning of the hunchback (hb) gene in Drosophila development involves control by three CRMs producing two distinct mRNA transcripts. We use this example to develop a differential equations model for transcription which takes into account the cis-regulatory architecture of the gene. Potential regulatory interactions are screened by a genetic algorithms (GAs) approach and compared to biological expression data.
Collapse
Affiliation(s)
- Alexander V. Spirov
- Computer Science and CEWIT, SUNY Stony Brook, 1500 Stony Brook Road, Stony Brook, NY 11794, USA
- Lab Modeling of Evolution, I. M. Sechenov Institute of Evolutionary Physiology and Biochemistry, Russian Academy of Sciences, pr. Torez 44, St. Petersburg 194223, Russia
| | - Ekaterina M. Myasnikova
- Center for Advanced Studies, Peter the Great St. Petersburg Polytechnical University, 29 Polytechnicheskaya St. Petersburg 195251, Russia
- Department of Bioinformatics, Moscow Institute of Physics and Technology, 9 Institutskiy per., Dolgoprudny, Moscow 141700, Russia
| | - David M. Holloway
- Mathematics Department, British Columbia Institute of Technology, 3700 Willingdon Avenue, Burnaby, BC, Canada V5G 3H2, Canada
- Department of Biology, University of Victoria, Victoria, BC, Canada V8W 2Y2, Canada
| |
Collapse
|
13
|
Clifford J, Adami C. Discovery and information-theoretic characterization of transcription factor binding sites that act cooperatively. Phys Biol 2015; 12:056004. [PMID: 26331781 DOI: 10.1088/1478-3975/12/5/056004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Transcription factor binding to the surface of DNA regulatory regions is one of the primary causes of regulating gene expression levels. A probabilistic approach to model protein-DNA interactions at the sequence level is through position weight matrices (PWMs) that estimate the joint probability of a DNA binding site sequence by assuming positional independence within the DNA sequence. Here we construct conditional PWMs that depend on the motif signatures in the flanking DNA sequence, by conditioning known binding site loci on the presence or absence of additional binding sites in the flanking sequence of each site's locus. Pooling known sites with similar flanking sequence patterns allows for the estimation of the conditional distribution function over the binding site sequences. We apply our model to the Dorsal transcription factor binding sites active in patterning the Dorsal-Ventral axis of Drosophila development. We find that those binding sites that cooperate with nearby Twist sites on average contain about 0.5 bits of information about the presence of Twist transcription factor binding sites in the flanking sequence. We also find that Dorsal binding site detectors conditioned on flanking sequence information make better predictions about what is a Dorsal site relative to background DNA than detection without information about flanking sequence features.
Collapse
Affiliation(s)
- Jacob Clifford
- Department of Physics and Astronomy, Michigan State University, East Lansing, MI, USA. BEACON Center for the Study of Evolution in Action, Michigan State University, East Lansing, MI, USA
| | | |
Collapse
|
14
|
Grice J, Noyvert B, Doglio L, Elgar G. A Simple Predictive Enhancer Syntax for Hindbrain Patterning Is Conserved in Vertebrate Genomes. PLoS One 2015; 10:e0130413. [PMID: 26131856 PMCID: PMC4489388 DOI: 10.1371/journal.pone.0130413] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2015] [Accepted: 05/19/2015] [Indexed: 12/17/2022] Open
Abstract
Background Determining the function of regulatory elements is fundamental for our understanding of development, disease and evolution. However, the sequence features that mediate these functions are often unclear and the prediction of tissue-specific expression patterns from sequence alone is non-trivial. Previous functional studies have demonstrated a link between PBX-HOX and MEIS/PREP binding interactions and hindbrain enhancer activity, but the defining grammar of these sites, if any exists, has remained elusive. Results Here, we identify a shared sequence signature (syntax) within a heterogeneous set of conserved vertebrate hindbrain enhancers composed of spatially co-occurring PBX-HOX and MEIS/PREP transcription factor binding motifs. We use this syntax to accurately predict hindbrain enhancers in 89% of cases (67/75 predicted elements) from a set of conserved non-coding elements (CNEs). Furthermore, mutagenesis of the sites abolishes activity or generates ectopic expression, demonstrating their requirement for segmentally restricted enhancer activity in the hindbrain. We refine and use our syntax to predict over 3,000 hindbrain enhancers across the human genome. These sequences tend to be located near developmental transcription factors and are enriched in known hindbrain activating elements, demonstrating the predictive power of this simple model. Conclusion Our findings support the theory that hundreds of CNEs, and perhaps thousands of regions across the human genome, function to coordinate gene expression in the developing hindbrain. We speculate that deeply conserved sequences of this kind contributed to the co-option of new genes into the hindbrain gene regulatory network during early vertebrate evolution by linking patterns of hox expression to downstream genes involved in segmentation and patterning, and evolutionarily newer instances may have continued to contribute to lineage-specific elaboration of the hindbrain.
Collapse
Affiliation(s)
- Joseph Grice
- The Francis Crick Institute Mill Hill Laboratory, The Ridgeway, Mill Hill, London, NW7 1AA, United Kingdom
| | - Boris Noyvert
- The Francis Crick Institute Mill Hill Laboratory, The Ridgeway, Mill Hill, London, NW7 1AA, United Kingdom
| | - Laura Doglio
- The Francis Crick Institute Mill Hill Laboratory, The Ridgeway, Mill Hill, London, NW7 1AA, United Kingdom
| | - Greg Elgar
- The Francis Crick Institute Mill Hill Laboratory, The Ridgeway, Mill Hill, London, NW7 1AA, United Kingdom
- * E-mail:
| |
Collapse
|
15
|
Gordon KL, Arthur RK, Ruvinsky I. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence. PLoS Genet 2015; 11:e1005268. [PMID: 26020930 PMCID: PMC4447282 DOI: 10.1371/journal.pgen.1005268] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Accepted: 05/09/2015] [Indexed: 11/28/2022] Open
Abstract
Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. To explore the phylogenetic limits of conservation of cis-regulatory elements, we used transgenesis to test the functions of enhancers of four genes from several species spanning the phylum Nematoda. While we found a striking degree of functional conservation among the examined cis elements, their DNA sequences lacked apparent conservation with the C. elegans orthologs. In fact, sequence similarity between C. elegans and the distantly related nematodes was no greater than would be expected by chance. Short motifs, similar to known regulatory sequences in C. elegans, can be detected in most of the cis elements. When tested, some of these sites appear to mediate regulatory function. However, they seem to have originated through motif turnover, rather than to have been preserved from a common ancestor. Our results suggest that gene regulatory networks are broadly conserved in the phylum Nematoda, but this conservation persists despite substantial reorganization of regulatory elements and could not be detected using naïve comparisons of sequence similarity.
Collapse
Affiliation(s)
- Kacy L. Gordon
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (KLG); (IR)
| | - Robert K. Arthur
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| | - Ilya Ruvinsky
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (KLG); (IR)
| |
Collapse
|
16
|
Abstract
Instructions for when, where and to what level each gene should be expressed are encoded within regulatory sequences. The importance of motifs recognized by DNA-binding regulators has long been known, but their extensive characterization afforded by recent technologies only partly accounts for how regulatory instructions are encoded in the genome. Here, we review recent advances in our understanding of regulatory sequences that influence transcription and go beyond the description of motifs. We discuss how understanding different aspects of the sequence-encoded regulation can help to unravel the genotype-phenotype relationship, which would lead to a more accurate and mechanistic interpretation of personal genome sequences.
Collapse
Affiliation(s)
- Michal Levo
- Department of Molecular Cell Biology, and Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Eran Segal
- Department of Molecular Cell Biology, and Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
17
|
A comparison of midline and tracheal gene regulation during Drosophila development. PLoS One 2014; 9:e85518. [PMID: 24465586 PMCID: PMC3896416 DOI: 10.1371/journal.pone.0085518] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2013] [Accepted: 11/28/2013] [Indexed: 11/19/2022] Open
Abstract
Within the Drosophila embryo, two related bHLH-PAS proteins, Single-minded and Trachealess, control development of the central nervous system midline and the trachea, respectively. These two proteins are bHLH-PAS transcription factors and independently form heterodimers with another bHLH-PAS protein, Tango. During early embryogenesis, expression of Single-minded is restricted to the midline and Trachealess to the trachea and salivary glands, whereas Tango is ubiquitously expressed. Both Single-minded/Tango and Trachealess/Tango heterodimers bind to the same DNA sequence, called the CNS midline element (CME) within cis-regulatory sequences of downstream target genes. While Single-minded/Tango and Trachealess/Tango activate some of the same genes in their respective tissues during embryogenesis, they also activate a number of different genes restricted to only certain tissues. The goal of this research is to understand how these two related heterodimers bind different enhancers to activate different genes, thereby regulating the development of functionally diverse tissues. Existing data indicates that Single-minded and Trachealess may bind to different co-factors restricted to various tissues, causing them to interact with the CME only within certain sequence contexts. This would lead to the activation of different target genes in different cell types. To understand how the context surrounding the CME is recognized by different bHLH-PAS heterodimers and their co-factors, we identified and analyzed novel enhancers that drive midline and/or tracheal expression and compared them to previously characterized enhancers. In addition, we tested expression of synthetic reporter genes containing the CME flanked by different sequences. Taken together, these experiments identify elements overrepresented within midline and tracheal enhancers and suggest that sequences immediately surrounding a CME help dictate whether a gene is expressed in the midline or trachea.
Collapse
|
18
|
Jiang P, Singh M. CCAT: Combinatorial Code Analysis Tool for transcriptional regulation. Nucleic Acids Res 2013; 42:2833-47. [PMID: 24366875 PMCID: PMC3950699 DOI: 10.1093/nar/gkt1302] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Combinatorial interplay among transcription factors (TFs) is an important mechanism by which transcriptional regulatory specificity is achieved. However, despite the increasing number of TFs for which either binding specificities or genome-wide occupancy data are known, knowledge about cooperativity between TFs remains limited. To address this, we developed a computational framework for predicting genome-wide co-binding between TFs (CCAT, Combinatorial Code Analysis Tool), and applied it to Drosophila melanogaster to uncover cooperativity among TFs during embryo development. Using publicly available TF binding specificity data and DNaseI chromatin accessibility data, we first predicted genome-wide binding sites for 324 TFs across five stages of D. melanogaster embryo development. We then applied CCAT in each of these developmental stages, and identified from 19 to 58 pairs of TFs in each stage whose predicted binding sites are significantly co-localized. We found that nearby binding sites for pairs of TFs predicted to cooperate were enriched in regions bound in relevant ChIP experiments, and were more evolutionarily conserved than other pairs. Further, we found that TFs tend to be co-localized with other TFs in a dynamic manner across developmental stages. All generated data as well as source code for our front-to-end pipeline are available at http://cat.princeton.edu.
Collapse
Affiliation(s)
- Peng Jiang
- Department of Computer Science, Princeton University, Princeton, 08540 NJ, USA and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, 08544 NJ, USA
| | | |
Collapse
|
19
|
Duque T, Samee MAH, Kazemian M, Pham HN, Brodsky MH, Sinha S. Simulations of enhancer evolution provide mechanistic insights into gene regulation. Mol Biol Evol 2013; 31:184-200. [PMID: 24097306 PMCID: PMC3879441 DOI: 10.1093/molbev/mst170] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/14/2023] Open
Abstract
There is growing interest in models of regulatory sequence evolution. However, existing models specifically designed for regulatory sequences consider the independent evolution of individual transcription factor (TF)-binding sites, ignoring that the function and evolution of a binding site depends on its context, typically the cis-regulatory module (CRM) in which the site is located. Moreover, existing models do not account for the gene-specific roles of TF-binding sites, primarily because their roles often are not well understood. We introduce two models of regulatory sequence evolution that address some of the shortcomings of existing models and implement simulation frameworks based on them. One model simulates the evolution of an individual binding site in the context of a CRM, while the other evolves an entire CRM. Both models use a state-of-the art sequence-to-expression model to predict the effects of mutations on the regulatory output of the CRM and determine the strength of selection. We use the new framework to simulate the evolution of TF-binding sites in 37 well-studied CRMs belonging to the anterior-posterior patterning system in Drosophila embryos. We show that these simulations provide accurate fits to evolutionary data from 12 Drosophila genomes, which includes statistics of binding site conservation on relatively short evolutionary scales and site loss across larger divergence times. The new framework allows us, for the first time, to test hypotheses regarding the underlying cis-regulatory code by directly comparing the evolutionary implications of the hypothesis with the observed evolutionary dynamics of binding sites. Using this capability, we find that explicitly modeling self-cooperative DNA binding by the TF Caudal (CAD) provides significantly better fits than an otherwise identical evolutionary simulation that lacks this mechanistic aspect. This hypothesis is further supported by a statistical analysis of the distribution of intersite spacing between adjacent CAD sites. Experimental tests confirm direct homodimeric interaction between CAD molecules as well as self-cooperative DNA binding by CAD. We note that computational modeling of the D. melanogaster CRMs alone did not yield significant evidence to support CAD self-cooperativity. We thus demonstrate how specific mechanistic details encoded in CRMs can be revealed by modeling their evolution and fitting such models to multispecies data.
Collapse
Affiliation(s)
- Thyago Duque
- Department of Computer Science, University of Illinois at Urbana-Champaign
| | | | | | | | | | | |
Collapse
|
20
|
Menoret D, Santolini M, Fernandes I, Spokony R, Zanet J, Gonzalez I, Latapie Y, Ferrer P, Rouault H, White KP, Besse P, Hakim V, Aerts S, Payre F, Plaza S. Genome-wide analyses of Shavenbaby target genes reveals distinct features of enhancer organization. Genome Biol 2013; 14:R86. [PMID: 23972280 PMCID: PMC4053989 DOI: 10.1186/gb-2013-14-8-r86] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2013] [Accepted: 08/23/2013] [Indexed: 12/17/2022] Open
Abstract
Background Developmental programs are implemented by regulatory interactions between Transcription Factors (TFs) and their target genes, which remain poorly understood. While recent studies have focused on regulatory cascades of TFs that govern early development, little is known about how the ultimate effectors of cell differentiation are selected and controlled. We addressed this question during late Drosophila embryogenesis, when the finely tuned expression of the TF Ovo/Shavenbaby (Svb) triggers the morphological differentiation of epidermal trichomes. Results We defined a sizeable set of genes downstream of Svb and used in vivo assays to delineate 14 enhancers driving their specific expression in trichome cells. Coupling computational modeling to functional dissection, we investigated the regulatory logic of these enhancers. Extending the repertoire of epidermal effectors using genome-wide approaches showed that the regulatory models learned from this first sample are representative of the whole set of trichome enhancers. These enhancers harbor remarkable features with respect to their functional architectures, including a weak or non-existent clustering of Svb binding sites. The in vivo function of each site relies on its intimate context, notably the flanking nucleotides. Two additional cis-regulatory motifs, present in a broad diversity of composition and positioning among trichome enhancers, critically contribute to enhancer activity. Conclusions Our results show that Svb directly regulates a large set of terminal effectors of the remodeling of epidermal cells. Further, these data reveal that trichome formation is underpinned by unexpectedly diverse modes of regulation, providing fresh insights into the functional architecture of enhancers governing a terminal differentiation program.
Collapse
|
21
|
Kazemian M, Pham H, Wolfe SA, Brodsky MH, Sinha S. Widespread evidence of cooperative DNA binding by transcription factors in Drosophila development. Nucleic Acids Res 2013; 41:8237-52. [PMID: 23847101 PMCID: PMC3783179 DOI: 10.1093/nar/gkt598] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/04/2023] Open
Abstract
Regulation of eukaryotic gene transcription is often combinatorial in nature, with multiple transcription factors (TFs) regulating common target genes, often through direct or indirect mutual interactions. Many individual examples of cooperative binding by directly interacting TFs have been identified, but it remains unclear how pervasive this mechanism is during animal development. Cooperative TF binding should be manifest in genomic sequences as biased arrangements of TF-binding sites. Here, we explore the extent and diversity of such arrangements related to gene regulation during Drosophila embryogenesis. We used the DNA-binding specificities of 322 TFs along with chromatin accessibility information to identify enriched spacing and orientation patterns of TF-binding site pairs. We developed a new statistical approach for this task, specifically designed to accurately assess inter-site spacing biases while accounting for the phenomenon of homotypic site clustering commonly observed in developmental regulatory regions. We observed a large number of short-range distance preferences between TF-binding site pairs, including examples where the preference depends on the relative orientation of the binding sites. To test whether these binding site patterns reflect physical interactions between the corresponding TFs, we analyzed 27 TF pairs whose binding sites exhibited short distance preferences. In vitro protein–protein binding experiments revealed that >65% of these TF pairs can directly interact with each other. For five pairs, we further demonstrate that they bind cooperatively to DNA if both sites are present with the preferred spacing. This study demonstrates how DNA-binding motifs can be used to produce a comprehensive map of sequence signatures for different mechanisms of combinatorial TF action.
Collapse
Affiliation(s)
- Majid Kazemian
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA, Laboratory of Molecular Immunology and Immunology Center, National Heart Lung and Blood Institute, National Institutes of Health, MD, USA, Program in Gene Function and Expression, University of Massachusetts Medical School, MA, USA, Department of Biochemistry and Molecular Pharmacology University of Massachusetts Medical School, MA, USA, Department of Molecular Medicine, University of Massachusetts Medical School, MA, USA and Institute of Genomic Biology, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | | | | | | | | |
Collapse
|
22
|
Nelson AC, Wardle FC. Conserved non-coding elements and cis regulation: actions speak louder than words. Development 2013; 140:1385-95. [PMID: 23482485 DOI: 10.1242/dev.084459] [Citation(s) in RCA: 49] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023]
Abstract
It is a truth (almost) universally acknowledged that conserved non-coding genomic sequences function in the cis regulation of neighbouring genes. But is this a misconception? The literature is strewn with examples of conserved non-coding sequences being able to drive reporter expression, but the extent to which such sequences are actually used endogenously in vivo is only now being rigorously explored using unbiased genome-scale approaches. Here, we review the emerging picture, examining the extent to which conserved non-coding sequences equivalently regulate gene expression in different species, or at different developmental stages, and how genomics approaches are revealing the relationship between sequence conservation and functional use of cis-regulatory elements.
Collapse
Affiliation(s)
- Andrew C Nelson
- Randall Division of Cell and Molecular Biophysics, New Hunt's House, King's College London, Guy's Campus, London SE1 1UL, UK.
| | | |
Collapse
|
23
|
Samee AH, Sinha S. Evaluating thermodynamic models of enhancer activity on cellular resolution gene expression data. Methods 2013; 62:79-90. [PMID: 23624421 DOI: 10.1016/j.ymeth.2013.03.005] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2012] [Accepted: 03/04/2013] [Indexed: 11/18/2022] Open
Abstract
With the advent of high throughput sequencing and high resolution transcriptomic technologies, there exists today an unprecedented opportunity to understand gene regulation at a quantitative level. State of the art models of the relationship between regulatory sequence and gene expression have shown great promise, but also suffer from some major shortcomings. In this paper, we identify and address methodological challenges pertaining to quantitative modeling of gene expression from sequence, and test our models on the anterior-posterior patterning system in the Drosophila embryo. We first develop a framework to process cellular resolution three-dimensional gene expression data from the Drosophila embryo and create data sets on which quantitative models can be trained. Next we propose a new score, called 'weighted pattern generating potential' (w-PGP), to evaluate model predictions, and show its advantages over the two most common scoring schemes in use today. The model building exercise uses w-PGP as the evaluation score and adopts a systematic strategy to increase a model's complexity while guarding against over-fitting. Our model identifies three transcription factors--ZELDA, SLOPPY-PAIRED, and NUBBIN--that have not been previously incorporated in quantitative models of this system, as having significant regulatory influence. Finally, we show how fitting quantitative models on data sets comprising a handful of enhancers, as reported in earlier work, may lead to unreliable models.
Collapse
Affiliation(s)
- Abul Hassan Samee
- Department of Computer Science, University of Illinois at Urbana-Champaign, Urbana, IL, USA
| | | |
Collapse
|
24
|
Cis-regulatory complexity within a large non-coding region in the Drosophila genome. PLoS One 2013; 8:e60137. [PMID: 23613719 PMCID: PMC3632565 DOI: 10.1371/journal.pone.0060137] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2012] [Accepted: 02/21/2013] [Indexed: 11/22/2022] Open
Abstract
Analysis of cis-regulatory enhancers has revealed that they consist of clustered blocks of highly conserved sequences. Although most characterized enhancers reside near their target genes, a growing number of studies have shown that enhancers located over 50 kb from their minimal promoter(s) are required for appropriate gene expression and many of these ‘long-range’ enhancers are found in genomic regions that are devoid of identified exons. To gain insight into the complexity of Drosophila cis-regulatory sequences within exon-poor regions, we have undertaken an evolutionary analysis of 39 of these regions located throughout the genome. This survey revealed that within these genomic expanses, clusters of conserved sequence blocks (CSBs) are positioned once every 1.1 kb, on average, and that a typical cluster contains multiple (5 to 30 or more) CSBs that have been maintained for at least 190 My of evolutionary divergence. As an initial step toward assessing the cis-regulatory activity of conserved clusters within gene-free genomic expanses, we have tested the in-vivo enhancer activity of 19 consecutive CSB clusters located in the middle of a 115 kb gene-poor region on the 3rd chromosome. Our studies revealed that each cluster functions independently as a specific spatial/temporal enhancer. In total, the enhancers possess a diversity of regulatory functions, including dynamically activating expression in defined patterns within subsets of cells in discrete regions of the embryo, larvae and/or adult. We also observed that many of the enhancers are multifunctional–that is, they activate expression during multiple developmental stages. By extending these results to the rest of the Drosophila genome, which contains over 70,000 non-coding CSB clusters, we suggest that most function as enhancers.
Collapse
|
25
|
Teif VB, Erdel F, Beshnova DA, Vainshtein Y, Mallm JP, Rippe K. Taking into account nucleosomes for predicting gene expression. Methods 2013; 62:26-38. [PMID: 23523656 DOI: 10.1016/j.ymeth.2013.03.011] [Citation(s) in RCA: 18] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/14/2012] [Accepted: 03/10/2013] [Indexed: 01/10/2023] Open
Abstract
The eukaryotic genome is organized in a chain of nucleosomes that consist of 145-147 bp of DNA wrapped around a histone octamer protein core. Binding of transcription factors (TF) to nucleosomal DNA is frequently impeded, which makes it a challenging task to calculate TF occupancy at a given regulatory genomic site for predicting gene expression. Here, we review methods to calculate TF binding to DNA in the presence of nucleosomes. The main theoretical problems are (i) the computation speed that is becoming a bottleneck when partial unwrapping of DNA from the nucleosome is considered, (ii) the perturbation of the binding equilibrium by the activity of ATP-dependent chromatin remodelers, which translocate nucleosomes along the DNA, and (iii) the model parameterization from high-throughput sequencing data and fluorescence microscopy experiments in living cells. We discuss strategies that address these issues to efficiently compute transcription factor binding in chromatin.
Collapse
Affiliation(s)
- Vladimir B Teif
- Research Group Genome Organization & Function, Deutsches Krebsforschungszentrum-DKFZ & BioQuant, Im Neuenheimer Feld 280, 69120 Heidelberg, Germany.
| | | | | | | | | | | |
Collapse
|
26
|
Ha N, Polychronidou M, Lohmann I. COPS: detecting co-occurrence and spatial arrangement of transcription factor binding motifs in genome-wide datasets. PLoS One 2012; 7:e52055. [PMID: 23272209 PMCID: PMC3525548 DOI: 10.1371/journal.pone.0052055] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2012] [Accepted: 11/12/2012] [Indexed: 11/18/2022] Open
Abstract
In multi-cellular organisms, spatiotemporal activity of cis-regulatory DNA elements depends on their occupancy by different transcription factors (TFs). In recent years, genome-wide ChIP-on-Chip, ChIP-Seq and DamID assays have been extensively used to unravel the combinatorial interaction of TFs with cis-regulatory modules (CRMs) in the genome. Even though genome-wide binding profiles are increasingly becoming available for different TFs, single TF binding profiles are in most cases not sufficient for dissecting complex regulatory networks. Thus, potent computational tools detecting statistically significant and biologically relevant TF-motif co-occurrences in genome-wide datasets are essential for analyzing context-dependent transcriptional regulation. We have developed COPS (Co-Occurrence Pattern Search), a new bioinformatics tool based on a combination of association rules and Markov chain models, which detects co-occurring TF binding sites (BSs) on genomic regions of interest. COPS scans DNA sequences for frequent motif patterns using a Frequent-Pattern tree based data mining approach, which allows efficient performance of the software with respect to both data structure and implementation speed, in particular when mining large datasets. Since transcriptional gene regulation very often relies on the formation of regulatory protein complexes mediated by closely adjoining TF binding sites on CRMs, COPS additionally detects preferred short distance between co-occurring TF motifs. The performance of our software with respect to biological significance was evaluated using three published datasets containing genomic regions that are independently bound by several TFs involved in a defined biological process. In sum, COPS is a fast, efficient and user-friendly tool mining statistically and biologically significant TFBS co-occurrences and therefore allows the identification of TFs that combinatorially regulate gene expression.
Collapse
Affiliation(s)
- Nati Ha
- Centre for Organismal Studies (COS) Heidelberg, University of Heidelberg, Heidelberg and CellNetworks – Cluster of Excellence Germany, Heidelberg, Germany
| | - Maria Polychronidou
- Centre for Organismal Studies (COS) Heidelberg, University of Heidelberg, Heidelberg and CellNetworks – Cluster of Excellence Germany, Heidelberg, Germany
| | - Ingrid Lohmann
- Centre for Organismal Studies (COS) Heidelberg, University of Heidelberg, Heidelberg and CellNetworks – Cluster of Excellence Germany, Heidelberg, Germany
- * E-mail:
| |
Collapse
|
27
|
Zondag L, Dearden PK, Wilson MJ. Deep sequencing and expression of microRNAs from early honeybee (Apis mellifera) embryos reveals a role in regulating early embryonic patterning. BMC Evol Biol 2012; 12:211. [PMID: 23121997 PMCID: PMC3562263 DOI: 10.1186/1471-2148-12-211] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2012] [Accepted: 10/22/2012] [Indexed: 01/23/2023] Open
Abstract
Background Recent evidence supports the proposal that the observed diversity of animal body plans has been produced through alterations to the complexity of the regulatory genome rather than increases in the protein-coding content of a genome. One significant form of gene regulation is the contribution made by the non-coding content of the genome. Non-coding RNAs play roles in embryonic development of animals and these functions might be expected to evolve rapidly. Using next-generation sequencing and in situ hybridization, we have examined the miRNA content of early honeybee embryos. Results Through small RNA sequencing we found that 28% of known miRNAs are expressed in the early embryo. We also identified developmentally expressed microRNAs that are unique to the Apoidea clade. Examination of expression patterns implied these miRNAs have roles in patterning the anterior-posterior and dorso-ventral axes as well as the extraembryonic membranes. Knockdown of Dicer, a key component of miRNA processing, confirmed that miRNAs are likely to have a role in patterning these tissues. Conclusions Examination of the expression patterns of novel miRNAs, some unique to the Apis group, indicated that they are likely to play a role in early honeybee development. Known miRNAs that are deeply conserved in animal phyla display differences in expression pattern between honeybee and Drosophila, particularly at early stages of development. This may indicate miRNAs play a rapidly evolving role in regulating developmental pathways, most likely through changes to the way their expression is regulated.
Collapse
Affiliation(s)
- Lisa Zondag
- Laboratory for Evolution and Development, Genetics Otago and National Research Centre for Growth and Development, Department of Biochemistry, University of Otago, PO Box 56, Dunedin 9054, New Zealand
| | | | | |
Collapse
|
28
|
Holmqvist PH, Boija A, Philip P, Crona F, Stenberg P, Mannervik M. Preferential genome targeting of the CBP co-activator by Rel and Smad proteins in early Drosophila melanogaster embryos. PLoS Genet 2012; 8:e1002769. [PMID: 22737084 PMCID: PMC3380834 DOI: 10.1371/journal.pgen.1002769] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2012] [Accepted: 05/02/2012] [Indexed: 11/18/2022] Open
Abstract
CBP and the related p300 protein are widely used transcriptional co-activators in metazoans that interact with multiple transcription factors. Whether CBP/p300 occupies the genome equally with all factors or preferentially binds together with some factors is not known. We therefore compared Drosophila melanogaster CBP (nejire) ChIP-seq peaks with regions bound by 40 different transcription factors in early embryos, and we found high co-occupancy with the Rel-family protein Dorsal. Dorsal is required for CBP occupancy in the embryo, but only at regions where few other factors are present. CBP peaks in mutant embryos lacking nuclear Dorsal are best correlated with TGF-ß/Dpp-signaling and Smad-protein binding. Differences in CBP occupancy in mutant embryos reflect gene expression changes genome-wide, but CBP also occupies some non-expressed genes. The presence of CBP at silent genes does not result in histone acetylation. We find that Polycomb-repressed H3K27me3 chromatin does not preclude CBP binding, but restricts histone acetylation at CBP-bound genomic sites. We conclude that CBP occupancy in Drosophila embryos preferentially overlaps factors controlling dorso-ventral patterning and that CBP binds silent genes without causing histone hyperacetylation.
Collapse
Affiliation(s)
- Per-Henrik Holmqvist
- The Wenner-Gren Institute, Developmental Biology, Stockholm University, Stockholm, Sweden
| | - Ann Boija
- The Wenner-Gren Institute, Developmental Biology, Stockholm University, Stockholm, Sweden
| | - Philge Philip
- Department of Molecular Biology, Umeå University, Umeå, Sweden
- Computational Life Science Cluster (CLiC), Umeå University, Umeå, Sweden
| | - Filip Crona
- The Wenner-Gren Institute, Developmental Biology, Stockholm University, Stockholm, Sweden
| | - Per Stenberg
- Department of Molecular Biology, Umeå University, Umeå, Sweden
- Computational Life Science Cluster (CLiC), Umeå University, Umeå, Sweden
- * E-mail: (MM); (PS)
| | - Mattias Mannervik
- The Wenner-Gren Institute, Developmental Biology, Stockholm University, Stockholm, Sweden
- * E-mail: (MM); (PS)
| |
Collapse
|
29
|
Shvartsman SY, Baker RE. Mathematical models of morphogen gradients and their effects on gene expression. WILEY INTERDISCIPLINARY REVIEWS-DEVELOPMENTAL BIOLOGY 2012; 1:715-30. [DOI: 10.1002/wdev.55] [Citation(s) in RCA: 32] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/12/2023]
|
30
|
Nikulova AA, Favorov AV, Sutormin RA, Makeev VJ, Mironov AA. CORECLUST: identification of the conserved CRM grammar together with prediction of gene regulation. Nucleic Acids Res 2012; 40:e93. [PMID: 22422836 PMCID: PMC3384346 DOI: 10.1093/nar/gks235] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/16/2022] Open
Abstract
Identification of transcriptional regulatory regions and tracing their internal organization are important for understanding the eukaryotic cell machinery. Cis-regulatory modules (CRMs) of higher eukaryotes are believed to possess a regulatory ‘grammar’, or preferred arrangement of binding sites, that is crucial for proper regulation and thus tends to be evolutionarily conserved. Here, we present a method CORECLUST (COnservative REgulatory CLUster STructure) that predicts CRMs based on a set of positional weight matrices. Given regulatory regions of orthologous and/or co-regulated genes, CORECLUST constructs a CRM model by revealing the conserved rules that describe the relative location of binding sites. The constructed model may be consequently used for the genome-wide prediction of similar CRMs, and thus detection of co-regulated genes, and for the investigation of the regulatory grammar of the system. Compared with related methods, CORECLUST shows better performance at identification of CRMs conferring muscle-specific gene expression in vertebrates and early-developmental CRMs in Drosophila.
Collapse
Affiliation(s)
- Anna A Nikulova
- Faculty of Bioengineering and Bioinformatics, Lomonosov Moscow State University, 1-73 Leninskie Gory, Moscow 119991, Russia.
| | | | | | | | | |
Collapse
|
31
|
Andrioli LP, Digiampietri LA, de Barros LP, Machado-Lima A. Huckebein is part of a combinatorial repression code in the anterior blastoderm. Dev Biol 2011; 361:177-85. [PMID: 22027434 DOI: 10.1016/j.ydbio.2011.10.016] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2010] [Revised: 08/01/2011] [Accepted: 10/07/2011] [Indexed: 01/03/2023]
Abstract
The hierarchy of the segmentation cascade responsible for establishing the Drosophila body plan is composed by gap, pair-rule and segment polarity genes. However, no pair-rule stripes are formed in the anterior regions of the embryo. This lack of stripe formation, as well as other evidence from the literature that is further investigated here, led us to the hypothesis that anterior gap genes might be involved in a combinatorial mechanism responsible for repressing the cis-regulatory modules (CRMs) of hairy (h), even-skipped (eve), runt (run), and fushi-tarazu (ftz) anterior-most stripes. In this study, we investigated huckebein (hkb), which has a gap expression domain at the anterior tip of the embryo. Using genetic methods we were able to detect deviations from the wild-type patterns of the anterior-most pair-rule stripes in different genetic backgrounds, which were consistent with Hkb-mediated repression. Moreover, we developed an image processing tool that, for the most part, confirmed our assumptions. Using an hkb misexpression system, we further detected specific repression on anterior stripes. Furthermore, bioinformatics analysis predicted an increased significance of binding site clusters in the CRMs of h 1, eve 1, run 1 and ftz 1when Hkb was incorporated in the analysis, indicating that Hkb plays a direct role in these CRMs. We further discuss that Hkb and Slp1, which is the other previously identified common repressor of anterior stripes, might participate in a combinatorial repression mechanism controlling stripe CRMs in the anterior parts of the embryo and define the borders of these anterior stripes.
Collapse
Affiliation(s)
- Luiz Paulo Andrioli
- Departamento de Genética e Biologia Evolutiva, Instituto de Biociências, Universidade São Paulo, R. do Matão, 277, Cidade Universitária, 05508-000, São Paulo, SP, Brazil.
| | | | | | | |
Collapse
|
32
|
Brody T, Yavatkar AS, Kuzin A, Kundu M, Tyson LJ, Ross J, Lin TY, Lee CH, Awasaki T, Lee T, Odenwald WF. Use of a Drosophila genome-wide conserved sequence database to identify functionally related cis-regulatory enhancers. Dev Dyn 2011; 241:169-89. [PMID: 22174086 PMCID: PMC3243966 DOI: 10.1002/dvdy.22728] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/09/2011] [Indexed: 12/05/2022] Open
Abstract
Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions:cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc.
Collapse
Affiliation(s)
- Thomas Brody
- Neural Cell-Fate Determinants Section, NINDS, NIH, Bethesda, Maryland 20892, USA.
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
33
|
Kim Y, Andreu MJ, Lim B, Chung K, Terayama M, Jiménez G, Berg CA, Lu H, Shvartsman SY. Gene regulation by MAPK substrate competition. Dev Cell 2011; 20:880-7. [PMID: 21664584 DOI: 10.1016/j.devcel.2011.05.009] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/20/2010] [Revised: 03/19/2011] [Accepted: 05/12/2011] [Indexed: 10/18/2022]
Abstract
Developing tissues are patterned by coordinated activities of signaling systems, which can be integrated by a regulatory region of a gene that binds multiple transcription factors or by a transcription factor that is modified by multiple enzymes. Based on a combination of genetic and imaging experiments in the early Drosophila embryo, we describe a signal integration mechanism that cannot be reduced to a single gene regulatory element or a single transcription factor. This mechanism relies on an enzymatic network formed by mitogen-activated protein kinase (MAPK) and its substrates. Specifically, anteriorly localized MAPK substrates, such as Bicoid, antagonize MAPK-dependent downregulation of Capicua, a repressor that is involved in gene regulation along the dorsoventral axis of the embryo. MAPK substrate competition provides a basis for ternary interaction of the anterior, dorsoventral, and terminal patterning systems. A mathematical model of this interaction can explain gene expression patterns with both anteroposterior and dorsoventral polarities.
Collapse
Affiliation(s)
- Yoosik Kim
- Department of Chemical and Biological Engineering, Lewis-Sigler Institute for Integrative Genomics, Princeton University, NJ 08544, USA
| | | | | | | | | | | | | | | | | |
Collapse
|
34
|
Swanson CI, Schwimmer DB, Barolo S. Rapid evolutionary rewiring of a structurally constrained eye enhancer. Curr Biol 2011; 21:1186-96. [PMID: 21737276 DOI: 10.1016/j.cub.2011.05.056] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2010] [Revised: 04/18/2011] [Accepted: 05/27/2011] [Indexed: 12/20/2022]
Abstract
BACKGROUND Enhancers are genomic cis-regulatory sequences that integrate spatiotemporal signals to control gene expression. Enhancer activity depends on the combination of bound transcription factors as well as-in some cases-the arrangement and spacing of binding sites for these factors. Here, we examine evolutionary changes to the sequence and structure of sparkling, a Notch/EGFR/Runx-regulated enhancer that activates the dPax2 gene in cone cells of the developing Drosophila eye. RESULTS Despite functional and structural constraints on its sequence, sparkling has undergone major reorganization in its recent evolutionary history. Our data suggest that the relative strengths of the various regulatory inputs into sparkling change rapidly over evolutionary time, such that reduced input from some factors is compensated by increased input from different regulators. These gains and losses are at least partly responsible for the changes in enhancer structure that we observe. Furthermore, stereotypical spatial relationships between certain binding sites ("grammar elements") can be identified in all sparkling orthologs-although the sites themselves are often recently derived. We also find that low binding affinity for the Notch-regulated transcription factor Su(H), a conserved property of sparkling, is required to prevent ectopic responses to Notch in noncone cells. CONCLUSIONS Rapid DNA sequence turnover does not imply either the absence of critical cis-regulatory information or the absence of structural rules. Our findings demonstrate that even a severely constrained cis-regulatory sequence can be significantly rewired over a short evolutionary timescale.
Collapse
Affiliation(s)
- Christina I Swanson
- Department of Cell and Developmental Biology, University of Michigan Medical School, Ann Arbor, MI 48109-2200, USA
| | | | | |
Collapse
|
35
|
Papatsenko D, Levine M. The Drosophila gap gene network is composed of two parallel toggle switches. PLoS One 2011; 6:e21145. [PMID: 21747931 PMCID: PMC3128594 DOI: 10.1371/journal.pone.0021145] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/11/2011] [Accepted: 05/20/2011] [Indexed: 11/30/2022] Open
Abstract
Drosophila “gap” genes provide the first response to maternal gradients in the early fly embryo. Gap genes are expressed in a series of broad bands across the embryo during first hours of development. The gene network controlling the gap gene expression patterns includes inputs from maternal gradients and mutual repression between the gap genes themselves. In this study we propose a modular design for the gap gene network, involving two relatively independent network domains. The core of each network domain includes a toggle switch corresponding to a pair of mutually repressive gap genes, operated in space by maternal inputs. The toggle switches present in the gap network are evocative of the phage lambda switch, but they are operated positionally (in space) by the maternal gradients, so the synthesis rates for the competing components change along the embryo anterior-posterior axis. Dynamic model, constructed based on the proposed principle, with elements of fractional site occupancy, required 5–7 parameters to fit quantitative spatial expression data for gap gradients. The identified model solutions (parameter combinations) reproduced major dynamic features of the gap gradient system and explained gap expression in a variety of segmentation mutants.
Collapse
Affiliation(s)
- Dmitri Papatsenko
- Department of Gene and Cell Medicine, Mount Sinai School of Medicine, Black Family Stem Cell Institute, New York, New York, United States of America.
| | | |
Collapse
|
36
|
Teif VB, Rippe K. Nucleosome mediated crosstalk between transcription factors at eukaryotic enhancers. Phys Biol 2011; 8:044001. [PMID: 21666293 DOI: 10.1088/1478-3975/8/4/044001] [Citation(s) in RCA: 22] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/11/2022]
Abstract
A recent study of transcription regulation in Drosophila embryonic development revealed a complex non-monotonic dependence of gene expression on the distance between binding sites of repressor and activator proteins at the corresponding enhancer cis-regulatory modules (Fakhouri et al 2010 Mol. Syst. Biol. 6 341). The repressor efficiency was high at small separations, low around 30 bp, reached a maximum at 50-60 bp, and decreased at larger distances to the activator binding sites. Here, we propose a straightforward explanation for the distance dependence of repressor activity by considering the effect of the presence of a nucleosome. Using a method that considers partial unwrapping of nucleosomal DNA from the histone octamer core, we calculated the dependence of activator binding on the repressor-activator distance and found a quantitative agreement with the distance dependence reported for the Drosophila enhancer element. In addition, the proposed model offers explanations for other distance-dependent effects at eukaryotic enhancers.
Collapse
Affiliation(s)
- Vladimir B Teif
- BioQuant and German Cancer Research Center, Im Neuenheimer Feld 267, 69120 Heidelberg, Germany.
| | | |
Collapse
|
37
|
Kuzin A, Kundu M, Brody T, Odenwald WF. Functional analysis of conserved sequences within a temporally restricted neural precursor cell enhancer. Mech Dev 2011; 128:165-77. [PMID: 21315151 DOI: 10.1016/j.mod.2011.02.001] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2010] [Revised: 01/28/2011] [Accepted: 02/02/2011] [Indexed: 11/18/2022]
Abstract
Many of the key regulators of Drosophila CNS neural identity are expressed in defined temporal orders during neuroblast (NB) lineage development. To begin to understand the structural and functional complexity of enhancers that regulate ordered NB gene expression programs, we have undertaken the mutational analysis of the temporally restricted nerfin-1 NB enhancer. Our previous studies have localized the enhancer to a region just proximal to the nerfin-1 transcription start site. Analysis of this enhancer, using the phylogenetic footprint program EvoPrinter, reveals the presence of multiple sequence blocks that are conserved among drosophilids. cis-Decoder alignments of these conserved sequence blocks (CSBs) has identified shorter elements that are conserved in other Drosophila NB enhancers. Mutagenesis of the enhancer reveals that although each CSB is required for wild-type expression, neither position nor orientation of the CSBs within the enhancer is crucial for enhancer function; removal of less-conserved or non-conserved sequences flanking CSB clusters also does not significantly alter enhancer activity. While all three conserved E-box transcription factor (TF) binding sites (CAGCTG) are required for full function, adding an additional site at different locations within non-conserved sequences interferes with enhancer activity. Of particular note, none of the mutations resulted in ectopic reporter expression outside of the early NB expression window, suggesting that the temporally restricted pattern is defined by transcriptional activators and not by direct DNA binding repressors. Our work also points to an unexpectedly large number of TFs required for optimal enhancer function - mutant TF analysis has identified at least four that are required for full enhancer regulation.
Collapse
Affiliation(s)
- Alexander Kuzin
- Neural Cell-Fate Determinants Section, NINDS, NIH Bethesda, MD, USA.
| | | | | | | |
Collapse
|
38
|
Abstract
Gap genes are involved in segment determination during the early development of the fruit fly Drosophila melanogaster as well as in other insects. This review attempts to synthesize the current knowledge of the gap gene network through a comprehensive survey of the experimental literature. I focus on genetic and molecular evidence, which provides us with an almost-complete picture of the regulatory interactions responsible for trunk gap gene expression. I discuss the regulatory mechanisms involved, and highlight the remaining ambiguities and gaps in the evidence. This is followed by a brief discussion of molecular regulatory mechanisms for transcriptional regulation, as well as precision and size-regulation provided by the system. Finally, I discuss evidence on the evolution of gap gene expression from species other than Drosophila. My survey concludes that studies of the gap gene system continue to reveal interesting and important new insights into the role of gene regulatory networks in development and evolution.
Collapse
Affiliation(s)
- Johannes Jaeger
- Centre de Regulació Genòmica, Universtitat Pompeu Fabra, Barcelona, Spain.
| |
Collapse
|
39
|
Robert-Moreno À, Naranjo S, de la Calle-Mustienes E, Gómez-Skarmeta JL, Alsina B. Characterization of new otic enhancers of the pou3f4 gene reveal distinct signaling pathway regulation and spatio-temporal patterns. PLoS One 2010; 5:e15907. [PMID: 21209840 PMCID: PMC3013142 DOI: 10.1371/journal.pone.0015907] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2010] [Accepted: 11/26/2010] [Indexed: 02/01/2023] Open
Abstract
POU3F4 is a member of the POU-homedomain transcription factor family with a prominent role in inner ear development. Mutations in the human POU3F4 coding unit leads to X-linked deafness type 3 (DFN3), characterized by conductive hearing loss and progressive sensorineural deafness. Microdeletions found 1 Mb 5' upstream of the coding region also displayed the same phenotype, suggesting that cis-regulatory elements might be present in that region. Indeed, we and others have recently identified several enhancers at the 1 Mb 5' upstream interval of the pou3f4 locus. Here we characterize the spatio-temporal patterns of these regulatory elements in zebrafish transgenic lines. We show that the most distal enhancer (HCNR 81675) is activated earlier and drives GFP reporter expression initially to a broad ear domain to progressively restrict to the sensory patches. The proximal enhancer (HCNR 82478) is switched later during development and promotes expression, among in other tissues, in sensory patches from its onset. The third enhancer (HCNR 81728) is also active at later stages in the otic mesenchyme and in the otic epithelium. We also characterize the signaling pathways regulating these enhancers. While HCNR 81675 is regulated by very early signals of retinoic acid, HCNR 82478 is regulated by Fgf activity at a later stage and the HCNR 81728 enhancer is under the control of Hh signaling. Finally, we show that Sox2 and Pax2 transcription factors are bound to HCNR 81675 genomic region during otic development and specific mutations to these transcription factor binding sites abrogates HCNR 81675 enhancer activity. Altogether, our results suggest that pou3f4 expression in inner ear might be under the control of distinct regulatory elements that fine-tune the spatio-temporal activity of this gene and provides novel data on the signaling mechanisms controlling pou3f4 function.
Collapse
Affiliation(s)
- Àlex Robert-Moreno
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra/Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain
| | - Silvia Naranjo
- Centro Andaluz de Biología del Desarrollo, CSIC/Universidad Pablo de Olavide, Sevilla, Spain
| | | | | | - Berta Alsina
- Department of Experimental and Health Sciences, Universitat Pompeu Fabra/Parc de Recerca Biomèdica de Barcelona, Barcelona, Spain
| |
Collapse
|
40
|
When needles look like hay: how to find tissue-specific enhancers in model organism genomes. Dev Biol 2010; 350:239-54. [PMID: 21130761 DOI: 10.1016/j.ydbio.2010.11.026] [Citation(s) in RCA: 25] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2010] [Revised: 11/11/2010] [Accepted: 11/22/2010] [Indexed: 01/22/2023]
Abstract
A major prerequisite for the investigation of tissue-specific processes is the identification of cis-regulatory elements. No generally applicable technique is available to distinguish them from any other type of genomic non-coding sequence. Therefore, researchers often have to identify these elements by elaborate in vivo screens, testing individual regions until the right one is found. Here, based on many examples from the literature, we summarize how functional enhancers have been isolated from other elements in the genome and how they have been characterized in transgenic animals. Covering computational and experimental studies, we provide an overview of the global properties of cis-regulatory elements, like their specific interactions with promoters and target gene distances. We describe conserved non-coding elements (CNEs) and their internal structure, nucleotide composition, binding site clustering and overlap, with a special focus on developmental enhancers. Conflicting data and unresolved questions on the nature of these elements are highlighted. Our comprehensive overview of the experimental shortcuts that have been found in the different model organism communities and the new field of high-throughput assays should help during the preparation phase of a screen for enhancers. The review is accompanied by a list of general guidelines for such a project.
Collapse
|
41
|
Su J, Teichmann SA, Down TA. Assessing computational methods of cis-regulatory module prediction. PLoS Comput Biol 2010; 6:e1001020. [PMID: 21152003 PMCID: PMC2996316 DOI: 10.1371/journal.pcbi.1001020] [Citation(s) in RCA: 63] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2010] [Accepted: 10/29/2010] [Indexed: 01/02/2023] Open
Abstract
Computational methods attempting to identify instances of cis-regulatory modules (CRMs) in the genome face a challenging problem of searching for potentially interacting transcription factor binding sites while knowledge of the specific interactions involved remains limited. Without a comprehensive comparison of their performance, the reliability and accuracy of these tools remains unclear. Faced with a large number of different tools that address this problem, we summarized and categorized them based on search strategy and input data requirements. Twelve representative methods were chosen and applied to predict CRMs from the Drosophila CRM database REDfly, and across the human ENCODE regions. Our results show that the optimal choice of method varies depending on species and composition of the sequences in question. When discriminating CRMs from non-coding regions, those methods considering evolutionary conservation have a stronger predictive power than methods designed to be run on a single genome. Different CRM representations and search strategies rely on different CRM properties, and different methods can complement one another. For example, some favour homotypical clusters of binding sites, while others perform best on short CRMs. Furthermore, most methods appear to be sensitive to the composition and structure of the genome to which they are applied. We analyze the principal features that distinguish the methods that performed well, identify weaknesses leading to poor performance, and provide a guide for users. We also propose key considerations for the development and evaluation of future CRM-prediction methods. Transcriptional regulation involves multiple transcription factors binding to DNA sequences. A limited repertoire of transcription factors performs this complex regulatory step through various spatial and temporal interactions between themselves and their binding sites. These transcription factor binding interactions are clustered as distinct modules: cis-regulatory modules (CRMs). Computational methods attempting to identify instances of CRMs in the genome face a challenging problem because a majority of these interactions between transcription factors remain unknown. To investigate the reliability and accuracy of these methods, we chose twelve representative methods and applied them to predict CRMs on both the fly and human genomes. Our results show that the optimal choice of method varies depending on species and composition of the sequences in question. Different CRM representations and search strategies rely on different CRM properties, and different methods can complement one another. We provide a guide for users and key considerations for developers. We also expect that, along with new technology generating new types of genomic data, future CRM prediction methods will be able to reveal transcription binding interactions in three-dimensional space.
Collapse
Affiliation(s)
- Jing Su
- MRC Laboratory of Molecular Biology, Cambridge, United Kingdom
| | | | | |
Collapse
|
42
|
Ribeiro TC, Ventrice G, Machado-Lima A, Andrioli LP. Investigating giant (Gt) repression in the formation of partially overlapping pair-rule stripes. Dev Dyn 2010; 239:2989-99. [DOI: 10.1002/dvdy.22434] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022] Open
|
43
|
He F, Wen Y, Cheung D, Deng J, Lu LJ, Jiao R, Ma J. Distance measurements via the morphogen gradient of Bicoid in Drosophila embryos. BMC DEVELOPMENTAL BIOLOGY 2010; 10:80. [PMID: 20678215 PMCID: PMC2919471 DOI: 10.1186/1471-213x-10-80] [Citation(s) in RCA: 26] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 04/13/2010] [Accepted: 08/02/2010] [Indexed: 11/10/2022]
Abstract
BACKGROUND Patterning along the anterior-posterior (A-P) axis in Drosophila embryos is instructed by the morphogen gradient of Bicoid (Bcd). Despite extensive studies of this morphogen, how embryo geometry may affect gradient formation and target responses has not been investigated experimentally. RESULTS In this report, we systematically compare the Bcd gradient profiles and its target expression patterns on the dorsal and ventral sides of the embryo. Our results support a hypothesis that proper distance measurement and the encoded positional information of the Bcd gradient are along the perimeter of the embryo. Our results also reveal that the dorsal and ventral sides of the embryo have a fundamentally similar relationship between Bcd and its target Hunchback (Hb), suggesting that Hb expression properties on the two sides of the embryo can be directly traced to Bcd gradient properties. Our 3-D simulation studies show that a curvature difference between the two sides of an embryo is sufficient to generate Bcd gradient properties that are consistent with experimental observations. CONCLUSIONS The findings described in this report provide a first quantitative, experimental evaluation of embryo geometry on Bcd gradient formation and target responses. They demonstrate that the physical features of an embryo, such as its shape, are integral to how pattern is formed.
Collapse
Affiliation(s)
- Feng He
- State Key Laboratory of Brain and Cognitive Science Institute of Biophysics Chinese Academy of Sciences 15 Datun Road Beijing 100101, China
| | | | | | | | | | | | | |
Collapse
|
44
|
Lusk RW, Eisen MB. Evolutionary mirages: selection on binding site composition creates the illusion of conserved grammars in Drosophila enhancers. PLoS Genet 2010; 6:e1000829. [PMID: 20107516 PMCID: PMC2809757 DOI: 10.1371/journal.pgen.1000829] [Citation(s) in RCA: 67] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2009] [Accepted: 12/22/2009] [Indexed: 01/05/2023] Open
Abstract
The clustering of transcription factor binding sites in developmental enhancers and the apparent preferential conservation of clustered sites have been widely interpreted as proof that spatially constrained physical interactions between transcription factors are required for regulatory function. However, we show here that selection on the composition of enhancers alone, and not their internal structure, leads to the accumulation of clustered sites with evolutionary dynamics that suggest they are preferentially conserved. We simulated the evolution of idealized enhancers from Drosophila melanogaster constrained to contain only a minimum number of binding sites for one or more factors. Under this constraint, mutations that destroy an existing binding site are tolerated only if a compensating site has emerged elsewhere in the enhancer. Overlapping sites, such as those frequently observed for the activator Bicoid and repressor Krüppel, had significantly longer evolutionary half-lives than isolated sites for the same factors. This leads to a substantially higher density of overlapping sites than expected by chance and the appearance that such sites are preferentially conserved. Because D. melanogaster (like many other species) has a bias for deletions over insertions, sites tended to become closer together over time, leading to an overall clustering of sites in the absence of any selection for clustered sites. Since this effect is strongest for the oldest sites, clustered sites also incorrectly appear to be preferentially conserved. Following speciation, sites tend to be closer together in all descendent species than in their common ancestors, violating the common assumption that shared features of species' genomes reflect their ancestral state. Finally, we show that selection on binding site composition alone recapitulates the observed number of overlapping and closely neighboring sites in real D. melanogaster enhancers. Thus, this study calls into question the common practice of inferring "cis-regulatory grammars" from the organization and evolutionary dynamics of developmental enhancers.
Collapse
Affiliation(s)
- Richard W. Lusk
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, California, United States of America
| | - Michael B. Eisen
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, California, United States of America
- Genomics Division, Ernest Orlando Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
- California Institute of Quantitative Biosciences, University of California Berkeley, Berkeley, California, United States of America
- Howard Hughes Medical Institute, University of California Berkeley, Berkeley, California, United States of America
- * E-mail:
| |
Collapse
|
45
|
Hu Z, Gallo SM. Identification of interacting transcription factors regulating tissue gene expression in human. BMC Genomics 2010; 11:49. [PMID: 20085649 PMCID: PMC2822763 DOI: 10.1186/1471-2164-11-49] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2009] [Accepted: 01/19/2010] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Tissue gene expression is generally regulated by multiple transcription factors (TFs). A major first step toward understanding how tissues achieve their specificity is to identify, at the genome scale, interacting TFs regulating gene expression in different tissues. Despite previous discoveries, the mechanisms that control tissue gene expression are not fully understood. RESULTS We have integrated a function conservation approach, which is based on evolutionary conservation of biological function, and genes with highest expression level in human tissues to predict TF pairs controlling tissue gene expression. To this end, we have identified 2549 TF pairs associated with a certain tissue. To find interacting TFs controlling tissue gene expression in a broad spatial and temporal manner, we looked for TF pairs common to the same type of tissues and identified 379 such TF pairs, based on which TF-TF interaction networks were further built. We also found that tissue-specific TFs may play an important role in recruiting non-tissue-specific TFs to the TF-TF interaction network, offering the potential for coordinating and controlling tissue gene expression across a variety of conditions. CONCLUSION The findings from this study indicate that tissue gene expression is regulated by large sets of interacting TFs either on the same promoter of a gene or through TF-TF interaction networks.
Collapse
Affiliation(s)
- Zihua Hu
- Center for Computational Research, New York State Center of Excellence in Bioinformatics & Life Sciences, Department of Biostatistics, Department of Medicine, State University of New York (SUNY), Buffalo, NY 14260, USA
| | - Steven M Gallo
- Center for Computational Research, New York State Center of Excellence in Bioinformatics & Life Sciences, State University of New York (SUNY), Buffalo, NY 14260, USA
| |
Collapse
|