1
|
Kapil S, Sobti RC, Kaur T. Prediction and analysis of cis-regulatory elements in Dorsal and Ventral patterning genes of Tribolium castaneum and its comparison with Drosophila melanogaster. Mol Cell Biochem 2024; 479:109-125. [PMID: 37004638 DOI: 10.1007/s11010-023-04712-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Accepted: 03/15/2023] [Indexed: 04/04/2023]
Abstract
Insect embryonic development and morphology are characterized by their anterior-posterior and dorsal-ventral (DV) patterning. In Drosophila embryos, DV patterning is mediated by a dorsal protein gradient which activates twist and snail proteins, the important regulators of DV patterning. To activate or repress gene expression, some regulatory proteins bind in clusters to their target gene at sites known as cis-regulatory elements or enhancers. To understand how variations in gene expression in different lineages might lead to different phenotypes, it is necessary to understand enhancers and their evolution. Drosophila melanogaster has been widely studied to understand the interactions between transcription factors and the transcription factor binding sites. Tribolium castaneum is an upcoming model animal which is catching the interest of biologists and the research on the enhancer mechanisms in the insect's axes patterning is still in infancy. Therefore, the current study was designed to compare the enhancers of DV patterning in the two insect species. The sequences of ten proteins involved in DV patterning of D. melanogaster were obtained from Flybase. The protein sequences of T. castaneum orthologous to those obtained from D. melanogaster were acquired from NCBI BLAST, and these were then converted to DNA sequences which were modified by adding 20 kb sequences both upstream and downstream to the gene. These modified sequences were used for further analysis. Bioinformatics tools (Cluster-Buster and MCAST) were used to search for clusters of binding sites (enhancers) in the modified DV genes. The results obtained showed that the transcription factors in Drosophila melanogaster and Tribolium castaneum are nearly identical; however, the number of binding sites varies between the two species, indicating transcription factor binding site evolution, as predicted by two different computational tools. It was observed that dorsal, twist, snail, zelda, and Supressor of Hairless are the transcription factors responsible for the regulation of DV patterning in the two insect species.
Collapse
Affiliation(s)
- Subham Kapil
- Department of Zoology, DAV University, Jalandhar, India
| | | | - Tejinder Kaur
- Department of Zoology, DAV University, Jalandhar, India.
| |
Collapse
|
2
|
The Genetic Mechanisms Underlying the Concerted Expression of the yellow and tan Genes in Complex Patterns on the Abdomen and Wings of Drosophila guttifera. Genes (Basel) 2023; 14:genes14020304. [PMID: 36833231 PMCID: PMC9957387 DOI: 10.3390/genes14020304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2022] [Revised: 01/12/2023] [Accepted: 01/21/2023] [Indexed: 01/26/2023] Open
Abstract
How complex morphological patterns form is an intriguing question in developmental biology. However, the mechanisms that generate complex patterns remain largely unknown. Here, we sought to identify the genetic mechanisms that regulate the tan (t) gene in a multi-spotted pigmentation pattern on the abdomen and wings of Drosophila guttifera. Previously, we showed that yellow (y) gene expression completely prefigures the abdominal and wing pigment patterns of this species. In the current study, we demonstrate that the t gene is co-expressed with the y gene in nearly identical patterns, both transcripts foreshadowing the adult abdominal and wing melanin spot patterns. We identified cis-regulatory modules (CRMs) of t, one of which drives reporter expression in six longitudinal rows of spots on the developing pupal abdomen, while the second CRM activates the reporter gene in a spotted wing pattern. Comparing the abdominal spot CRMs of y and t, we found a similar composition of putative transcription factor binding sites that are thought to regulate the complex expression patterns of both terminal pigmentation genes y and t. In contrast, the y and t wing spots appear to be regulated by distinct upstream factors. Our results suggest that the D. guttifera abdominal and wing melanin spot patterns have been established through the co-regulation of y and t, shedding light on how complex morphological traits may be regulated through the parallel coordination of downstream target genes.
Collapse
|
3
|
Cross-species enhancer prediction using machine learning. Genomics 2022; 114:110454. [PMID: 36030022 DOI: 10.1016/j.ygeno.2022.110454] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 07/28/2022] [Accepted: 08/16/2022] [Indexed: 11/21/2022]
Abstract
Cis-regulatory elements (CREs) are non-coding parts of the genome that play a critical role in gene expression regulation. Enhancers, as an important example of CREs, interact with genes to influence complex traits like disease, heat tolerance and growth rate. Much of what is known about enhancers come from studies of humans and a few model organisms like mouse, with little known about other mammalian species. Previous studies have attempted to identify enhancers in less studied mammals using comparative genomics but with limited success. Recently, Machine Learning (ML) techniques have shown promising results to predict enhancer regions. Here, we investigated the ability of ML methods to identify enhancers in three non-model mammalian species (cattle, pig and dog) using human and mouse enhancer data from VISTA and publicly available ChIP-seq. We tested nine models, using four different representations of the DNA sequences in cross-species prediction using both the VISTA dataset and species-specific ChIP-seq data. We identified between 809,399 and 877,278 enhancer-like regions (ELRs) in the study species (11.6-13.7% of each genome). These predictions were close to the ~8% proportion of ELRs that covered the human genome. We propose that our ML methods have predictive ability for identifying enhancers in non-model mammalian species. We have provided a list of high confidence enhancers at https://github.com/DaviesCentreInformatics/Cross-species-enhancer-prediction and believe these enhancers will be of great use to the community.
Collapse
|
4
|
MAPK-mediated transcription factor GATAd contributes to Cry1Ac resistance in diamondback moth by reducing PxmALP expression. PLoS Genet 2022; 18:e1010037. [PMID: 35113858 PMCID: PMC8846524 DOI: 10.1371/journal.pgen.1010037] [Citation(s) in RCA: 29] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/15/2022] [Accepted: 01/12/2022] [Indexed: 12/12/2022] Open
Abstract
The benefits of biopesticides and transgenic crops based on the insecticidal Cry-toxins from Bacillus thuringiensis (Bt) are considerably threatened by insect resistance evolution, thus, deciphering the molecular mechanisms underlying insect resistance to Bt products is of great significance to their sustainable utilization. Previously, we have demonstrated that the down-regulation of PxmALP in a strain of Plutella xylostella (L.) highly resistant to the Bt Cry1Ac toxin was due to a hormone-activated MAPK signaling pathway and contributed to the resistance phenotype. However, the underlying transcriptional regulatory mechanism remains enigmatic. Here, we report that the PxGATAd transcription factor (TF) is responsible for the differential expression of PxmALP observed between the Cry1Ac susceptible and resistant strains. We identified that PxGATAd directly activates PxmALP expression via interacting with a non-canonical but specific GATA-like cis-response element (CRE) located in the PxmALP promoter region. A six-nucleotide insertion mutation in this cis-acting element of the PxmALP promoter from the resistant strain resulted in repression of transcriptional activity, affecting the regulatory performance of PxGATAd. Furthermore, silencing of PxGATAd in susceptible larvae reduced the expression of PxmALP and susceptibility to Cry1Ac toxin. Suppressing PxMAP4K4 expression in the resistant larvae transiently recovered both the expression of PxGATAd and PxmALP, indicating that the PxGATAd is a positive responsive factor involved in the activation of PxmALP promoter and negatively regulated by the MAPK signaling pathway. Overall, this study deciphers an intricate regulatory mechanism of PxmALP gene expression and highlights the concurrent involvement of both trans-regulatory factors and cis-acting elements in Cry1Ac resistance development in lepidopteran insects. Gene expression and regulation are associated with adaptive evolution in living organisms. The rapid evolution of insect resistance to Bt insecticidal Cry toxins is frequently associated with reduced expression of diverse midgut genes that code for Cry-toxin receptors. Nonetheless, our current knowledge about the regulation of gene expression of these pivotal receptor genes in insects is limited. Membrane-bound alkaline phosphatase (mALP) is a known receptor for Cry1Ac toxin in diverse insects and here, we report the transcriptional regulatory mechanism of the PxmALP gene related to Cry1Ac resistance in P. xylostella. We identified a MAPK signaling pathway that negatively regulates the PxGATAd transcriptional factor which is involved in the differential expression of PxmALP via interacting with the PxmALP promoter. Furthermore, a cis-acting element mutation repressing the regulatory activity of PxGATAd for PxmALP expression in the Cry1Ac resistant strain was identified. Our study provides an insight into the precise transcriptional regulatory mechanism that regulates PxmALP expression and is involved in the evolution of Bt Cry1Ac resistance in P. xylostella, which provides a paradigm for decoding the regulation landscape of midgut Cry-toxin receptor genes in insects.
Collapse
|
5
|
Schember I, Halfon MS. Identification of new Anopheles gambiae transcriptional enhancers using a cross-species prediction approach. INSECT MOLECULAR BIOLOGY 2021; 30:410-419. [PMID: 33866636 PMCID: PMC8266755 DOI: 10.1111/imb.12705] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Revised: 02/09/2021] [Accepted: 03/31/2021] [Indexed: 06/12/2023]
Abstract
The success of transgenic mosquito vector control approaches relies on well-targeted gene expression, requiring the identification and characterization of a diverse set of mosquito promoters and transcriptional enhancers. However, few enhancers have been characterized in Anopheles gambiae to date. Here, we employ the SCRMshaw method we previously developed to predict enhancers in the A. gambiae genome, preferentially targeting vector-relevant tissues such as the salivary glands, midgut and nervous system. We demonstrate a high overall success rate, with at least 8 of 11 (73%) tested sequences validating as enhancers in an in vivo xenotransgenic assay. Four tested sequences drive expression in either the salivary gland or the midgut, making them directly useful for probing the biology of these infection-relevant tissues. The success of our study suggests that computational enhancer prediction should serve as an effective means for identifying A. gambiae enhancers with activity in tissues involved in malaria propagation and transmission.
Collapse
Affiliation(s)
- Isabella Schember
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203
| | - Marc S. Halfon
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY 14203
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14203
- NY State Center of Excellence in Bioinformatics & Life Sciences, Buffalo, NY 14203
- Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Comprehensive Cancer Center, Buffalo, NY 14263
| |
Collapse
|
6
|
Asma H, Halfon MS. Annotating the Insect Regulatory Genome. INSECTS 2021; 12:591. [PMID: 34209769 PMCID: PMC8305585 DOI: 10.3390/insects12070591] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 06/23/2021] [Accepted: 06/25/2021] [Indexed: 11/17/2022]
Abstract
An ever-growing number of insect genomes is being sequenced across the evolutionary spectrum. Comprehensive annotation of not only genes but also regulatory regions is critical for reaping the full benefits of this sequencing. Driven by developments in sequencing technologies and in both empirical and computational discovery strategies, the past few decades have witnessed dramatic progress in our ability to identify cis-regulatory modules (CRMs), sequences such as enhancers that play a major role in regulating transcription. Nevertheless, providing a timely and comprehensive regulatory annotation of newly sequenced insect genomes is an ongoing challenge. We review here the methods being used to identify CRMs in both model and non-model insect species, and focus on two tools that we have developed, REDfly and SCRMshaw. These resources can be paired together in a powerful combination to facilitate insect regulatory annotation over a broad range of species, with an accuracy equal to or better than that of other state-of-the-art methods.
Collapse
Affiliation(s)
- Hasiba Asma
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
| | - Marc S. Halfon
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- NY State Center of Excellence in Bioinformatics & Life Sciences, Buffalo, NY 14203, USA
| |
Collapse
|
7
|
Jindal GA, Farley EK. Enhancer grammar in development, evolution, and disease: dependencies and interplay. Dev Cell 2021; 56:575-587. [PMID: 33689769 PMCID: PMC8462829 DOI: 10.1016/j.devcel.2021.02.016] [Citation(s) in RCA: 72] [Impact Index Per Article: 18.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Revised: 02/15/2021] [Accepted: 02/16/2021] [Indexed: 12/19/2022]
Abstract
Each language has standard books describing that language's grammatical rules. Biologists have searched for similar, albeit more complex, principles relating enhancer sequence to gene expression. Here, we review the literature on enhancer grammar. We introduce dependency grammar, a model where enhancers encode information based on dependencies between enhancer features shaped by mechanistic, evolutionary, and biological constraints. Classifying enhancers based on the types of dependencies may identify unifying principles relating enhancer sequence to gene expression. Such rules would allow us to read the instructions for development within genomes and pinpoint causal enhancer variants underlying disease and evolutionary changes.
Collapse
Affiliation(s)
- Granton A Jindal
- Division of Cardiology, Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Division of Biological Sciences, Section of Molecular Biology, University of California San Diego, La Jolla, CA 92093, USA
| | - Emma K Farley
- Division of Cardiology, Department of Medicine, University of California San Diego, La Jolla, CA 92093, USA; Division of Biological Sciences, Section of Molecular Biology, University of California San Diego, La Jolla, CA 92093, USA.
| |
Collapse
|
8
|
Avsec Ž, Weilert M, Shrikumar A, Krueger S, Alexandari A, Dalal K, Fropf R, McAnany C, Gagneur J, Kundaje A, Zeitlinger J. Base-resolution models of transcription-factor binding reveal soft motif syntax. Nat Genet 2021; 53:354-366. [PMID: 33603233 PMCID: PMC8812996 DOI: 10.1038/s41588-021-00782-6] [Citation(s) in RCA: 314] [Impact Index Per Article: 78.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2020] [Accepted: 01/07/2021] [Indexed: 01/30/2023]
Abstract
The arrangement (syntax) of transcription factor (TF) binding motifs is an important part of the cis-regulatory code, yet remains elusive. We introduce a deep learning model, BPNet, that uses DNA sequence to predict base-resolution chromatin immunoprecipitation (ChIP)-nexus binding profiles of pluripotency TFs. We develop interpretation tools to learn predictive motif representations and identify soft syntax rules for cooperative TF binding interactions. Strikingly, Nanog preferentially binds with helical periodicity, and TFs often cooperate in a directional manner, which we validate using clustered regularly interspaced short palindromic repeat (CRISPR)-induced point mutations. Our model represents a powerful general approach to uncover the motifs and syntax of cis-regulatory sequences in genomics data.
Collapse
Affiliation(s)
- Žiga Avsec
- Department of Informatics, Technical University of Munich, Garching, Germany,Graduate School of Quantitative Biosciences (QBM), Ludwig-Maximilians-Universität München, Munich, Germany,Currently at DeepMind, London, UK
| | - Melanie Weilert
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Avanti Shrikumar
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Sabrina Krueger
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Amr Alexandari
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Khyati Dalal
- Stowers Institute for Medical Research, Kansas City, MO, USA,The University of Kansas Medical Center, Kansas City, KS, USA
| | - Robin Fropf
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Charles McAnany
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | - Julien Gagneur
- Department of Informatics, Technical University of Munich, Garching, Germany
| | - Anshul Kundaje
- Department of Computer Science, Stanford University, Stanford, CA, USA,Department of Genetics, Stanford University, Stanford, CA, USA,correspondence: ,
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO, USA,The University of Kansas Medical Center, Kansas City, KS, USA,correspondence: ,
| |
Collapse
|
9
|
Chen L, Capra JA. Learning and interpreting the gene regulatory grammar in a deep learning framework. PLoS Comput Biol 2020; 16:e1008334. [PMID: 33137083 PMCID: PMC7660921 DOI: 10.1371/journal.pcbi.1008334] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2019] [Revised: 11/12/2020] [Accepted: 09/12/2020] [Indexed: 12/12/2022] Open
Abstract
Deep neural networks (DNNs) have achieved state-of-the-art performance in identifying gene regulatory sequences, but they have provided limited insight into the biology of regulatory elements due to the difficulty of interpreting the complex features they learn. Several models of how combinatorial binding of transcription factors, i.e. the regulatory grammar, drives enhancer activity have been proposed, ranging from the flexible TF billboard model to the stringent enhanceosome model. However, there is limited knowledge of the prevalence of these (or other) sequence architectures across enhancers. Here we perform several hypothesis-driven analyses to explore the ability of DNNs to learn the regulatory grammar of enhancers. We created synthetic datasets based on existing hypotheses about combinatorial transcription factor binding site (TFBS) patterns, including homotypic clusters, heterotypic clusters, and enhanceosomes, from real TF binding motifs from diverse TF families. We then trained deep residual neural networks (ResNets) to model the sequences under a range of scenarios that reflect real-world multi-label regulatory sequence prediction tasks. We developed a gradient-based unsupervised clustering method to extract the patterns learned by the ResNet models. We demonstrated that simulated regulatory grammars are best learned in the penultimate layer of the ResNets, and the proposed method can accurately retrieve the regulatory grammar even when there is heterogeneity in the enhancer categories and a large fraction of TFBS outside of the regulatory grammar. However, we also identify common scenarios where ResNets fail to learn simulated regulatory grammars. Finally, we applied the proposed method to mouse developmental enhancers and were able to identify the components of a known heterotypic TF cluster. Our results provide a framework for interpreting the regulatory rules learned by ResNets, and they demonstrate that the ability and efficiency of ResNets in learning the regulatory grammar depends on the nature of the prediction task.
Collapse
Affiliation(s)
- Ling Chen
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, United States of America
| | - John A. Capra
- Department of Biological Sciences, Vanderbilt University, Nashville, TN, United States of America
- Vanderbilt Genetics Institute and Department of Biomedical Informatics, Vanderbilt University Medical Center, Nashville, TN, United States of America
- Department of Computer Science, Vanderbilt University, Nashville, TN, United States of America
| |
Collapse
|
10
|
|
11
|
Sandler JE, Stathopoulos A. Stepwise Progression of Embryonic Patterning. Trends Genet 2016; 32:432-443. [PMID: 27230753 DOI: 10.1016/j.tig.2016.04.004] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2016] [Revised: 04/20/2016] [Accepted: 04/21/2016] [Indexed: 01/23/2023]
Abstract
It is long established that the graded distribution of Dorsal transcription factor influences spatial domains of gene expression along the dorsoventral (DV) axis of Drosophila melanogaster embryos. However, the more recent realization that Dorsal levels also change with time raises the question of whether these dynamics are instructive. An overview of DV axis patterning is provided, focusing on new insights identified through quantitative analysis of temporal changes in Dorsal target gene expression from one nuclear cycle to the next ('steps'). Possible roles for the stepwise progression of this patterning program are discussed including (i) tight temporal regulation of signaling pathway activation, (ii) control of gene expression cohorts, and (iii) ensuring the irreversibility of the patterning and cell fate specification process.
Collapse
Affiliation(s)
- Jeremy E Sandler
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Angelike Stathopoulos
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA.
| |
Collapse
|
12
|
Yao Y, Minor PJ, Zhao YT, Jeong Y, Pani AM, King AN, Symmons O, Gan L, Cardoso WV, Spitz F, Lowe CJ, Epstein DJ. Cis-regulatory architecture of a brain signaling center predates the origin of chordates. Nat Genet 2016; 48:575-80. [PMID: 27064252 PMCID: PMC4848136 DOI: 10.1038/ng.3542] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2015] [Accepted: 03/11/2016] [Indexed: 12/13/2022]
Abstract
Genomic approaches have predicted hundreds of thousands of tissue-specific cis-regulatory sequences, but the determinants critical to their function and evolutionary history are mostly unknown. Here we systematically decode a set of brain enhancers active in the zona limitans intrathalamica (zli), a signaling center essential for vertebrate forebrain development via the secreted morphogen Sonic hedgehog (Shh). We apply a de novo motif analysis tool to identify six position-independent sequence motifs together with their cognate transcription factors that are essential for zli enhancer activity and Shh expression in the mouse embryo. Using knowledge of this regulatory lexicon, we discover new Shh zli enhancers in mice and a functionally equivalent element in hemichordates, indicating an ancient origin of the Shh zli regulatory network that predates the chordate phylum. These findings support a strategy for delineating functionally conserved enhancers in the absence of overt sequence homologies and over extensive evolutionary distances.
Collapse
Affiliation(s)
- Yao Yao
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, 415 Curie Blvd, Clinical Research Building 470, Philadelphia, PA 19104, USA
| | - Paul J. Minor
- Hopkins Marine Station, Department of Biology, Stanford University, 120 Oceanview Blvd. Pacific Grove, CA 93950, USA
| | - Ying-Tao Zhao
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, 415 Curie Blvd, Clinical Research Building 470, Philadelphia, PA 19104, USA
| | - Yongsu Jeong
- Department of Genetic Engineering, College of Life Sciences and Graduate School of Biotechnology, Kyung Hee University, Yongin-si 446-701, Republic of Korea
| | - Ariel M. Pani
- Hopkins Marine Station, Department of Biology, Stanford University, 120 Oceanview Blvd. Pacific Grove, CA 93950, USA
| | - Anna N. King
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, 415 Curie Blvd, Clinical Research Building 470, Philadelphia, PA 19104, USA
| | - Orsolya Symmons
- Developmental Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Lin Gan
- Department of Ophthalmology, University of Rochester Medical Center, Rochester, NY 14642, USA
| | - Wellington V. Cardoso
- Columbia Center for Human Development, Department of Medicine, Pulmonary Allergy Critical Care, Columbia University Medical Center, New York, NY 10032, USA
| | - François Spitz
- Developmental Biology Unit, European Molecular Biology Laboratory, Heidelberg, Germany
| | - Christopher J. Lowe
- Hopkins Marine Station, Department of Biology, Stanford University, 120 Oceanview Blvd. Pacific Grove, CA 93950, USA
| | - Douglas J. Epstein
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, 415 Curie Blvd, Clinical Research Building 470, Philadelphia, PA 19104, USA
| |
Collapse
|
13
|
Integration of Orthogonal Signaling by the Notch and Dpp Pathways in Drosophila. Genetics 2016; 203:219-40. [PMID: 26975664 PMCID: PMC4858776 DOI: 10.1534/genetics.116.186791] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/04/2016] [Accepted: 03/08/2016] [Indexed: 11/18/2022] Open
Abstract
The transcription factor Suppressor of Hairless and its coactivator, the Notch intracellular domain, are polyglutamine (pQ)-rich factors that target enhancer elements and interact with other locally bound pQ-rich factors. To understand the functional repertoire of such enhancers, we identify conserved regulatory belts with binding sites for the pQ-rich effectors of both Notch and BMP/Dpp signaling, and the pQ-deficient tissue selectors Apterous (Ap), Scalloped (Sd), and Vestigial (Vg). We find that the densest such binding site cluster in the genome is located in the BMP-inducible nab locus, a homolog of the vertebrate transcriptional cofactors NAB1/NAB2 We report three major findings. First, we find that this nab regulatory belt is a novel enhancer driving dorsal wing margin expression in regions of peak phosphorylated Mad in wing imaginal discs. Second, we show that Ap is developmentally required to license the nab dorsal wing margin enhancer (DWME) to read out Notch and Dpp signaling in the dorsal compartment. Third, we find that the nab DWME is embedded in a complex of intronic enhancers, including a wing quadrant enhancer, a proximal wing disc enhancer, and a larval brain enhancer. This enhancer complex coordinates global nab expression via both tissue-specific activation and interenhancer silencing. We suggest that DWME integration of BMP signaling maintains nab expression in proliferating margin descendants that have divided away from Notch-Delta boundary signaling. As such, uniform expression of genes like nab and vestigial in proliferating compartments would typically require both boundary and nonboundary lineage-specific enhancers.
Collapse
|
14
|
Quantitatively predictable control of Drosophila transcriptional enhancers in vivo with engineered transcription factors. Nat Genet 2016; 48:292-8. [DOI: 10.1038/ng.3509] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2015] [Accepted: 01/15/2016] [Indexed: 12/13/2022]
|
15
|
Clifford J, Adami C. Discovery and information-theoretic characterization of transcription factor binding sites that act cooperatively. Phys Biol 2015; 12:056004. [PMID: 26331781 DOI: 10.1088/1478-3975/12/5/056004] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/28/2023]
Abstract
Transcription factor binding to the surface of DNA regulatory regions is one of the primary causes of regulating gene expression levels. A probabilistic approach to model protein-DNA interactions at the sequence level is through position weight matrices (PWMs) that estimate the joint probability of a DNA binding site sequence by assuming positional independence within the DNA sequence. Here we construct conditional PWMs that depend on the motif signatures in the flanking DNA sequence, by conditioning known binding site loci on the presence or absence of additional binding sites in the flanking sequence of each site's locus. Pooling known sites with similar flanking sequence patterns allows for the estimation of the conditional distribution function over the binding site sequences. We apply our model to the Dorsal transcription factor binding sites active in patterning the Dorsal-Ventral axis of Drosophila development. We find that those binding sites that cooperate with nearby Twist sites on average contain about 0.5 bits of information about the presence of Twist transcription factor binding sites in the flanking sequence. We also find that Dorsal binding site detectors conditioned on flanking sequence information make better predictions about what is a Dorsal site relative to background DNA than detection without information about flanking sequence features.
Collapse
Affiliation(s)
- Jacob Clifford
- Department of Physics and Astronomy, Michigan State University, East Lansing, MI, USA. BEACON Center for the Study of Evolution in Action, Michigan State University, East Lansing, MI, USA
| | | |
Collapse
|
16
|
Kazemian M, Suryamohan K, Chen JY, Zhang Y, Samee MAH, Halfon MS, Sinha S. Evidence for deep regulatory similarities in early developmental programs across highly diverged insects. Genome Biol Evol 2015; 6:2301-20. [PMID: 25173756 PMCID: PMC4217690 DOI: 10.1093/gbe/evu184] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Many genes familiar from Drosophila development, such as the so-called gap, pair-rule, and segment polarity genes, play important roles in the development of other insects and in many cases appear to be deployed in a similar fashion, despite the fact that Drosophila-like "long germband" development is highly derived and confined to a subset of insect families. Whether or not these similarities extend to the regulatory level is unknown. Identification of regulatory regions beyond the well-studied Drosophila has been challenging as even within the Diptera (flies, including mosquitoes) regulatory sequences have diverged past the point of recognition by standard alignment methods. Here, we demonstrate that methods we previously developed for computational cis-regulatory module (CRM) discovery in Drosophila can be used effectively in highly diverged (250-350 Myr) insect species including Anopheles gambiae, Tribolium castaneum, Apis mellifera, and Nasonia vitripennis. In Drosophila, we have successfully used small sets of known CRMs as "training data" to guide the search for other CRMs with related function. We show here that although species-specific CRM training data do not exist, training sets from Drosophila can facilitate CRM discovery in diverged insects. We validate in vivo over a dozen new CRMs, roughly doubling the number of known CRMs in the four non-Drosophila species. Given the growing wealth of Drosophila CRM annotation, these results suggest that extensive regulatory sequence annotation will be possible in newly sequenced insects without recourse to costly and labor-intensive genome-scale experiments. We develop a new method, Regulus, which computes a probabilistic score of similarity based on binding site composition (despite the absence of nucleotide-level sequence alignment), and demonstrate similarity between functionally related CRMs from orthologous loci. Our work represents an important step toward being able to trace the evolutionary history of gene regulatory networks and defining the mechanisms underlying insect evolution.
Collapse
Affiliation(s)
- Majid Kazemian
- Department of Computer Science, University of Illinois at Urbana-Champaign Laboratory of Molecular Immunology, National Heart Lung and Blood Institute, National Institutes of Health, Bethesda, Maryland
| | - Kushal Suryamohan
- Department of Biochemistry, University at Buffalo-State University of New York NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, New York
| | - Jia-Yu Chen
- Department of Computer Science, University of Illinois at Urbana-Champaign
| | - Yinan Zhang
- Department of Computer Science, University of Illinois at Urbana-Champaign
| | | | - Marc S Halfon
- Department of Biochemistry, University at Buffalo-State University of New York NY State Center of Excellence in Bioinformatics and Life Sciences, Buffalo, New York Department of Biological Sciences, University at Buffalo-State University of New York Molecular and Cellular Biology Department and Program in Cancer Genetics, Roswell Park Cancer Institute, Buffalo, New York
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign Institute of Genomic Biology, University of Illinois at Urbana-Champaign
| |
Collapse
|
17
|
Gordon KL, Arthur RK, Ruvinsky I. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence. PLoS Genet 2015; 11:e1005268. [PMID: 26020930 PMCID: PMC4447282 DOI: 10.1371/journal.pgen.1005268] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2014] [Accepted: 05/09/2015] [Indexed: 11/28/2022] Open
Abstract
Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. To explore the phylogenetic limits of conservation of cis-regulatory elements, we used transgenesis to test the functions of enhancers of four genes from several species spanning the phylum Nematoda. While we found a striking degree of functional conservation among the examined cis elements, their DNA sequences lacked apparent conservation with the C. elegans orthologs. In fact, sequence similarity between C. elegans and the distantly related nematodes was no greater than would be expected by chance. Short motifs, similar to known regulatory sequences in C. elegans, can be detected in most of the cis elements. When tested, some of these sites appear to mediate regulatory function. However, they seem to have originated through motif turnover, rather than to have been preserved from a common ancestor. Our results suggest that gene regulatory networks are broadly conserved in the phylum Nematoda, but this conservation persists despite substantial reorganization of regulatory elements and could not be detected using naïve comparisons of sequence similarity.
Collapse
Affiliation(s)
- Kacy L. Gordon
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (KLG); (IR)
| | - Robert K. Arthur
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
| | - Ilya Ruvinsky
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- Department of Ecology and Evolution, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (KLG); (IR)
| |
Collapse
|
18
|
Camino EM, Butts JC, Ordway A, Vellky JE, Rebeiz M, Williams TM. The evolutionary origination and diversification of a dimorphic gene regulatory network through parallel innovations in cis and trans. PLoS Genet 2015; 11:e1005136. [PMID: 25835988 PMCID: PMC4383587 DOI: 10.1371/journal.pgen.1005136] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/10/2014] [Accepted: 03/10/2015] [Indexed: 01/15/2023] Open
Abstract
The origination and diversification of morphological characteristics represents a key problem in understanding the evolution of development. Morphological traits result from gene regulatory networks (GRNs) that form a web of transcription factors, which regulate multiple cis-regulatory element (CRE) sequences to control the coordinated expression of differentiation genes. The formation and modification of GRNs must ultimately be understood at the level of individual regulatory linkages (i.e., transcription factor binding sites within CREs) that constitute the network. Here, we investigate how elements within a network originated and diversified to generate a broad range of abdominal pigmentation phenotypes among Sophophora fruit flies. Our data indicates that the coordinated expression of two melanin synthesis enzymes, Yellow and Tan, recently evolved through novel CRE activities that respond to the spatial patterning inputs of Hox proteins and the sex-specific input of Bric-à-brac transcription factors. Once established, it seems that these newly evolved activities were repeatedly modified by evolutionary changes in the network’s trans-regulators to generate large-scale changes in pigment pattern. By elucidating how yellow and tan are connected to the web of abdominal trans-regulators, we discovered that the yellow and tan abdominal CREs are composed of distinct regulatory inputs that exhibit contrasting responses to the same Hox proteins and Hox cofactors. These results provide an example in which CRE origination underlies a recently evolved novel trait, and highlights how coordinated expression patterns can evolve in parallel through the generation of unique regulatory linkages. The genomic content of regulatory genes such as transcription factors is surprisingly conserved between diverse animal species, raising the paradox of how new traits emerge, and are subsequently modified and lost. In this study we make a connection between the developmental basis for the formation of a fruit fly trait and the evolutionary basis for that trait’s origin, diversification, and loss. We show how the origin of a novel pigmentation trait is associated with the evolution of two regulatory sequences that control the co-expression of two key pigmentation genes. These sequences interact in unique ways with evolutionarily conserved Hox transcription factors to drive gene co-expression. Once these unique connections evolved, the alteration of this trait appears to have proceeded through changes to regulatory genes rather than regulatory sequences of the pigmentation genes. Thus, our findings support a scenario where regulatory sequence evolution provided new functions to old transcription factors, how co-expression can emerge from different utilizations of the same transcription factors, and that trait diversity was surprisingly shaped by changes in some manner to the deeply conserved regulatory genes.
Collapse
Affiliation(s)
- Eric M. Camino
- Department of Biology, University of Dayton, Dayton, Ohio, United States of America
| | - John C. Butts
- Department of Biology, University of Dayton, Dayton, Ohio, United States of America
| | - Alison Ordway
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Jordan E. Vellky
- Department of Biology, University of Dayton, Dayton, Ohio, United States of America
| | - Mark Rebeiz
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, Pennsylvania, United States of America
| | - Thomas M. Williams
- Department of Biology, University of Dayton, Dayton, Ohio, United States of America
- Center for Tissue Regeneration and Engineering at Dayton, University of Dayton, Dayton, Ohio, United States of America
- * E-mail:
| |
Collapse
|
19
|
Slattery M, Ma L, Spokony RF, Arthur RK, Kheradpour P, Kundaje A, Nègre N, Crofts A, Ptashkin R, Zieba J, Ostapenko A, Suchy S, Victorsen A, Jameel N, Grundstad AJ, Gao W, Moran JR, Rehm EJ, Grossman RL, Kellis M, White KP. Diverse patterns of genomic targeting by transcriptional regulators in Drosophila melanogaster. Genome Res 2015; 24:1224-35. [PMID: 24985916 PMCID: PMC4079976 DOI: 10.1101/gr.168807.113] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]
Abstract
Annotation of regulatory elements and identification of the transcription-related factors (TRFs) targeting these elements are key steps in understanding how cells interpret their genetic blueprint and their environment during development, and how that process goes awry in the case of disease. One goal of the modENCODE (model organism ENCyclopedia of DNA Elements) Project is to survey a diverse sampling of TRFs, both DNA-binding and non-DNA-binding factors, to provide a framework for the subsequent study of the mechanisms by which transcriptional regulators target the genome. Here we provide an updated map of the Drosophila melanogaster regulatory genome based on the location of 84 TRFs at various stages of development. This regulatory map reveals a variety of genomic targeting patterns, including factors with strong preferences toward proximal promoter binding, factors that target intergenic and intronic DNA, and factors with distinct chromatin state preferences. The data also highlight the stringency of the Polycomb regulatory network, and show association of the Trithorax-like (Trl) protein with hotspots of DNA binding throughout development. Furthermore, the data identify more than 5800 instances in which TRFs target DNA regions with demonstrated enhancer activity. Regions of high TRF co-occupancy are more likely to be associated with open enhancers used across cell types, while lower TRF occupancy regions are associated with complex enhancers that are also regulated at the epigenetic level. Together these data serve as a resource for the research community in the continued effort to dissect transcriptional regulatory mechanisms directing Drosophila development.
Collapse
Affiliation(s)
- Matthew Slattery
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Lijia Ma
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Rebecca F Spokony
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Robert K Arthur
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Pouya Kheradpour
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Anshul Kundaje
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA
| | - Nicolas Nègre
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA; Université de Montpellier II and INRA, UMR1333 DGIMI, F-34095 Montpellier, France
| | - Alex Crofts
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Ryan Ptashkin
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jennifer Zieba
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Alexander Ostapenko
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Sarah Suchy
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Alec Victorsen
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Nader Jameel
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - A Jason Grundstad
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Wenxuan Gao
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Jennifer R Moran
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - E Jay Rehm
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Robert L Grossman
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Laboratory (CSAIL), Massachusetts Institute of Technology (MIT), Cambridge, Massachusetts 02139, USA; Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA
| | - Kevin P White
- Institute for Genomics & Systems Biology, Department of Human Genetics, The University of Chicago, Chicago, Illinois 60637, USA
| |
Collapse
|
20
|
Ozdemir A, Ma L, White KP, Stathopoulos A. Su(H)-mediated repression positions gene boundaries along the dorsal-ventral axis of Drosophila embryos. Dev Cell 2015; 31:100-13. [PMID: 25313963 DOI: 10.1016/j.devcel.2014.08.005] [Citation(s) in RCA: 29] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2013] [Revised: 06/10/2014] [Accepted: 08/05/2014] [Indexed: 12/22/2022]
Abstract
In Drosophila embryos, a nuclear gradient of the Dorsal (Dl) transcription factor directs differential gene expression along the dorsoventral (DV) axis, translating it into distinct domains that specify future mesodermal, neural, and ectodermal territories. However, the mechanisms used to differentially position gene expression boundaries along this axis are not fully understood. Here, using a combination of approaches, including mutant phenotype analyses and chromatin immunoprecipitation, we show that the transcription factor Suppressor of Hairless, Su(H), helps define dorsal boundaries for many genes expressed along the DV axis. Synthetic reporter constructs also provide molecular evidence that Su(H) binding sites support repression and act to counterbalance activation through Dl and the ubiquitous activator Zelda. Our study highlights a role for broadly expressed repressors, like Su(H), and organization of transcription factor binding sites within cis-regulatory modules as important elements controlling spatial domains of gene expression to facilitate flexible positioning of boundaries across the entire DV axis.
Collapse
Affiliation(s)
- Anil Ozdemir
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA
| | - Lijia Ma
- Institute for Genomics and Systems Biology and Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Kevin P White
- Institute for Genomics and Systems Biology and Department of Human Genetics, University of Chicago, Chicago, IL 60637, USA
| | - Angelike Stathopoulos
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, CA 91125, USA.
| |
Collapse
|
21
|
Slattery M, Zhou T, Yang L, Dantas Machado AC, Gordân R, Rohs R. Absence of a simple code: how transcription factors read the genome. Trends Biochem Sci 2014; 39:381-99. [PMID: 25129887 DOI: 10.1016/j.tibs.2014.07.002] [Citation(s) in RCA: 366] [Impact Index Per Article: 33.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/03/2014] [Revised: 07/11/2014] [Accepted: 07/15/2014] [Indexed: 12/21/2022]
Abstract
Transcription factors (TFs) influence cell fate by interpreting the regulatory DNA within a genome. TFs recognize DNA in a specific manner; the mechanisms underlying this specificity have been identified for many TFs based on 3D structures of protein-DNA complexes. More recently, structural views have been complemented with data from high-throughput in vitro and in vivo explorations of the DNA-binding preferences of many TFs. Together, these approaches have greatly expanded our understanding of TF-DNA interactions. However, the mechanisms by which TFs select in vivo binding sites and alter gene expression remain unclear. Recent work has highlighted the many variables that influence TF-DNA binding, while demonstrating that a biophysical understanding of these many factors will be central to understanding TF function.
Collapse
Affiliation(s)
- Matthew Slattery
- Department of Biomedical Sciences, University of Minnesota Medical School, Duluth, MN 55812, USA; Developmental Biology Center, University of Minnesota, Minneapolis, MN 55455, USA.
| | - Tianyin Zhou
- Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Lin Yang
- Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Ana Carolina Dantas Machado
- Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA
| | - Raluca Gordân
- Center for Genomic and Computational Biology, Departments of Biostatistics and Bioinformatics, Computer Science, and Molecular Genetics and Microbiology, Duke University, Durham, NC 27708, USA.
| | - Remo Rohs
- Molecular and Computational Biology Program, Departments of Biological Sciences, Chemistry, Physics, and Computer Science, University of Southern California, Los Angeles, CA 90089, USA.
| |
Collapse
|
22
|
Brittain A, Stroebele E, Erives A. Microsatellite repeat instability fuels evolution of embryonic enhancers in Hawaiian Drosophila. PLoS One 2014; 9:e101177. [PMID: 24978198 PMCID: PMC4076327 DOI: 10.1371/journal.pone.0101177] [Citation(s) in RCA: 9] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2014] [Accepted: 06/03/2014] [Indexed: 12/16/2022] Open
Abstract
For ∼30 million years, the eggs of Hawaiian Drosophila were laid in ever-changing environments caused by high rates of island formation. The associated diversification of the size and developmental rate of the syncytial fly embryo would have altered morphogenic gradients, thus necessitating frequent evolutionary compensation of transcriptional responses. We investigate the consequences these radiations had on transcriptional enhancers patterning the embryo to see whether their pattern of molecular evolution is different from non-Hawaiian species. We identify and functionally assay in transgenic D. melanogaster the Neurogenic Ectoderm Enhancers from two different Hawaiian Drosophila groups: (i) the picture wing group, and (ii) the modified mouthparts group. We find that the binding sites in this set of well-characterized enhancers are footprinted by diverse microsatellite repeat (MSR) sequences. We further show that Hawaiian embryonic enhancers in general are enriched in MSR relative to both Hawaiian non-embryonic enhancers and non-Hawaiian embryonic enhancers. We propose embryonic enhancers are sensitive to Activator spacing because they often serve as assembly scaffolds for the aggregation of transcription factor activator complexes. Furthermore, as most indels are produced by microsatellite repeat slippage, enhancers from Hawaiian Drosophila lineages, which experience dynamic evolutionary pressures, would become grossly enriched in MSR content.
Collapse
Affiliation(s)
- Andrew Brittain
- Department of Biology, University of Iowa, Iowa City, Iowa, United States of America
| | - Elizabeth Stroebele
- Department of Biology, University of Iowa, Iowa City, Iowa, United States of America
| | - Albert Erives
- Department of Biology, University of Iowa, Iowa City, Iowa, United States of America
- * E-mail:
| |
Collapse
|
23
|
Barrière A, Ruvinsky I. Pervasive divergence of transcriptional gene regulation in Caenorhabditis nematodes. PLoS Genet 2014; 10:e1004435. [PMID: 24968346 PMCID: PMC4072541 DOI: 10.1371/journal.pgen.1004435] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/02/2013] [Accepted: 04/28/2014] [Indexed: 12/18/2022] Open
Abstract
Because there is considerable variation in gene expression even between closely related species, it is clear that gene regulatory mechanisms evolve relatively rapidly. Because primary sequence conservation is an unreliable proxy for functional conservation of cis-regulatory elements, their assessment must be carried out in vivo. We conducted a survey of cis-regulatory conservation between C. elegans and closely related species C. briggsae, C. remanei, C. brenneri, and C. japonica. We tested enhancers of eight genes from these species by introducing them into C. elegans and analyzing the expression patterns they drove. Our results support several notable conclusions. Most exogenous cis elements direct expression in the same cells as their C. elegans orthologs, confirming gross conservation of regulatory mechanisms. However, the majority of exogenous elements, when placed in C. elegans, also directed expression in cells outside endogenous patterns, suggesting functional divergence. Recurrent ectopic expression of different promoters in the same C. elegans cells may reflect biases in the directions in which expression patterns can evolve due to shared regulatory logic of coexpressed genes. The fact that, despite differences between individual genes, several patterns repeatedly emerged from our survey, encourages us to think that general rules governing regulatory evolution may exist and be discoverable.
Collapse
Affiliation(s)
- Antoine Barrière
- Department of Ecology and Evolution and Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (AB); (IR)
| | - Ilya Ruvinsky
- Department of Ecology and Evolution and Institute for Genomics and Systems Biology, The University of Chicago, Chicago, Illinois, United States of America
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail: (AB); (IR)
| |
Collapse
|
24
|
Abstract
Instructions for when, where and to what level each gene should be expressed are encoded within regulatory sequences. The importance of motifs recognized by DNA-binding regulators has long been known, but their extensive characterization afforded by recent technologies only partly accounts for how regulatory instructions are encoded in the genome. Here, we review recent advances in our understanding of regulatory sequences that influence transcription and go beyond the description of motifs. We discuss how understanding different aspects of the sequence-encoded regulation can help to unravel the genotype-phenotype relationship, which would lead to a more accurate and mechanistic interpretation of personal genome sequences.
Collapse
Affiliation(s)
- Michal Levo
- Department of Molecular Cell Biology, and Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Eran Segal
- Department of Molecular Cell Biology, and Department of Computer Science and Applied Mathematics, Weizmann Institute of Science, Rehovot 76100, Israel
| |
Collapse
|
25
|
Kvon EZ, Kazmar T, Stampfel G, Yáñez-Cuna JO, Pagani M, Schernhuber K, Dickson BJ, Stark A. Genome-scale functional characterization of Drosophila developmental enhancers in vivo. Nature 2014; 512:91-5. [PMID: 24896182 DOI: 10.1038/nature13395] [Citation(s) in RCA: 315] [Impact Index Per Article: 28.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2014] [Accepted: 04/17/2014] [Indexed: 01/31/2023]
Abstract
Transcriptional enhancers are crucial regulators of gene expression and animal development and the characterization of their genomic organization, spatiotemporal activities and sequence properties is a key goal in modern biology. Here we characterize the in vivo activity of 7,705 Drosophila melanogaster enhancer candidates covering 13.5% of the non-coding non-repetitive genome throughout embryogenesis. 3,557 (46%) candidates are active, suggesting a high density with 50,000 to 100,000 developmental enhancers genome-wide. The vast majority of enhancers display specific spatial patterns that are highly dynamic during development. Most appear to regulate their neighbouring genes, suggesting that the cis-regulatory genome is organized locally into domains, which are supported by chromosomal domains, insulator binding and genome evolution. However, 12 to 21 per cent of enhancers appear to skip non-expressed neighbours and regulate a more distal gene. Finally, we computationally identify cis-regulatory motifs that are predictive and required for enhancer activity, as we validate experimentally. This work provides global insights into the organization of an animal regulatory genome and the make-up of enhancer sequences and confirms and generalizes principles from previous studies. All enhancer patterns are annotated manually with a controlled vocabulary and all results are available through a web interface (http://enhancers.starklab.org), including the raw images of all microscopy slides for manual inspection at arbitrary zoom levels.
Collapse
Affiliation(s)
- Evgeny Z Kvon
- 1] Research Institute of Molecular Pathology (IMP), Vienna Biocenter VBC, Dr Bohr-Gasse 7, 1030 Vienna, Austria [2] Howard Hughes Medical Institute, Janelia Farm Research Campus, Ashburn, Virginia 20147, USA (B.J.D.); Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA (E.Z.K.)
| | - Tomas Kazmar
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter VBC, Dr Bohr-Gasse 7, 1030 Vienna, Austria
| | - Gerald Stampfel
- 1] Research Institute of Molecular Pathology (IMP), Vienna Biocenter VBC, Dr Bohr-Gasse 7, 1030 Vienna, Austria [2]
| | - J Omar Yáñez-Cuna
- 1] Research Institute of Molecular Pathology (IMP), Vienna Biocenter VBC, Dr Bohr-Gasse 7, 1030 Vienna, Austria [2]
| | - Michaela Pagani
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter VBC, Dr Bohr-Gasse 7, 1030 Vienna, Austria
| | - Katharina Schernhuber
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter VBC, Dr Bohr-Gasse 7, 1030 Vienna, Austria
| | - Barry J Dickson
- 1] Research Institute of Molecular Pathology (IMP), Vienna Biocenter VBC, Dr Bohr-Gasse 7, 1030 Vienna, Austria [2] Howard Hughes Medical Institute, Janelia Farm Research Campus, Ashburn, Virginia 20147, USA (B.J.D.); Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA (E.Z.K.)
| | - Alexander Stark
- Research Institute of Molecular Pathology (IMP), Vienna Biocenter VBC, Dr Bohr-Gasse 7, 1030 Vienna, Austria
| |
Collapse
|
26
|
Enhancer diversity and the control of a simple pattern of Drosophila CNS midline cell expression. Dev Biol 2014; 392:466-82. [PMID: 24854999 DOI: 10.1016/j.ydbio.2014.05.011] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2013] [Revised: 05/06/2014] [Accepted: 05/13/2014] [Indexed: 01/13/2023]
Abstract
Transcriptional enhancers integrate information derived from transcription factor binding to control gene expression. One key question concerns the extent of trans- and cis-regulatory variation in how co-expressed genes are controlled. The Drosophila CNS midline cells constitute a group of neurons and glia in which expression changes can be readily characterized during specification and differentiation. Using a transgenic approach, we compare the cis-regulation of multiple genes expressed in the Drosophila CNS midline primordium cells, and show that while the expression patterns may appear alike, the target genes are not equivalent in how these common expression patterns are achieved. Some genes utilize a single enhancer that promotes expression in all midline cells, while others utilize multiple enhancers with distinct spatial, temporal, and quantitative contributions. Two regulators, Single-minded and Notch, play key roles in controlling early midline gene expression. While Single-minded is expected to control expression of most, if not all, midline primordium-expressed genes, the role of Notch in directly controlling midline transcription is unknown. Midline primordium expression of the rhomboid gene is dependent on cell signaling by the Notch signaling pathway. Mutational analysis of a rhomboid enhancer reveals at least 5 distinct types of functional cis-control elements, including a binding site for the Notch effector, Suppressor of Hairless. The results suggest a model in which Notch/Suppressor of Hairless levels are insufficient to activate rhomboid expression by itself, but does so in conjunction with additional factors, some of which, including Single-minded, provide midline specificity to Notch activation. Similarly, a midline glial enhancer from the argos gene, which is dependent on EGF/Spitz signaling, is directly regulated by contributions from both Pointed, the EGF transcriptional effector, and Single-minded. In contrast, midline primordium expression of other genes shows a strong dependence on Single-minded and varying combinations of additional transcription factors. Thus, Single-minded directly regulates midline primordium-expressed genes, but in some cases plays a primary role in directing target gene midline expression, and in others provides midline specificity to cell signaling inputs.
Collapse
|
27
|
Xu Z, Chen H, Ling J, Yu D, Struffi P, Small S. Impacts of the ubiquitous factor Zelda on Bicoid-dependent DNA binding and transcription in Drosophila. Genes Dev 2014; 28:608-21. [PMID: 24637116 PMCID: PMC3967049 DOI: 10.1101/gad.234534.113] [Citation(s) in RCA: 77] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
The Drosophila transcription factor Bicoid (Bcd) binds thousands of genomic sites during early embryogenesis, but it is unclear how many of these binding events are functionally important. Here, Small and colleagues test the role of the maternal factor Zelda (Zld) in Bcd-mediated binding and transcription. Embryos lacking Zld show enhanced Bcd binding to a subset of genomic locations, causing early activation of target genes normally silent until later stages. This study demonstrates a critical role for Zld in controlling Bcd binding and target gene activation in the early embryo. In vivo cross-linking studies suggest that the Drosophila transcription factor Bicoid (Bcd) binds to several thousand sites during early embryogenesis, but it is not clear how many of these binding events are functionally important. In contrast, reporter gene studies have identified >60 Bcd-dependent enhancers, all of which contain clusters of the consensus binding sequence TAATCC. These studies also identified clusters of TAATCC motifs (inactive fragments) that failed to drive Bcd-dependent activation. In general, active fragments showed higher levels of Bcd binding in vivo and were enriched in predicted binding sites for the ubiquitous maternal protein Zelda (Zld). Here we tested the role of Zld in Bcd-mediated binding and transcription. Removal of Zld function and mutations in Zld sites caused significant reductions in Bcd binding to known enhancers and variable effects on the activation and spatial positioning of Bcd-dependent expression patterns. Also, insertion of Zld sites converted one of six inactive fragments into a Bcd-responsive enhancer. Genome-wide binding experiments in zld mutants showed variable effects on Bcd-binding peaks, ranging from strong reductions to significantly enhanced levels of binding. Increases in Bcd binding caused the precocious Bcd-dependent activation of genes that are normally not expressed in early embryos, suggesting that Zld controls the genome-wide binding profile of Bcd at the qualitative level and is critical for selecting target genes for activation in the early embryo. These results underscore the importance of combinatorial binding in enhancer function and provide data that will help predict regulatory activities based on DNA sequence.
Collapse
Affiliation(s)
- Zhe Xu
- Department of Biology, New York University, New York, New York 10003, USA
| | | | | | | | | | | |
Collapse
|
28
|
Lacin H, Rusch J, Yeh RT, Fujioka M, Wilson BA, Zhu Y, Robie AA, Mistry H, Wang T, Jaynes JB, Skeath JB. Genome-wide identification of Drosophila Hb9 targets reveals a pivotal role in directing the transcriptome within eight neuronal lineages, including activation of nitric oxide synthase and Fd59a/Fox-D. Dev Biol 2014; 388:117-33. [PMID: 24512689 DOI: 10.1016/j.ydbio.2014.01.029] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2013] [Revised: 01/11/2014] [Accepted: 01/31/2014] [Indexed: 11/25/2022]
Abstract
Hb9 is a homeodomain-containing transcription factor that acts in combination with Nkx6, Lim3, and Tail-up (Islet) to guide the stereotyped differentiation, connectivity, and function of a subset of neurons in Drosophila. The role of Hb9 in directing neuronal differentiation is well documented, but the lineage of Hb9(+) neurons is only partly characterized, its regulation is poorly understood, and most of the downstream genes through which it acts remain at large. Here, we complete the lineage tracing of all embryonic Hb9(+) neurons (to eight neuronal lineages) and provide evidence that hb9, lim3, and tail-up are coordinately regulated by a common set of upstream factors. Through the parallel use of micro-array gene expression profiling and the Dam-ID method, we searched for Hb9-regulated genes, uncovering transcription factors as the most over-represented class of genes regulated by Hb9 (and Nkx6) in the CNS. By a nearly ten-to-one ratio, Hb9 represses rather than activates transcription factors, highlighting transcriptional repression of other transcription factors as a core mechanism by which Hb9 governs neuronal determination. From the small set of genes activated by Hb9, we characterized the expression and function of two - fd59a/foxd, which encodes a transcription factor, and Nitric oxide synthase. Under standard lab conditions, both genes are dispensable for Drosophila development, but Nos appears to inhibit hyper-active behavior and fd59a appears to act in octopaminergic neurons to control egg-laying behavior. Together our data clarify the mechanisms through which Hb9 governs neuronal specification and differentiation and provide an initial characterization of the expression and function of Nos and fd59a in the Drosophila CNS.
Collapse
Affiliation(s)
- Haluk Lacin
- Department of Genetics, Washington University School of Medicine, St. Louis 4566, Scott Avenue, St. Louis, MO 63110, USA
| | - Jannette Rusch
- Department of Genetics, Washington University School of Medicine, St. Louis 4566, Scott Avenue, St. Louis, MO 63110, USA
| | - Raymond T Yeh
- Department of Genetics, Washington University School of Medicine, St. Louis 4566, Scott Avenue, St. Louis, MO 63110, USA
| | - Miki Fujioka
- Department of Biochemistry and Molecular Biology, Thomas Jefferson University, 1020 Locust Street, Philadelphia, PA 19107, USA
| | - Beth A Wilson
- Department of Genetics, Washington University School of Medicine, St. Louis 4566, Scott Avenue, St. Louis, MO 63110, USA
| | - Yi Zhu
- Department of Genetics, Washington University School of Medicine, St. Louis 4566, Scott Avenue, St. Louis, MO 63110, USA
| | - Alice A Robie
- Howard Hughes Medical Institute, Janelia Farm Research Campus (HHMI JFRC), Ashburn, VA, USA
| | - Hemlata Mistry
- Department of Genetics, Washington University School of Medicine, St. Louis 4566, Scott Avenue, St. Louis, MO 63110, USA
| | - Ting Wang
- Department of Genetics, Washington University School of Medicine, St. Louis 4566, Scott Avenue, St. Louis, MO 63110, USA
| | - James B Jaynes
- Department of Biochemistry and Molecular Biology, Thomas Jefferson University, 1020 Locust Street, Philadelphia, PA 19107, USA
| | - James B Skeath
- Department of Genetics, Washington University School of Medicine, St. Louis 4566, Scott Avenue, St. Louis, MO 63110, USA.
| |
Collapse
|
29
|
Erceg J, Saunders TE, Girardot C, Devos DP, Hufnagel L, Furlong EEM. Subtle changes in motif positioning cause tissue-specific effects on robustness of an enhancer's activity. PLoS Genet 2014; 10:e1004060. [PMID: 24391522 PMCID: PMC3879207 DOI: 10.1371/journal.pgen.1004060] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/19/2013] [Accepted: 11/11/2013] [Indexed: 12/14/2022] Open
Abstract
Deciphering the specific contribution of individual motifs within cis-regulatory modules (CRMs) is crucial to understanding how gene expression is regulated and how this process is affected by sequence variation. But despite vast improvements in the ability to identify where transcription factors (TFs) bind throughout the genome, we are limited in our ability to relate information on motif occupancy to function from sequence alone. Here, we engineered 63 synthetic CRMs to systematically assess the relationship between variation in the content and spacing of motifs within CRMs to CRM activity during development using Drosophila transgenic embryos. In over half the cases, very simple elements containing only one or two types of TF binding motifs were capable of driving specific spatio-temporal patterns during development. Different motif organizations provide different degrees of robustness to enhancer activity, ranging from binary on-off responses to more subtle effects including embryo-to-embryo and within-embryo variation. By quantifying the effects of subtle changes in motif organization, we were able to model biophysical rules that explain CRM behavior and may contribute to the spatial positioning of CRM activity in vivo. For the same enhancer, the effects of small differences in motif positions varied in developmentally related tissues, suggesting that gene expression may be more susceptible to sequence variation in one tissue compared to another. This result has important implications for human eQTL studies in which many associated mutations are found in cis-regulatory regions, though the mechanism for how they affect tissue-specific gene expression is often not understood.
Collapse
Affiliation(s)
- Jelena Erceg
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Timothy E. Saunders
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Charles Girardot
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Damien P. Devos
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Lars Hufnagel
- Cell Biology and Biophysics Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
| | - Eileen E. M. Furlong
- Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Heidelberg, Germany
- * E-mail:
| |
Collapse
|
30
|
Jiang P, Singh M. CCAT: Combinatorial Code Analysis Tool for transcriptional regulation. Nucleic Acids Res 2013; 42:2833-47. [PMID: 24366875 PMCID: PMC3950699 DOI: 10.1093/nar/gkt1302] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023] Open
Abstract
Combinatorial interplay among transcription factors (TFs) is an important mechanism by which transcriptional regulatory specificity is achieved. However, despite the increasing number of TFs for which either binding specificities or genome-wide occupancy data are known, knowledge about cooperativity between TFs remains limited. To address this, we developed a computational framework for predicting genome-wide co-binding between TFs (CCAT, Combinatorial Code Analysis Tool), and applied it to Drosophila melanogaster to uncover cooperativity among TFs during embryo development. Using publicly available TF binding specificity data and DNaseI chromatin accessibility data, we first predicted genome-wide binding sites for 324 TFs across five stages of D. melanogaster embryo development. We then applied CCAT in each of these developmental stages, and identified from 19 to 58 pairs of TFs in each stage whose predicted binding sites are significantly co-localized. We found that nearby binding sites for pairs of TFs predicted to cooperate were enriched in regions bound in relevant ChIP experiments, and were more evolutionarily conserved than other pairs. Further, we found that TFs tend to be co-localized with other TFs in a dynamic manner across developmental stages. All generated data as well as source code for our front-to-end pipeline are available at http://cat.princeton.edu.
Collapse
Affiliation(s)
- Peng Jiang
- Department of Computer Science, Princeton University, Princeton, 08540 NJ, USA and Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, 08544 NJ, USA
| | | |
Collapse
|
31
|
Vadasz S, Marquez J, Tulloch M, Shylo NA, García-Castro MI. Pax7 is regulated by cMyb during early neural crest development through a novel enhancer. Development 2013; 140:3691-702. [PMID: 23942518 DOI: 10.1242/dev.088328] [Citation(s) in RCA: 17] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/02/2023]
Abstract
The neural crest (NC) is a migratory population of cells unique to vertebrates that generates many diverse derivatives. NC cells arise during gastrulation at the neural plate border (NPB), which is later elevated as the neural folds (NFs) form and fuse in the dorsal region of the closed neural tube, from where NC cells emigrate. In chick embryos, Pax7 is an early marker, and necessary component of NC development. Unlike other early NPB markers, which are co-expressed in lateral ectoderm, medial neural plate or posterior-lateral mesoderm, Pax7 early expression seems more restricted to the NPB. However, the molecular mechanisms controlling early Pax7 expression remain poorly understood. Here, we identify a novel enhancer of Pax7 in avian embryos that replicates the expression of Pax7 associated with early NC development. Expression from this enhancer is found in early NPB, NFs and early emigrating NC, but unlike Pax7, which is also expressed in mesodermal derivatives, this enhancer is not active in somites. Further analysis demonstrates that cMyb is able to interact with this enhancer and modulates reporter and endogenous early Pax7 expression; thus, cMyb is identified as a novel regulator of Pax7 in early NC development.
Collapse
Affiliation(s)
- Stephanie Vadasz
- Department of Molecular, Cellular, and Developmental Biology, Yale University, New Haven, CT 06520-8103, USA
| | | | | | | | | |
Collapse
|
32
|
Diermeier SD, Németh A, Rehli M, Grummt I, Längst G. Chromatin-specific regulation of mammalian rDNA transcription by clustered TTF-I binding sites. PLoS Genet 2013; 9:e1003786. [PMID: 24068958 PMCID: PMC3772059 DOI: 10.1371/journal.pgen.1003786] [Citation(s) in RCA: 20] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2013] [Accepted: 07/26/2013] [Indexed: 12/04/2022] Open
Abstract
Enhancers and promoters often contain multiple binding sites for the same transcription factor, suggesting that homotypic clustering of binding sites may serve a role in transcription regulation. Here we show that clustering of binding sites for the transcription termination factor TTF-I downstream of the pre-rRNA coding region specifies transcription termination, increases the efficiency of transcription initiation and affects the three-dimensional structure of rRNA genes. On chromatin templates, but not on free rDNA, clustered binding sites promote cooperative binding of TTF-I, loading TTF-I to the downstream terminators before it binds to the rDNA promoter. Interaction of TTF-I with target sites upstream and downstream of the rDNA transcription unit connects these distal DNA elements by forming a chromatin loop between the rDNA promoter and the terminators. The results imply that clustered binding sites increase the binding affinity of transcription factors in chromatin, thus influencing the timing and strength of DNA-dependent processes. The sequence-specific binding of proteins to regulatory regions controls gene expression. Binding sites for transcription factors are rather short and present several million times in large genomes. However, only a small number of these binding sites are functionally important. How proteins can discriminate and select their functional regions is not clear, to date. Regulatory loci like gene promoters and enhancers commonly comprise multiple binding sites for either one factor or a combination of several DNA binding proteins, allowing efficient factor recruitment. We studied the cluster of TTF-I binding sites downstream of the rRNA gene and identified that cooperative binding to the multimeric termination sites in combination with low-affinity binding of TTF-I to individual sites upstream of the gene serves multiple regulatory functions. Packaging of the clustered sites into chromatin is a prerequisite for high-affinity binding, coordinated activation of transcription and the formation of a chromatin loop between the promoter and the terminator.
Collapse
Affiliation(s)
- Sarah D. Diermeier
- Biochemistry Centre Regensburg (BCR), University of Regensburg, Regensburg, Germany
| | - Attila Németh
- Biochemistry Centre Regensburg (BCR), University of Regensburg, Regensburg, Germany
| | - Michael Rehli
- Department of Hematology, University Hospital Regensburg, Regensburg, Germany
| | - Ingrid Grummt
- Molecular Biology of the Cell II, German Cancer Research Centre (DKFZ), Heidelberg, Germany
| | - Gernot Längst
- Biochemistry Centre Regensburg (BCR), University of Regensburg, Regensburg, Germany
- * E-mail:
| |
Collapse
|
33
|
Menoret D, Santolini M, Fernandes I, Spokony R, Zanet J, Gonzalez I, Latapie Y, Ferrer P, Rouault H, White KP, Besse P, Hakim V, Aerts S, Payre F, Plaza S. Genome-wide analyses of Shavenbaby target genes reveals distinct features of enhancer organization. Genome Biol 2013; 14:R86. [PMID: 23972280 PMCID: PMC4053989 DOI: 10.1186/gb-2013-14-8-r86] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2013] [Accepted: 08/23/2013] [Indexed: 12/17/2022] Open
Abstract
Background Developmental programs are implemented by regulatory interactions between Transcription Factors (TFs) and their target genes, which remain poorly understood. While recent studies have focused on regulatory cascades of TFs that govern early development, little is known about how the ultimate effectors of cell differentiation are selected and controlled. We addressed this question during late Drosophila embryogenesis, when the finely tuned expression of the TF Ovo/Shavenbaby (Svb) triggers the morphological differentiation of epidermal trichomes. Results We defined a sizeable set of genes downstream of Svb and used in vivo assays to delineate 14 enhancers driving their specific expression in trichome cells. Coupling computational modeling to functional dissection, we investigated the regulatory logic of these enhancers. Extending the repertoire of epidermal effectors using genome-wide approaches showed that the regulatory models learned from this first sample are representative of the whole set of trichome enhancers. These enhancers harbor remarkable features with respect to their functional architectures, including a weak or non-existent clustering of Svb binding sites. The in vivo function of each site relies on its intimate context, notably the flanking nucleotides. Two additional cis-regulatory motifs, present in a broad diversity of composition and positioning among trichome enhancers, critically contribute to enhancer activity. Conclusions Our results show that Svb directly regulates a large set of terminal effectors of the remodeling of epidermal cells. Further, these data reveal that trichome formation is underpinned by unexpectedly diverse modes of regulation, providing fresh insights into the functional architecture of enhancers governing a terminal differentiation program.
Collapse
|
34
|
Kenigsberg E, Tanay A. Drosophila functional elements are embedded in structurally constrained sequences. PLoS Genet 2013; 9:e1003512. [PMID: 23750124 PMCID: PMC3671938 DOI: 10.1371/journal.pgen.1003512] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/31/2012] [Accepted: 03/04/2013] [Indexed: 12/22/2022] Open
Abstract
Modern functional genomics uncovered numerous functional elements in metazoan genomes. Nevertheless, only a small fraction of the typical non-exonic genome contains elements that code for function directly. On the other hand, a much larger fraction of the genome is associated with significant evolutionary constraints, suggesting that much of the non-exonic genome is weakly functional. Here we show that in flies, local (30–70 bp) conserved sequence elements that are associated with multiple regulatory functions serve as focal points to a pattern of punctuated regional increase in G/C nucleotide frequencies. We show that this pattern, which covers a region tenfold larger than the conserved elements themselves, is an evolutionary consequence of a shift in the balance between gain and loss of G/C nucleotides and that it is correlated with nucleosome occupancy across multiple classes of epigenetic state. Evidence for compensatory evolution and analysis of SNP allele frequencies show that the evolutionary regime underlying this balance shift is likely to be non-neutral. These data suggest that current gaps in our understanding of genome function and evolutionary dynamics are explicable by a model of sparse sequence elements directly encoding for function, embedded into structural sequences that help to define the local and global epigenomic context of such functional elements. A key challenge in functional genomics is to predict evolutionary dynamics from functional annotation of the genome and vice versa. Modern epigenomic studies helped assign function to numerous new sequence elements, but left most of the genome essentially uncharacterized. Evolutionary genomics, on the other hand, consistently suggests that a much larger fraction of the un-annotated genome evolves under selective pressure. We hypothesize that this function-selection gap can be attributed to sequences that facilitate the physical organization of functional elements, such as transcription factor binding sites, within chromosomes. We exemplify this by studying in detail the sequences embedding small conserved elements (CEs) in Drosophila. We show that, while CEs have typically high AT content, high GC content levels around them are maintained by a non-neutral evolutionary balance between gain and loss of GC nucleotides. This non-uniform pattern is highly correlated with nucleosome organization around CEs, potentially imposing an evolutionary constraint on as much as one quarter of the genome. We suggest this can at least partly explain the above function-selection gap. Weak evolutionary constraints on “structural” sequences (at scales ranging from one nucleosome to recently described multi-megabase topological domains) may affect genome evolution just like structural motifs shape protein evolution.
Collapse
Affiliation(s)
- Ephraim Kenigsberg
- Department of Computer Science and Applied Mathematics and Department of Biological Regulation, Weizmann Institute, Rehovot, Israel
| | - Amos Tanay
- Department of Computer Science and Applied Mathematics and Department of Biological Regulation, Weizmann Institute, Rehovot, Israel
- * E-mail:
| |
Collapse
|
35
|
Crocker J, Erives A. A Schnurri/Mad/Medea complex attenuates the dorsal-twist gradient readout at vnd. Dev Biol 2013; 378:64-72. [PMID: 23499655 DOI: 10.1016/j.ydbio.2013.03.002] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2012] [Revised: 02/13/2013] [Accepted: 03/04/2013] [Indexed: 12/28/2022]
Abstract
Morphogen gradients are used in developing embryos, where they subdivide a field of cells into territories characterized by distinct cell fate potentials. Such systems require both a spatially-graded distribution of the morphogen, and an ability to encode different responses at different target genes. However, the potential for different temporal responses is also present because morphogen gradients typically provide temporal cues, which may be a potential source of conflict. Thus, a low threshold response adapted for an early temporal onset may be inappropriate when the desired spatial response is a spatially-limited, high-threshold expression pattern. Here, we identify such a case with the Drosophila vnd locus, which is a target of the dorsal (dl) nuclear concentration gradient that patterns the dorsal/ventral (D/V) axis of the embryo. The vnd gene plays a critical role in the "ventral dominance" hierarchy of vnd, ind, and msh, which individually specify distinct D/V neural columnar fates in increasingly dorsal ectodermal compartments. The role of vnd in this regulatory hierarchy requires early temporal expression, which is characteristic of low-threshold responses, but its specification of ventral neurogenic ectoderm demands a relatively high-threshold response to dl. We show that the Neurogenic Ectoderm Enhancer (NEE) at vnd takes additional input from the complementary Dpp gradient via a conserved Schnurri/Mad/Medea silencer element (SSE) unlike NEEs at brk, sog, rho, and vn. These results show how requirements for conflicting temporal and spatial responses to the same gradient can be solved by additional inputs from complementary gradients.
Collapse
Affiliation(s)
- Justin Crocker
- Janelia Farm Research Campus, Howard Hughes Medical Institute, 19700 Helix Drive, Ashburn, VA 20147, USA
| | | |
Collapse
|
36
|
Dpp-induced Egfr signaling triggers postembryonic wing development in Drosophila. Proc Natl Acad Sci U S A 2013; 110:5058-63. [PMID: 23479629 DOI: 10.1073/pnas.1217538110] [Citation(s) in RCA: 35] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
The acquisition of flight contributed to the success of insects and winged forms are present in most orders. Key to understanding the origin of wings will be knowledge of the earliest postembryonic events promoting wing outgrowth. The Drosophila melanogaster wing is intensely studied as a model appendage, and yet little is known about the beginning of wing outgrowth. Vein (Vn) is a neuregulin-like ligand for the EGF receptor (Egfr), which is necessary for global development of the early Drosophila wing disc. vn is not expressed in the embryonic wing primordium and thus has to be induced de novo in the nascent larval wing disc. We find that Decapentaplegic (Dpp), a Bone Morphogenetic Protein (BMP) family member, provides the instructive signal for initiating vn expression. The signaling involves paracrine communication between two epithelia in the early disc. Once initiated, vn expression is amplified and maintained by autocrine signaling mediated by the E-twenty six (ETS)-factor PointedP2 (PntP2). This interplay of paracrine and autocrine signaling underlies the spatial and temporal pattern of induction of Vn/Egfr target genes and explains both body wall development and wing outgrowth. It is possible this gene regulatory network governing expression of an EGF ligand is conserved and reflects a common origin of insect wings.
Collapse
|
37
|
Kim AR, Martinez C, Ionides J, Ramos AF, Ludwig MZ, Ogawa N, Sharp DH, Reinitz J. Rearrangements of 2.5 kilobases of noncoding DNA from the Drosophila even-skipped locus define predictive rules of genomic cis-regulatory logic. PLoS Genet 2013; 9:e1003243. [PMID: 23468638 PMCID: PMC3585115 DOI: 10.1371/journal.pgen.1003243] [Citation(s) in RCA: 48] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2012] [Accepted: 11/30/2012] [Indexed: 01/19/2023] Open
Abstract
Rearrangements of about 2.5 kilobases of regulatory DNA located 5' of the transcription start site of the Drosophila even-skipped locus generate large-scale changes in the expression of even-skipped stripes 2, 3, and 7. The most radical effects are generated by juxtaposing the minimal stripe enhancers MSE2 and MSE3 for stripes 2 and 3 with and without small "spacer" segments less than 360 bp in length. We placed these fusion constructs in a targeted transformation site and obtained quantitative expression data for these transformants together with their controlling transcription factors at cellular resolution. These data demonstrated that the rearrangements can alter expression levels in stripe 2 and the 2-3 interstripe by a factor of more than 10. We reasoned that this behavior would place tight constraints on possible rules of genomic cis-regulatory logic. To find these constraints, we confronted our new expression data together with previously obtained data on other constructs with a computational model. The model contained representations of thermodynamic protein-DNA interactions including steric interference and cooperative binding, short-range repression, direct repression, activation, and coactivation. The model was highly constrained by the training data, which it described within the limits of experimental error. The model, so constrained, was able to correctly predict expression patterns driven by enhancers for other Drosophila genes; even-skipped enhancers not included in the training set; stripe 2, 3, and 7 enhancers from various Drosophilid and Sepsid species; and long segments of even-skipped regulatory DNA that contain multiple enhancers. The model further demonstrated that elevated expression driven by a fusion of MSE2 and MSE3 was a consequence of the recruitment of a portion of MSE3 to become a functional component of MSE2, demonstrating that cis-regulatory "elements" are not elementary objects.
Collapse
Affiliation(s)
- Ah-Ram Kim
- Department of Ecology and Evolution, Chicago Center for Systems Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Biochemistry and Cell Biology, Stony Brook University, Stony Brook, New York, United States of America
| | - Carlos Martinez
- Department of Ecology and Evolution, Chicago Center for Systems Biology, University of Chicago, Chicago, Illinois, United States of America
| | - John Ionides
- Department of Biochemistry, University of Cambridge, Cambridge, United Kingdom
| | - Alexandre F. Ramos
- Escola de Artes, Ciências e Humanidades, Universidade de São Paulo, São Paulo, Brazil
| | - Michael Z. Ludwig
- Department of Ecology and Evolution, Chicago Center for Systems Biology, University of Chicago, Chicago, Illinois, United States of America
| | - Nobuo Ogawa
- Genomics Division, Lawrence Berkeley National Laboratory, Berkeley, California, United States of America
| | - David H. Sharp
- Theoretical Division, Los Alamos National Laboratory, Los Alamos, New Mexico, United States of America
| | - John Reinitz
- Department of Ecology and Evolution, Chicago Center for Systems Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Statistics, Department of Molecular Genetics and Cell Biology, and Institute of Genomics and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
38
|
Deciphering the transcriptional cis-regulatory code. Trends Genet 2012; 29:11-22. [PMID: 23102583 DOI: 10.1016/j.tig.2012.09.007] [Citation(s) in RCA: 89] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/09/2012] [Revised: 09/24/2012] [Accepted: 09/25/2012] [Indexed: 02/07/2023]
Abstract
Information about developmental gene expression resides in defined regulatory elements, called enhancers, in the non-coding part of the genome. Although cells reliably utilize enhancers to orchestrate gene expression, a cis-regulatory code that would allow their interpretation has remained one of the greatest challenges of modern biology. In this review, we summarize studies from the past three decades that describe progress towards revealing the properties of enhancers and discuss how recent approaches are providing unprecedented insights into regulatory elements in animal genomes. Over the next years, we believe that the functional characterization of regulatory sequences in entire genomes, combined with recent computational methods, will provide a comprehensive view of genomic regulatory elements and their building blocks and will enable researchers to begin to understand the sequence basis of the cis-regulatory code.
Collapse
|
39
|
Frankel N. Multiple layers of complexity incis-regulatory regions of developmental genes. Dev Dyn 2012; 241:1857-66. [DOI: 10.1002/dvdy.23871] [Citation(s) in RCA: 27] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/06/2012] [Indexed: 12/19/2022] Open
|
40
|
Wang S, Yin Y, Ma Q, Tang X, Hao D, Xu Y. Genome-scale identification of cell-wall related genes in Arabidopsis based on co-expression network analysis. BMC PLANT BIOLOGY 2012; 12:138. [PMID: 22877077 PMCID: PMC3463447 DOI: 10.1186/1471-2229-12-138] [Citation(s) in RCA: 40] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/14/2012] [Accepted: 07/30/2012] [Indexed: 05/21/2023]
Abstract
BACKGROUND Identification of the novel genes relevant to plant cell-wall (PCW) synthesis represents a highly important and challenging problem. Although substantial efforts have been invested into studying this problem, the vast majority of the PCW related genes remain unknown. RESULTS Here we present a computational study focused on identification of the novel PCW genes in Arabidopsis based on the co-expression analyses of transcriptomic data collected under 351 conditions, using a bi-clustering technique. Our analysis identified 217 highly co-expressed gene clusters (modules) under some experimental conditions, each containing at least one gene annotated as PCW related according to the Purdue Cell Wall Gene Families database. These co-expression modules cover 349 known/annotated PCW genes and 2,438 new candidates. For each candidate gene, we annotated the specific PCW synthesis stages in which it is involved and predicted the detailed function. In addition, for the co-expressed genes in each module, we predicted and analyzed their cis regulatory motifs in the promoters using our motif discovery pipeline, providing strong evidence that the genes in each co-expression module are transcriptionally co-regulated. From the all co-expression modules, we infer that 108 modules are related to four major PCW synthesis components, using three complementary methods. CONCLUSIONS We believe our approach and data presented here will be useful for further identification and characterization of PCW genes. All the predicted PCW genes, co-expression modules, motifs and their annotations are available at a web-based database: http://csbl.bmb.uga.edu/publications/materials/shanwang/CWRPdb/index.html.
Collapse
Affiliation(s)
- Shan Wang
- Computational Systems Biology Laboratory, Department of Biochemistry and Molecular Biology, and Institute of Bioinformatics, Athens, GA, USA
- Key Lab for Molecular Enzymology and Engineering of the Ministry of Education, Jilin University, Changchun, China
- Biotechnology Research Centre, Jilin Academy of Agricultural Sciences (JAAS), Changchun, China
| | - Yanbin Yin
- Computational Systems Biology Laboratory, Department of Biochemistry and Molecular Biology, and Institute of Bioinformatics, Athens, GA, USA
- BESC BioEerngy Science Center, University of Georgia, Athens, GA, USA
| | - Qin Ma
- Computational Systems Biology Laboratory, Department of Biochemistry and Molecular Biology, and Institute of Bioinformatics, Athens, GA, USA
- BESC BioEerngy Science Center, University of Georgia, Athens, GA, USA
| | - Xiaojia Tang
- Computational Systems Biology Laboratory, Department of Biochemistry and Molecular Biology, and Institute of Bioinformatics, Athens, GA, USA
| | - Dongyun Hao
- Key Lab for Molecular Enzymology and Engineering of the Ministry of Education, Jilin University, Changchun, China
- Biotechnology Research Centre, Jilin Academy of Agricultural Sciences (JAAS), Changchun, China
| | - Ying Xu
- Computational Systems Biology Laboratory, Department of Biochemistry and Molecular Biology, and Institute of Bioinformatics, Athens, GA, USA
- BESC BioEerngy Science Center, University of Georgia, Athens, GA, USA
- College of Computer Science and Technology, Jilin University, Changchun, China
| |
Collapse
|
41
|
Busser BW, Taher L, Kim Y, Tansey T, Bloom MJ, Ovcharenko I, Michelson AM. A machine learning approach for identifying novel cell type-specific transcriptional regulators of myogenesis. PLoS Genet 2012; 8:e1002531. [PMID: 22412381 PMCID: PMC3297574 DOI: 10.1371/journal.pgen.1002531] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/30/2011] [Accepted: 12/23/2011] [Indexed: 12/22/2022] Open
Abstract
Transcriptional enhancers integrate the contributions of multiple classes of transcription factors (TFs) to orchestrate the myriad spatio-temporal gene expression programs that occur during development. A molecular understanding of enhancers with similar activities requires the identification of both their unique and their shared sequence features. To address this problem, we combined phylogenetic profiling with a DNA-based enhancer sequence classifier that analyzes the TF binding sites (TFBSs) governing the transcription of a co-expressed gene set. We first assembled a small number of enhancers that are active in Drosophila melanogaster muscle founder cells (FCs) and other mesodermal cell types. Using phylogenetic profiling, we increased the number of enhancers by incorporating orthologous but divergent sequences from other Drosophila species. Functional assays revealed that the diverged enhancer orthologs were active in largely similar patterns as their D. melanogaster counterparts, although there was extensive evolutionary shuffling of known TFBSs. We then built and trained a classifier using this enhancer set and identified additional related enhancers based on the presence or absence of known and putative TFBSs. Predicted FC enhancers were over-represented in proximity to known FC genes; and many of the TFBSs learned by the classifier were found to be critical for enhancer activity, including POU homeodomain, Myb, Ets, Forkhead, and T-box motifs. Empirical testing also revealed that the T-box TF encoded by org-1 is a previously uncharacterized regulator of muscle cell identity. Finally, we found extensive diversity in the composition of TFBSs within known FC enhancers, suggesting that motif combinatorics plays an essential role in the cellular specificity exhibited by such enhancers. In summary, machine learning combined with evolutionary sequence analysis is useful for recognizing novel TFBSs and for facilitating the identification of cognate TFs that coordinate cell type-specific developmental gene expression patterns.
Collapse
Affiliation(s)
- Brian W. Busser
- Laboratory of Developmental Systems Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Leila Taher
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Yongsok Kim
- Laboratory of Developmental Systems Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Terese Tansey
- Laboratory of Developmental Systems Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Molly J. Bloom
- Laboratory of Developmental Systems Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
| | - Ivan Ovcharenko
- Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (IO); (AMM)
| | - Alan M. Michelson
- Laboratory of Developmental Systems Biology, National Heart, Lung, and Blood Institute, National Institutes of Health, Bethesda, Maryland, United States of America
- * E-mail: (IO); (AMM)
| |
Collapse
|
42
|
Starr MO, Ho MCW, Gunther EJM, Tu YK, Shur AS, Goetz SE, Borok MJ, Kang V, Drewell RA. Molecular dissection of cis-regulatory modules at the Drosophila bithorax complex reveals critical transcription factor signature motifs. Dev Biol 2011; 359:290-302. [PMID: 21821017 PMCID: PMC3202680 DOI: 10.1016/j.ydbio.2011.07.028] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/02/2011] [Revised: 07/17/2011] [Accepted: 07/19/2011] [Indexed: 11/17/2022]
Abstract
At the Drosophila melanogaster bithorax complex (BX-C) over 330kb of intergenic DNA is responsible for directing the transcription of just three homeotic (Hox) genes during embryonic development. A number of distinct enhancer cis-regulatory modules (CRMs) are responsible for controlling the specific expression patterns of the Hox genes in the BX-C. While it has proven possible to identify orthologs of known BX-C CRMs in different Drosophila species using overall sequence conservation, this approach has not proven sufficiently effective for identifying novel CRMs or defining the key functional sequences within enhancer CRMs. Here we demonstrate that the specific spatial clustering of transcription factor (TF) binding sites is important for BX-C enhancer activity. A bioinformatic search for combinations of putative TF binding sites in the BX-C suggests that simple clustering of binding sites is frequently not indicative of enhancer activity. However, through molecular dissection and evolutionary comparison across the Drosophila genus we discovered that specific TF binding site clustering patterns are an important feature of three known BX-C enhancers. Sub-regions of the defined IAB5 and IAB7b enhancers were both found to contain an evolutionarily conserved signature motif of clustered TF binding sites which is critical for the functional activity of the enhancers. Together, these results indicate that the spatial organization of specific activator and repressor binding sites within BX-C enhancers is of greater importance than overall sequence conservation and is indicative of enhancer functional activity.
Collapse
Affiliation(s)
| | | | | | - Yen-Kuei Tu
- Biology Department, Harvey Mudd College, 301 Platt Boulevard, Claremont, CA 91711, USA
| | - Andrey S. Shur
- Biology Department, Harvey Mudd College, 301 Platt Boulevard, Claremont, CA 91711, USA
| | - Sara E. Goetz
- Biology Department, Harvey Mudd College, 301 Platt Boulevard, Claremont, CA 91711, USA
| | - Matthew J. Borok
- Biology Department, Harvey Mudd College, 301 Platt Boulevard, Claremont, CA 91711, USA
| | - Victoria Kang
- Biology Department, Harvey Mudd College, 301 Platt Boulevard, Claremont, CA 91711, USA
| | - Robert A. Drewell
- Biology Department, Harvey Mudd College, 301 Platt Boulevard, Claremont, CA 91711, USA
| |
Collapse
|
43
|
Ludwig MZ, Manu, Kittler R, White KP, Kreitman M. Consequences of eukaryotic enhancer architecture for gene expression dynamics, development, and fitness. PLoS Genet 2011; 7:e1002364. [PMID: 22102826 PMCID: PMC3213169 DOI: 10.1371/journal.pgen.1002364] [Citation(s) in RCA: 53] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/11/2011] [Accepted: 09/14/2011] [Indexed: 12/13/2022] Open
Abstract
The regulatory logic of time- and tissue-specific gene expression has mostly been dissected in the context of the smallest DNA fragments that, when isolated, recapitulate native expression in reporter assays. It is not known if the genomic sequences surrounding such fragments, often evolutionarily conserved, have any biological function or not. Using an enhancer of the even-skipped gene of Drosophila as a model, we investigate the functional significance of the genomic sequences surrounding empirically identified enhancers. A 480 bp long "minimal stripe element" is able to drive even-skipped expression in the second of seven stripes but is embedded in a larger region of 800 bp containing evolutionarily conserved binding sites for required transcription factors. To assess the overall fitness contribution made by these binding sites in the native genomic context, we employed a gene-replacement strategy in which whole-locus transgenes, capable of rescuing even-skipped(-) lethality to adulthood, were substituted for the native gene. The molecular phenotypes were characterized by tagging Even-skipped with a fluorescent protein and monitoring gene expression dynamics in living embryos. We used recombineering to excise the sequences surrounding the minimal enhancer and site-specific transgenesis to create co-isogenic strains differing only in their stripe 2 sequences. Remarkably, the flanking sequences were dispensable for viability, proving the sufficiency of the minimal element for biological function under normal conditions. These sequences are required for robustness to genetic and environmental perturbation instead. The mutant enhancers had measurable sex- and dose-dependent effects on viability. At the molecular level, the mutants showed a destabilization of stripe placement and improper activation of downstream genes. Finally, we demonstrate through live measurements that the peripheral sequences are required for temperature compensation. These results imply that seemingly redundant regulatory sequences beyond the minimal enhancer are necessary for robust gene expression and that "robustness" itself must be an evolved characteristic of the wild-type enhancer.
Collapse
Affiliation(s)
- Michael Z. Ludwig
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
- Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
| | - Manu
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
| | - Ralf Kittler
- Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Kevin P. White
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
- Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
- Department of Human Genetics, University of Chicago, Chicago, Illinois, United States of America
| | - Martin Kreitman
- Department of Ecology and Evolution, University of Chicago, Chicago, Illinois, United States of America
- Institute for Genomics and Systems Biology, University of Chicago, Chicago, Illinois, United States of America
| |
Collapse
|
44
|
Abstract
Accurately predicting regulatory sequences and enhancers in entire genomes is an important but difficult problem, especially in large vertebrate genomes. With the advent of ChIP-seq technology, experimental detection of genome-wide EP300/CREBBP bound regions provides a powerful platform to develop predictive tools for regulatory sequences and to study their sequence properties. Here, we develop a support vector machine (SVM) framework which can accurately identify EP300-bound enhancers using only genomic sequence and an unbiased set of general sequence features. Moreover, we find that the predictive sequence features identified by the SVM classifier reveal biologically relevant sequence elements enriched in the enhancers, but we also identify other features that are significantly depleted in enhancers. The predictive sequence features are evolutionarily conserved and spatially clustered, providing further support of their functional significance. Although our SVM is trained on experimental data, we also predict novel enhancers and show that these putative enhancers are significantly enriched in both ChIP-seq signal and DNase I hypersensitivity signal in the mouse brain and are located near relevant genes. Finally, we present results of comparisons between other EP300/CREBBP data sets using our SVM and uncover sequence elements enriched and/or depleted in the different classes of enhancers. Many of these sequence features play a role in specifying tissue-specific or developmental-stage-specific enhancer activity, but our results indicate that some features operate in a general or tissue-independent manner. In addition to providing a high confidence list of enhancer targets for subsequent experimental investigation, these results contribute to our understanding of the general sequence structure of vertebrate enhancers.
Collapse
|
45
|
Swanson CI, Schwimmer DB, Barolo S. Rapid evolutionary rewiring of a structurally constrained eye enhancer. Curr Biol 2011; 21:1186-96. [PMID: 21737276 DOI: 10.1016/j.cub.2011.05.056] [Citation(s) in RCA: 65] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/30/2010] [Revised: 04/18/2011] [Accepted: 05/27/2011] [Indexed: 12/20/2022]
Abstract
BACKGROUND Enhancers are genomic cis-regulatory sequences that integrate spatiotemporal signals to control gene expression. Enhancer activity depends on the combination of bound transcription factors as well as-in some cases-the arrangement and spacing of binding sites for these factors. Here, we examine evolutionary changes to the sequence and structure of sparkling, a Notch/EGFR/Runx-regulated enhancer that activates the dPax2 gene in cone cells of the developing Drosophila eye. RESULTS Despite functional and structural constraints on its sequence, sparkling has undergone major reorganization in its recent evolutionary history. Our data suggest that the relative strengths of the various regulatory inputs into sparkling change rapidly over evolutionary time, such that reduced input from some factors is compensated by increased input from different regulators. These gains and losses are at least partly responsible for the changes in enhancer structure that we observe. Furthermore, stereotypical spatial relationships between certain binding sites ("grammar elements") can be identified in all sparkling orthologs-although the sites themselves are often recently derived. We also find that low binding affinity for the Notch-regulated transcription factor Su(H), a conserved property of sparkling, is required to prevent ectopic responses to Notch in noncone cells. CONCLUSIONS Rapid DNA sequence turnover does not imply either the absence of critical cis-regulatory information or the absence of structural rules. Our findings demonstrate that even a severely constrained cis-regulatory sequence can be significantly rewired over a short evolutionary timescale.
Collapse
Affiliation(s)
- Christina I Swanson
- Department of Cell and Developmental Biology, University of Michigan Medical School, Ann Arbor, MI 48109-2200, USA
| | | | | |
Collapse
|
46
|
Barrière A, Gordon KL, Ruvinsky I. Distinct functional constraints partition sequence conservation in a cis-regulatory element. PLoS Genet 2011; 7:e1002095. [PMID: 21655084 PMCID: PMC3107193 DOI: 10.1371/journal.pgen.1002095] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/25/2010] [Accepted: 04/07/2011] [Indexed: 11/25/2022] Open
Abstract
Different functional constraints contribute to different evolutionary rates across genomes. To understand why some sequences evolve faster than others in a single cis-regulatory locus, we investigated function and evolutionary dynamics of the promoter of the Caenorhabditis elegans unc-47 gene. We found that this promoter consists of two distinct domains. The proximal promoter is conserved and is largely sufficient to direct appropriate spatial expression. The distal promoter displays little if any conservation between several closely related nematodes. Despite this divergence, sequences from all species confer robustness of expression, arguing that this function does not require substantial sequence conservation. We showed that even unrelated sequences have the ability to promote robust expression. A prominent feature shared by all of these robustness-promoting sequences is an AT-enriched nucleotide composition consistent with nucleosome depletion. Because general sequence composition can be maintained despite sequence turnover, our results explain how different functional constraints can lead to vastly disparate rates of sequence divergence within a promoter. Comparison between genome sequences of different species is a powerful tool in modern biology because important features are maintained by natural selection and are therefore conserved. However, some important sequences within genomes evolve considerably faster than others. One possible explanation is that they encode little or no function. Alternatively, they may evolve under different constraints that permit sequence turnover while maintaining function. Here we report that the promoter of the unc-47 gene of C. elegans contains two discrete elements. One has a highly conserved sequence that determines the spatial expression pattern. Another shows no sequence conservation, but it makes expression of the gene robust, that is, consistent between individuals and resilient to environmental challenges. Remarkably, multiple unrelated sequences are capable of promoting robust expression. Nucleotide composition of these sequences suggests that open chromatin may play a role in conferring robustness of gene expression. Because general sequence composition and therefore expression robustness can be maintained despite sequence turnover, our results offer an explanation of how rapidly diverging promoter elements can nevertheless remain functionally conserved.
Collapse
Affiliation(s)
- Antoine Barrière
- Department of Ecology and Evolution and Institute for Genomics and Systems Biology, Chicago, Illinois, United States of America
| | - Kacy L. Gordon
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
| | - Ilya Ruvinsky
- Department of Ecology and Evolution and Institute for Genomics and Systems Biology, Chicago, Illinois, United States of America
- Department of Organismal Biology and Anatomy, The University of Chicago, Chicago, Illinois, United States of America
- * E-mail:
| |
Collapse
|
47
|
Kim TM, Park PJ. Advances in analysis of transcriptional regulatory networks. WILEY INTERDISCIPLINARY REVIEWS-SYSTEMS BIOLOGY AND MEDICINE 2011; 3:21-35. [PMID: 21069662 DOI: 10.1002/wsbm.105] [Citation(s) in RCA: 24] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/05/2023]
Abstract
A transcriptional regulatory network represents a molecular framework in which developmental or environmental cues are transformed into differential expression of genes. Transcriptional regulation is mediated by the combinatorial interplay between cis-regulatory DNA elements and trans-acting transcription factors, and is perhaps the most important mechanism for controlling gene expression. Recent innovations, most notably the method for detecting protein-DNA interactions genome-wide, can help provide a comprehensive catalog of cis-regulatory elements and their interaction with given trans-acting factors in a given condition. A transcriptional regulatory network that integrates such information can lead to a systems-level understanding of regulatory mechanisms. In this review, we will highlight the key aspects of current knowledge on eukaryotic transcriptional regulation, especially on known transcription factors and their interacting regulatory elements. Then we will review some recent technical advances for genome-wide mapping of DNA-protein interactions based on high-throughput sequencing. Finally, we will discuss the types of biological insights that can be obtained from a network-level understanding of transcription regulation as well as future challenges in the field.
Collapse
Affiliation(s)
- Tae-Min Kim
- Center for Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | | |
Collapse
|
48
|
High resolution mapping of Twist to DNA in Drosophila embryos: Efficient functional analysis and evolutionary conservation. Genome Res 2011; 21:566-77. [PMID: 21383317 DOI: 10.1101/gr.104018.109] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/29/2022]
Abstract
Cis-regulatory modules (CRMs) function by binding sequence specific transcription factors, but the relationship between in vivo physical binding and the regulatory capacity of factor-bound DNA elements remains uncertain. We investigate this relationship for the well-studied Twist factor in Drosophila melanogaster embryos by analyzing genome-wide factor occupancy and testing the functional significance of Twist occupied regions and motifs within regions. Twist ChIP-seq data efficiently identified previously studied Twist-dependent CRMs and robustly predicted new CRM activity in transgenesis, with newly identified Twist-occupied regions supporting diverse spatiotemporal patterns (>74% positive, n = 31). Some, but not all, candidate CRMs require Twist for proper expression in the embryo. The Twist motifs most favored in genome ChIP data (in vivo) differed from those most favored by Systematic Evolution of Ligands by EXponential enrichment (SELEX) (in vitro). Furthermore, the majority of ChIP-seq signals could be parsimoniously explained by a CABVTG motif located within 50 bp of the ChIP summit and, of these, CACATG was most prevalent. Mutagenesis experiments demonstrated that different Twist E-box motif types are not fully interchangeable, suggesting that the ChIP-derived consensus (CABVTG) includes sites having distinct regulatory outputs. Further analysis of position, frequency of occurrence, and sequence conservation revealed significant enrichment and conservation of CABVTG E-box motifs near Twist ChIP-seq signal summits, preferential conservation of ±150 bp surrounding Twist occupied summits, and enrichment of GA- and CA-repeat sequences near Twist occupied summits. Our results show that high resolution in vivo occupancy data can be used to drive efficient discovery and dissection of global and local cis-regulatory logic.
Collapse
|
49
|
Abstract
The gene regulatory network (GRN) underpinning dorsal-ventral (DV) patterning of the Drosophila embryo is among the most thoroughly understood GRNs, making it an ideal system for comparative studies seeking to understand the evolution of development. With the emergence of widely applicable techniques for testing gene function, species with sequenced genomes, and multiple tractable species with diverse developmental modes, a phylogenetically broad and molecularly deep understanding of the evolution of DV axis formation in insects is feasible. Here, we review recent progress made in this field, compare our emerging molecular understanding to classical embryological experiments, and suggest future directions of inquiry.
Collapse
Affiliation(s)
- Jeremy A. Lynch
- Institute for Developmental Biology, University of Cologne, 50674 Cologne, Germany
| | - Siegfried Roth
- Institute for Developmental Biology, University of Cologne, 50674 Cologne, Germany
| |
Collapse
|
50
|
Rebeiz M, Williams TM. Experimental approaches to evaluate the contributions of candidate cis-regulatory mutations to phenotypic evolution. Methods Mol Biol 2011; 772:351-375. [PMID: 22065449 DOI: 10.1007/978-1-61779-228-1_21] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/31/2023]
Abstract
Elucidating the molecular bases by which phenotypic traits have evolved provides a glimpse into the past, allowing the characterization of genetic changes that cumulatively contribute to evolutionary innovations. Historically, much of the experimental attention has been focused on changes in protein-coding regions that can readily be identified by the genetic code for translating gene coding sequences into proteins. Resultantly, the role of noncoding sequences in trait evolution has remained more mysterious. In recent years, several studies have reached an unprecedented level of detail in describing how noncoding mutations in gene cis-regulatory elements contribute to morphological evolution. Based on these and other studies, we describe an experimental framework and some of the genetic and molecular methods to connect a particular cis-regulatory mutation to the evolution of any phenotypic trait.
Collapse
Affiliation(s)
- Mark Rebeiz
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA, USA
| | | |
Collapse
|