1
|
A mammalian tripartite enhancer cluster controls hypothalamic Pomc expression, food intake, and body weight. Proc Natl Acad Sci U S A 2024; 121:e2322692121. [PMID: 38652744 PMCID: PMC11067048 DOI: 10.1073/pnas.2322692121] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/05/2024] [Accepted: 03/12/2024] [Indexed: 04/25/2024] Open
Abstract
Food intake and energy balance are tightly regulated by a group of hypothalamic arcuate neurons expressing the proopiomelanocortin (POMC) gene. In mammals, arcuate-specific POMC expression is driven by two cis-acting transcriptional enhancers known as nPE1 and nPE2. Because mutant mice lacking these two enhancers still showed hypothalamic Pomc mRNA, we searched for additional elements contributing to arcuate Pomc expression. By combining molecular evolution with reporter gene expression in transgenic zebrafish and mice, here, we identified a mammalian arcuate-specific Pomc enhancer that we named nPE3, carrying several binding sites also present in nPE1 and nPE2 for transcription factors known to activate neuronal Pomc expression, such as ISL1, NKX2.1, and ERα. We found that nPE3 originated in the lineage leading to placental mammals and remained under purifying selection in all mammalian orders, although it was lost in Simiiformes (monkeys, apes, and humans) following a unique segmental deletion event. Interestingly, ablation of nPE3 from the mouse genome led to a drastic reduction (>70%) in hypothalamic Pomc mRNA during development and only moderate (<33%) in adult mice. Comparison between double (nPE1 and nPE2) and triple (nPE1, nPE2, and nPE3) enhancer mutants revealed the relative contribution of nPE3 to hypothalamic Pomc expression and its importance in the control of food intake and adiposity in male and female mice. Altogether, these results demonstrate that nPE3 integrates a tripartite cluster of partially redundant enhancers that originated upon a triple convergent evolutionary process in mammals and that is critical for hypothalamic Pomc expression and body weight homeostasis.
Collapse
|
2
|
An in vivo functional assay to characterize human STAT5B genetic variants during zebrafish development. Hum Mol Genet 2023; 32:2473-2484. [PMID: 37162340 DOI: 10.1093/hmg/ddad078] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/30/2023] [Revised: 04/19/2023] [Accepted: 05/07/2023] [Indexed: 05/11/2023] Open
Abstract
Growth hormone (GH) binding to GH receptor activates janus kinase 2 (JAK2)-signal transducer and activator of transcription 5b (STAT5b) pathway, which stimulates transcription of insulin-like growth factor-1 (IGF1), insulin-like growth factor binding protein 3 (IGFBP3) and insulin-like growth factor acid-labile subunit (IGFALS). Although STAT5B deficiency was established as an autosomal recessive disorder, heterozygous dominant-negative STAT5B variants have been reported in patients with less severe growth deficit and milder immune dysfunction. We developed an in vivo functional assay in zebrafish to characterize the pathogenicity of three human STAT5B variants (p.Ala630Pro, p.Gln474Arg and p.Lys632Asn). Overexpression of human wild-type (WT) STAT5B mRNA and its variants led to a significant reduction of body length together with developmental malformations in zebrafish embryos. Overexpression of p.Ala630Pro, p.Gln474Arg or p.Lys632Asn led to an increased number of embryos with pericardial edema, cyclopia and bent spine compared with WT STAT5B. Although co-injection of WT and p.Gln474Arg and WT and p.Lys632Asn STAT5B mRNA in zebrafish embryos partially or fully rescues the length and the developmental malformations in zebrafish embryos, co-injection of WT and p.Ala630Pro STAT5B mRNA leads to a greater number of embryos with developmental malformations and a reduction in body length of these embryos. These results suggest that these variants could interfere with endogenous stat5.1 signaling through different mechanisms. In situ hybridization of zebrafish embryos overexpressing p.Gln474Arg and p.Lys632Asn STAT5B mRNA shows a reduction in igf1 expression. In conclusion, our study reveals the pathogenicity of the STAT5B variants studied.
Collapse
|
3
|
The GRN concept as a guide for evolutionary developmental biology. JOURNAL OF EXPERIMENTAL ZOOLOGY. PART B, MOLECULAR AND DEVELOPMENTAL EVOLUTION 2023; 340:92-104. [PMID: 35344632 PMCID: PMC9515236 DOI: 10.1002/jez.b.23132] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/30/2021] [Revised: 03/08/2022] [Accepted: 03/11/2022] [Indexed: 12/13/2022]
Abstract
Organismal phenotypes result largely from inherited developmental programs, usually executed during embryonic and juvenile life stages. These programs are not blank slates onto which natural selection can draw arbitrary forms. Rather, the mechanisms of development play an integral role in shaping phenotypic diversity and help determine the evolutionary trajectories of species. Modern evolutionary biology must, therefore, account for these mechanisms in both theory and in practice. The gene regulatory network (GRN) concept represents a potent tool for achieving this goal whose utility has grown in tandem with advances in "omic" technologies and experimental techniques. However, while the GRN concept is widely utilized, it is often less clear what practical implications it has for conducting research in evolutionary developmental biology. In this Perspective, we attempt to provide clarity by discussing how experiments and projects can be designed in light of the GRN concept. We first map familiar biological notions onto the more abstract components of GRN models. We then review how diverse functional genomic approaches can be directed toward the goal of constructing such models and discuss current methods for functionally testing evolutionary hypotheses that arise from them. Finally, we show how the major steps of GRN model construction and experimental validation suggest generalizable workflows that can serve as a scaffold for project design. Taken together, the practical implications that we draw from the GRN concept provide a set of guideposts for studies aiming at unraveling the molecular basis of phenotypic diversity.
Collapse
|
4
|
A hypothetical model of trans-acting R-loops-mediated promoter-enhancer interactions by Alu elements. J Genet Genomics 2021; 48:1007-1019. [PMID: 34531149 DOI: 10.1016/j.jgg.2021.07.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2021] [Revised: 06/24/2021] [Accepted: 07/07/2021] [Indexed: 12/22/2022]
Abstract
Enhancers modulate gene expression by interacting with promoters. Models of enhancer-promoter interactions (EPIs) in the literature involve the activity of many components, including transcription factors and nucleic acid. However, the role that sequence similarity plays in EPIs remains largely unexplored. Herein, we report that Alu-derived sequences dominate sequence similarity between enhancers and promoters. After rejecting alternative DNA:DNA and DNA:RNA triplex models, we propose that enhancer-associated RNAs (eRNAs) may directly contact their targeted promoters by forming trans-acting R-loops at those Alu sequences. We show how the characteristic distribution of functional genomic data, such as RNA-DNA proximate ligation reads, binding of transcription factors, and RNA-binding proteins, all align with the Alu sequences of EPIs. We also show that these aligned Alu sequences may be subject to the constraint of coevolution, further implying the functional significance of these R-loop hybrids. Finally, our results imply that eRNA and Alu elements associate in a manner previously unrecognized in EPIs and the evolution of gene regulation networks in mammals.
Collapse
|
5
|
The developmental hourglass model is applicable to the spinal cord based on single-cell transcriptomes and non-conserved cis-regulatory elements. Dev Growth Differ 2021; 63:372-391. [PMID: 34473348 PMCID: PMC9293469 DOI: 10.1111/dgd.12750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/24/2021] [Accepted: 08/26/2021] [Indexed: 11/27/2022]
Abstract
The developmental hourglass model predicts that embryonic morphology is most conserved at the mid‐embryonic stage and diverges at the early and late stages. To date, this model has been verified by examining the anatomical features or gene expression profiles at the whole embryonic level. Here, by data mining approach utilizing multiple genomic and transcriptomic datasets from different species in combination, and by experimental validation, we demonstrate that the hourglass model is also applicable to a reduced element, the spinal cord. In the middle of spinal cord development, dorsoventrally arrayed neuronal progenitor domains are established, which are conserved among vertebrates. By comparing the publicly available single‐cell transcriptome datasets of mice and zebrafish, we found that ventral subpopulations of post‐mitotic spinal neurons display divergent molecular profiles. We also detected the non‐conservation of cis‐regulatory elements located around the progenitor fate determinants, indicating that the cis‐regulatory elements contributing to the progenitor specification are evolvable. These results demonstrate that, despite the conservation of the progenitor domains, the processes before and after the progenitor domain specification diverged. This study will be helpful to understand the molecular basis of the developmental hourglass model.
Collapse
|
6
|
Expression of acid-labile subunit (ALS) in developing and adult zebrafish and its role in dorso-ventral patterning during development. Gen Comp Endocrinol 2020; 299:113591. [PMID: 32828812 DOI: 10.1016/j.ygcen.2020.113591] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/06/2020] [Revised: 07/28/2020] [Accepted: 08/18/2020] [Indexed: 11/24/2022]
Abstract
Mammalian acid-labile subunit (ALS) is a serum protein that binds binary complexes between Insulin-like growth factors (IGFs) and Insulin-like growth factor-binding proteins (IGFBPs) extending their half-life and keeping them in the vasculature. Human ALS deficiency (ACLSD), due to homozygous or compound heterozygous mutations in IGFALS, leads to moderate short stature with reduced levels of IGF-I and IGFBP-3. There is only one corresponding zebrafish ortholog gene and it has not yet been studied. In this study we elucidate the role of igfals during zebrafish development. In zebrafish embryos igfals mRNA is expressed throughout development, mainly in the brain and subsequently also in the gut and swimbladder. To determine its role during development, we knocked down igfals gene product using morpholinos (MOs). Igfals morphant embryos displayed dorsalization in different degrees of severity, including a shortened trunk and loss of tail. Furthermore, co-injection of human IGFALS (hIGFALS) mRNA was able to rescue the MO-induced phenotype. Finally, overexpression of either hIGFALS or zebrafish igfals (zigfals) mRNA leads to ventralization of embryos including a reduced head and enlarged tail. These findings suggest that als plays an important role in dorso-ventral patterning during zebrafish development.
Collapse
|
7
|
Deep Convergence, Shared Ancestry, and Evolutionary Novelty in the Genetic Architecture of Heliconius Mimicry. Genetics 2020; 216:765-780. [PMID: 32883703 PMCID: PMC7648585 DOI: 10.1534/genetics.120.303611] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2020] [Accepted: 08/25/2020] [Indexed: 01/31/2023] Open
Abstract
Convergent evolution can occur through different genetic mechanisms in different species. It is now clear that convergence at the genetic level is also widespread, and can be caused by either (i) parallel genetic evolution, where independently evolved convergent mutations arise in different populations or species, or (ii) collateral evolution in which shared ancestry results from either ancestral polymorphism or introgression among taxa. The adaptive radiation of Heliconius butterflies shows color pattern variation within species, as well as mimetic convergence between species. Using comparisons from across multiple hybrid zones, we use signals of shared ancestry to identify and refine multiple putative regulatory elements in Heliconius melpomene and its comimics, Heliconius elevatus and Heliconius besckei, around three known major color patterning genes: optix, WntA, and cortex While we find that convergence between H. melpomene and H. elevatus is caused by a complex history of collateral evolution via introgression in the Amazon, convergence between these species in the Guianas appears to have evolved independently. Thus, we find adaptive convergent genetic evolution to be a key driver of regulatory changes that lead to rapid phenotypic changes. Furthermore, we uncover evidence of parallel genetic evolution at some loci around optix and WntA in H. melpomene and its distant comimic Heliconius erato Ultimately, we show that all three of convergence, conservation, and novelty underlie the modular architecture of Heliconius color pattern mimicry.
Collapse
|
8
|
Transcriptional Enhancers in the FOXP2 Locus Underwent Accelerated Evolution in the Human Lineage. Mol Biol Evol 2019; 36:2432-2450. [PMID: 31359064 DOI: 10.1093/molbev/msz173] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2018] [Revised: 04/26/2019] [Accepted: 07/16/2019] [Indexed: 12/11/2022] Open
Abstract
Unique human features such as complex language are the result of molecular evolutionary changes that modified developmental programs of our brain. The human-specific evolution of the forkhead box P2 (FOXP2) gene coding region has been linked to the emergence of speech and language in the human kind. However, little is known about how the expression of FOXP2 is regulated and if its regulatory machinery evolved in a lineage-specific manner in humans. In order to identify FOXP2 regulatory regions containing human-specific changes we used databases of human accelerated non-coding sequences or HARs. We found that the topologically associating domain (TAD) determined using developing human cerebral cortex containing the FOXP2 locus includes two clusters of 12 HARs, placing the locus occupied by FOXP2 among the top regions showing fast acceleration rates in non-coding regions in the human genome. Using in vivo enhancer assays in zebrafish, we found that at least five FOXP2-HARs behave as transcriptional enhancers throughout different developmental stages. In addition, we found that at least two FOXP2-HARs direct the expression of the reporter gene EGFP to foxP2 expressing regions and cells. Moreover, we uncovered two FOXP2-HARs showing reporter expression gain of function in the nervous system when compared with the chimpanzee ortholog sequences. Our results indicate that regulatory sequences in the FOXP2 locus underwent a human-specific evolutionary process suggesting that the transcriptional machinery controlling this gene could have also evolved differentially in the human lineage.
Collapse
|
9
|
Modulating transcription factor activity: Interfering with protein-protein interaction networks. Semin Cell Dev Biol 2018; 99:12-19. [PMID: 30172762 DOI: 10.1016/j.semcdb.2018.07.019] [Citation(s) in RCA: 26] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/04/2017] [Revised: 02/16/2018] [Accepted: 07/17/2018] [Indexed: 11/23/2022]
Abstract
Biophysical parameters that govern transcription factors activity are binding locations across the genome, dwelling time at these regulatory elements and specific protein-protein interactions. Most molecular strategies used to develop small compounds that block transcription factors activity have been based on biochemistry and cell biology methods that that do not take into consideration these key biophysical features. Here, we review the advance in the field of transcription factor biology and describe how their interactome and transcriptional regulation on a genome wide scale have been deciphered. We suggest that this new knowledge has the potential to be used to implement innovative research drug discovery program.
Collapse
|
10
|
Towards a map of cis-regulatory sequences in the human genome. Nucleic Acids Res 2018; 46:5395-5409. [PMID: 29733395 PMCID: PMC6009671 DOI: 10.1093/nar/gky338] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2018] [Revised: 04/14/2018] [Accepted: 04/19/2018] [Indexed: 01/10/2023] Open
Abstract
Accumulating evidence indicates that transcription factor (TF) binding sites, or cis-regulatory elements (CREs), and their clusters termed cis-regulatory modules (CRMs) play a more important role than do gene-coding sequences in specifying complex traits in humans, including the susceptibility to common complex diseases. To fully characterize their roles in deriving the complex traits/diseases, it is necessary to annotate all CREs and CRMs encoded in the human genome. However, the current annotations of CREs and CRMs in the human genome are still very limited and mostly coarse-grained, as they often lack the detailed information of CREs in CRMs. Here, we integrated 620 TF ChIP-seq datasets produced by the ENCODE project for 168 TFs in 79 different cell/tissue types and predicted an unprecedentedly completely map of CREs in CRMs in the human genome at single nucleotide resolution. The map includes 305 912 CRMs containing a total of 1 178 913 CREs belonging to 736 unique TF binding motifs. The predicted CREs and CRMs tend to be subject to either purifying selection or positive selection, thus are likely to be functional. Based on the results, we also examined the status of available ChIP-seq datasets for predicting the entire regulatory genome of humans.
Collapse
|
11
|
Dynamic evolution of regulatory element ensembles in primate CD4 + T cells. Nat Ecol Evol 2018; 2:537-548. [PMID: 29379187 DOI: 10.1038/s41559-017-0447-5] [Citation(s) in RCA: 43] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/31/2017] [Accepted: 12/08/2017] [Indexed: 12/12/2022]
Abstract
How evolutionary changes at enhancers affect the transcription of target genes remains an important open question. Previous comparative studies of gene expression have largely measured the abundance of messenger RNA, which is affected by post-transcriptional regulatory processes, hence limiting inferences about the mechanisms underlying expression differences. Here, we directly measured nascent transcription in primate species, allowing us to separate transcription from post-transcriptional regulation. We used precision run-on and sequencing to map RNA polymerases in resting and activated CD4+ T cells in multiple human, chimpanzee and rhesus macaque individuals, with rodents as outgroups. We observed general conservation in coding and non-coding transcription, punctuated by numerous differences between species, particularly at distal enhancers and non-coding RNAs. Genes regulated by larger numbers of enhancers are more frequently transcribed at evolutionarily stable levels, despite reduced conservation at individual enhancers. Adaptive nucleotide substitutions are associated with lineage-specific transcription and at one locus, SGPP2, we predict and experimentally validate that multiple substitutions contribute to human-specific transcription. Collectively, our findings suggest a pervasive role for evolutionary compensation across ensembles of enhancers that jointly regulate target genes.
Collapse
|
12
|
Conserved non-coding elements: developmental gene regulation meets genome organization. Nucleic Acids Res 2018; 45:12611-12624. [PMID: 29121339 PMCID: PMC5728398 DOI: 10.1093/nar/gkx1074] [Citation(s) in RCA: 57] [Impact Index Per Article: 9.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2017] [Accepted: 10/24/2017] [Indexed: 12/20/2022] Open
Abstract
Comparative genomics has revealed a class of non-protein-coding genomic sequences that display an extraordinary degree of conservation between two or more organisms, regularly exceeding that found within protein-coding exons. These elements, collectively referred to as conserved non-coding elements (CNEs), are non-randomly distributed across chromosomes and tend to cluster in the vicinity of genes with regulatory roles in multicellular development and differentiation. CNEs are organized into functional ensembles called genomic regulatory blocks–dense clusters of elements that collectively coordinate the expression of shared target genes, and whose span in many cases coincides with topologically associated domains. CNEs display sequence properties that set them apart from other sequences under constraint, and have recently been proposed as useful markers for the reconstruction of the evolutionary history of organisms. Disruption of several of these elements is known to contribute to diseases linked with development, and cancer. The emergence, evolutionary dynamics and functions of CNEs still remain poorly understood, and new approaches are required to enable comprehensive CNE identification and characterization. Here, we review current knowledge and identify challenges that need to be tackled to resolve the impasse in understanding extreme non-coding conservation.
Collapse
|
13
|
Abstract
What made us human? Gene expression changes clearly played a significant part in human evolution, but pinpointing the causal regulatory mutations is hard. Comparative genomics enabled the identification of human accelerated regions (HARs) and other human-specific genome sequences. The major challenge in the past decade has been to link diverged sequences to uniquely human biology. This review discusses approaches to this problem, progress made at the molecular level, and prospects for moving towards genetic causes for uniquely human biology.
Collapse
|
14
|
Molecular and functional genetics of the proopiomelanocortin gene, food intake regulation and obesity. FEBS Lett 2017; 591:2593-2606. [PMID: 28771698 PMCID: PMC9975356 DOI: 10.1002/1873-3468.12776] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/30/2017] [Revised: 07/31/2017] [Accepted: 07/31/2017] [Indexed: 12/20/2022]
Abstract
A specter is haunting the world, the specter of obesity. During the last decade, this pandemia has skyrocketed threatening children, adolescents and lower income families worldwide. Although driven by an increase in the consumption of ultraprocessed edibles of poor nutritional value, the obesogenic changes in contemporary human lifestyle affect people differently, revealing that some individuals are more prone to develop increased adiposity. During the last years, we performed a variety of genetic, evolutionary, biochemical and behavioral experiments that allowed us to understand how a group of neurons present in the arcuate nucleus of the hypothalamus regulate the expression of the proopiomelanocortin (Pomc) gene and induce satiety. We disentangled the neuronal transcriptional code of Pomc by identifying the cis-acting regulatory elements and primary transcription factors controlling hypothalamic Pomc expression and determined their functional importance in the regulation of food intake and adiposity. Altogether, our studies reviewed here shed light on the power and limitations of the mammalian central satiety pathways and may contribute to the development of individual and collective strategies to reduce the debilitating effects of the self-induced obesity pandemia.
Collapse
|
15
|
Abstract
Over the last decade, the noncoding part of the genome has been shown to harbour thousands of cis-regulatory elements, such as enhancers, that activate well-defined gene expression programs. Driven by the development of numerous techniques, many of these elements are now identified in multiple tissues and cell types, and their characteristics as well as importance in development and disease are becoming increasingly clear. Here, we provide an overview of the insights that were gained from the analysis of noncoding gene regulatory elements in the brain and describe their potential contribution to cell type specialization, brain function and neurodegenerative disease.
Collapse
|
16
|
|
17
|
Comparative analyses of super-enhancers reveal conserved elements in vertebrate genomes. Genome Res 2016; 27:259-268. [PMID: 27965291 PMCID: PMC5287231 DOI: 10.1101/gr.203679.115] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2015] [Accepted: 12/09/2016] [Indexed: 12/11/2022]
Abstract
Super-enhancers (SEs) are key transcriptional drivers of cellular, developmental, and disease states in mammals, yet the conservational and regulatory features of these enhancer elements in nonmammalian vertebrates are unknown. To define SEs in zebrafish and enable sequence and functional comparisons to mouse and human SEs, we used genome-wide histone H3 lysine 27 acetylation (H3K27ac) occupancy as a primary SE delineator. Our study determined the set of SEs in pluripotent state cells and adult zebrafish tissues and revealed both similarities and differences between zebrafish and mammalian SEs. Although the total number of SEs was proportional to the genome size, the genomic distribution of zebrafish SEs differed from that of the mammalian SEs. Despite the evolutionary distance separating zebrafish and mammals and the low overall SE sequence conservation, ∼42% of zebrafish SEs were located in close proximity to orthologs that also were associated with SEs in mouse and human. Compared to their nonassociated counterparts, higher sequence conservation was revealed for those SEs that have maintained orthologous gene associations. Functional dissection of two of these SEs identified conserved sequence elements and tissue-specific expression patterns, while chromatin accessibility analyses predicted transcription factors governing the function of pluripotent state zebrafish SEs. Our zebrafish annotations and comparative studies show the extent of SE usage and their conservation across vertebrates, permitting future gene regulatory studies in several tissues.
Collapse
|
18
|
The Functionality and Evolution of Eukaryotic Transcriptional Enhancers. ADVANCES IN GENETICS 2016; 96:143-206. [PMID: 27968730 DOI: 10.1016/bs.adgen.2016.08.004] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
Enhancers regulate precise spatial and temporal patterns of gene expression in eukaryotes and, moreover, evolutionary changes in these modular cis-regulatory elements may represent the predominant genetic basis for phenotypic evolution. Here, we review approaches to identify and functionally analyze enhancers and their transcription factor binding sites, including assay for transposable-accessible chromatin-sequencing (ATAC-Seq) and clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9, respectively. We also explore enhancer functionality, including how transcription factor binding sites combine to regulate transcription, as well as research on shadow and super enhancers, and how enhancers can act over great distances and even in trans. Finally, we discuss recent theoretical and empirical data on how transcription factor binding sites and enhancers evolve. This includes how the function of enhancers is maintained despite the turnover of transcription factor binding sites as well as reviewing studies where mutations in enhancers have been shown to underlie morphological change.
Collapse
|
19
|
|
20
|
Compensatory Drift and the Evolutionary Dynamics of Dosage-Sensitive Duplicate Genes. Genetics 2015; 202:765-74. [PMID: 26661114 DOI: 10.1534/genetics.115.178137] [Citation(s) in RCA: 30] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2015] [Accepted: 12/06/2015] [Indexed: 11/18/2022] Open
Abstract
Dosage-balance selection preserves functionally redundant duplicates (paralogs) at the optimum for their combined expression. Here we present a model of the dynamics of duplicate genes coevolving under dosage-balance selection. We call this the compensatory drift model. Results show that even when strong dosage-balance selection constrains total expression to the optimum, expression of each duplicate can diverge by drift from its original level. The rate of divergence slows as the strength of stabilizing selection, the size of the mutation effect, and/or the size of the population increases. We show that dosage-balance selection impedes neofunctionalization early after duplication but can later facilitate it. We fit this model to data from sodium channel duplicates in 10 families of teleost fish; these include two convergent lineages of electric fish in which one of the duplicates neofunctionalized. Using the model, we estimated the strength of dosage-balance selection for these genes. The results indicate that functionally redundant paralogs still may undergo radical functional changes after a prolonged period of compensatory drift.
Collapse
|
21
|
Favorable genomic environments for cis-regulatory evolution: A novel theoretical framework. Semin Cell Dev Biol 2015; 57:2-10. [PMID: 26673387 DOI: 10.1016/j.semcdb.2015.12.003] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2015] [Revised: 12/02/2015] [Accepted: 12/05/2015] [Indexed: 12/22/2022]
Abstract
Cis-regulatory changes are arguably the primary evolutionary source of animal morphological diversity. With the recent explosion of genome-wide comparisons of the cis-regulatory content in different animal species is now possible to infer general principles underlying enhancer evolution. However, these studies have also revealed numerous discrepancies and paradoxes, suggesting that the mechanistic causes and modes of cis-regulatory evolution are still not well understood and are probably much more complex than generally appreciated. Here, we argue that the mutational mechanisms and genomic regions generating new regulatory activities must comply with the constraints imposed by the molecular properties of cis-regulatory elements (CREs) and the organizational features of long-range chromatin interactions. Accordingly, we propose a new integrative evolutionary framework for cis-regulatory evolution based on two major premises for the origin of novel enhancer activity: (i) an accessible chromatin environment and (ii) compatibility with the 3D structure and interactions of pre-existing CREs. Mechanisms and DNA sequences not fulfilling these premises, will be less likely to have a measurable impact on gene expression and as such, will have a minor contribution to the evolution of gene regulation. Finally, we discuss current comparative cis-regulatory data under the light of this new evolutionary model, and propose that the two most prominent mechanisms for the evolution of cis-regulatory changes are the overprinting of ancestral CREs and the exaptation of transposable elements.
Collapse
|
22
|
Mechanisms of Evolutionary Innovation Point to Genetic Control Logic as the Key Difference Between Prokaryotes and Eukaryotes. J Mol Evol 2015. [PMID: 26208881 DOI: 10.1007/s00239-015-9688-6] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/19/2022]
Abstract
The evolution of life from the simplest, original form to complex, intelligent animal life occurred through a number of key innovations. Here we present a new tool to analyze these key innovations by proposing that the process of evolutionary innovation may follow one of three underlying processes, namely a Random Walk, a Critical Path, or a Many Paths process, and in some instances may also constitute a "Pull-up the Ladder" event. Our analysis is based on the occurrence of function in modern biology, rather than specific structure or mechanism. A function in modern biology may be classified in this way either on the basis of its evolution or the basis of its modern mechanism. Characterizing key innovations in this way helps identify the likelihood that an innovation could arise. In this paper, we describe the classification, and methods to classify functional features of modern organisms into these three classes based on the analysis of how a function is implemented in modern biology. We present the application of our categorization to the evolution of eukaryotic gene control. We use this approach to support the argument that there are few, and possibly no basic chemical differences between the functional constituents of the machinery of gene control between eukaryotes, bacteria and archaea. This suggests that the difference between eukaryotes and prokaryotes that allows the former to develop the complex genetic architecture seen in animals and plants is something other than their chemistry. We tentatively identify the difference as a difference in control logic, that prokaryotic genes are by default 'on' and eukaryotic genes are by default 'off.' The Many Paths evolutionary process suggests that, from a 'default off' starting point, the evolution of the genetic complexity of higher eukaryotes is a high probability event.
Collapse
|
23
|
Evidence for deep regulatory similarities in early developmental programs across highly diverged insects. Genome Biol Evol 2015; 6:2301-20. [PMID: 25173756 PMCID: PMC4217690 DOI: 10.1093/gbe/evu184] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Many genes familiar from Drosophila development, such as the so-called gap, pair-rule, and segment polarity genes, play important roles in the development of other insects and in many cases appear to be deployed in a similar fashion, despite the fact that Drosophila-like "long germband" development is highly derived and confined to a subset of insect families. Whether or not these similarities extend to the regulatory level is unknown. Identification of regulatory regions beyond the well-studied Drosophila has been challenging as even within the Diptera (flies, including mosquitoes) regulatory sequences have diverged past the point of recognition by standard alignment methods. Here, we demonstrate that methods we previously developed for computational cis-regulatory module (CRM) discovery in Drosophila can be used effectively in highly diverged (250-350 Myr) insect species including Anopheles gambiae, Tribolium castaneum, Apis mellifera, and Nasonia vitripennis. In Drosophila, we have successfully used small sets of known CRMs as "training data" to guide the search for other CRMs with related function. We show here that although species-specific CRM training data do not exist, training sets from Drosophila can facilitate CRM discovery in diverged insects. We validate in vivo over a dozen new CRMs, roughly doubling the number of known CRMs in the four non-Drosophila species. Given the growing wealth of Drosophila CRM annotation, these results suggest that extensive regulatory sequence annotation will be possible in newly sequenced insects without recourse to costly and labor-intensive genome-scale experiments. We develop a new method, Regulus, which computes a probabilistic score of similarity based on binding site composition (despite the absence of nucleotide-level sequence alignment), and demonstrate similarity between functionally related CRMs from orthologous loci. Our work represents an important step toward being able to trace the evolutionary history of gene regulatory networks and defining the mechanisms underlying insect evolution.
Collapse
|
24
|
Unraveling the Tangled Skein: The Evolution of Transcriptional Regulatory Networks in Development. Annu Rev Genomics Hum Genet 2015; 16:103-31. [PMID: 26079281 DOI: 10.1146/annurev-genom-091212-153423] [Citation(s) in RCA: 63] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/26/2023]
Abstract
The molecular and genetic basis for the evolution of anatomical diversity is a major question that has inspired evolutionary and developmental biologists for decades. Because morphology takes form during development, a true comprehension of how anatomical structures evolve requires an understanding of the evolutionary events that alter developmental genetic programs. Vast gene regulatory networks (GRNs) that connect transcription factors to their target regulatory sequences control gene expression in time and space and therefore determine the tissue-specific genetic programs that shape morphological structures. In recent years, many new examples have greatly advanced our understanding of the genetic alterations that modify GRNs to generate newly evolved morphologies. Here, we review several aspects of GRN evolution, including their deep preservation, their mechanisms of alteration, and how they originate to generate novel developmental programs.
Collapse
|
25
|
Islet 1 specifies the identity of hypothalamic melanocortin neurons and is critical for normal food intake and adiposity in adulthood. Proc Natl Acad Sci U S A 2015; 112:E1861-70. [PMID: 25825735 DOI: 10.1073/pnas.1500672112] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
Food intake and body weight regulation depend on proper expression of the proopiomelanocortin gene (Pomc) in a group of neurons located in the mediobasal hypothalamus of all vertebrates. These neurons release POMC-encoded melanocortins, which are potent anorexigenic neuropeptides, and their absence from mice or humans leads to hyperphagia and severe obesity. Although the pathophysiology of hypothalamic POMC neurons is well understood, the genetic program that establishes the neuronal melanocortinergic phenotype and maintains a fully functional neuronal POMC phenotype throughout adulthood remains unknown. Here, we report that the early expression of the LIM-homeodomain transcription factor Islet 1 (ISL1) in the developing hypothalamus promotes the terminal differentiation of melanocortinergic neurons and is essential for hypothalamic Pomc expression since its initial onset and throughout the entire lifetime. We detected ISL1 in the prospective hypothalamus just before the onset of Pomc expression and, from then on, Pomc and Isl1 coexpress. ISL1 binds in vitro and in vivo to critical homeodomain binding DNA motifs present in the neuronal Pomc enhancers nPE1 and nPE2, and mutations of these sites completely disrupt the ability of these enhancers to drive reporter gene expression to hypothalamic POMC neurons in transgenic mice and zebrafish. ISL1 is necessary for hypothalamic Pomc expression during mouse and zebrafish embryogenesis. Furthermore, conditional Isl1 inactivation from POMC neurons impairs Pomc expression, leading to hyperphagia and obesity. Our results demonstrate that ISL1 specifies the identity of hypothalamic melanocortin neurons and is required for melanocortin-induced satiety and normal adiposity throughout the entire lifespan.
Collapse
|
26
|
8.2% of the Human genome is constrained: variation in rates of turnover across functional element classes in the human lineage. PLoS Genet 2014; 10:e1004525. [PMID: 25057982 PMCID: PMC4109858 DOI: 10.1371/journal.pgen.1004525] [Citation(s) in RCA: 131] [Impact Index Per Article: 13.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2013] [Accepted: 06/05/2014] [Indexed: 01/27/2023] Open
Abstract
Ten years on from the finishing of the human reference genome sequence, it remains unclear what fraction of the human genome confers function, where this sequence resides, and how much is shared with other mammalian species. When addressing these questions, functional sequence has often been equated with pan-mammalian conserved sequence. However, functional elements that are short-lived, including those contributing to species-specific biology, will not leave a footprint of long-lasting negative selection. Here, we address these issues by identifying and characterising sequence that has been constrained with respect to insertions and deletions for pairs of eutherian genomes over a range of divergences. Within noncoding sequence, we find increasing amounts of mutually constrained sequence as species pairs become more closely related, indicating that noncoding constrained sequence turns over rapidly. We estimate that half of present-day noncoding constrained sequence has been gained or lost in approximately the last 130 million years (half-life in units of divergence time, d1/2 = 0.25–0.31). While enriched with ENCODE biochemical annotations, much of the short-lived constrained sequences we identify are not detected by models optimized for wider pan-mammalian conservation. Constrained DNase 1 hypersensitivity sites, promoters and untranslated regions have been more evolutionarily stable than long noncoding RNA loci which have turned over especially rapidly. By contrast, protein coding sequence has been highly stable, with an estimated half-life of over a billion years (d1/2 = 2.1–5.0). From extrapolations we estimate that 8.2% (7.1–9.2%) of the human genome is presently subject to negative selection and thus is likely to be functional, while only 2.2% has maintained constraint in both human and mouse since these species diverged. These results reveal that the evolutionary history of the human genome has been highly dynamic, particularly for its noncoding yet biologically functional fraction. Nearly 99% of the human genome does not encode proteins, and while there recently has been extensive biochemical annotation of the remaining noncoding fraction, it remains unclear whether or not the bulk of these DNA sequences have important functional roles. By comparing the genome sequences of different species we identify genomic regions that have evolved unexpectedly slowly, a signature of natural selection upon functional sequence. Using a high resolution evolutionary approach to find sequence showing evolutionary signatures of functionality we estimate that a total of 8.2% (7.1–9.2%) of the human genome is presently functional, more than three times as much than is functional and shared between human and mouse. This implies that there is an abundance of sequences with short lived lineage-specific functionality. As expected, most of the sequence involved in this functional “turnover” is noncoding, while protein coding sequence is stably preserved over longer evolutionary timescales. More generally, we find that the rate of functional turnover varies significantly across categories of functional noncoding elements. Our results provide a pan-mammalian and whole genome perspective on how rapidly different classes of sequence have gained and lost functionality down the human lineage.
Collapse
|
27
|
Abstract
The genetic changes underlying the dramatic differences in form and function between humans and other primates are largely unknown, although it is clear that gene regulatory changes play an important role. To identify regulatory sequences with potentially human-specific functions, we and others used comparative genomics to find non-coding regions conserved across mammals that have acquired many sequence changes in humans since divergence from chimpanzees. These regions are good candidates for performing human-specific regulatory functions. Here, we analysed the DNA sequence, evolutionary history, histone modifications, chromatin state and transcription factor (TF) binding sites of a combined set of 2649 non-coding human accelerated regions (ncHARs) and predicted that at least 30% of them function as developmental enhancers. We prioritized the predicted ncHAR enhancers using analysis of TF binding site gain and loss, along with the functional annotations and expression patterns of nearby genes. We then tested both the human and chimpanzee sequence for 29 ncHARs in transgenic mice, and found 24 novel developmental enhancers active in both species, 17 of which had very consistent patterns of activity in specific embryonic tissues. Of these ncHAR enhancers, five drove expression patterns suggestive of different activity for the human and chimpanzee sequence at embryonic day 11.5. The changes to human non-coding DNA in these ncHAR enhancers may modify the complex patterns of gene expression necessary for proper development in a human-specific manner and are thus promising candidates for understanding the genetic basis of human-specific biology.
Collapse
|
28
|
Abstract
Deciphering the genetic bases that drive animal diversity is one of the major challenges of modern biology. Although four decades ago it was proposed that animal evolution was mainly driven by changes in cis-regulatory DNA elements controlling gene expression rather than in protein-coding sequences, only now are powerful bioinformatics and experimental approaches available to accelerate studies into how the evolution of transcriptional enhancers contributes to novel forms and functions. In the introduction to this Theme Issue, we start by defining the general properties of transcriptional enhancers, such as modularity and the coexistence of tight sequence conservation with transcription factor-binding site shuffling as different mechanisms that maintain the enhancer grammar over evolutionary time. We discuss past and current methods used to identify cell-type-specific enhancers and provide examples of how enhancers originate de novo, change and are lost in particular lineages. We then focus in the central part of this Theme Issue on analysing examples of how the molecular evolution of enhancers may change form and function. Throughout this introduction, we present the main findings of the articles, reviews and perspectives contributed to this Theme Issue that together illustrate some of the great advances and current frontiers in the field.
Collapse
|