1
|
Removing unwanted variation between samples in Hi-C experiments. Brief Bioinform 2024; 25:bbae217. [PMID: 38711367 DOI: 10.1093/bib/bbae217] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2023] [Revised: 01/26/2024] [Accepted: 04/24/2024] [Indexed: 05/08/2024] Open
Abstract
Hi-C data are commonly normalized using single sample processing methods, with focus on comparisons between regions within a given contact map. Here, we aim to compare contact maps across different samples. We demonstrate that unwanted variation, of likely technical origin, is present in Hi-C data with replicates from different individuals, and that properties of this unwanted variation change across the contact map. We present band-wise normalization and batch correction, a method for normalization and batch correction of Hi-C data and show that it substantially improves comparisons across samples, including in a quantitative trait loci analysis as well as differential enrichment across cell types.
Collapse
|
2
|
Mapping genetic effects on cell type-specific chromatin accessibility and annotating complex immune trait variants using single nucleus ATAC-seq in peripheral blood. PLoS Genet 2023; 19:e1010759. [PMID: 37289818 DOI: 10.1371/journal.pgen.1010759] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2022] [Accepted: 04/25/2023] [Indexed: 06/10/2023] Open
Abstract
Gene regulation is highly cell type-specific and understanding the function of non-coding genetic variants associated with complex traits requires molecular phenotyping at cell type resolution. In this study we performed single nucleus ATAC-seq (snATAC-seq) and genotyping in peripheral blood mononuclear cells from 13 individuals. Clustering chromatin accessibility profiles of 96,002 total nuclei identified 17 immune cell types and sub-types. We mapped chromatin accessibility QTLs (caQTLs) in each immune cell type and sub-type using individuals of European ancestry which identified 6,901 caQTLs at FDR < .10 and 4,220 caQTLs at FDR < .05, including those obscured from assays of bulk tissue such as with divergent effects on different cell types. For 3,941 caQTLs we further annotated putative target genes of variant activity using single cell co-accessibility, and caQTL variants were significantly correlated with the accessibility level of linked gene promoters. We fine-mapped loci associated with 16 complex immune traits and identified immune cell caQTLs at 622 candidate causal variants, including those with cell type-specific effects. At the 6q15 locus associated with type 1 diabetes, in line with previous reports, variant rs72928038 was a naïve CD4+ T cell caQTL linked to BACH2 and we validated the allelic effects of this variant on regulatory activity in Jurkat T cells. These results highlight the utility of snATAC-seq for mapping genetic effects on accessible chromatin in specific cell types.
Collapse
|
3
|
The EN-TEx resource of multi-tissue personal epigenomes & variant-impact models. Cell 2023; 186:1493-1511.e40. [PMID: 37001506 PMCID: PMC10074325 DOI: 10.1016/j.cell.2023.02.018] [Citation(s) in RCA: 12] [Impact Index Per Article: 12.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/03/2022] [Revised: 10/16/2022] [Accepted: 02/10/2023] [Indexed: 04/03/2023]
Abstract
Understanding how genetic variants impact molecular phenotypes is a key goal of functional genomics, currently hindered by reliance on a single haploid reference genome. Here, we present the EN-TEx resource of 1,635 open-access datasets from four donors (∼30 tissues × ∼15 assays). The datasets are mapped to matched, diploid genomes with long-read phasing and structural variants, instantiating a catalog of >1 million allele-specific loci. These loci exhibit coordinated activity along haplotypes and are less conserved than corresponding, non-allele-specific ones. Surprisingly, a deep-learning transformer model can predict the allele-specific activity based only on local nucleotide-sequence context, highlighting the importance of transcription-factor-binding motifs particularly sensitive to variants. Furthermore, combining EN-TEx with existing genome annotations reveals strong associations between allele-specific and GWAS loci. It also enables models for transferring known eQTLs to difficult-to-profile tissues (e.g., from skin to heart). Overall, EN-TEx provides rich data and generalizable models for more accurate personal functional genomics.
Collapse
|
4
|
Rapid changes in chromatin structure during dedifferentiation of primary hepatocytes in vitro. Genomics 2022; 114:110330. [PMID: 35278615 DOI: 10.1016/j.ygeno.2022.110330] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2021] [Revised: 02/22/2022] [Accepted: 03/06/2022] [Indexed: 01/14/2023]
Abstract
Primary hepatocytes are widely used in the pharmaceutical industry to screen drug candidates for hepatotoxicity, but hepatocytes quickly dedifferentiate and lose their mature metabolic function in culture. Attempts have been made to better recapitulate the in vivo liver environment in culture, but the full spectrum of signals required to maintain hepatocyte function ex vivo remains elusive. To elucidate molecular changes that accompany, and may contribute to dedifferentiation of hepatocytes ex vivo, we performed lineage tracing and comprehensive profiling of alterations in their gene expression profiles and chromatin landscape during culture. First, using genetically tagged hepatocytes we demonstrate that expression of the fetal gene alpha-fetoprotein in cultured hepatocytes comes from cells that previously expressed the mature gene albumin, and not from a population of albumin-negative precursor cells, proving mature hepatocytes undergo true dedifferentiation in culture. Next we studied the dedifferentiation process in detail through bulk RNA-sequencing of hepatocytes cultured over an extended period. We identified three distinct phases of dedifferentiation: an early phase, where mature hepatocyte genes are rapidly downregulated in a matter of hours; a middle phase, where fetal genes are activated; and a late phase, where initially rare contaminating non-parenchymal cells proliferate, taking over the culture. Lastly, to better understand the signaling events that result in the rapid downregulation of mature genes in hepatocytes, we examined changes in chromatin accessibility in these cells during the first 24 h of culture using Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq). We find that drastic and rapid changes in chromatin accessibility occur immediately upon the start of culture. Using binding motif analysis of the areas of open chromatin sharing similar temporal profiles, we identify several candidate transcription factors potentially involved in the dedifferentiation of primary hepatocytes in culture.
Collapse
|
5
|
Author Correction: Expanded encyclopaedias of DNA elements in the human and mouse genomes. Nature 2022; 605:E3. [PMID: 35474001 PMCID: PMC9095460 DOI: 10.1038/s41586-021-04226-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022]
|
6
|
Coding and noncoding variants in EBF3 are involved in HADDS and simplex autism. Hum Genomics 2021; 15:44. [PMID: 34256850 PMCID: PMC8278787 DOI: 10.1186/s40246-021-00342-3] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/03/2021] [Accepted: 06/17/2021] [Indexed: 01/01/2023] Open
Abstract
BACKGROUND Previous research in autism and other neurodevelopmental disorders (NDDs) has indicated an important contribution of protein-coding (coding) de novo variants (DNVs) within specific genes. The role of de novo noncoding variation has been observable as a general increase in genetic burden but has yet to be resolved to individual functional elements. In this study, we assessed whole-genome sequencing data in 2671 families with autism (discovery cohort of 516 families, replication cohort of 2155 families). We focused on DNVs in enhancers with characterized in vivo activity in the brain and identified an excess of DNVs in an enhancer named hs737. RESULTS We adapted the fitDNM statistical model to work in noncoding regions and tested enhancers for excess of DNVs in families with autism. We found only one enhancer (hs737) with nominal significance in the discovery (p = 0.0172), replication (p = 2.5 × 10-3), and combined dataset (p = 1.1 × 10-4). Each individual with a DNV in hs737 had shared phenotypes including being male, intact cognitive function, and hypotonia or motor delay. Our in vitro assessment of the DNVs showed they all reduce enhancer activity in a neuronal cell line. By epigenomic analyses, we found that hs737 is brain-specific and targets the transcription factor gene EBF3 in human fetal brain. EBF3 is genome-wide significant for coding DNVs in NDDs (missense p = 8.12 × 10-35, loss-of-function p = 2.26 × 10-13) and is widely expressed in the body. Through characterization of promoters bound by EBF3 in neuronal cells, we saw enrichment for binding to NDD genes (p = 7.43 × 10-6, OR = 1.87) involved in gene regulation. Individuals with coding DNVs have greater phenotypic severity (hypotonia, ataxia, and delayed development syndrome [HADDS]) in comparison to individuals with noncoding DNVs that have autism and hypotonia. CONCLUSIONS In this study, we identify DNVs in the hs737 enhancer in individuals with autism. Through multiple approaches, we find hs737 targets the gene EBF3 that is genome-wide significant in NDDs. By assessment of noncoding variation and the genes they affect, we are beginning to understand their impact on gene regulatory networks in NDDs.
Collapse
|
7
|
Interpreting type 1 diabetes risk with genetics and single-cell epigenomics. Nature 2021; 594:398-402. [PMID: 34012112 PMCID: PMC10560508 DOI: 10.1038/s41586-021-03552-w] [Citation(s) in RCA: 122] [Impact Index Per Article: 40.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2020] [Accepted: 04/14/2021] [Indexed: 02/04/2023]
Abstract
Genetic risk variants that have been identified in genome-wide association studies of complex diseases are primarily non-coding1. Translating these risk variants into mechanistic insights requires detailed maps of gene regulation in disease-relevant cell types2. Here we combined two approaches: a genome-wide association study of type 1 diabetes (T1D) using 520,580 samples, and the identification of candidate cis-regulatory elements (cCREs) in pancreas and peripheral blood mononuclear cells using single-nucleus assay for transposase-accessible chromatin with sequencing (snATAC-seq) of 131,554 nuclei. Risk variants for T1D were enriched in cCREs that were active in T cells and other cell types, including acinar and ductal cells of the exocrine pancreas. Risk variants at multiple T1D signals overlapped with exocrine-specific cCREs that were linked to genes with exocrine-specific expression. At the CFTR locus, the T1D risk variant rs7795896 mapped to a ductal-specific cCRE that regulated CFTR; the risk allele reduced transcription factor binding, enhancer activity and CFTR expression in ductal cells. These findings support a role for the exocrine pancreas in the pathogenesis of T1D and highlight the power of large-scale genome-wide association studies and single-cell epigenomics for understanding the cellular origins of complex disease.
Collapse
|
8
|
Abstract
A Correction to this paper has been published: https://doi.org/10.1038/s41586-020-03089-4.
Collapse
|
9
|
Author Correction: An atlas of dynamic chromatin landscapes in mouse fetal development. Nature 2020; 586:E31. [PMID: 33037424 PMCID: PMC7962567 DOI: 10.1038/s41586-020-2841-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
|
10
|
Spatiotemporal DNA methylome dynamics of the developing mouse fetus. Nature 2020; 583:752-759. [PMID: 32728242 PMCID: PMC7398276 DOI: 10.1038/s41586-020-2119-x] [Citation(s) in RCA: 67] [Impact Index Per Article: 16.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/09/2017] [Accepted: 06/11/2019] [Indexed: 01/10/2023]
Abstract
Cytosine DNA methylation is essential for mammalian development but understanding of its spatiotemporal distribution in the developing embryo remains limited1,2. Here, as part of the mouse Encyclopedia of DNA Elements (ENCODE) project, we profiled 168 methylomes from 12 mouse tissues or organs at 9 developmental stages from embryogenesis to adulthood. We identified 1,808,810 genomic regions that showed variations in CG methylation by comparing the methylomes of different tissues or organs from different developmental stages. These DNA elements predominantly lose CG methylation during fetal development, whereas the trend is reversed after birth. During late stages of fetal development, non-CG methylation accumulated within the bodies of key developmental transcription factor genes, coinciding with their transcriptional repression. Integration of genome-wide DNA methylation, histone modification and chromatin accessibility data enabled us to predict 461,141 putative developmental tissue-specific enhancers, the human orthologues of which were enriched for disease-associated genetic variants. These spatiotemporal epigenome maps provide a resource for studies of gene regulation during tissue or organ progression, and a starting point for investigating regulatory elements that are involved in human developmental disorders. Analysis of 168 methylomes from 12 mouse tissues at 9 developmental stages sheds light on the epigenetic and regulatory landscape during mammalian fetal development.
Collapse
|
11
|
Abstract
The Encyclopedia of DNA Elements (ENCODE) project has established a genomic resource for mammalian development, profiling a diverse panel of mouse tissues at 8 developmental stages from 10.5 days after conception until birth, including transcriptomes, methylomes and chromatin states. Here we systematically examined the state and accessibility of chromatin in the developing mouse fetus. In total we performed 1,128 chromatin immunoprecipitation with sequencing (ChIP-seq) assays for histone modifications and 132 assay for transposase-accessible chromatin using sequencing (ATAC-seq) assays for chromatin accessibility across 72 distinct tissue-stages. We used integrative analysis to develop a unified set of chromatin state annotations, infer the identities of dynamic enhancers and key transcriptional regulators, and characterize the relationship between chromatin state and accessibility during developmental gene regulation. We also leveraged these data to link enhancers to putative target genes and demonstrate tissue-specific enrichments of sequence variants associated with disease in humans. The mouse ENCODE data sets provide a compendium of resources for biomedical researchers and achieve, to our knowledge, the most comprehensive view of chromatin dynamics during mammalian fetal development to date.
Collapse
|
12
|
Abstract
The human and mouse genomes contain instructions that specify RNAs and proteins and govern the timing, magnitude, and cellular context of their production. To better delineate these elements, phase III of the Encyclopedia of DNA Elements (ENCODE) Project has expanded analysis of the cell and tissue repertoires of RNA transcription, chromatin structure and modification, DNA methylation, chromatin looping, and occupancy by transcription factors and RNA-binding proteins. Here we summarize these efforts, which have produced 5,992 new experimental datasets, including systematic determinations across mouse fetal development. All data are available through the ENCODE data portal (https://www.encodeproject.org), including phase II ENCODE1 and Roadmap Epigenomics2 data. We have developed a registry of 926,535 human and 339,815 mouse candidate cis-regulatory elements, covering 7.9 and 3.4% of their respective genomes, by integrating selected datatypes associated with gene regulation, and constructed a web-based server (SCREEN; http://screen.encodeproject.org) to provide flexible, user-defined access to this resource. Collectively, the ENCODE data and registry provide an expansive resource for the scientific community to build a better understanding of the organization and function of the human and mouse genomes.
Collapse
|
13
|
Common DNA sequence variation influences 3-dimensional conformation of the human genome. Genome Biol 2019; 20:255. [PMID: 31779666 PMCID: PMC6883528 DOI: 10.1186/s13059-019-1855-4] [Citation(s) in RCA: 49] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2019] [Accepted: 10/10/2019] [Indexed: 12/22/2022] Open
Abstract
BACKGROUND The 3-dimensional (3D) conformation of chromatin inside the nucleus is integral to a variety of nuclear processes including transcriptional regulation, DNA replication, and DNA damage repair. Aberrations in 3D chromatin conformation have been implicated in developmental abnormalities and cancer. Despite the importance of 3D chromatin conformation to cellular function and human health, little is known about how 3D chromatin conformation varies in the human population, or whether DNA sequence variation between individuals influences 3D chromatin conformation. RESULTS To address these questions, we perform Hi-C on lymphoblastoid cell lines from 20 individuals. We identify thousands of regions across the genome where 3D chromatin conformation varies between individuals and find that this variation is often accompanied by variation in gene expression, histone modifications, and transcription factor binding. Moreover, we find that DNA sequence variation influences several features of 3D chromatin conformation including loop strength, contact insulation, contact directionality, and density of local cis contacts. We map hundreds of quantitative trait loci associated with 3D chromatin features and find evidence that some of these same variants are associated at modest levels with other molecular phenotypes as well as complex disease risk. CONCLUSION Our results demonstrate that common DNA sequence variants can influence 3D chromatin conformation, pointing to a more pervasive role for 3D chromatin conformation in human phenotypic variation than previously recognized.
Collapse
|
14
|
N 6-methyladenine DNA Modification in Glioblastoma. Cell 2018; 175:1228-1243.e20. [PMID: 30392959 PMCID: PMC6433469 DOI: 10.1016/j.cell.2018.10.006] [Citation(s) in RCA: 195] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2018] [Revised: 07/26/2018] [Accepted: 10/01/2018] [Indexed: 02/07/2023]
Abstract
Genetic drivers of cancer can be dysregulated through epigenetic modifications of DNA. Although the critical role of DNA 5-methylcytosine (5mC) in the regulation of transcription is recognized, the functions of other non-canonical DNA modifications remain obscure. Here, we report the identification of novel N6-methyladenine (N6-mA) DNA modifications in human tissues and implicate this epigenetic mark in human disease, specifically the highly malignant brain cancer glioblastoma. Glioblastoma markedly upregulated N6-mA levels, which co-localized with heterochromatic histone modifications, predominantly H3K9me3. N6-mA levels were dynamically regulated by the DNA demethylase ALKBH1, depletion of which led to transcriptional silencing of oncogenic pathways through decreasing chromatin accessibility. Targeting the N6-mA regulator ALKBH1 in patient-derived human glioblastoma models inhibited tumor cell proliferation and extended the survival of tumor-bearing mice, supporting this novel DNA modification as a potential therapeutic target for glioblastoma. Collectively, our results uncover a novel epigenetic node in cancer through the DNA modification N6-mA.
Collapse
|
15
|
Abstract
How eukaryotic chromosomes fold inside the nucleus is an age-old question that remains unanswered today. Early biochemical and microscopic studies revealed the existence of chromatin domains and loops as a pervasive feature of interphase chromosomes, but the biological implications of such organizational features were obscure. Genome-wide analysis of pair-wise chromatin interactions using chromatin conformation capture (3C)-based techniques has shed new light on the organization of chromosomes in interphase nuclei. Particularly, the finding of cell-type invariant, evolutionarily conserved topologically associating domains (TADs) in a broad spectrum of cell types has provided a new molecular framework for the study of animal development and human diseases. Here, we review recent progress in characterization of such chromatin domains and delineation of mechanisms of their formation in animal cells.
Collapse
|
16
|
Genome-wide compendium and functional assessment of in vivo heart enhancers. Nat Commun 2016; 7:12923. [PMID: 27703156 PMCID: PMC5059478 DOI: 10.1038/ncomms12923] [Citation(s) in RCA: 61] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/19/2016] [Accepted: 08/16/2016] [Indexed: 12/04/2022] Open
Abstract
Whole-genome sequencing is identifying growing numbers of non-coding variants in human disease studies, but the lack of accurate functional annotations prevents their interpretation. We describe the genome-wide landscape of distant-acting enhancers active in the developing and adult human heart, an organ whose impairment is a predominant cause of mortality and morbidity. Using integrative analysis of >35 epigenomic data sets from mouse and human pre- and postnatal hearts we created a comprehensive reference of >80,000 putative human heart enhancers. To illustrate the importance of enhancers in the regulation of genes involved in heart disease, we deleted the mouse orthologs of two human enhancers near cardiac myosin genes. In both cases, we observe in vivo expression changes and cardiac phenotypes consistent with human heart disease. Our study provides a comprehensive catalogue of human heart enhancers for use in clinical whole-genome sequencing studies and highlights the importance of enhancers for cardiac function. Identification of non-coding variants has outstripped our ability to annotate and interpret them. Dickel et al. present a compendium of over 80,000 putative human heart enhancers and demonstrate that two conserved enhancers are required for proper cardiac function in mice.
Collapse
|
17
|
CRISPR Inversion of CTCF Sites Alters Genome Topology and Enhancer/Promoter Function. Cell 2015; 162:900-10. [PMID: 26276636 PMCID: PMC4642453 DOI: 10.1016/j.cell.2015.07.038] [Citation(s) in RCA: 634] [Impact Index Per Article: 70.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2015] [Revised: 05/30/2015] [Accepted: 07/22/2015] [Indexed: 01/27/2023]
Abstract
CTCF and the associated cohesin complex play a central role in insulator function and higher-order chromatin organization of mammalian genomes. Recent studies identified a correlation between the orientation of CTCF-binding sites (CBSs) and chromatin loops. To test the functional significance of this observation, we combined CRISPR/Cas9-based genomic-DNA-fragment editing with chromosome-conformation-capture experiments to show that the location and relative orientations of CBSs determine the specificity of long-range chromatin looping in mammalian genomes, using protocadherin (Pcdh) and β-globin as model genes. Inversion of CBS elements within the Pcdh enhancer reconfigures the topology of chromatin loops between the distal enhancer and target promoters and alters gene-expression patterns. Thus, although enhancers can function in an orientation-independent manner in reporter assays, in the native chromosome context, the orientation of at least some enhancers carrying CBSs can determine both the architecture of topological chromatin domains and enhancer/promoter specificity. These findings reveal how 3D chromosome architecture can be encoded by linear genome sequences.
Collapse
|
18
|
Genomic analysis reveals distinct mechanisms and functional classes of SOX10-regulated genes in melanocytes. Hum Mol Genet 2015. [PMID: 26206884 DOI: 10.1093/hmg/ddv267] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
SOX10 is required for melanocyte development and maintenance, and has been linked to melanoma initiation and progression. However, the molecular mechanisms by which SOX10 guides the appropriate gene expression programs necessary to promote the melanocyte lineage are not fully understood. Here we employ genetic and epigenomic analysis approaches to uncover novel genomic targets and previously unappreciated molecular roles of SOX10 in melanocytes. Through global analysis of SOX10-binding sites and epigenetic characteristics of chromatin states, we uncover an extensive catalog of SOX10 targets genome-wide. Our findings reveal that SOX10 predominantly engages 'open' chromatin regions and binds to distal regulatory elements, including novel and previously known melanocyte enhancers. Integrated chromatin occupancy and transcriptome analysis suggest a role for SOX10 in both transcriptional activation and repression to regulate functionally distinct classes of genes. We demonstrate that distinct epigenetic signatures and cis-regulatory sequence motifs predicted to bind putative co-regulatory transcription factors define SOX10-activated and SOX10-repressed target genes. Collectively, these findings uncover a central role of SOX10 as a global regulator of gene expression in the melanocyte lineage by targeting diverse regulatory pathways.
Collapse
|
19
|
Abstract
It can be convenient to think of the genome as simply a string of nucleotides, the linear order of which encodes an organism's genetic blueprint. However, the genome does not exist as a linear entity within cells where this blueprint is actually utilized. Inside the nucleus, the genome is organized in three-dimensional (3D) space, and lineage-specific transcriptional programs that direct stem cell fate are implemented in this native 3D context. Here, we review principles of 3D genome organization in mammalian cells. We focus on the emerging relationship between genome organization and lineage-specific transcriptional regulation, which we argue are inextricably linked.
Collapse
|
20
|
A polymorphism in IRF4 affects human pigmentation through a tyrosinase-dependent MITF/TFAP2A pathway. Cell 2014; 155:1022-33. [PMID: 24267888 DOI: 10.1016/j.cell.2013.10.022] [Citation(s) in RCA: 153] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2013] [Revised: 08/19/2013] [Accepted: 10/01/2013] [Indexed: 10/26/2022]
Abstract
Sequence polymorphisms linked to human diseases and phenotypes in genome-wide association studies often affect noncoding regions. A SNP within an intron of the gene encoding Interferon Regulatory Factor 4 (IRF4), a transcription factor with no known role in melanocyte biology, is strongly associated with sensitivity of skin to sun exposure, freckles, blue eyes, and brown hair color. Here, we demonstrate that this SNP lies within an enhancer of IRF4 transcription in melanocytes. The allele associated with this pigmentation phenotype impairs binding of the TFAP2A transcription factor that, together with the melanocyte master regulator MITF, regulates activity of the enhancer. Assays in zebrafish and mice reveal that IRF4 cooperates with MITF to activate expression of Tyrosinase (TYR), an essential enzyme in melanin synthesis. Our findings provide a clear example of a noncoding polymorphism that affects a phenotype by modulating a developmental gene regulatory network.
Collapse
|
21
|
Integration of ChIP-seq and machine learning reveals enhancers and a predictive regulatory sequence vocabulary in melanocytes. Genome Res 2012; 22:2290-301. [PMID: 23019145 PMCID: PMC3483558 DOI: 10.1101/gr.139360.112] [Citation(s) in RCA: 56] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
We take a comprehensive approach to the study of regulatory control of gene expression in melanocytes that proceeds from large-scale enhancer discovery facilitated by ChIP-seq; to rigorous validation in silico, in vitro, and in vivo; and finally to the use of machine learning to elucidate a regulatory vocabulary with genome-wide predictive power. We identify 2489 putative melanocyte enhancer loci in the mouse genome by ChIP-seq for EP300 and H3K4me1. We demonstrate that these putative enhancers are evolutionarily constrained, enriched for sequence motifs predicted to bind key melanocyte transcription factors, located near genes relevant to melanocyte biology, and capable of driving reporter gene expression in melanocytes in culture (86%; 43/50) and in transgenic zebrafish (70%; 7/10). Next, using the sequences of these putative enhancers as a training set for a supervised machine learning algorithm, we develop a vocabulary of 6-mers predictive of melanocyte enhancer function. Lastly, we demonstrate that this vocabulary has genome-wide predictive power in both the mouse and human genomes. This study provides deep insight into the regulation of gene expression in melanocytes and demonstrates a powerful approach to the investigation of regulatory sequences that can be applied to other cell types.
Collapse
|
22
|
SOX10 directly modulates ERBB3 transcription via an intronic neural crest enhancer. BMC DEVELOPMENTAL BIOLOGY 2011; 11:40. [PMID: 21672228 PMCID: PMC3124416 DOI: 10.1186/1471-213x-11-40] [Citation(s) in RCA: 46] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/25/2011] [Accepted: 06/14/2011] [Indexed: 12/21/2022]
Abstract
Background The ERBB3 gene is essential for the proper development of the neural crest (NC) and its derivative populations such as Schwann cells. As with all cell fate decisions, transcriptional regulatory control plays a significant role in the progressive restriction and specification of NC derived lineages during development. However, little is known about the sequences mediating transcriptional regulation of ERBB3 or the factors that bind them. Results In this study we identified three transcriptional enhancers at the ERBB3 locus and evaluated their regulatory potential in vitro in NC-derived cell types and in vivo in transgenic zebrafish. One enhancer, termed ERBB3_MCS6, which lies within the first intron of ERBB3, directs the highest reporter expression in vitro and also demonstrates epigenetic marks consistent with enhancer activity. We identify a consensus SOX10 binding site within ERBB3_MCS6 and demonstrate, in vitro, its necessity and sufficiency for the activity of this enhancer. Additionally, we demonstrate that transcription from the endogenous Erbb3 locus is dependent on Sox10. Further we demonstrate in vitro that Sox10 physically interacts with that ERBB3_MCS6. Consistent with its in vitro activity, we also show that ERBB3_MCS6 drives reporter expression in NC cells and a subset of its derivative lineages in vivo in zebrafish in a manner consistent with erbb3b expression. We also demonstrate, using morpholino analysis, that Sox10 is necessary for ERBB3_MCS6 expression in vivo in zebrafish. Conclusions Taken collectively, our data suggest that ERBB3 may be directly regulated by SOX10, and that this control may in part be facilitated by ERBB3_MCS6.
Collapse
|
23
|
Oligodendroglial and pan-neural crest expression of Cre recombinase directed by Sox10 enhancer. Genesis 2010; 47:765-70. [PMID: 19830815 DOI: 10.1002/dvg.20559] [Citation(s) in RCA: 21] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023]
Abstract
Utilizing a recently identified Sox10 distal enhancer directing Cre expression, we report S4F:Cre, a transgenic mouse line capable of inducing recombination in oligodendroglia and all examined neural crest derived tissues. Assayed using R26R:LacZ reporter mice expression was detected in neural crest derived tissues including the forming facial skeleton, dorsal root ganglia, sympathetic ganglia, enteric nervous system, aortae, and melanoblasts, consistent with Sox10 expression. LacZ reporter expression was also detected in non-neural crest derived tissues including the oligodendrocytes and the ventral neural tube. This line provides appreciable differences in Cre expression pattern from other transgenic mouse lines that mark neural crest populations, including additional populations defined by the expression of other SoxE proteins. The S4F:Cre transgenic line will thus serve as a powerful tool for lineage tracing, gene function characterization, and genome manipulation in these populations.
Collapse
|