1
|
Ertl HA, Bayala EX, Siddiq MA, Wittkopp PJ. Divergence of Grainy head affects chromatin accessibility, gene expression, and embryonic viability in Drosophila melanogaster. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.04.07.588430. [PMID: 38645200 PMCID: PMC11030446 DOI: 10.1101/2024.04.07.588430] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 04/23/2024]
Abstract
Pioneer factors are critical for gene regulation and development because they bind chromatin and make DNA more accessible for binding by other transcription factors. The pioneer factor Grainy head (Grh) is present across metazoans and has been shown to retain a role in epithelium development in fruit flies, nematodes, and mice despite extensive divergence in both amino acid sequence and length. Here, we investigate the evolution of Grh function by comparing the effects of the fly (Drosophila melanogaster) and worm (Caenorhabditis elegans) Grh orthologs on chromatin accessibility, gene expression, embryonic development, and viability in transgenic D. melanogaster. We found that the Caenorhabditis elegans ortholog rescued cuticle development but not full embryonic viability in Drosophila melanogaster grh null mutants. At the molecular level, the C. elegans ortholog only partially rescued chromatin accessibility and gene expression. Divergence in the disordered N-terminus of the Grh protein contributes to these differences in embryonic viability and molecular phenotypes. These data show how pioneer factors can diverge in sequence and function at the molecular level while retaining conserved developmental functions at the organismal level.
Collapse
Affiliation(s)
- Henry A. Ertl
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Erick X. Bayala
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Mohammad A. Siddiq
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
| | - Patricia J. Wittkopp
- Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109, USA
- Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, MI 48109, USA
| |
Collapse
|
2
|
Shaw DE, Naftaly AS, White MA. Positive Selection Drives cis-regulatory Evolution Across the Threespine Stickleback Y Chromosome. Mol Biol Evol 2024; 41:msae020. [PMID: 38306314 PMCID: PMC10899008 DOI: 10.1093/molbev/msae020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 12/31/2023] [Accepted: 01/24/2024] [Indexed: 02/04/2024] Open
Abstract
Allele-specific gene expression evolves rapidly on heteromorphic sex chromosomes. Over time, the accumulation of mutations on the Y chromosome leads to widespread loss of gametolog expression, relative to the X chromosome. It remains unclear if expression evolution on degrading Y chromosomes is primarily driven by mutations that accumulate through processes of selective interference, or if positive selection can also favor the down-regulation of coding regions on the Y chromosome that contain deleterious mutations. Identifying the relative rates of cis-regulatory sequence evolution across Y chromosomes has been challenging due to the limited number of reference assemblies. The threespine stickleback (Gasterosteus aculeatus) Y chromosome is an excellent model to identify how regulatory mutations accumulate on Y chromosomes due to its intermediate state of divergence from the X chromosome. A large number of Y-linked gametologs still exist across 3 differently aged evolutionary strata to test these hypotheses. We found that putative enhancer regions on the Y chromosome exhibited elevated substitution rates and decreased polymorphism when compared to nonfunctional sites, like intergenic regions and synonymous sites. This suggests that many cis-regulatory regions are under positive selection on the Y chromosome. This divergence was correlated with X-biased gametolog expression, indicating the loss of expression from the Y chromosome may be favored by selection. Our findings provide evidence that Y-linked cis-regulatory regions exhibit signs of positive selection quickly after the suppression of recombination and allow comparisons with recent theoretical models that suggest the rapid divergence of regulatory regions may be favored to mask deleterious mutations on the Y chromosome.
Collapse
Affiliation(s)
- Daniel E Shaw
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| | | | - Michael A White
- Department of Genetics, University of Georgia, Athens, GA 30602, USA
| |
Collapse
|
3
|
Devens HR, Davidson PL, Byrne M, Wray GA. Hybrid Epigenomes Reveal Extensive Local Genetic Changes to Chromatin Accessibility Contribute to Divergence in Embryonic Gene Expression Between Species. Mol Biol Evol 2023; 40:msad222. [PMID: 37823438 PMCID: PMC10638671 DOI: 10.1093/molbev/msad222] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/06/2023] [Revised: 06/14/2023] [Accepted: 07/27/2023] [Indexed: 10/13/2023] Open
Abstract
Chromatin accessibility plays an important role in shaping gene expression, yet little is known about the genetic and molecular mechanisms that influence the evolution of chromatin configuration. Both local (cis) and distant (trans) genetic influences can in principle influence chromatin accessibility and are based on distinct molecular mechanisms. We, therefore, sought to characterize the role that each of these plays in altering chromatin accessibility in 2 closely related sea urchin species. Using hybrids of Heliocidaris erythrogramma and Heliocidaris tuberculata, and adapting a statistical framework previously developed for the analysis of cis and trans influences on the transcriptome, we examined how these mechanisms shape the regulatory landscape at 3 important developmental stages, and compared our results to similar analyses of the transcriptome. We found extensive cis- and trans-based influences on evolutionary changes in chromatin, with cis effects generally larger in effect. Evolutionary changes in accessibility and gene expression are correlated, especially when expression has a local genetic basis. Maternal influences appear to have more of an effect on chromatin accessibility than on gene expression, persisting well past the maternal-to-zygotic transition. Chromatin accessibility near gene regulatory network genes appears to be distinctly regulated, with trans factors appearing to play an outsized role in the configuration of chromatin near these genes. Together, our results represent the first attempt to quantify cis and trans influences on evolutionary divergence in chromatin configuration in an outbred natural study system and suggest that chromatin regulation is more genetically complex than was previously appreciated.
Collapse
Affiliation(s)
| | | | - Maria Byrne
- School of Medical Science, The University of Sydney, Sydney, New South Wales, Australia
- School of Life and Environmental Science, The University of Sydney, Sydney, New South Wales, Australia
| | - Gregory A Wray
- Department of Biology, Duke University, Durham, NC, USA
- Center for Genomic and Computational Biology, Duke University, Durham, NC, USA
| |
Collapse
|
4
|
Ling L, Mühling B, Jaenichen R, Gompel N. Increased chromatin accessibility promotes the evolution of a transcriptional silencer in Drosophila. SCIENCE ADVANCES 2023; 9:eade6529. [PMID: 36800429 PMCID: PMC9937571 DOI: 10.1126/sciadv.ade6529] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Accepted: 01/17/2023] [Indexed: 06/18/2023]
Abstract
The loss of discrete morphological traits, the most common evolutionary transition, is typically driven by changes in developmental gene expression. Mutations accumulating in regulatory elements of these genes can disrupt DNA binding sites for transcription factors patterning their spatial expression, or delete entire enhancers. Regulatory elements, however, may be silenced through changes in chromatin accessibility or the emergence of repressive elements. Here, we show that increased chromatin accessibility at the gene yellow, combined with the gain of a repressor site, underlies the loss of a wing spot pigmentation pattern in a Drosophila species. The gain of accessibility of this repressive element is regulated by E93, a transcription factor governing the progress of metamorphosis. This convoluted evolutionary scenario contrasts with the parsimonious mutational paths generally envisioned and often documented for morphological losses. It illustrates how evolutionary changes in chromatin accessibility may directly contribute to morphological diversification.
Collapse
|
5
|
Devens HR, Davidson PL, Byrne M, Wray GA. Hybrid epigenomes reveal extensive local genetic changes to chromatin accessibility contribute to divergence in embryonic gene expression between species. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.04.522781. [PMID: 36711588 PMCID: PMC9881966 DOI: 10.1101/2023.01.04.522781] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/06/2023]
Abstract
Chromatin accessibility plays an important role in shaping gene expression patterns across development and evolution; however, little is known about the genetic and molecular mechanisms that influence chromatin configuration itself. Because cis and trans influences can both theoretically influence the accessibility of the epigenome, we sought to better characterize the role that both of these mechanisms play in altering chromatin accessibility in two closely related sea urchin species. Using hybrids of the two species, and adapting a statistical framework previously developed for the analysis of cis and trans influences on the transcriptome, we examined how these mechanisms shape the regulatory landscape at three important developmental stages, and compared our results to similar patterns in the transcriptome. We found extensive cis- and trans-based influences on evolutionary changes in chromatin, with cis effects slightly more numerous and larger in effect. Genetic mechanisms influencing gene expression and chromatin configuration are correlated, but differ in several important ways. Maternal influences also appear to have more of an effect on chromatin accessibility than on gene expression, persisting well past the maternal-to-zygotic transition. Furthermore, chromatin accessibility near GRN genes appears to be regulated differently than the rest of the epigenome, and indicates that trans factors may play an outsized role in the configuration of chromatin near these genes. Together, our results represent the first attempt to quantify cis and trans influences on evolutionary divergence in chromatin configuration in an outbred natural study system, and suggest that the regulation of chromatin is more genetically complex than was previously appreciated.
Collapse
Affiliation(s)
| | | | - Maria Byrne
- School of Medical Science, The University of Sydney, NSW 2006, Australia
- School of Life and Environmental Science, The University of Sydney, NSW 2006, Australia
| | - Gregory A. Wray
- Department of Biology, Duke University, Durham, NC 27708, USA
- Center for Genomic and Computational Biology, Duke University, Durham, NC 27708, USA
| |
Collapse
|
6
|
Ertl HA, Hill MS, Wittkopp PJ. Differential Grainy head binding correlates with variation in chromatin structure and gene expression in Drosophila melanogaster. BMC Genomics 2022; 23:854. [PMID: 36575386 PMCID: PMC9795675 DOI: 10.1186/s12864-022-09082-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 12/14/2022] [Indexed: 12/28/2022] Open
Abstract
Phenotypic evolution is often caused by variation in gene expression resulting from altered gene regulatory mechanisms. Genetic variation affecting chromatin remodeling has been identified as a potential source of variable gene expression; however, the roles of specific chromatin remodeling factors remain unclear. Here, we address this knowledge gap by examining the relationship between variation in gene expression, variation in chromatin structure, and variation in binding of the pioneer factor Grainy head between imaginal wing discs of two divergent strains of Drosophila melanogaster and their F1 hybrid. We find that (1) variation in Grainy head binding is mostly due to sequence changes that act in cis but are located outside of the canonical Grainy head binding motif, (2) variation in Grainy head binding correlates with changes in chromatin accessibility, and (3) this variation in chromatin accessibility, coupled with variation in Grainy head binding, correlates with variation in gene expression in some cases but not others. Interactions among these three molecular layers is complex, but these results suggest that genetic variation affecting the binding of pioneer factors contributes to variation in chromatin remodeling and the evolution of gene expression.
Collapse
Affiliation(s)
- Henry A. Ertl
- grid.214458.e0000000086837370Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109 USA
| | - Mark S. Hill
- grid.214458.e0000000086837370Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109 USA ,grid.83440.3b0000000121901201Present address: Cancer Evolution and Genome Instability Laboratory, University College London Cancer Institute and The Francis Crick Institute, London, UK
| | - Patricia J. Wittkopp
- grid.214458.e0000000086837370Department of Ecology and Evolutionary Biology, University of Michigan, Ann Arbor, MI 48109 USA ,grid.214458.e0000000086837370Department of Molecular, Cellular, and Developmental Biology, University of Michigan, Ann Arbor, MI 48109 USA
| |
Collapse
|
7
|
Davidson PL, Byrne M, Wray GA. Evolutionary Changes in the Chromatin Landscape Contribute to Reorganization of a Developmental Gene Network During Rapid Life History Evolution in Sea Urchins. Mol Biol Evol 2022; 39:msac172. [PMID: 35946348 PMCID: PMC9435058 DOI: 10.1093/molbev/msac172] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/27/2022] Open
Abstract
Chromatin configuration is highly dynamic during embryonic development in animals, exerting an important point of control in transcriptional regulation. Yet there exists remarkably little information about the role of evolutionary changes in chromatin configuration to the evolution of gene expression and organismal traits. Genome-wide assays of chromatin configuration, coupled with whole-genome alignments, can help address this gap in knowledge in several ways. In this study we present a comparative analysis of regulatory element sequences and accessibility throughout embryogenesis in three sea urchin species with divergent life histories: a lecithotroph Heliocidaris erythrogramma, a closely related planktotroph H. tuberculata, and a distantly related planktotroph Lytechinus variegatus. We identified distinct epigenetic and mutational signatures of evolutionary modifications to the function of putative cis-regulatory elements in H. erythrogramma that have accumulated nonuniformly throughout the genome, suggesting selection, rather than drift, underlies many modifications associated with the derived life history. Specifically, regulatory elements composing the sea urchin developmental gene regulatory network are enriched for signatures of positive selection and accessibility changes which may function to alter binding affinity and access of developmental transcription factors to these sites. Furthermore, regulatory element changes often correlate with divergent expression patterns of genes involved in cell type specification, morphogenesis, and development of other derived traits, suggesting these evolutionary modifications have been consequential for phenotypic evolution in H. erythrogramma. Collectively, our results demonstrate that selective pressures imposed by changes in developmental life history rapidly reshape the cis-regulatory landscape of core developmental genes to generate novel traits and embryonic programs.
Collapse
Affiliation(s)
| | - Maria Byrne
- School of Life and Environmental Sciences, University of Sydney, Sydney, NSW, Australia
| | | |
Collapse
|
8
|
Naftaly AS, Pau S, White MA. Long-read RNA sequencing reveals widespread sex-specific alternative splicing in threespine stickleback fish. Genome Res 2021; 31:1486-1497. [PMID: 34131005 PMCID: PMC8327910 DOI: 10.1101/gr.274282.120] [Citation(s) in RCA: 19] [Impact Index Per Article: 4.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/11/2020] [Accepted: 06/15/2021] [Indexed: 01/07/2023]
Abstract
Alternate isoforms are important contributors to phenotypic diversity across eukaryotes. Although short-read RNA-sequencing has increased our understanding of isoform diversity, it is challenging to accurately detect full-length transcripts, preventing the identification of many alternate isoforms. Long-read sequencing technologies have made it possible to sequence full-length alternative transcripts, accurately characterizing alternative splicing events, alternate transcription start and end sites, and differences in UTR regions. Here, we use Pacific Biosciences (PacBio) long-read RNA-sequencing (Iso-Seq) to examine the transcriptomes of five organs in threespine stickleback fish (Gasterosteus aculeatus), a widely used genetic model species. The threespine stickleback fish has a refined genome assembly in which gene annotations are based on short-read RNA sequencing and predictions from coding sequence of other species. This suggests some of the existing annotations may be inaccurate or alternative transcripts may not be fully characterized. Using Iso-Seq we detected thousands of novel isoforms, indicating many isoforms are absent in the current Ensembl gene annotations. In addition, we refined many of the existing annotations within the genome. We noted many improperly positioned transcription start sites that were refined with long-read sequencing. The Iso-Seq-predicted transcription start sites were more accurate and verified through ATAC-seq. We also detected many alternative splicing events between sexes and across organs. We found a substantial number of genes in both somatic and gonadal samples that had sex-specific isoforms. Our study highlights the power of long-read sequencing to study the complexity of transcriptomes, greatly improving genomic resources for the threespine stickleback fish.
Collapse
Affiliation(s)
- Alice S Naftaly
- Department of Genetics, University of Georgia, Athens, Georgia 30602, USA
| | - Shana Pau
- Department of Genetics, University of Georgia, Athens, Georgia 30602, USA
- Department of Biology, University of Texas Arlington, Arlington, Texas 76019, USA
| | - Michael A White
- Department of Genetics, University of Georgia, Athens, Georgia 30602, USA
| |
Collapse
|
9
|
Peng PC, Khoueiry P, Girardot C, Reddington JP, Garfield DA, Furlong EEM, Sinha S. The Role of Chromatin Accessibility in cis-Regulatory Evolution. Genome Biol Evol 2020; 11:1813-1828. [PMID: 31114856 PMCID: PMC6601868 DOI: 10.1093/gbe/evz103] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/13/2019] [Indexed: 02/07/2023] Open
Abstract
Transcription factor (TF) binding is determined by sequence as well as chromatin accessibility. Although the role of accessibility in shaping TF-binding landscapes is well recorded, its role in evolutionary divergence of TF binding, which in turn can alter cis-regulatory activities, is not well understood. In this work, we studied the evolution of genome-wide binding landscapes of five major TFs in the core network of mesoderm specification, between Drosophila melanogaster and Drosophila virilis, and examined its relationship to accessibility and sequence-level changes. We generated chromatin accessibility data from three important stages of embryogenesis in both Drosophila melanogaster and Drosophila virilis and recorded conservation and divergence patterns. We then used multivariable models to correlate accessibility and sequence changes to TF-binding divergence. We found that accessibility changes can in some cases, for example, for the master regulator Twist and for earlier developmental stages, more accurately predict binding change than is possible using TF-binding motif changes between orthologous enhancers. Accessibility changes also explain a significant portion of the codivergence of TF pairs. We noted that accessibility and motif changes offer complementary views of the evolution of TF binding and developed a combined model that captures the evolutionary data much more accurately than either view alone. Finally, we trained machine learning models to predict enhancer activity from TF binding and used these functional models to argue that motif and accessibility-based predictors of TF-binding change can substitute for experimentally measured binding change, for the purpose of predicting evolutionary changes in enhancer activity.
Collapse
Affiliation(s)
- Pei-Chen Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign.,Center for Bioinformatics and Functional Genomics, Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA
| | - Pierre Khoueiry
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.,American University of Beirut (AUB), Department of Biochemistry and Molecular Genetics, Beirut, Lebanon
| | - Charles Girardot
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - James P Reddington
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - David A Garfield
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.,IRI-Life Sciences, Humboldt Universität zu Berlin, Berlin, Germany
| | - Eileen E M Furlong
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign
| |
Collapse
|
10
|
Umeyama T, Ito T. DMS-Seq for In Vivo Genome-wide Mapping of Protein-DNA Interactions and Nucleosome Centers. Cell Rep 2018; 21:289-300. [PMID: 28978481 DOI: 10.1016/j.celrep.2017.09.035] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2017] [Revised: 07/31/2017] [Accepted: 09/08/2017] [Indexed: 01/05/2023] Open
Abstract
Protein-DNA interactions provide the basis for chromatin structure and gene regulation. Comprehensive identification of protein-occupied sites is thus vital to an in-depth understanding of genome function. Dimethyl sulfate (DMS) is a chemical probe that has long been used to detect footprints of DNA-bound proteins in vitro and in vivo. Here, we describe a genomic footprinting method, dimethyl sulfate sequencing (DMS-seq), which exploits the cell-permeable nature of DMS to obviate the need for nuclear isolation. This feature makes DMS-seq simple in practice and removes the potential risk of protein re-localization during nuclear isolation. DMS-seq successfully detects transcription factors bound to cis-regulatory elements and non-canonical chromatin particles in nucleosome-free regions. Furthermore, an unexpected preference of DMS confers on DMS-seq a unique potential to directly detect nucleosome centers without using genetic manipulation. We expect that DMS-seq will serve as a characteristic method for genome-wide interrogation of in vivo protein-DNA interactions.
Collapse
Affiliation(s)
- Taichi Umeyama
- Department of Biochemistry, Kyushu University Graduate School of Medical Sciences, Fukuoka 812-8582, Japan; Core Research for Evolutional Science and Technology (CREST), Japan Agency for Medical Research and Development (AMED), Tokyo 100-0004, Japan; Laboratory for Microbiome Sciences, RIKEN Center for Integrative Medical Sciences, Yokohama 230-0045, Japan
| | - Takashi Ito
- Department of Biochemistry, Kyushu University Graduate School of Medical Sciences, Fukuoka 812-8582, Japan; Core Research for Evolutional Science and Technology (CREST), Japan Agency for Medical Research and Development (AMED), Tokyo 100-0004, Japan.
| |
Collapse
|
11
|
Segorbe D, Wilkinson D, Mizeranschi A, Hughes T, Aaløkken R, Váchová L, Palková Z, Gilfillan GD. An optimized FAIRE procedure for low cell numbers in yeast. Yeast 2018; 35:507-512. [PMID: 29577419 PMCID: PMC6099244 DOI: 10.1002/yea.3316] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2017] [Revised: 02/24/2018] [Accepted: 03/17/2018] [Indexed: 11/22/2022] Open
Abstract
We report an optimized low‐input FAIRE‐seq (Formaldehyde‐Assisted Isolation of Regulatory Elements‐sequencing) procedure to assay chromatin accessibility from limited amounts of yeast cells. We demonstrate that the method performs well on as little as 4 mg of cells scraped directly from a few colonies. Sensitivity, specificity and reproducibility of the scaled‐down method are comparable with those of regular, higher input amounts, and allow the use of 100‐fold fewer cells than existing procedures. The method enables epigenetic analysis of chromatin structure without the need for cell multiplication of exponentially growing cells in liquid culture, thus opening the possibility of studying colony cell subpopulations, or those that can be isolated directly from environmental samples.
Collapse
Affiliation(s)
- David Segorbe
- Faculty of Science, Charles University, BIOCEV, 252 50 Vestec, Czech Republic
| | - Derek Wilkinson
- Faculty of Science, Charles University, BIOCEV, 252 50 Vestec, Czech Republic
| | | | - Timothy Hughes
- Department of Medical Genetics, Oslo University Hospital and University of Oslo, 0450, Oslo, Norway
| | - Ragnhild Aaløkken
- Department of Medical Genetics, Oslo University Hospital and University of Oslo, 0450, Oslo, Norway
| | - Libuše Váchová
- Institute of Microbiology of the Czech Academy of Sciences, BIOCEV, 252 50 Vestec, Czech Republic
| | - Zdena Palková
- Faculty of Science, Charles University, BIOCEV, 252 50 Vestec, Czech Republic
| | - Gregor D Gilfillan
- Department of Medical Genetics, Oslo University Hospital and University of Oslo, 0450, Oslo, Norway
| |
Collapse
|
12
|
Abstract
BACKGROUND Gene expression is a key intermediate level that genotypes lead to a particular trait. Gene expression is affected by various factors including genotypes of genetic variants. With an aim of delineating the genetic impact on gene expression, we build a deep auto-encoder model to assess how good genetic variants will contribute to gene expression changes. This new deep learning model is a regression-based predictive model based on the MultiLayer Perceptron and Stacked Denoising Auto-encoder (MLP-SAE). The model is trained using a stacked denoising auto-encoder for feature selection and a multilayer perceptron framework for backpropagation. We further improve the model by introducing dropout to prevent overfitting and improve performance. RESULTS To demonstrate the usage of this model, we apply MLP-SAE to a real genomic datasets with genotypes and gene expression profiles measured in yeast. Our results show that the MLP-SAE model with dropout outperforms other models including Lasso, Random Forests and the MLP-SAE model without dropout. Using the MLP-SAE model with dropout, we show that gene expression quantifications predicted by the model solely based on genotypes, align well with true gene expression patterns. CONCLUSION We provide a deep auto-encoder model for predicting gene expression from SNP genotypes. This study demonstrates that deep learning is appropriate for tackling another genomic problem, i.e., building predictive models to understand genotypes' contribution to gene expression. With the emerging availability of richer genomic data, we anticipate that deep learning models play a bigger role in modeling and interpreting genomics.
Collapse
Affiliation(s)
- Rui Xie
- Department of Computer Science, University of Missouri at Columbia, Columbia, MO USA
| | - Jia Wen
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, University City Blvd, Charlotte, NC USA
| | - Andrew Quitadamo
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, University City Blvd, Charlotte, NC USA
| | - Jianlin Cheng
- Department of Computer Science, University of Missouri at Columbia, Columbia, MO USA
| | - Xinghua Shi
- Department of Bioinformatics and Genomics, College of Computing and Informatics, University of North Carolina at Charlotte, University City Blvd, Charlotte, NC USA
| |
Collapse
|
13
|
Sarda S, Hannenhalli S. High-Throughput Identification of Cis-Regulatory Rewiring Events in Yeast. Mol Biol Evol 2015; 32:3047-63. [PMID: 26399482 DOI: 10.1093/molbev/msv203] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
A coregulated module of genes ("regulon") can have evolutionarily conserved expression patterns and yet have diverged upstream regulators across species. For instance, the ribosomal genes regulon is regulated by the transcription factor (TF) TBF1 in Candida albicans, while in Saccharomyces cerevisiae it is regulated by RAP1. Only a handful of such rewiring events have been established, and the prevalence or conditions conducive to such events are not well known. Here, we develop a novel probabilistic scoring method to comprehensively screen for regulatory rewiring within regulons across 23 yeast species. Investigation of 1,713 regulons and 176 TFs yielded 5,353 significant rewiring events at 5% false discovery rate (FDR). Besides successfully recapitulating known rewiring events, our analyses also suggest TF candidates for certain processes reported to be under distinct regulatory controls in S. cerevisiae and C. albicans, for which the implied regulators are not known: 1) Oxidative stress response (Sc-MSN2 to Ca-FKH2) and 2) nutrient modulation (Sc-RTG1 to Ca-GCN4/Ca-UME6). Furthermore, a stringent screen to detect TF rewiring at individual genes identified 1,446 events at 10% FDR. Overall, these events are supported by strong coexpression between the predicted regulator and its target gene(s) in a species-specific fashion (>50-fold). Independent functional analyses of rewiring TF pairs revealed greater functional interactions and shared biological processes between them (P = 1 × 10(-3)).Our study represents the first comprehensive assessment of regulatory rewiring; with a novel approach that has generated a unique high-confidence resource of several specific events, suggesting that evolutionary rewiring is relatively frequent and may be a significant mechanism of regulatory innovation.
Collapse
Affiliation(s)
- Shrutii Sarda
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park
| | - Sridhar Hannenhalli
- Center for Bioinformatics and Computational Biology, University of Maryland, College Park
| |
Collapse
|
14
|
Ramachandran P, Palidwor GA, Perkins TJ. BIDCHIPS: bias decomposition and removal from ChIP-seq data clarifies true binding signal and its functional correlates. Epigenetics Chromatin 2015; 8:33. [PMID: 26388941 PMCID: PMC4574076 DOI: 10.1186/s13072-015-0028-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2015] [Accepted: 09/07/2015] [Indexed: 12/24/2022] Open
Abstract
Background Unraveling transcriptional regulatory networks is a central problem in molecular biology and, in this quest, chromatin immunoprecipitation and sequencing (ChIP-seq) technology has given us the unprecedented ability to identify sites of protein-DNA binding and histone modification genome wide. However, multiple systemic and procedural biases hinder harnessing the full potential of this technology. Previous studies have addressed this problem, but a thorough characterization of different, interacting biases on ChIP-seq signals is still lacking. Results Here, we present a novel framework where the genome-wide ChIP-seq signal is viewed as being quantifiably influenced by different, measurable sources of bias, which can then be computationally subtracted away. We use a compendium of 123 human ENCODE ChIP-seq datasets to build regression models that tell us how much of a ChIP-seq signal can be attributed to mappability, GC-content, chromatin accessibility, and factors represented in input DNA and IgG controls. When we use the model to separate out these non-binding influences from the ChIP-seq signal, we obtain a purified signal that associates better to TF-DNA-binding motifs than do other measures of peak significance. We also carry out a multiscale analysis that reveals how ChIP-seq signal biases differ across different scales. Finally, we investigate previously reported associations between gene expression and ChIP-seq signals at transcription start sites. We show that our model can be used to discriminate ChIP-seq signals that are truly related to gene expression from those that are merely correlated by virtue of bias—in particular, chromatin accessibility bias, which shows up in ChIP-seq signals and also relates to gene expression. Conclusions Our study provides new insights into the behavior of ChIP-seq signal biases and proposes a novel mitigation framework that improves results compared to existing techniques. With ChIP-seq now being the central technology for studying transcriptional regulation, it is most crucial to accurately characterize, quantify, and adjust for the genome-wide effects of biases affecting ChIP-seq. Our study also emphasizes that properly accounting for confounders in ChIP-seq data is of paramount importance for obtaining biologically accurate insights into the workings of the complex regulatory mechanisms in living organisms. R and MATLAB packages implementing the framework can be obtained from http://www.perkinslab.ca/Software.html. Electronic supplementary material The online version of this article (doi:10.1186/s13072-015-0028-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Parameswaran Ramachandran
- Regenerative Medicine Program, Ottawa Hospital Research Institute, K1H 8L6 Ottawa, Canada ; Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, K1H 8M5 Ottawa, Canada
| | - Gareth A Palidwor
- Regenerative Medicine Program, Ottawa Hospital Research Institute, K1H 8L6 Ottawa, Canada
| | - Theodore J Perkins
- Regenerative Medicine Program, Ottawa Hospital Research Institute, K1H 8L6 Ottawa, Canada ; Department of Biochemistry, Microbiology and Immunology, Faculty of Medicine, University of Ottawa, K1H 8M5 Ottawa, Canada
| |
Collapse
|
15
|
Naval-Sánchez M, Potier D, Hulselmans G, Christiaens V, Aerts S. Identification of Lineage-Specific Cis-Regulatory Modules Associated with Variation in Transcription Factor Binding and Chromatin Activity Using Ornstein-Uhlenbeck Models. Mol Biol Evol 2015; 32:2441-55. [PMID: 25944915 PMCID: PMC4540964 DOI: 10.1093/molbev/msv107] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Scoring the impact of noncoding variation on the function of cis-regulatory regions, on their chromatin state, and on the qualitative and quantitative expression levels of target genes is a fundamental problem in evolutionary genomics. A particular challenge is how to model the divergence of quantitative traits and to identify relationships between the changes across the different levels of the genome, the chromatin activity landscape, and the transcriptome. Here, we examine the use of the Ornstein-Uhlenbeck (OU) model to infer selection at the level of predicted cis-regulatory modules (CRMs), and link these with changes in transcription factor binding and chromatin activity. Using publicly available cross-species ChIP-Seq and STARR-Seq data we show how OU can be applied genome-wide to identify candidate transcription factors for which binding site and CRM turnover is correlated with changes in regulatory activity. Next, we profile open chromatin in the developing eye across three Drosophila species. We identify the recognition motifs of the chromatin remodelers, Trithorax-like and Grainyhead as mostly correlating with species-specific changes in open chromatin. In conclusion, we show in this study that CRM scores can be used as quantitative traits and that motif discovery approaches can be extended towards more complex models of divergence.
Collapse
Affiliation(s)
- Marina Naval-Sánchez
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Delphine Potier
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Gert Hulselmans
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Valerie Christiaens
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| | - Stein Aerts
- Laboratory of Computational Biology, Department of Human Genetics, University of Leuven, Leuven, Belgium
| |
Collapse
|
16
|
Abstract
Gene expression levels are determined by the balance between rates of mRNA transcription and decay, and genetic variation in either of these processes can result in heritable differences in transcript abundance. Although the genetics of gene expression has been a subject of intense interest, the contribution of heritable variation in mRNA decay rates to gene expression variation has received far less attention. To this end, we developed a novel statistical framework and measured allele-specific differences in mRNA decay rates in a diploid yeast hybrid created by mating two genetically diverse parental strains. We estimate that 31% of genes exhibit allelic differences in mRNA decay rates, of which 350 can be identified at a false discovery rate of 10%. Genes with significant allele-specific differences in mRNA decay rates have higher levels of polymorphism compared to other genes, with all gene regions contributing to allelic differences in mRNA decay rates. Strikingly, we find widespread evidence for compensatory evolution, such that variants influencing transcriptional initiation and decay have opposite effects, suggesting that steady-state gene expression levels are subject to pervasive stabilizing selection. Our results demonstrate that heritable differences in mRNA decay rates are widespread and are an important target for natural selection to maintain or fine-tune steady-state gene expression levels.
Collapse
Affiliation(s)
- Jennifer M Andrie
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA
| | - Jon Wakefield
- Department of Statistics, University of Washington, Seattle, Washington 98195, USA
| | - Joshua M Akey
- Department of Genome Sciences, University of Washington, Seattle, Washington 98195, USA;
| |
Collapse
|