1
|
Dresch JM, Nourie LL, Conrad RD, Carlson LT, Tchantouridze EI, Tesfaye B, Verhagen E, Gupta M, Borges-Rivera D, Drewell RA. Two coacting shadow enhancers regulate twin of eyeless expression during early Drosophila development. Genetics 2025; 229:1-43. [PMID: 39607769 PMCID: PMC11708921 DOI: 10.1093/genetics/iyae176] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2024] [Accepted: 10/21/2024] [Indexed: 11/30/2024] Open
Abstract
The Drosophila PAX6 homolog twin of eyeless (toy) sits at the pinnacle of the genetic pathway controlling eye development, the retinal determination network. Expression of toy in the embryo is first detectable at cellular blastoderm stage 5 in an anterior-dorsal band in the presumptive procephalic neuroectoderm, which gives rise to the primordia of the visual system and brain. Although several maternal and gap transcription factors that generate positional information in the embryo have been implicated in controlling toy, the regulation of toy expression in the early embryo is currently not well characterized. In this study, we adopt an integrated experimental approach utilizing bioinformatics, molecular genetic testing of putative enhancers in transgenic reporter gene assays and quantitative analysis of expression patterns in the early embryo, to identify 2 novel coacting enhancers at the toy gene. In addition, we apply mathematical modeling to dissect the regulatory landscape for toy. We demonstrate that relatively simple thermodynamic-based models, incorporating only 5 TF binding sites, can accurately predict gene expression from the 2 coacting enhancers and that the HUNCHBACK TF plays a critical regulatory role through a dual-modality function as an activator and repressor. Our analysis also reveals that the molecular architecture of the 2 enhancers is very different, indicating that the underlying regulatory logic they employ is distinct.
Collapse
Affiliation(s)
- Jacqueline M Dresch
- Biology Department, Clark University, 950 Main Street, Worcester, MA 01610, USA
| | - Luke L Nourie
- Biology Department, Clark University, 950 Main Street, Worcester, MA 01610, USA
| | - Regan D Conrad
- Biology Department, Clark University, 950 Main Street, Worcester, MA 01610, USA
| | - Lindsay T Carlson
- Biology Department, Clark University, 950 Main Street, Worcester, MA 01610, USA
| | | | - Biruck Tesfaye
- Biology Department, Clark University, 950 Main Street, Worcester, MA 01610, USA
| | - Eleanor Verhagen
- Biology Department, Clark University, 950 Main Street, Worcester, MA 01610, USA
| | - Mahima Gupta
- Biology Department, Clark University, 950 Main Street, Worcester, MA 01610, USA
| | - Diego Borges-Rivera
- Biology Department, Clark University, 950 Main Street, Worcester, MA 01610, USA
| | - Robert A Drewell
- Biology Department, Clark University, 950 Main Street, Worcester, MA 01610, USA
| |
Collapse
|
2
|
Berrocal A, Lammers NC, Garcia HG, Eisen MB. Unified bursting strategies in ectopic and endogenous even-skipped expression patterns. eLife 2024; 12:RP88671. [PMID: 39651963 PMCID: PMC11627552 DOI: 10.7554/elife.88671] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2024] Open
Abstract
Transcription often occurs in bursts as gene promoters switch stochastically between active and inactive states. Enhancers can dictate transcriptional activity in animal development through the modulation of burst frequency, duration, or amplitude. Previous studies observed that different enhancers can achieve a wide range of transcriptional outputs through the same strategies of bursting control. For example, in Berrocal et al., 2020, we showed that despite responding to different transcription factors, all even-skipped enhancers increase transcription by upregulating burst frequency and amplitude while burst duration remains largely constant. These shared bursting strategies suggest that a unified molecular mechanism constraints how enhancers modulate transcriptional output. Alternatively, different enhancers could have converged on the same bursting control strategy because of natural selection favoring one of these particular strategies. To distinguish between these two scenarios, we compared transcriptional bursting between endogenous and ectopic gene expression patterns. Because enhancers act under different regulatory inputs in ectopic patterns, dissimilar bursting control strategies between endogenous and ectopic patterns would suggest that enhancers adapted their bursting strategies to their trans-regulatory environment. Here, we generated ectopic even-skipped transcription patterns in fruit fly embryos and discovered that bursting strategies remain consistent in endogenous and ectopic even-skipped expression. These results provide evidence for a unified molecular mechanism shaping even-skipped bursting strategies and serve as a starting point to uncover the realm of strategies employed by other enhancers.
Collapse
Affiliation(s)
- Augusto Berrocal
- Department of Molecular & Cell Biology, University of California at BerkeleyBerkeleyUnited States
| | - Nicholas C Lammers
- Biophysics Graduate Group, University of California at BerkeleyBerkeleyUnited States
| | - Hernan G Garcia
- Department of Molecular & Cell Biology, University of California at BerkeleyBerkeleyUnited States
- Biophysics Graduate Group, University of California at BerkeleyBerkeleyUnited States
- Department of Physics, University of California at BerkeleyBerkeleyUnited States
- California Institute for Quantitative Biosciences (QB3), University of California at BerkeleyBerkeleyUnited States
- Chan Zuckerberg Biohub–San FranciscoSan FranciscoUnited States
| | - Michael B Eisen
- Department of Molecular & Cell Biology, University of California at BerkeleyBerkeleyUnited States
- Biophysics Graduate Group, University of California at BerkeleyBerkeleyUnited States
- California Institute for Quantitative Biosciences (QB3), University of California at BerkeleyBerkeleyUnited States
- Howard Hughes Medical Institute, University of California at BerkeleyBerkeleyUnited States
| |
Collapse
|
3
|
LeBlanc C, Stefani J, Soriano M, Lam A, Zintel MA, Kotha SR, Chase E, Pimentel-Solorio G, Vunnum A, Flug K, Fultineer A, Hummel N, Staller MV. Conservation of function without conservation of amino acid sequence in intrinsically disordered transcriptional activation domains. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.03.626510. [PMID: 39677729 PMCID: PMC11642888 DOI: 10.1101/2024.12.03.626510] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/17/2024]
Abstract
Protein function is canonically believed to be more conserved than amino acid sequence, but this idea is only well supported in folded domains, where highly diverged sequences can fold into equivalent 3D structures. In contrast, intrinsically disordered protein regions (IDRs) do not fold into a stable 3D structure, thus it remains unknown when and how function is conserved for IDRs that experience rapid amino acid sequence divergence. As a model system for studying the evolution of IDRs, we examined transcriptional activation domains, the regions of transcription factors that bind to coactivator complexes. We systematically identified activation domains on 502 orthologs of the transcriptional activator Gcn4 spanning 600 MY of fungal evolution. We find that the central activation domain shows strong conservation of function without conservation of sequence. This conservation of function without conservation of sequence is facilitated by evolutionary turnover (gain and loss) of key acidic and aromatic residues, the positions most important for function. This high sequence flexibility of functional orthologs mirrors the physical flexibility of the activation domain coactivator interaction interface, suggesting that physical flexibility enables evolutionary plasticity. We propose that turnover of short functional elements, sometimes individual amino acids, is a general mechanism for conservation of function without conservation of sequence during IDR evolution.
Collapse
Affiliation(s)
- Claire LeBlanc
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Jordan Stefani
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Melvin Soriano
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Angelica Lam
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Marissa A. Zintel
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
| | - Sanjana R. Kotha
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Emily Chase
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Giovani Pimentel-Solorio
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
| | - Aditya Vunnum
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
| | - Katherine Flug
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
| | - Aaron Fultineer
- Department of Physics, University of California Berkeley, Berkeley, 94720
| | - Niklas Hummel
- Department of Biology, Technische Universität Darmstadt, Darmstadt, Germany
| | - Max V. Staller
- Department of Molecular and Cell Biology, University of California Berkeley, Berkeley, 94720
- Center for Computational Biology, University of California Berkeley, Berkeley, 94720
- Chan Zuckerberg Biohub–San Francisco, San Francisco, CA 94158
| |
Collapse
|
4
|
Berrocal A, Lammers NC, Garcia HG, Eisen MB. Unified bursting strategies in ectopic and endogenous even-skipped expression patterns. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.02.09.527927. [PMID: 36798351 PMCID: PMC9934701 DOI: 10.1101/2023.02.09.527927] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/12/2023]
Abstract
Transcription often occurs in bursts as gene promoters switch stochastically between active and inactive states. Enhancers can dictate transcriptional activity in animal development through the modulation of burst frequency, duration, or amplitude. Previous studies observed that different enhancers can achieve a wide range of transcriptional outputs through the same strategies of bursting control. For example, despite responding to different transcription factors, all even-skipped enhancers increase transcription by upregulating burst frequency and amplitude while burst duration remains largely constant. These shared bursting strategies suggest that a unified molecular mechanism constraints how enhancers modulate transcriptional output. Alternatively, different enhancers could have converged on the same bursting control strategy because of natural selection favoring one of these particular strategies. To distinguish between these two scenarios, we compared transcriptional bursting between endogenous and ectopic gene expression patterns. Because enhancers act under different regulatory inputs in ectopic patterns, dissimilar bursting control strategies between endogenous and ectopic patterns would suggest that enhancers adapted their bursting strategies to their trans-regulatory environment. Here, we generated ectopic even-skipped transcription patterns in fruit fly embryos and discovered that bursting strategies remain consistent in endogenous and ectopic even-skipped expression. These results provide evidence for a unified molecular mechanism shaping even-skipped bursting strategies and serve as a starting point to uncover the realm of strategies employed by other enhancers.
Collapse
Affiliation(s)
- Augusto Berrocal
- Department of Molecular & Cell Biology, University of California at Berkeley, Berkeley, CA, United States
- Current Address: Department of Pharmaceutical Chemistry, University of California at San Francisco, San Francisco, CA, United States
| | - Nicholas C Lammers
- Biophysics Graduate Group, University of California at Berkeley, Berkeley, CA, United States
- Current Address: Department of Genome Sciences, University of Washington, Seattle, WA, United States
| | - Hernan G Garcia
- Department of Molecular & Cell Biology, University of California at Berkeley, Berkeley, CA, United States
- Biophysics Graduate Group, University of California at Berkeley, Berkeley, CA, United States
- Department of Physics, University of California at Berkeley, Berkeley, CA, United States
- California Institute for Quantitative Biosciences (QB3), University of California at Berkeley, Berkeley, CA, United States
- Chan Zuckerberg Biohub–San Francisco, San Francisco, California, CA, United States
| | - Michael B Eisen
- Department of Molecular & Cell Biology, University of California at Berkeley, Berkeley, CA, United States
- Biophysics Graduate Group, University of California at Berkeley, Berkeley, CA, United States
- California Institute for Quantitative Biosciences (QB3), University of California at Berkeley, Berkeley, CA, United States
- Howard Hughes Medical Institute, University of California at Berkeley, Berkeley, CA, United States
| |
Collapse
|
5
|
Karollus A, Hingerl J, Gankin D, Grosshauser M, Klemon K, Gagneur J. Species-aware DNA language models capture regulatory elements and their evolution. Genome Biol 2024; 25:83. [PMID: 38566111 PMCID: PMC10985990 DOI: 10.1186/s13059-024-03221-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2023] [Accepted: 03/20/2024] [Indexed: 04/04/2024] Open
Abstract
BACKGROUND The rise of large-scale multi-species genome sequencing projects promises to shed new light on how genomes encode gene regulatory instructions. To this end, new algorithms are needed that can leverage conservation to capture regulatory elements while accounting for their evolution. RESULTS Here, we introduce species-aware DNA language models, which we trained on more than 800 species spanning over 500 million years of evolution. Investigating their ability to predict masked nucleotides from context, we show that DNA language models distinguish transcription factor and RNA-binding protein motifs from background non-coding sequence. Owing to their flexibility, DNA language models capture conserved regulatory elements over much further evolutionary distances than sequence alignment would allow. Remarkably, DNA language models reconstruct motif instances bound in vivo better than unbound ones and account for the evolution of motif sequences and their positional constraints, showing that these models capture functional high-order sequence and evolutionary context. We further show that species-aware training yields improved sequence representations for endogenous and MPRA-based gene expression prediction, as well as motif discovery. CONCLUSIONS Collectively, these results demonstrate that species-aware DNA language models are a powerful, flexible, and scalable tool to integrate information from large compendia of highly diverged genomes.
Collapse
Affiliation(s)
- Alexander Karollus
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
- Munich Center for Machine Learning, Munich, Germany
| | - Johannes Hingerl
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Dennis Gankin
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Martin Grosshauser
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Kristian Klemon
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany
| | - Julien Gagneur
- School of Computation, Information and Technology, Technical University of Munich, Garching, Germany.
- Munich Center for Machine Learning, Munich, Germany.
- Institute of Human Genetics, School of Medicine and Health, Technical University of Munich, Munich, Germany.
- Computational Health Center, Helmholtz Center Munich, Neuherberg, Germany.
- Munich Data Science Institute, Technical University of Munich, Garching, Germany.
| |
Collapse
|
6
|
Ciren D, Zebell S, Lippman ZB. Extreme restructuring of cis-regulatory regions controlling a deeply conserved plant stem cell regulator. PLoS Genet 2024; 20:e1011174. [PMID: 38437180 PMCID: PMC10911594 DOI: 10.1371/journal.pgen.1011174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2023] [Accepted: 02/07/2024] [Indexed: 03/06/2024] Open
Abstract
A striking paradox is that genes with conserved protein sequence, function and expression pattern over deep time often exhibit extremely divergent cis-regulatory sequences. It remains unclear how such drastic cis-regulatory evolution across species allows preservation of gene function, and to what extent these differences influence how cis-regulatory variation arising within species impacts phenotypic change. Here, we investigated these questions using a plant stem cell regulator conserved in expression pattern and function over ~125 million years. Using in-vivo genome editing in two distantly related models, Arabidopsis thaliana (Arabidopsis) and Solanum lycopersicum (tomato), we generated over 70 deletion alleles in the upstream and downstream regions of the stem cell repressor gene CLAVATA3 (CLV3) and compared their individual and combined effects on a shared phenotype, the number of carpels that make fruits. We found that sequences upstream of tomato CLV3 are highly sensitive to even small perturbations compared to its downstream region. In contrast, Arabidopsis CLV3 function is tolerant to severe disruptions both upstream and downstream of the coding sequence. Combining upstream and downstream deletions also revealed a different regulatory outcome. Whereas phenotypic enhancement from adding downstream mutations was predominantly weak and additive in tomato, mutating both regions of Arabidopsis CLV3 caused substantial and synergistic effects, demonstrating distinct distribution and redundancy of functional cis-regulatory sequences. Our results demonstrate remarkable malleability in cis-regulatory structural organization of a deeply conserved plant stem cell regulator and suggest that major reconfiguration of cis-regulatory sequence space is a common yet cryptic evolutionary force altering genotype-to-phenotype relationships from regulatory variation in conserved genes. Finally, our findings underscore the need for lineage-specific dissection of the spatial architecture of cis-regulation to effectively engineer trait variation from conserved productivity genes in crops.
Collapse
Affiliation(s)
- Danielle Ciren
- Cold Spring Harbor Laboratory, School of Biological Sciences, Cold Spring Harbor, New York, United States of America
| | - Sophia Zebell
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
- Howard Hughes Medical Institute, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| | - Zachary B. Lippman
- Cold Spring Harbor Laboratory, School of Biological Sciences, Cold Spring Harbor, New York, United States of America
- Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
- Howard Hughes Medical Institute, Cold Spring Harbor Laboratory, Cold Spring Harbor, New York, United States of America
| |
Collapse
|
7
|
Taskiran II, Spanier KI, Dickmänken H, Kempynck N, Pančíková A, Ekşi EC, Hulselmans G, Ismail JN, Theunis K, Vandepoel R, Christiaens V, Mauduit D, Aerts S. Cell-type-directed design of synthetic enhancers. Nature 2024; 626:212-220. [PMID: 38086419 PMCID: PMC10830415 DOI: 10.1038/s41586-023-06936-2] [Citation(s) in RCA: 38] [Impact Index Per Article: 38.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2022] [Accepted: 12/05/2023] [Indexed: 01/19/2024]
Abstract
Transcriptional enhancers act as docking stations for combinations of transcription factors and thereby regulate spatiotemporal activation of their target genes1. It has been a long-standing goal in the field to decode the regulatory logic of an enhancer and to understand the details of how spatiotemporal gene expression is encoded in an enhancer sequence. Here we show that deep learning models2-6, can be used to efficiently design synthetic, cell-type-specific enhancers, starting from random sequences, and that this optimization process allows detailed tracing of enhancer features at single-nucleotide resolution. We evaluate the function of fully synthetic enhancers to specifically target Kenyon cells or glial cells in the fruit fly brain using transgenic animals. We further exploit enhancer design to create 'dual-code' enhancers that target two cell types and minimal enhancers smaller than 50 base pairs that are fully functional. By examining the state space searches towards local optima, we characterize enhancer codes through the strength, combination and arrangement of transcription factor activator and transcription factor repressor motifs. Finally, we apply the same strategies to successfully design human enhancers, which adhere to enhancer rules similar to those of Drosophila enhancers. Enhancer design guided by deep learning leads to better understanding of how enhancers work and shows that their code can be exploited to manipulate cell states.
Collapse
Affiliation(s)
- Ibrahim I Taskiran
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Katina I Spanier
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Hannah Dickmänken
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Niklas Kempynck
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Alexandra Pančíková
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
- VIB-KULeuven Center for Cancer Biology, Leuven, Belgium
| | - Eren Can Ekşi
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Gert Hulselmans
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Joy N Ismail
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
- UK Dementia Research Institute at Imperial College London, London, UK
| | - Koen Theunis
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Roel Vandepoel
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Valerie Christiaens
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - David Mauduit
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium
- Department of Human Genetics, KU Leuven, Leuven, Belgium
| | - Stein Aerts
- Laboratory of Computational Biology, VIB Center for AI & Computational Biology (VIB.AI), Leuven, Belgium.
- VIB-KULeuven Center for Brain & Disease Research, Leuven, Belgium.
- Department of Human Genetics, KU Leuven, Leuven, Belgium.
| |
Collapse
|
8
|
Mañes-García J, Marco-Ferreres R, Beccari L. Shaping gene expression and its evolution by chromatin architecture and enhancer activity. Curr Top Dev Biol 2024; 159:406-437. [PMID: 38729683 DOI: 10.1016/bs.ctdb.2024.01.001] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/12/2024]
Abstract
Transcriptional regulation plays a pivotal role in orchestrating the intricate genetic programs governing embryonic development. The expression of developmental genes relies on the combined activity of several cis-regulatory elements (CREs), such as enhancers and silencers, which can be located at long linear distances from the genes that they regulate and that interact with them through establishment of chromatin loops. Mutations affecting their activity or interaction with their target genes can lead to developmental disorders and are thought to have importantly contributed to the evolution of the animal body plan. The income of next-generation-sequencing approaches has allowed identifying over a million of sequences with putative regulatory potential in the human genome. Characterizing their function and establishing gene-CREs maps is essential to decode the logic governing developmental gene expression and is one of the major challenges of the post-genomic era. Chromatin 3D organization plays an essential role in determining how CREs specifically contact their target genes while avoiding deleterious off-target interactions. Our understanding of these aspects has greatly advanced with the income of chromatin conformation capture techniques and fluorescence microscopy approaches to visualize the organization of DNA elements in the nucleus. Here we will summarize relevant aspects of how the interplay between CRE activity and chromatin 3D organization regulates developmental gene expression and how it relates to pathological conditions and the evolution of animal body plan.
Collapse
Affiliation(s)
| | | | - Leonardo Beccari
- Centro de Biología Molecular Severo Ochoa, CSIC-UAM, Madrid, Spain.
| |
Collapse
|
9
|
Ciren D, Zebell S, Lippman ZB. Extreme restructuring of cis -regulatory regions controlling a deeply conserved plant stem cell regulator. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.12.20.572550. [PMID: 38187729 PMCID: PMC10769289 DOI: 10.1101/2023.12.20.572550] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/09/2024]
Abstract
A striking paradox is that genes with conserved protein sequence, function and expression pattern over deep time often exhibit extremely divergent cis -regulatory sequences. It remains unclear how such drastic cis -regulatory evolution across species allows preservation of gene function, and to what extent these differences influence how cis- regulatory variation arising within species impacts phenotypic change. Here, we investigated these questions using a plant stem cell regulator conserved in expression pattern and function over ∼125 million years. Using in-vivo genome editing in two distantly related models, Arabidopsis thaliana (Arabidopsis) and Solanum lycopersicum (tomato), we generated over 70 deletion alleles in the upstream and downstream regions of the stem cell repressor gene CLAVATA3 ( CLV3 ) and compared their individual and combined effects on a shared phenotype, the number of carpels that make fruits. We found that sequences upstream of tomato CLV3 are highly sensitive to even small perturbations compared to its downstream region. In contrast, Arabidopsis CLV3 function is tolerant to severe disruptions both upstream and downstream of the coding sequence. Combining upstream and downstream deletions also revealed a different regulatory outcome. Whereas phenotypic enhancement from adding downstream mutations was predominantly weak and additive in tomato, mutating both regions of Arabidopsis CLV3 caused substantial and synergistic effects, demonstrating distinct distribution and redundancy of functional cis -regulatory sequences. Our results demonstrate remarkable malleability in cis -regulatory structural organization of a deeply conserved plant stem cell regulator and suggest that major reconfiguration of cis -regulatory sequence space is a common yet cryptic evolutionary force altering genotype-to-phenotype relationships from regulatory variation in conserved genes. Finally, our findings underscore the need for lineage-specific dissection of the spatial architecture of cis -regulation to effectively engineer trait variation from conserved productivity genes in crops. Author summary We investigated the evolution of cis -regulatory elements (CREs) and their interactions in the regulation of a plant stem cell regulator gene, CLAVATA3 (CLV3) , in Arabidopsis and tomato. Despite diverging ∼125 million years ago, the function and expression of CLV3 is conserved in these species; however, cis -regulatory sequences upstream and downstream have drastically diverged, preventing identification of conserved non-coding sequences between them. We used CRISPR-Cas9 to engineer dozens of mutations within the cis -regulatory regions of Arabidopsis and tomato CLV3. In tomato, our results show that tomato CLV3 function primarily relies on interactions among CREs in the 5' non-coding region, unlike Arabidopsis CLV3 , which depends on a more balanced distribution of functional CREs between the 5' and 3' regions. Therefore, despite a high degree of functional conservation, our study demonstrates divergent regulatory strategies between two distantly related CLV3 orthologs, with substantial alterations in regulatory sequences, their spatial arrangement, and their relative effects on CLV3 regulation. These results suggest that regulatory regions are not only extremely robust to mutagenesis, but also that the sequences underlying this robustness can be lineage-specific for conserved genes, due to the complex and often redundant interactions among CREs that ensure proper gene function amidst large-scale sequence turnover.
Collapse
|
10
|
Lupo O, Kumar DK, Livne R, Chappleboim M, Levy I, Barkai N. The architecture of binding cooperativity between densely bound transcription factors. Cell Syst 2023; 14:732-745.e5. [PMID: 37527656 DOI: 10.1016/j.cels.2023.06.010] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/25/2023] [Revised: 05/23/2023] [Accepted: 06/27/2023] [Indexed: 08/03/2023]
Abstract
The binding of transcription factors (TFs) along genomes is restricted to a subset of sites containing their preferred motifs. TF-binding specificity is often attributed to the co-binding of interacting TFs; however, apart from specific examples, this model remains untested. Here, we define dependencies among budding yeast TFs that localize to overlapping promoters by profiling the genome-wide consequences of co-depleting multiple TFs. We describe unidirectional interactions, revealing Msn2 as a central factor allowing TF binding at its target promoters. By contrast, no case of mutual cooperation was observed. Particularly, Msn2 retained binding at its preferred promoters upon co-depletion of fourteen similarly bound TFs. Overall, the consequences of TF co-depletions were moderate, limited to a subset of promoters, and failed to explain the role of regions outside the DNA-binding domain in directing TF-binding preferences. Our results call for re-evaluating the role of cooperative interactions in directing TF-binding preferences.
Collapse
Affiliation(s)
- Offir Lupo
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Divya Krishna Kumar
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Rotem Livne
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Michal Chappleboim
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Idan Levy
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel.
| |
Collapse
|
11
|
Cutter AD. Speciation and development. Evol Dev 2023; 25:289-327. [PMID: 37545126 DOI: 10.1111/ede.12454] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/07/2023] [Revised: 06/13/2023] [Accepted: 07/20/2023] [Indexed: 08/08/2023]
Abstract
Understanding general principles about the origin of species remains one of the foundational challenges in evolutionary biology. The genomic divergence between groups of individuals can spawn hybrid inviability and hybrid sterility, which presents a tantalizing developmental problem. Divergent developmental programs may yield either conserved or divergent phenotypes relative to ancestral traits, both of which can be responsible for reproductive isolation during the speciation process. The genetic mechanisms of developmental evolution involve cis- and trans-acting gene regulatory change, protein-protein interactions, genetic network structures, dosage, and epigenetic regulation, all of which also have roots in population genetic and molecular evolutionary processes. Toward the goal of demystifying Darwin's "mystery of mysteries," this review integrates microevolutionary concepts of genetic change with principles of organismal development, establishing explicit links between population genetic process and developmental mechanisms in the production of macroevolutionary pattern. This integration aims to establish a more unified view of speciation that binds process and mechanism.
Collapse
Affiliation(s)
- Asher D Cutter
- Department of Ecology & Evolutionary Biology, University of Toronto, Toronto, Ontario, Canada
| |
Collapse
|
12
|
Smith GD, Ching WH, Cornejo-Páramo P, Wong ES. Decoding enhancer complexity with machine learning and high-throughput discovery. Genome Biol 2023; 24:116. [PMID: 37173718 PMCID: PMC10176946 DOI: 10.1186/s13059-023-02955-4] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Accepted: 04/28/2023] [Indexed: 05/15/2023] Open
Abstract
Enhancers are genomic DNA elements controlling spatiotemporal gene expression. Their flexible organization and functional redundancies make deciphering their sequence-function relationships challenging. This article provides an overview of the current understanding of enhancer organization and evolution, with an emphasis on factors that influence these relationships. Technological advancements, particularly in machine learning and synthetic biology, are discussed in light of how they provide new ways to understand this complexity. Exciting opportunities lie ahead as we continue to unravel the intricacies of enhancer function.
Collapse
Affiliation(s)
- Gabrielle D Smith
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Wan Hern Ching
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia
| | - Paola Cornejo-Páramo
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington, NSW, Australia
| | - Emily S Wong
- Victor Chang Cardiac Research Institute, 405 Liverpool Street, Darlinghurst, NSW, Australia.
- School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Kensington, NSW, Australia.
| |
Collapse
|
13
|
Harden TT, Vincent BJ, DePace AH. Transcriptional activators in the early Drosophila embryo perform different kinetic roles. Cell Syst 2023; 14:258-272.e4. [PMID: 37080162 PMCID: PMC10473017 DOI: 10.1016/j.cels.2023.03.006] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/08/2021] [Revised: 06/26/2022] [Accepted: 03/21/2023] [Indexed: 04/22/2023]
Abstract
Combinatorial regulation of gene expression by transcription factors (TFs) may in part arise from kinetic synergy-wherein TFs regulate different steps in the transcription cycle. Kinetic synergy requires that TFs play distinguishable kinetic roles. Here, we used live imaging to determine the kinetic roles of three TFs that activate transcription in the Drosophila embryo-Zelda, Bicoid, and Stat92E-by introducing their binding sites into the even-skipped stripe 2 enhancer. These TFs influence different sets of kinetic parameters, and their influence can change over time. All three TFs increased the fraction of transcriptionally active nuclei; Zelda also shortened the first-passage time into transcription and regulated the interval between transcription events. Stat92E also increased the lifetimes of active transcription. Different TFs can therefore play distinct kinetic roles in activating the transcription. This has consequences for understanding the composition and flexibility of regulatory DNA sequences and the biochemical function of TFs. A record of this paper's transparent peer review process is included in the supplemental information.
Collapse
Affiliation(s)
- Timothy T Harden
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Ben J Vincent
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
| | - Angela H DePace
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA.
| |
Collapse
|
14
|
Galupa R, Alvarez-Canales G, Borst NO, Fuqua T, Gandara L, Misunou N, Richter K, Alves MRP, Karumbi E, Perkins ML, Kocijan T, Rushlow CA, Crocker J. Enhancer architecture and chromatin accessibility constrain phenotypic space during Drosophila development. Dev Cell 2023; 58:51-62.e4. [PMID: 36626871 PMCID: PMC9860173 DOI: 10.1016/j.devcel.2022.12.003] [Citation(s) in RCA: 15] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/04/2022] [Revised: 10/18/2022] [Accepted: 12/07/2022] [Indexed: 01/11/2023]
Abstract
Developmental enhancers bind transcription factors and dictate patterns of gene expression during development. Their molecular evolution can underlie phenotypical evolution, but the contributions of the evolutionary pathways involved remain little understood. Here, using mutation libraries in Drosophila melanogaster embryos, we observed that most point mutations in developmental enhancers led to changes in gene expression levels but rarely resulted in novel expression outside of the native pattern. In contrast, random sequences, often acting as developmental enhancers, drove expression across a range of cell types; random sequences including motifs for transcription factors with pioneer activity acted as enhancers even more frequently. Our findings suggest that the phenotypic landscapes of developmental enhancers are constrained by enhancer architecture and chromatin accessibility. We propose that the evolution of existing enhancers is limited in its capacity to generate novel phenotypes, whereas the activity of de novo elements is a primary source of phenotypic novelty.
Collapse
Affiliation(s)
- Rafael Galupa
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany.
| | | | | | - Timothy Fuqua
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Lautaro Gandara
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Natalia Misunou
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | - Kerstin Richter
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | | | - Esther Karumbi
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | | | - Tin Kocijan
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany
| | | | - Justin Crocker
- European Molecular Biology Laboratory, 69117 Heidelberg, Germany.
| |
Collapse
|
15
|
Kumar Mishra S, Bhattacherjee A. Understanding the Target Search by Multiple Transcription Factors on Nucleosomal DNA. Chemphyschem 2023; 24:e202200644. [PMID: 36602094 DOI: 10.1002/cphc.202200644] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/26/2022] [Revised: 01/04/2023] [Accepted: 01/05/2023] [Indexed: 01/06/2023]
Abstract
The association of multiple Transcription Factors (TFs) in the cis-regulatory region is imperative for developmental changes in eukaryotes. The underlying process is exceedingly complex, and it is not at all clear what orchestrates the overall search process by multiple TFs. In this study, by developing a theoretical model based on a discrete-state stochastic approach, we investigated the target search mechanism of multiple TFs on nucleosomal DNA. Experimental kinetic rate constants of different TFs are taken as input to estimate the Mean-First-Passage time to recognize the binding motifs by two TFs on a dynamic nucleosome model. The theory systematically analyzes when the TFs search their binding motifs hierarchically and when simultaneously by proceeding via the formation of a protein-protein complex. Our results, validated by extensive Monte Carlo simulations, elucidate the molecular basis of the complex target search phenomenon of multiple TFs on nucleosomal DNA.
Collapse
Affiliation(s)
- Sujeet Kumar Mishra
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| | - Arnab Bhattacherjee
- School of Computational and Integrative Sciences, Jawaharlal Nehru University, New Delhi, India
| |
Collapse
|
16
|
Staller MV. Transcription factors perform a 2-step search of the nucleus. Genetics 2022; 222:iyac111. [PMID: 35939561 PMCID: PMC9526044 DOI: 10.1093/genetics/iyac111] [Citation(s) in RCA: 20] [Impact Index Per Article: 6.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2022] [Accepted: 07/14/2022] [Indexed: 01/02/2023] Open
Abstract
Transcription factors regulate gene expression by binding to regulatory DNA and recruiting regulatory protein complexes. The DNA-binding and protein-binding functions of transcription factors are traditionally described as independent functions performed by modular protein domains. Here, I argue that genome binding can be a 2-part process with both DNA-binding and protein-binding steps, enabling transcription factors to perform a 2-step search of the nucleus to find their appropriate binding sites in a eukaryotic genome. I support this hypothesis with new and old results in the literature, discuss how this hypothesis parsimoniously resolves outstanding problems, and present testable predictions.
Collapse
Affiliation(s)
- Max Valentín Staller
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA 94720, USA
| |
Collapse
|
17
|
Cross-species enhancer prediction using machine learning. Genomics 2022; 114:110454. [PMID: 36030022 DOI: 10.1016/j.ygeno.2022.110454] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2022] [Revised: 07/28/2022] [Accepted: 08/16/2022] [Indexed: 11/21/2022]
Abstract
Cis-regulatory elements (CREs) are non-coding parts of the genome that play a critical role in gene expression regulation. Enhancers, as an important example of CREs, interact with genes to influence complex traits like disease, heat tolerance and growth rate. Much of what is known about enhancers come from studies of humans and a few model organisms like mouse, with little known about other mammalian species. Previous studies have attempted to identify enhancers in less studied mammals using comparative genomics but with limited success. Recently, Machine Learning (ML) techniques have shown promising results to predict enhancer regions. Here, we investigated the ability of ML methods to identify enhancers in three non-model mammalian species (cattle, pig and dog) using human and mouse enhancer data from VISTA and publicly available ChIP-seq. We tested nine models, using four different representations of the DNA sequences in cross-species prediction using both the VISTA dataset and species-specific ChIP-seq data. We identified between 809,399 and 877,278 enhancer-like regions (ELRs) in the study species (11.6-13.7% of each genome). These predictions were close to the ~8% proportion of ELRs that covered the human genome. We propose that our ML methods have predictive ability for identifying enhancers in non-model mammalian species. We have provided a list of high confidence enhancers at https://github.com/DaviesCentreInformatics/Cross-species-enhancer-prediction and believe these enhancers will be of great use to the community.
Collapse
|
18
|
Stadler PF, Will S. Bi-alignments with affine gaps costs. Algorithms Mol Biol 2022; 17:10. [PMID: 35578255 PMCID: PMC9109335 DOI: 10.1186/s13015-022-00219-7] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/18/2021] [Accepted: 05/01/2022] [Indexed: 12/02/2022] Open
Abstract
Background Commonly, sequence and structure elements are assumed to evolve congruently, such that homologous sequence positions correspond to homologous structural features. Assuming congruent evolution, alignments based on sequence and structure similarity can therefore optimize both similarities at the same time in a single alignment. To model incongruent evolution, where sequence and structural features diverge positionally, we recently introduced bi-alignments. This generalization of sequence and structure-based alignments is best understood as alignments of two distinct pairwise alignments of the same entities: one modeling sequence similarity, the other structural similarity. Results Optimal bi-alignments with affine gap costs (or affine shift cost) for two constituent alignments can be computed exactly in quartic space and time. Even bi-alignments with affine shift and gap cost, as well as bi-alignment with sub-additive gap cost are optimized efficiently. Affine gap-cost bi-alignment of large proteins (\documentclass[12pt]{minimal}
\usepackage{amsmath}
\usepackage{wasysym}
\usepackage{amsfonts}
\usepackage{amssymb}
\usepackage{amsbsy}
\usepackage{mathrsfs}
\usepackage{upgreek}
\setlength{\oddsidemargin}{-69pt}
\begin{document}$$\sim 930$$\end{document}∼930 aa) can be computed. Conclusion Affine cost bi-alignments are of practical interest to study shifts of protein sequences and protein structures relative to each other. Availability The affine cost bi-alignment algorithm has been implemented in Python 3 and Cython. It is available as free software from https://github.com/s-will/BiAlign/releases/tag/v0.3 and as bioconda package bialign. Supplementary Information The online version contains supplementary material available at 10.1186/s13015-022-00219-7.
Collapse
|
19
|
Hu L, Zhao X, Li P, Zeng Y, Zhang Y, Shen Y, Wang Y, Sun X, Lai B, Zhong C. Proximal and Distal Regions of Pathogenic Th17 Related Chromatin Loci Are Sequentially Accessible During Pathogenicity of Th17. Front Immunol 2022; 13:864314. [PMID: 35514969 PMCID: PMC9062102 DOI: 10.3389/fimmu.2022.864314] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2022] [Accepted: 03/17/2022] [Indexed: 11/13/2022] Open
Abstract
Pathogenic Th17, featured by their production of pro-inflammatory cytokines, are considered as a key player in most autoimmune diseases. The transcriptome of them is obviously distinct from that of conventional regulatory Th17. However, chromatin accessibility of the two Th17 groups have not been comprehensively compared yet. Here, we found that their chromatin-accessible regions(ChARs) significantly correlated with the expression of related genes, indicating that they might engage in the regulation of these genes. Indeed, pathogenic Th17 specific ChARs (patho-ChARs) exhibited a significant distribution preference in TSS-proximal region. We further filtered the patho-ChARs based on their conservation among mammalians or their concordance with the expression of their related genes. In either situation, the filtered patho-ChARs also showed a preference for TSS-proximal region. Enrichment of expression concordant patho-ChARs related genes suggested that they might involve in the pathogenicity of Th17. Thus, we also examined all ChARs of patho-ChARs related genes, and defined an opening ChAR set according to their changes in the Th17 to Th1 conversion. Interestingly, these opening ChARs displayed a sequential accessibility change from TSS-proximal region to TSS-distal region. Meanwhile, a group of patho-TFs (transcription factors) were identified based on the appearance of their binding motifs in the opening ChARs. Consistently, some of them also displayed a similar preference for binding the TSS-proximal region. Single-cell transcriptome analysis further confirmed that these patho-TFs were involved in the generation of pathogenic Th17. Therefore, our results shed light on a new regulatory mechanism underlying the generation of pathogenic Th17, which is worth to be considered for autoimmune disease therapy.
Collapse
Affiliation(s)
- Luni Hu
- Beijing Key Laboratory of Tumor Systems Biology, Institute of Systems Biomedicine, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
| | - Xingyu Zhao
- Beijing Key Laboratory of Tumor Systems Biology, Institute of Systems Biomedicine, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
| | - Peng Li
- Beijing Key Laboratory of Tumor Systems Biology, Institute of Systems Biomedicine, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
| | - Yanyu Zeng
- Beijing Key Laboratory of Tumor Systems Biology, Institute of Systems Biomedicine, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
| | - Yime Zhang
- Beijing Key Laboratory of Tumor Systems Biology, Institute of Systems Biomedicine, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
| | - Yang Shen
- School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
| | - Yukai Wang
- School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China
| | - Xiaolin Sun
- Department of Rheumatology and Immunology, Peking University People's Hospital, Beijing, China.,Beijing Key Laboratory for Rheumatism Mechanism and Immune Diagnosis (BZ0135), Peking University People's Hospital, Beijing, China
| | - Binbin Lai
- Biomedical Engineering Department, Peking University, Beijing, China.,Institute of Medical Technology, Peking University Health Science Center, Beijing, China.,Department of Dermatology and Venereology, Peking University First Hospital, Beijing, China
| | - Chao Zhong
- Beijing Key Laboratory of Tumor Systems Biology, Institute of Systems Biomedicine, School of Basic Medical Sciences, Peking University Health Science Center, Beijing, China.,National Health Commission (NHC) Key Laboratory of Medical Immunology, Peking University, Beijing, China.,Key Laboratory of Molecular Immunology, Chinese Academy of Medical Sciences, Beijing, China
| |
Collapse
|
20
|
Perumalsamy NK, Hemalatha C. Cis-regulatory elements (CREs) in spinal solitary fibrous tumours. Meta Gene 2022. [DOI: 10.1016/j.mgene.2022.101025] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/18/2022] Open
|
21
|
Abstract
Even if a species' phenotype does not change over evolutionary time, the underlying mechanism may change, as distinct molecular pathways can realize identical phenotypes. Here we use linear system theory to explore the consequences of this idea, describing how a gene network underlying a conserved phenotype evolves, as the genetic drift of small changes to these molecular pathways causes a population to explore the set of mechanisms with identical phenotypes. To do this, we model an organism's internal state as a linear system of differential equations for which the environment provides input and the phenotype is the output, in which context there exists an exact characterization of the set of all mechanisms that give the same input-output relationship. This characterization implies that selectively neutral directions in genotype space should be common and that the evolutionary exploration of these distinct but equivalent mechanisms can lead to the reproductive incompatibility of independently evolving populations. This evolutionary exploration, or system drift, is expected to proceed at a rate proportional to the amount of intrapopulation genetic variation divided by the effective population size ( Ne$N_e$ ). At biologically reasonable parameter values this could lead to substantial interpopulation incompatibility, and thus speciation, on a time scale of Ne$N_e$ generations. This model also naturally predicts Haldane's rule, thus providing a concrete explanation of why heterogametic hybrids tend to be disrupted more often than homogametes during the early stages of speciation.
Collapse
Affiliation(s)
- Joshua S. Schiffman
- New York Genome CenterNew YorkNew York 10013,Weill Cornell MedicineNew YorkNew York 10065,Department of Molecular and Computational BiologyUniversity of Southern CaliforniaLos AngelesCalifornia 90089
| | - Peter L. Ralph
- Department of Molecular and Computational BiologyUniversity of Southern CaliforniaLos AngelesCalifornia 90089,Department of Mathematics, Institute of Ecology and EvolutionUniversity of OregonEugeneOregon 97403,Department of Biology, Institute of Ecology and EvolutionUniversity of OregonEugeneOregon 97403
| |
Collapse
|
22
|
Ray-Jones H, Spivakov M. Transcriptional enhancers and their communication with gene promoters. Cell Mol Life Sci 2021; 78:6453-6485. [PMID: 34414474 PMCID: PMC8558291 DOI: 10.1007/s00018-021-03903-w] [Citation(s) in RCA: 32] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 07/08/2021] [Accepted: 07/19/2021] [Indexed: 12/13/2022]
Abstract
Transcriptional enhancers play a key role in the initiation and maintenance of gene expression programmes, particularly in metazoa. How these elements control their target genes in the right place and time is one of the most pertinent questions in functional genomics, with wide implications for most areas of biology. Here, we synthesise classic and recent evidence on the regulatory logic of enhancers, including the principles of enhancer organisation, factors that facilitate and delimit enhancer-promoter communication, and the joint effects of multiple enhancers. We show how modern approaches building on classic insights have begun to unravel the complexity of enhancer-promoter relationships, paving the way towards a quantitative understanding of gene control.
Collapse
Affiliation(s)
- Helen Ray-Jones
- MRC London Institute of Medical Sciences, London, W12 0NN, UK
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College, London, W12 0NN, UK
| | - Mikhail Spivakov
- MRC London Institute of Medical Sciences, London, W12 0NN, UK.
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College, London, W12 0NN, UK.
| |
Collapse
|
23
|
Schwope R, Magris G, Miculan M, Paparelli E, Celii M, Tocci A, Marroni F, Fornasiero A, De Paoli E, Morgante M. Open chromatin in grapevine marks candidate CREs and with other chromatin features correlates with gene expression. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2021; 107:1631-1647. [PMID: 34219317 PMCID: PMC8518642 DOI: 10.1111/tpj.15404] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/07/2020] [Revised: 06/24/2021] [Accepted: 06/25/2021] [Indexed: 05/14/2023]
Abstract
Vitis vinifera is an economically important crop and a useful model in which to study chromatin dynamics. In contrast to the small and relatively simple genome of Arabidopsis thaliana, grapevine contains a complex genome of 487 Mb that exhibits extensive colonization by transposable elements. We used Hi-C, ChIP-seq and ATAC-seq to measure how chromatin features correlate to the expression of 31 845 grapevine genes. ATAC-seq revealed the presence of more than 16 000 open chromatin regions, of which we characterize nearly 5000 as possible distal enhancer candidates that occur in intergenic space > 2 kb from the nearest transcription start site (TSS). A motif search identified more than 480 transcription factor (TF) binding sites in these regions, with those for TCP family proteins in greatest abundance. These open chromatin regions are typically within 15 kb from their nearest promoter, and a gene ontology analysis indicated that their nearest genes are significantly enriched for TF activity. The presence of a candidate cis-regulatory element (cCRE) > 2 kb upstream of the TSS, location in the active nuclear compartment as determined by Hi-C, and the enrichment of H3K4me3, H3K4me1 and H3K27ac at the gene are correlated with gene expression. Taken together, these results suggest that regions of intergenic open chromatin identified by ATAC-seq can be considered potential candidates for cis-regulatory regions in V. vinifera. Our findings enhance the characterization of a valuable agricultural crop, and help to clarify the understanding of unique plant biology.
Collapse
Affiliation(s)
- Rachel Schwope
- Dipartimento di Scienze AgroalimentariAmbientali e Animali (DI4A)UdineI‐33100Italy
- Istituto di Genomica ApplicataUdineI‐33100Italy
| | - Gabriele Magris
- Dipartimento di Scienze AgroalimentariAmbientali e Animali (DI4A)UdineI‐33100Italy
- Istituto di Genomica ApplicataUdineI‐33100Italy
| | - Mara Miculan
- Dipartimento di Scienze AgroalimentariAmbientali e Animali (DI4A)UdineI‐33100Italy
- Istituto di Genomica ApplicataUdineI‐33100Italy
- Present address:
Institute of Life SciencesScuola Superiore Sant'Anna PisaPisa56127Italy
| | - Eleonora Paparelli
- Dipartimento di Scienze AgroalimentariAmbientali e Animali (DI4A)UdineI‐33100Italy
- Istituto di Genomica ApplicataUdineI‐33100Italy
- Present address:
IGA Technology ServicesUdineI‐33100Italy
| | - Mirko Celii
- Dipartimento di Scienze AgroalimentariAmbientali e Animali (DI4A)UdineI‐33100Italy
- Istituto di Genomica ApplicataUdineI‐33100Italy
- Present address:
Center for Desert Agriculture, Biological and Environmental Sciences & Engineering Division (BESE)KAUSTThuwalMakkahSaudi Arabia
| | - Aldo Tocci
- Dipartimento di Scienze AgroalimentariAmbientali e Animali (DI4A)UdineI‐33100Italy
- Istituto di Genomica ApplicataUdineI‐33100Italy
- Scuola Internazionale Superiore di Studi AvanzatiTriesteFriuli‐Venezia GiuliaItaly
| | - Fabio Marroni
- Dipartimento di Scienze AgroalimentariAmbientali e Animali (DI4A)UdineI‐33100Italy
- Istituto di Genomica ApplicataUdineI‐33100Italy
| | - Alice Fornasiero
- Dipartimento di Scienze AgroalimentariAmbientali e Animali (DI4A)UdineI‐33100Italy
- Istituto di Genomica ApplicataUdineI‐33100Italy
- Present address:
Center for Desert Agriculture, Biological and Environmental Sciences & Engineering Division (BESE)KAUSTThuwalMakkahSaudi Arabia
| | - Emanuele De Paoli
- Dipartimento di Scienze AgroalimentariAmbientali e Animali (DI4A)UdineI‐33100Italy
| | - Michele Morgante
- Dipartimento di Scienze AgroalimentariAmbientali e Animali (DI4A)UdineI‐33100Italy
- Istituto di Genomica ApplicataUdineI‐33100Italy
| |
Collapse
|
24
|
Mukaigasa K, Sakuma C, Yaginuma H. The developmental hourglass model is applicable to the spinal cord based on single-cell transcriptomes and non-conserved cis-regulatory elements. Dev Growth Differ 2021; 63:372-391. [PMID: 34473348 PMCID: PMC9293469 DOI: 10.1111/dgd.12750] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2021] [Revised: 08/24/2021] [Accepted: 08/26/2021] [Indexed: 11/27/2022]
Abstract
The developmental hourglass model predicts that embryonic morphology is most conserved at the mid-embryonic stage and diverges at the early and late stages. To date, this model has been verified by examining the anatomical features or gene expression profiles at the whole embryonic level. Here, by data mining approach utilizing multiple genomic and transcriptomic datasets from different species in combination, and by experimental validation, we demonstrate that the hourglass model is also applicable to a reduced element, the spinal cord. In the middle of spinal cord development, dorsoventrally arrayed neuronal progenitor domains are established, which are conserved among vertebrates. By comparing the publicly available single-cell transcriptome datasets of mice and zebrafish, we found that ventral subpopulations of post-mitotic spinal neurons display divergent molecular profiles. We also detected the non-conservation of cis-regulatory elements located around the progenitor fate determinants, indicating that the cis-regulatory elements contributing to the progenitor specification are evolvable. These results demonstrate that, despite the conservation of the progenitor domains, the processes before and after the progenitor domain specification diverged. This study will be helpful to understand the molecular basis of the developmental hourglass model.
Collapse
Affiliation(s)
- Katsuki Mukaigasa
- Department of Neuroanatomy and EmbryologySchool of MedicineFukushima Medical UniversityFukushimaJapan
| | - Chie Sakuma
- Department of Neuroanatomy and EmbryologySchool of MedicineFukushima Medical UniversityFukushimaJapan
| | - Hiroyuki Yaginuma
- Department of Neuroanatomy and EmbryologySchool of MedicineFukushima Medical UniversityFukushimaJapan
| |
Collapse
|
25
|
Shih CH, Fay J. Cis-regulatory variants affect gene expression dynamics in yeast. eLife 2021; 10:e68469. [PMID: 34369376 PMCID: PMC8367379 DOI: 10.7554/elife.68469] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/17/2021] [Accepted: 08/06/2021] [Indexed: 12/14/2022] Open
Abstract
Evolution of cis-regulatory sequences depends on how they affect gene expression and motivates both the identification and prediction of cis-regulatory variants responsible for expression differences within and between species. While much progress has been made in relating cis-regulatory variants to expression levels, the timing of gene activation and repression may also be important to the evolution of cis-regulatory sequences. We investigated allele-specific expression (ASE) dynamics within and between Saccharomyces species during the diauxic shift and found appreciable cis-acting variation in gene expression dynamics. Within-species ASE is associated with intergenic variants, and ASE dynamics are more strongly associated with insertions and deletions than ASE levels. To refine these associations, we used a high-throughput reporter assay to test promoter regions and individual variants. Within the subset of regions that recapitulated endogenous expression, we identified and characterized cis-regulatory variants that affect expression dynamics. Between species, chimeric promoter regions generate novel patterns and indicate constraints on the evolution of gene expression dynamics. We conclude that changes in cis-regulatory sequences can tune gene expression dynamics and that the interplay between expression dynamics and other aspects of expression is relevant to the evolution of cis-regulatory sequences.
Collapse
Affiliation(s)
- Ching-Hua Shih
- Department of Biology, University of RochesterRochesterUnited States
| | - Justin Fay
- Department of Biology, University of RochesterRochesterUnited States
| |
Collapse
|
26
|
Asma H, Halfon MS. Annotating the Insect Regulatory Genome. INSECTS 2021; 12:591. [PMID: 34209769 PMCID: PMC8305585 DOI: 10.3390/insects12070591] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2021] [Revised: 06/23/2021] [Accepted: 06/25/2021] [Indexed: 11/17/2022]
Abstract
An ever-growing number of insect genomes is being sequenced across the evolutionary spectrum. Comprehensive annotation of not only genes but also regulatory regions is critical for reaping the full benefits of this sequencing. Driven by developments in sequencing technologies and in both empirical and computational discovery strategies, the past few decades have witnessed dramatic progress in our ability to identify cis-regulatory modules (CRMs), sequences such as enhancers that play a major role in regulating transcription. Nevertheless, providing a timely and comprehensive regulatory annotation of newly sequenced insect genomes is an ongoing challenge. We review here the methods being used to identify CRMs in both model and non-model insect species, and focus on two tools that we have developed, REDfly and SCRMshaw. These resources can be paired together in a powerful combination to facilitate insect regulatory annotation over a broad range of species, with an accuracy equal to or better than that of other state-of-the-art methods.
Collapse
Affiliation(s)
- Hasiba Asma
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
| | - Marc S. Halfon
- Program in Genetics, Genomics, and Bioinformatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA;
- Department of Biochemistry, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biomedical Informatics, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- Department of Biological Sciences, University at Buffalo-State University of New York, Buffalo, NY 14203, USA
- NY State Center of Excellence in Bioinformatics & Life Sciences, Buffalo, NY 14203, USA
| |
Collapse
|
27
|
DiFrisco J, Jaeger J. Homology of process: developmental dynamics in comparative biology. Interface Focus 2021; 11:20210007. [PMID: 34055306 PMCID: PMC8086918 DOI: 10.1098/rsfs.2021.0007] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/22/2021] [Indexed: 12/14/2022] Open
Abstract
Comparative biology builds up systematic knowledge of the diversity of life, across evolutionary lineages and levels of organization, starting with evidence from a sparse sample of model organisms. In developmental biology, a key obstacle to the growth of comparative approaches is that the concept of homology is not very well defined for levels of organization that are intermediate between individual genes and morphological characters. In this paper, we investigate what it means for ontogenetic processes to be homologous, focusing specifically on the examples of insect segmentation and vertebrate somitogenesis. These processes can be homologous without homology of the underlying genes or gene networks, since the latter can diverge over evolutionary time, while the dynamics of the process remain the same. Ontogenetic processes like these therefore constitute a dissociable level and distinctive unit of comparison requiring their own specific criteria of homology. In addition, such processes are typically complex and nonlinear, such that their rigorous description and comparison requires not only observation and experimentation, but also dynamical modelling. We propose six criteria of process homology, combining recognized indicators (sameness of parts, morphological outcome and topological position) with novel ones derived from dynamical systems modelling (sameness of dynamical properties, dynamical complexity and evidence for transitional forms). We show how these criteria apply to animal segmentation and other ontogenetic processes. We conclude by situating our proposed dynamical framework for homology of process in relation to similar research programmes, such as process structuralism and developmental approaches to morphological homology.
Collapse
Affiliation(s)
- James DiFrisco
- Institute of Philosophy, KU Leuven, 3000 Leuven, Belgium
| | - Johannes Jaeger
- Complexity Science Hub (CSH) Vienna, Josefstädter Strasse 39, 1080 Vienna, Austria
| |
Collapse
|
28
|
Kwon SB, Ernst J. Learning a genome-wide score of human-mouse conservation at the functional genomics level. Nat Commun 2021; 12:2495. [PMID: 33941776 PMCID: PMC8093196 DOI: 10.1038/s41467-021-22653-8] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/18/2020] [Accepted: 03/24/2021] [Indexed: 01/06/2023] Open
Abstract
Identifying genomic regions with functional genomic properties that are conserved between human and mouse is an important challenge in the context of mouse model studies. To address this, we develop a method to learn a score of evidence of conservation at the functional genomics level by integrating information from a compendium of epigenomic, transcription factor binding, and transcriptomic data from human and mouse. The method, Learning Evidence of Conservation from Integrated Functional genomic annotations (LECIF), trains neural networks to generate this score for the human and mouse genomes. The resulting LECIF score highlights human and mouse regions with shared functional genomic properties and captures correspondence of biologically similar human and mouse annotations. Analysis with independent datasets shows the score also highlights loci associated with similar phenotypes in both species. LECIF will be a resource for mouse model studies by identifying loci whose functional genomic properties are likely conserved.
Collapse
Affiliation(s)
- Soo Bin Kwon
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, CA, USA.,Department of Biological Chemistry, University of California, Los Angeles, CA, USA
| | - Jason Ernst
- Bioinformatics Interdepartmental Program, University of California, Los Angeles, CA, USA. .,Department of Biological Chemistry, University of California, Los Angeles, CA, USA. .,Eli and Edythe Broad Center of Regenerative Medicine and Stem Cell Research at University of California, Los Angeles, CA, USA. .,Computer Science Department, University of California, Los Angeles, CA, USA. .,Department of Computational Medicine, University of California, Los Angeles, CA, USA. .,Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA. .,Molecular Biology Institute, University of California, Los Angeles, CA, USA.
| |
Collapse
|
29
|
Conner WR, Delaney EK, Bronski MJ, Ginsberg PS, Wheeler TB, Richardson KM, Peckenpaugh B, Kim KJ, Watada M, Hoffmann AA, Eisen MB, Kopp A, Cooper BS, Turelli M. A phylogeny for the Drosophila montium species group: A model clade for comparative analyses. Mol Phylogenet Evol 2021; 158:107061. [PMID: 33387647 PMCID: PMC7946709 DOI: 10.1016/j.ympev.2020.107061] [Citation(s) in RCA: 18] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/04/2020] [Revised: 12/18/2020] [Accepted: 12/24/2020] [Indexed: 12/22/2022]
Abstract
The Drosophila montium species group is a clade of 94 named species, closely related to the model species D. melanogaster. The montium species group is distributed over a broad geographic range throughout Asia, Africa, and Australasia. Species of this group possess a wide range of morphologies, mating behaviors, and endosymbiont associations, making this clade useful for comparative analyses. We use genomic data from 42 available species to estimate the phylogeny and relative divergence times within the montium species group, and its relative divergence time from D. melanogaster. To assess the robustness of our phylogenetic inferences, we use 3 non-overlapping sets of 20 single-copy coding sequences and analyze all 60 genes with both Bayesian and maximum likelihood methods. Our analyses support monophyly of the group. Apart from the uncertain placement of a single species, D. baimaii, our analyses also support the monophyly of all seven subgroups proposed within the montium group. Our phylograms and relative chronograms provide a highly resolved species tree, with discordance restricted to estimates of relatively short branches deep in the tree. In contrast, age estimates for the montium crown group, relative to its divergence from D. melanogaster, depend critically on prior assumptions concerning variation in rates of molecular evolution across branches, and hence have not been reliably determined. We discuss methodological issues that limit phylogenetic resolution - even when complete genome sequences are available - as well as the utility of the current phylogeny for understanding the evolutionary and biogeographic history of this clade.
Collapse
Affiliation(s)
- William R Conner
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA; Division of Biological Sciences, University of Montana, Missoula, MT 59812, USA(1)
| | - Emily K Delaney
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - Michael J Bronski
- Department of Molecular & Cell Biology, University of California, Berkeley, CA 94720, USA
| | - Paul S Ginsberg
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA; Department of Genetics, University of Georgia, Athens, GA 30602, USA(1)
| | - Timothy B Wheeler
- Division of Biological Sciences, University of Montana, Missoula, MT 59812, USA(1)
| | - Kelly M Richardson
- Bio21 Institute, School of BioScience, University of Melbourne, Victoria 3010, Australia
| | - Brooke Peckenpaugh
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA; Department of Biology, Indiana University, Bloomington, IN 47405, USA(1)
| | - Kevin J Kim
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - Masayoshi Watada
- Graduate School of Science and Engineering, Ehime University, Matsuyama, Ehime, Japan
| | - Ary A Hoffmann
- Bio21 Institute, School of BioScience, University of Melbourne, Victoria 3010, Australia
| | - Michael B Eisen
- Department of Molecular & Cell Biology, University of California, Berkeley, CA 94720, USA
| | - Artyom Kopp
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA
| | - Brandon S Cooper
- Division of Biological Sciences, University of Montana, Missoula, MT 59812, USA(1)
| | - Michael Turelli
- Department of Evolution and Ecology, University of California, Davis, CA 95616, USA.
| |
Collapse
|
30
|
Panigrahi A, O'Malley BW. Mechanisms of enhancer action: the known and the unknown. Genome Biol 2021; 22:108. [PMID: 33858480 PMCID: PMC8051032 DOI: 10.1186/s13059-021-02322-1] [Citation(s) in RCA: 191] [Impact Index Per Article: 47.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2020] [Accepted: 03/23/2021] [Indexed: 12/13/2022] Open
Abstract
Differential gene expression mechanisms ensure cellular differentiation and plasticity to shape ontogenetic and phylogenetic diversity of cell types. A key regulator of differential gene expression programs are the enhancers, the gene-distal cis-regulatory sequences that govern spatiotemporal and quantitative expression dynamics of target genes. Enhancers are widely believed to physically contact the target promoters to effect transcriptional activation. However, our understanding of the full complement of regulatory proteins and the definitive mechanics of enhancer action is incomplete. Here, we review recent findings to present some emerging concepts on enhancer action and also outline a set of outstanding questions.
Collapse
Affiliation(s)
- Anil Panigrahi
- Department of Molecular and Cellular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA
| | - Bert W O'Malley
- Department of Molecular and Cellular Biology, Baylor College of Medicine, One Baylor Plaza, Houston, TX, 77030, USA.
| |
Collapse
|
31
|
Rothenberg EV. Logic and lineage impacts on functional transcription factor deployment for T-cell fate commitment. Biophys J 2021; 120:4162-4181. [PMID: 33838137 PMCID: PMC8516641 DOI: 10.1016/j.bpj.2021.04.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2021] [Revised: 03/22/2021] [Accepted: 04/02/2021] [Indexed: 11/19/2022] Open
Abstract
Transcription factors are the major agents that read the regulatory sequence information in the genome to initiate changes in expression of specific genes, both in development and in physiological activation responses. Their actions depend on site-specific DNA binding and are largely guided by their individual DNA target sequence specificities. However, their action is far more conditional in a real developmental context than would be expected for simple reading of local genomic DNA sequence, which is common to all cells in the organism. They are constrained by slow-changing chromatin states and by interactions with other transcription factors, which affect their occupancy patterns of potential sites across the genome. These mechanisms lead to emergent discontinuities in function even for transcription factors with minimally changing expression. This is well revealed by diverse lineages of blood cells developing throughout life from hematopoietic stem cells, which use overlapping combinations of transcription factors to drive strongly divergent gene regulation programs. Here, using development of T lymphocytes from hematopoietic multipotent progenitor cells as a focus, recent evidence is reviewed on how binding specificity and dynamics, transcription factor cooperativity, and chromatin state changes impact the effective regulatory functions of key transcription factors including PU.1, Runx1, Notch-RBPJ, and Bcl11b.
Collapse
Affiliation(s)
- Ellen V Rothenberg
- Division of Biology and Biological Engineering, California Institute of Technology, Pasadena, California.
| |
Collapse
|
32
|
Giudicelli F, Roest Crollius H. On the importance of evolutionary constraint for regulatory sequence identification. Brief Funct Genomics 2021:elab015. [PMID: 33754633 DOI: 10.1093/bfgp/elab015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/09/2020] [Revised: 01/15/2021] [Accepted: 02/19/2021] [Indexed: 11/13/2022] Open
Abstract
Regulation of gene expression relies on the activity of specialized genomic elements, enhancers or silencers, distributed over sometimes large distance from their target gene promoters. A significant part of vertebrate genomes consists in such regulatory elements, but their identification and that of their target genes remains challenging, due to the lack of clear signature at the nucleotide level. For many years the main hallmark used for identifying functional elements has been their sequence conservation between genomes of distant species, indicative of purifying selection. More recently, genome-wide biochemical assays have opened new avenues for detecting regulatory regions, shifting attention away from evolutionary constraints. Here, we review the respective contributions of comparative genomics and biochemical assays for the definition of regulatory elements and their targets and advocate that both sequence conservation and preserved synteny, taken as signature of functional constraint, remain essential tools in this task.
Collapse
|
33
|
Yuan X, Scott IC, Wilson MD. Heart Enhancers: Development and Disease Control at a Distance. Front Genet 2021; 12:642975. [PMID: 33777110 PMCID: PMC7987942 DOI: 10.3389/fgene.2021.642975] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2020] [Accepted: 01/29/2021] [Indexed: 12/14/2022] Open
Abstract
Bound by lineage-determining transcription factors and signaling effectors, enhancers play essential roles in controlling spatiotemporal gene expression profiles during development, homeostasis and disease. Recent synergistic advances in functional genomic technologies, combined with the developmental biology toolbox, have resulted in unprecedented genome-wide annotation of heart enhancers and their target genes. Starting with early studies of vertebrate heart enhancers and ending with state-of-the-art genome-wide enhancer discovery and testing, we will review how studying heart enhancers in metazoan species has helped inform our understanding of cardiac development and disease.
Collapse
Affiliation(s)
- Xuefei Yuan
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
- Program in Developmental and Stem Cell Biology, The Hospital for Sick Children, Toronto, ON, Canada
| | - Ian C. Scott
- Program in Developmental and Stem Cell Biology, The Hospital for Sick Children, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| | - Michael D. Wilson
- Program in Genetics and Genome Biology, The Hospital for Sick Children, Toronto, ON, Canada
- Department of Molecular Genetics, University of Toronto, Toronto, ON, Canada
| |
Collapse
|
34
|
Abstract
Motivation The universal expressibility assumption of Deep Neural Networks (DNNs) is the key motivation behind recent worksin the systems biology community to employDNNs to solve important problems in functional genomics and moleculargenetics. Typically, such investigations have taken a ‘black box’ approach in which the internal structure of themodel used is set purely by machine learning considerations with little consideration of representing the internalstructure of the biological system by the mathematical structure of the DNN. DNNs have not yet been applied to thedetailed modeling of transcriptional control in which mRNA production is controlled by the binding of specific transcriptionfactors to DNA, in part because such models are in part formulated in terms of specific chemical equationsthat appear different in form from those used in neural networks. Results In this paper, we give an example of a DNN whichcan model the detailed control of transcription in a precise and predictive manner. Its internal structure is fully interpretableand is faithful to underlying chemistry of transcription factor binding to DNA. We derive our DNN from asystems biology model that was not previously recognized as having a DNN structure. Although we apply our DNNto data from the early embryo of the fruit fly Drosophila, this system serves as a test bed for analysis of much larger datasets obtained by systems biology studies on a genomic scale. . Availability and implementation The implementation and data for the models used in this paper are in a zip file in the supplementary material. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yi Liu
- Department of Statistics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA
| | - Kenneth Barr
- Department of Human Genetics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA
| | - John Reinitz
- Departments of Statistics, Ecology and Evolution, Molecular Genetics & Cell Biology, Institute of Genomics and Systems Biology, University of Chicago, Chicago, IL 60637, USA
| |
Collapse
|
35
|
Abstract
Hemichordates, along with echinoderms and chordates, belong to the lineage of bilaterians called the deuterostomes. Their phylogenetic position as an outgroup to chordates provides an opportunity to investigate the evolutionary origins of the chordate body plan and reconstruct ancestral deuterostome characters. The body plans of the hemichordates and chordates are organizationally divergent making anatomical comparisons very challenging. The developmental underpinnings of animal body plans are often more conservative than the body plans they regulate, and offer a novel data set for making comparisons between morphologically divergent body architectures. Here I review the hemichordate developmental data generated over the past 20 years that further test hypotheses of proposed morphological affinities between the two taxa, but also compare the conserved anteroposterior, dorsoventral axial patterning programs and germ layer specification programs. These data provide an opportunity to determine which developmental programs are ancestral deuterostome or bilaterian innovations, and which ones occurred in stem chordates or vertebrates representing developmental novelties of the chordate body plan.
Collapse
|
36
|
Jana T, Brodsky S, Barkai N. Speed-Specificity Trade-Offs in the Transcription Factors Search for Their Genomic Binding Sites. Trends Genet 2021; 37:421-432. [PMID: 33414013 DOI: 10.1016/j.tig.2020.12.001] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Revised: 12/04/2020] [Accepted: 12/07/2020] [Indexed: 12/17/2022]
Abstract
Transcription factors (TFs) regulate gene expression by binding DNA sequences recognized by their DNA-binding domains (DBDs). DBD-recognized motifs are short and highly abundant in genomes. The ability of TFs to bind a specific subset of motif-containing sites, and to do so rapidly upon activation, is fundamental for gene expression in all eukaryotes. Despite extensive interest, our understanding of the TF-target search process is fragmented; although binding specificity and detection speed are two facets of this same process, trade-offs between them are rarely addressed. In this opinion article, we discuss potential speed-specificity trade-offs in the context of existing models. We further discuss the recently described 'distributed specificity' paradigm, suggesting that intrinsically disordered regions (IDRs) promote specificity while reducing the TF-target search time.
Collapse
Affiliation(s)
- Tamar Jana
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Sagie Brodsky
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel
| | - Naama Barkai
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot 76100, Israel.
| |
Collapse
|
37
|
Wong ES, Zheng D, Tan SZ, Bower NL, Garside V, Vanwalleghem G, Gaiti F, Scott E, Hogan BM, Kikuchi K, McGlinn E, Francois M, Degnan BM. Deep conservation of the enhancer regulatory code in animals. Science 2020; 370:370/6517/eaax8137. [PMID: 33154111 DOI: 10.1126/science.aax8137] [Citation(s) in RCA: 78] [Impact Index Per Article: 15.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2019] [Revised: 04/29/2020] [Accepted: 09/30/2020] [Indexed: 12/15/2022]
Abstract
Interactions of transcription factors (TFs) with DNA regulatory sequences, known as enhancers, specify cell identity during animal development. Unlike TFs, the origin and evolution of enhancers has been difficult to trace. We drove zebrafish and mouse developmental transcription using enhancers from an evolutionarily distant marine sponge. Some of these sponge enhancers are located in highly conserved microsyntenic regions, including an Islet enhancer in the Islet-Scaper region. We found that Islet enhancers in humans and mice share a suite of TF binding motifs with sponges, and that they drive gene expression patterns similar to those of sponge and endogenous Islet enhancers in zebrafish. Our results suggest the existence of an ancient and conserved, yet flexible, genomic regulatory syntax that has been repeatedly co-opted into cell type-specific gene regulatory networks across the animal kingdom.
Collapse
Affiliation(s)
- Emily S Wong
- School of Biological Sciences, University of Queensland, Brisbane, Australia. .,Victor Chang Cardiac Research Institute, Sydney, Australia.,School of Biotechnology and Biomolecular Sciences, UNSW Sydney, Sydney, Australia
| | - Dawei Zheng
- Victor Chang Cardiac Research Institute, Sydney, Australia
| | - Siew Z Tan
- Institute for Molecular Biosciences, University of Queensland, Brisbane, Australia
| | - Neil L Bower
- Institute for Molecular Biosciences, University of Queensland, Brisbane, Australia
| | - Victoria Garside
- Australian Regenerative Medicine Institute, Monash University, Melbourne, Australia
| | | | - Federico Gaiti
- School of Biological Sciences, University of Queensland, Brisbane, Australia
| | - Ethan Scott
- Queensland Brain Institute, University of Queensland, Brisbane, Australia
| | - Benjamin M Hogan
- Institute for Molecular Biosciences, University of Queensland, Brisbane, Australia.,Department of Anatomy and Neuroscience and Sir Peter MacCallum Department of Oncology, University of Melbourne, Melbourne, Australia
| | - Kazu Kikuchi
- Victor Chang Cardiac Research Institute, Sydney, Australia
| | - Edwina McGlinn
- Australian Regenerative Medicine Institute, Monash University, Melbourne, Australia
| | - Mathias Francois
- Institute for Molecular Biosciences, University of Queensland, Brisbane, Australia. .,Centenary Institute, David Richmond Program for Cardio-Vascular Research: Gene Regulation and Editing, School of Life and Environmental Sciences, University of Sydney, Sydney, Australia
| | - Bernard M Degnan
- School of Biological Sciences, University of Queensland, Brisbane, Australia.
| |
Collapse
|
38
|
Harmston N. Regulation in common: Sponge to zebrafish. Science 2020; 370:657-658. [PMID: 33154124 DOI: 10.1126/science.abe9317] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/02/2022]
Affiliation(s)
- Nathan Harmston
- Science Division, Yale-NUS College, 16 College Avenue West #01-220, 138527, Singapore. .,Programme in Cancer and Stem Cell Biology, Duke-NUS Medical School, 169857, Singapore
| |
Collapse
|
39
|
Halstead MM, Kern C, Saelao P, Wang Y, Chanthavixay G, Medrano JF, Van Eenennaam AL, Korf I, Tuggle CK, Ernst CW, Zhou H, Ross PJ. A comparative analysis of chromatin accessibility in cattle, pig, and mouse tissues. BMC Genomics 2020; 21:698. [PMID: 33028202 PMCID: PMC7541309 DOI: 10.1186/s12864-020-07078-9] [Citation(s) in RCA: 32] [Impact Index Per Article: 6.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/19/2020] [Accepted: 09/17/2020] [Indexed: 12/25/2022] Open
Abstract
Background Although considerable progress has been made towards annotating the noncoding portion of the human and mouse genomes, regulatory elements in other species, such as livestock, remain poorly characterized. This lack of functional annotation poses a substantial roadblock to agricultural research and diminishes the value of these species as model organisms. As active regulatory elements are typically characterized by chromatin accessibility, we implemented the Assay for Transposase Accessible Chromatin (ATAC-seq) to annotate and characterize regulatory elements in pigs and cattle, given a set of eight adult tissues. Results Overall, 306,304 and 273,594 active regulatory elements were identified in pig and cattle, respectively. 71,478 porcine and 47,454 bovine regulatory elements were highly tissue-specific and were correspondingly enriched for binding motifs of known tissue-specific transcription factors. However, in every tissue the most prevalent accessible motif corresponded to the insulator CTCF, suggesting pervasive involvement in 3-D chromatin organization. Taking advantage of a similar dataset in mouse, open chromatin in pig, cattle, and mice were compared, revealing that the conservation of regulatory elements, in terms of sequence identity and accessibility, was consistent with evolutionary distance; whereas pig and cattle shared about 20% of accessible sites, mice and ungulates only had about 10% of accessible sites in common. Furthermore, conservation of accessibility was more prevalent at promoters than at intergenic regions. Conclusions The lack of conserved accessibility at distal elements is consistent with rapid evolution of enhancers, and further emphasizes the need to annotate regulatory elements in individual species, rather than inferring elements based on homology. This atlas of chromatin accessibility in cattle and pig constitutes a substantial step towards annotating livestock genomes and dissecting the regulatory link between genome and phenome.
Collapse
Affiliation(s)
- Michelle M Halstead
- Department of Animal Science, University of California Davis, Davis, CA, 95616, USA
| | - Colin Kern
- Department of Animal Science, University of California Davis, Davis, CA, 95616, USA
| | - Perot Saelao
- Department of Animal Science, University of California Davis, Davis, CA, 95616, USA
| | - Ying Wang
- Department of Animal Science, University of California Davis, Davis, CA, 95616, USA
| | - Ganrea Chanthavixay
- Department of Animal Science, University of California Davis, Davis, CA, 95616, USA
| | - Juan F Medrano
- Department of Animal Science, University of California Davis, Davis, CA, 95616, USA
| | | | - Ian Korf
- Department of Animal Science, University of California Davis, Davis, CA, 95616, USA
| | | | - Catherine W Ernst
- Department of Animal Science, Michigan State University, East Lansing, 48824, MI, USA
| | - Huaijun Zhou
- Department of Animal Science, University of California Davis, Davis, CA, 95616, USA.
| | - Pablo J Ross
- Department of Animal Science, University of California Davis, Davis, CA, 95616, USA.
| |
Collapse
|
40
|
Abstract
This study has taken advantage of the availability of the assembled genomic sequence of flies, mosquitos, ants and bees to explore the presence of ultraconserved sequence elements in these phylogenetic groups. We compared non-coding sequences found within and flanking Drosophila developmental genes to homologous sequences in Ceratitis capitata and Musca domestica. Many of the conserved sequence blocks (CSBs) that constitute Drosophila cis-regulatory DNA, recognized by EvoPrinter alignment protocols, are also conserved in Ceratitis and Musca. Also conserved is the position but not necessarily the orientation of many of these ultraconserved CSBs (uCSBs) with respect to flanking genes. Using the mosquito EvoPrint algorithm, we have also identified uCSBs shared among distantly related mosquito species. Side by side comparison of bee and ant EvoPrints of selected developmental genes identify uCSBs shared between these two Hymenoptera, as well as less conserved CSBs in either one or the other taxon but not in both. Analysis of uCSBs in these dipterans and Hymenoptera will lead to a greater understanding of their evolutionary origin and function of their conserved non-coding sequences and aid in discovery of core elements of enhancers. This study applies the phylogenetic footprinting program EvoPrinter to detection of ultraconserved non-coding sequence elements in Diptera, including flies and mosquitos, and Hymenoptera, including ants and bees. EvoPrinter outputs an interspecies comparison as a single sequence in terms of the input reference sequence. Ultraconserved sequences flanking known developmental genes were detected in Ceratitis and Musca when compared with Drosophila species, in Aedes and Culex when compared with Anopheles, and between ants and bees. Our methods are useful in detecting and understanding the core evolutionarily hardened sequences required for gene regulation.
Collapse
|
41
|
Abstract
Key discoveries in Drosophila have shaped our understanding of cellular "enhancers." With a special focus on the fly, this chapter surveys properties of these adaptable cis-regulatory elements, whose actions are critical for the complex spatial/temporal transcriptional regulation of gene expression in metazoa. The powerful combination of genetics, molecular biology, and genomics available in Drosophila has provided an arena in which the developmental role of enhancers can be explored. Enhancers are characterized by diverse low- or high-throughput assays, which are challenging to interpret, as not all of these methods of identifying enhancers produce concordant results. As a model metazoan, the fly offers important advantages to comprehensive analysis of the central functions that enhancers play in gene expression, and their critical role in mediating the production of phenotypes from genotype and environmental inputs. A major challenge moving forward will be obtaining a quantitative understanding of how these cis-regulatory elements operate in development and disease.
Collapse
Affiliation(s)
- Stephen Small
- Department of Biology, Developmental Systems Training Program, New York University, 10003 and
| | - David N Arnosti
- Department of Biochemistry and Molecular Biology, Michigan State University, East Lansing, Michigan 48824
| |
Collapse
|
42
|
Chen HM, Marques JG, Sugino K, Wei D, Miyares RL, Lee T. CAMIO: a transgenic CRISPR pipeline to create diverse targeted genome deletions in Drosophila. Nucleic Acids Res 2020; 48:4344-4356. [PMID: 32187363 PMCID: PMC7192631 DOI: 10.1093/nar/gkaa177] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 02/06/2020] [Accepted: 03/10/2020] [Indexed: 02/07/2023] Open
Abstract
The genome is the blueprint for an organism. Interrogating the genome, especially locating critical cis-regulatory elements, requires deletion analysis. This is conventionally performed using synthetic constructs, making it cumbersome and non-physiological. Thus, we created Cas9-mediated Arrayed Mutagenesis of Individual Offspring (CAMIO) to achieve comprehensive analysis of a targeted region of native DNA. CAMIO utilizes CRISPR that is spatially restricted to generate independent deletions in the intact Drosophila genome. Controlled by recombination, a single guide RNA is stochastically chosen from a set targeting a specific DNA region. Combining two sets increases variability, leading to either indels at 1–2 target sites or inter-target deletions. Cas9 restriction to male germ cells elicits autonomous double-strand-break repair, consequently creating offspring with diverse mutations. Thus, from a single population cross, we can obtain a deletion matrix covering a large expanse of DNA at both coarse and fine resolution. We demonstrate the ease and power of CAMIO by mapping 5′UTR sequences crucial for chinmo's post-transcriptional regulation.
Collapse
Affiliation(s)
- Hui-Min Chen
- Howard Hughes Medical Institute, Janelia Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
| | - Jorge Garcia Marques
- Howard Hughes Medical Institute, Janelia Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
| | - Ken Sugino
- Howard Hughes Medical Institute, Janelia Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
| | - Dingjun Wei
- Howard Hughes Medical Institute, Janelia Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
| | - Rosa Linda Miyares
- Howard Hughes Medical Institute, Janelia Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
| | - Tzumin Lee
- Howard Hughes Medical Institute, Janelia Research Campus, 19700 Helix Drive, Ashburn, VA 20147, USA
| |
Collapse
|
43
|
Dukler N, Huang YF, Siepel A. Phylogenetic Modeling of Regulatory Element Turnover Based on Epigenomic Data. Mol Biol Evol 2020; 37:2137-2152. [PMID: 32176292 PMCID: PMC7306682 DOI: 10.1093/molbev/msaa073] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
Evolutionary changes in gene expression are often driven by gains and losses of cis-regulatory elements (CREs). The dynamics of CRE evolution can be examined using multispecies epigenomic data, but so far such analyses have generally been descriptive and model-free. Here, we introduce a probabilistic modeling framework for the evolution of CREs that operates directly on raw chromatin immunoprecipitation and sequencing (ChIP-seq) data and fully considers the phylogenetic relationships among species. Our framework includes a phylogenetic hidden Markov model, called epiPhyloHMM, for identifying the locations of multiply aligned CREs, and a combined phylogenetic and generalized linear model, called phyloGLM, for accounting for the influence of a rich set of genomic features in describing their evolutionary dynamics. We apply these methods to previously published ChIP-seq data for the H3K4me3 and H3K27ac histone modifications in liver tissue from nine mammals. We find that enhancers are gained and lost during mammalian evolution at about twice the rate of promoters, and that turnover rates are negatively correlated with DNA sequence conservation, expression level, and tissue breadth, and positively correlated with distance from the transcription start site, consistent with previous findings. In addition, we find that the predicted dosage sensitivity of target genes positively correlates with DNA sequence constraint in CREs but not with turnover rates, perhaps owing to differences in the effect sizes of the relevant mutations. Altogether, our probabilistic modeling framework enables a variety of powerful new analyses.
Collapse
Affiliation(s)
- Noah Dukler
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
- Physiology, Biophysics, and Systems Biology, Weill Cornell Medical College, New York, NY
| | - Yi-Fei Huang
- Department of Biology and Huck Institute of Life Sciences, Pennsylvania State University, University Park, PA
| | - Adam Siepel
- Simons Center for Quantitative Biology, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY
| |
Collapse
|
44
|
Enhancer RNAs are an important regulatory layer of the epigenome. Nat Struct Mol Biol 2020; 27:521-528. [PMID: 32514177 DOI: 10.1038/s41594-020-0446-0] [Citation(s) in RCA: 214] [Impact Index Per Article: 42.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/25/2019] [Accepted: 05/07/2020] [Indexed: 12/20/2022]
Abstract
Noncoding RNAs (ncRNAs) direct a remarkable number of diverse functions in development and disease through their regulation of transcription, RNA processing and translation. Leading the charge in the RNA revolution is a class of ncRNAs that are synthesized at active enhancers, called enhancer RNAs (eRNAs). Here, we review recent insights into the biogenesis of eRNAs and the mechanisms underlying their multifaceted functions and consider how these findings could inform future investigations into enhancer transcription and eRNA function.
Collapse
|
45
|
Rivera J, Keränen SVE, Gallo SM, Halfon MS. REDfly: the transcriptional regulatory element database for Drosophila. Nucleic Acids Res 2020; 47:D828-D834. [PMID: 30329093 PMCID: PMC6323911 DOI: 10.1093/nar/gky957] [Citation(s) in RCA: 45] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/06/2018] [Accepted: 10/04/2018] [Indexed: 12/21/2022] Open
Abstract
The REDfly database provides a comprehensive curation of experimentally-validated Drosophila transcriptional cis-regulatory elements and includes information on DNA sequence, experimental evidence, patterns of regulated gene expression, and more. Now in its thirteenth year, REDfly has grown to over 23 000 records of tested reporter gene constructs and 2200 tested transcription factor binding sites. Recent developments include the start of curation of predicted cis-regulatory modules in addition to experimentally-verified ones, improved search and filtering, and increased interaction with the authors of curated papers. An expanded data model that will capture information on temporal aspects of gene regulation, regulation in response to environmental and other non-developmental cues, sexually dimorphic gene regulation, and non-endogenous (ectopic) aspects of reporter gene expression is under development and expected to be in place within the coming year. REDfly is freely accessible at http://redfly.ccr.buffalo.edu, and news about database updates and new features can be followed on Twitter at @REDfly_database.
Collapse
Affiliation(s)
- John Rivera
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA.,New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | | | - Steven M Gallo
- Center for Computational Research, State University of New York at Buffalo, Buffalo, NY 14203, USA.,New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA
| | - Marc S Halfon
- New York State Center of Excellence in Bioinformatics and Life Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biomedical Informatics, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Biological Sciences, State University of New York at Buffalo, Buffalo, NY 14203, USA.,Department of Molecular and Cellular Biology and Program in Cancer Genetics, Roswell Park Cancer Institute, Buffalo, NY 14263, USA
| |
Collapse
|
46
|
King DM, Hong CKY, Shepherdson JL, Granas DM, Maricque BB, Cohen BA. Synthetic and genomic regulatory elements reveal aspects of cis-regulatory grammar in mouse embryonic stem cells. eLife 2020; 9:41279. [PMID: 32043966 PMCID: PMC7077988 DOI: 10.7554/elife.41279] [Citation(s) in RCA: 43] [Impact Index Per Article: 8.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/20/2018] [Accepted: 02/07/2020] [Indexed: 01/08/2023] Open
Abstract
In embryonic stem cells (ESCs), a core transcription factor (TF) network establishes the gene expression program necessary for pluripotency. To address how interactions between four key TFs contribute to cis-regulation in mouse ESCs, we assayed two massively parallel reporter assay (MPRA) libraries composed of binding sites for SOX2, POU5F1 (OCT4), KLF4, and ESRRB. Comparisons between synthetic cis-regulatory elements and genomic sequences with comparable binding site configurations revealed some aspects of a regulatory grammar. The expression of synthetic elements is influenced by both the number and arrangement of binding sites. This grammar plays only a small role for genomic sequences, as the relative activities of genomic sequences are best explained by the predicted occupancy of binding sites, regardless of binding site identity and positioning. Our results suggest that the effects of transcription factor binding sites (TFBS) are influenced by the order and orientation of sites, but that in the genome the overall occupancy of TFs is the primary determinant of activity. Transcription factors are proteins that flip genetic switches; their role is to control when and where genes are active. They do this by binding to short stretches of DNA called cis-regulatory sequences. Each sequence can have several binding sites for different transcription factors, but it is largely unclear whether the transcription factors binding to the same regulatory sequence actually work together. It is possible that each transcription factor may work independently and there only needs to be critical mass of transcription factors bound to throw the genetic switch. If this is the case, the most important features of a cis-regulatory sequence should be the number of binding sites it contains, and how tightly the transcription factors bind to those sites. The more transcription factors and the more strongly they bind, the more active the gene should be. An alternative option is that certain transcription factors may work better together, enhancing each other's effects such that the total effect is more than the sum of its parts. If this is true, the order, orientation and spacing of the binding sites within a sequence should matter more than the number. One way to investigate to distinguish between these possibilities is to study mouse embryonic stem cells, which have a core set of four transcription factors. Looking directly at a real genome, however, can be confusing and it is difficult to measure the effects of different cis-regulatory sequences because genes differ in so many other ways. To tackle this problem, King et al. created a synthetic set of cis-regulatory sequences based on the four core transcription factors found in mouse stem cells. The synthetic set had every combination of two, three or four of the binding sites, with each site either facing forwards or backwards along the DNA strand. King et al. attached each of the synthetic cis-regulatory sequences to a reporter gene to find out how well each sequence performed. This revealed that the cis-regulatory sequences with the most binding sites and the tightest binding affinities work best, suggesting that transcription factors mainly work independently. There was evidence of some interaction between some transcription factors, because, of the synthetic sequences with four binding sites, some worked better than others, and there were patterns in the most effective binding site combinations. However, these effects were small and when King et al. went on to test sequences from the real mouse genome, the most important factor by far was the number of binding sites. Synthetic libraries of DNA sequences allow researchers to examine gene regulation more clearly than is possible in real genomes. Yet this approach does have its limitations and it is impossible to capture every type of cis-regulatory sequence in one library. The next step to extend this work is to combine the two approaches, taking sequences from the real genome and manipulating them one by one. This could help to unravel the rules that govern how cis-regulatory sequences work in real cells.
Collapse
Affiliation(s)
- Dana M King
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - Clarice Kit Yee Hong
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - James L Shepherdson
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - David M Granas
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - Brett B Maricque
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| | - Barak A Cohen
- Edison Center for Genome Sciences and Systems Biology, Washington University in St. Louis, St. Louis, United States.,Department of Genetics, Washington University in St. Louis, St. Louis, United States
| |
Collapse
|
47
|
Peng PC, Khoueiry P, Girardot C, Reddington JP, Garfield DA, Furlong EEM, Sinha S. The Role of Chromatin Accessibility in cis-Regulatory Evolution. Genome Biol Evol 2020; 11:1813-1828. [PMID: 31114856 PMCID: PMC6601868 DOI: 10.1093/gbe/evz103] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/13/2019] [Indexed: 02/07/2023] Open
Abstract
Transcription factor (TF) binding is determined by sequence as well as chromatin accessibility. Although the role of accessibility in shaping TF-binding landscapes is well recorded, its role in evolutionary divergence of TF binding, which in turn can alter cis-regulatory activities, is not well understood. In this work, we studied the evolution of genome-wide binding landscapes of five major TFs in the core network of mesoderm specification, between Drosophila melanogaster and Drosophila virilis, and examined its relationship to accessibility and sequence-level changes. We generated chromatin accessibility data from three important stages of embryogenesis in both Drosophila melanogaster and Drosophila virilis and recorded conservation and divergence patterns. We then used multivariable models to correlate accessibility and sequence changes to TF-binding divergence. We found that accessibility changes can in some cases, for example, for the master regulator Twist and for earlier developmental stages, more accurately predict binding change than is possible using TF-binding motif changes between orthologous enhancers. Accessibility changes also explain a significant portion of the codivergence of TF pairs. We noted that accessibility and motif changes offer complementary views of the evolution of TF binding and developed a combined model that captures the evolutionary data much more accurately than either view alone. Finally, we trained machine learning models to predict enhancer activity from TF binding and used these functional models to argue that motif and accessibility-based predictors of TF-binding change can substitute for experimentally measured binding change, for the purpose of predicting evolutionary changes in enhancer activity.
Collapse
Affiliation(s)
- Pei-Chen Peng
- Department of Computer Science, University of Illinois at Urbana-Champaign.,Center for Bioinformatics and Functional Genomics, Department of Biomedical Sciences, Cedars-Sinai Medical Center, Los Angeles, CA
| | - Pierre Khoueiry
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.,American University of Beirut (AUB), Department of Biochemistry and Molecular Genetics, Beirut, Lebanon
| | - Charles Girardot
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - James P Reddington
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - David A Garfield
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany.,IRI-Life Sciences, Humboldt Universität zu Berlin, Berlin, Germany
| | - Eileen E M Furlong
- European Molecular Biology Laboratory, Genome Biology Unit, Heidelberg, Germany
| | - Saurabh Sinha
- Department of Computer Science, University of Illinois at Urbana-Champaign.,Carl R. Woese Institute for Genomic Biology, University of Illinois at Urbana-Champaign
| |
Collapse
|
48
|
Garcia HG, Berrocal A, Kim YJ, Martini G, Zhao J. Lighting up the central dogma for predictive developmental biology. Curr Top Dev Biol 2019; 137:1-35. [PMID: 32143740 DOI: 10.1016/bs.ctdb.2019.10.010] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Although the last 30years have witnessed the mapping of the wiring diagrams of the gene regulatory networks that dictate cell fate and animal body plans, specific understanding building on such network diagrams that shows how DNA regulatory regions control gene expression lags far behind. These networks have yet to yield the predictive power necessary to, for example, calculate how the concentration dynamics of input transcription factors and DNA regulatory sequence prescribes output patterns of gene expression that, in turn, determine body plans themselves. Here, we argue that reaching a predictive understanding of developmental decision-making calls for an interplay between theory and experiment aimed at revealing how the regulation of the processes of the central dogma dictate network connections and how network topology guides cells toward their ultimate developmental fate. To make this possible, it is crucial to break free from the snapshot-based understanding of embryonic development facilitated by fixed-tissue approaches and embrace new technologies that capture the dynamics of developmental decision-making at the single cell level, in living embryos.
Collapse
Affiliation(s)
- Hernan G Garcia
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, United States; Department of Physics, University of California at Berkeley, Berkeley, CA, United States; Biophysics Graduate Group, University of California at Berkeley, Berkeley, CA, United States; Quantitative Biosciences-QB3, University of California at Berkeley, Berkeley, CA, United States.
| | - Augusto Berrocal
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, United States
| | - Yang Joon Kim
- Biophysics Graduate Group, University of California at Berkeley, Berkeley, CA, United States
| | - Gabriella Martini
- Department of Molecular and Cell Biology, University of California at Berkeley, Berkeley, CA, United States
| | - Jiaxi Zhao
- Department of Physics, University of California at Berkeley, Berkeley, CA, United States
| |
Collapse
|
49
|
Abstract
ABSTRACT
There is now compelling evidence that many arthropods pattern their segments using a clock-and-wavefront mechanism, analogous to that operating during vertebrate somitogenesis. In this Review, we discuss how the arthropod segmentation clock generates a repeating sequence of pair-rule gene expression, and how this is converted into a segment-polarity pattern by ‘timing factor’ wavefronts associated with axial extension. We argue that the gene regulatory network that patterns segments may be relatively conserved, although the timing of segmentation varies widely, and double-segment periodicity appears to have evolved at least twice. Finally, we describe how the repeated evolution of a simultaneous (Drosophila-like) mode of segmentation within holometabolan insects can be explained by heterochronic shifts in timing factor expression plus extensive pre-patterning of the pair-rule genes.
Collapse
Affiliation(s)
- Erik Clark
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115, USA
- Department of Zoology, University of Cambridge, Cambridge, CB2 3EJ, UK
| | - Andrew D. Peel
- School of Biology, Faculty of Biological Sciences, University of Leeds, Leeds, LS2 9JT, UK
| | - Michael Akam
- Department of Zoology, University of Cambridge, Cambridge, CB2 3EJ, UK
| |
Collapse
|
50
|
Razy-Krajka F, Stolfi A. Regulation and evolution of muscle development in tunicates. EvoDevo 2019; 10:13. [PMID: 31249657 PMCID: PMC6589888 DOI: 10.1186/s13227-019-0125-6] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2018] [Accepted: 06/08/2019] [Indexed: 12/16/2022] Open
Abstract
For more than a century, studies on tunicate muscle formation have revealed many principles of cell fate specification, gene regulation, morphogenesis, and evolution. Here, we review the key studies that have probed the development of all the various muscle cell types in a wide variety of tunicate species. We seize this occasion to explore the implications and questions raised by these findings in the broader context of muscle evolution in chordates.
Collapse
Affiliation(s)
- Florian Razy-Krajka
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, USA
| | - Alberto Stolfi
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, USA
| |
Collapse
|