1
|
Mononen J, Taipale M, Malinen M, Velidendla B, Niskanen E, Levonen AL, Ruotsalainen AK, Heikkinen S. Genetic variation is a key determinant of chromatin accessibility and drives differences in the regulatory landscape of C57BL/6J and 129S1/SvImJ mice. Nucleic Acids Res 2024; 52:2904-2923. [PMID: 38153160 PMCID: PMC11014276 DOI: 10.1093/nar/gkad1225] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2022] [Revised: 11/09/2023] [Accepted: 12/12/2023] [Indexed: 12/29/2023] Open
Abstract
Most common genetic variants associated with disease are located in non-coding regions of the genome. One mechanism by which they function is through altering transcription factor (TF) binding. In this study, we explore how genetic variation is connected to differences in the regulatory landscape of livers from C57BL/6J and 129S1/SvImJ mice fed either chow or a high-fat diet. To identify sites where regulatory variation affects TF binding and nearby gene expression, we employed an integrative analysis of H3K27ac ChIP-seq (active enhancers), ATAC-seq (chromatin accessibility) and RNA-seq (gene expression). We show that, across all these assays, the genetically driven (i.e. strain-specific) differences in the regulatory landscape are more pronounced than those modified by diet. Most notably, our analysis revealed that differentially accessible regions (DARs, N = 29635, FDR < 0.01 and fold change > 50%) are almost always strain-specific and enriched with genetic variation. Moreover, proximal DARs are highly correlated with differentially expressed genes. We also show that TF binding is affected by genetic variation, which we validate experimentally using ChIP-seq for TCF7L2 and CTCF. This study provides detailed insights into how non-coding genetic variation alters the gene regulatory landscape, and demonstrates how this can be used to study the regulatory variation influencing TF binding.
Collapse
Affiliation(s)
- Juho Mononen
- Institute of Biomedicine, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| | - Mari Taipale
- A.I. Virtanen Institute, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| | - Marjo Malinen
- Department of Environmental and Biological Sciences, Faculty of Science and Forestry, University of Eastern Finland, Joensuu FI- 80101, Finland
- Department of Forestry and Environmental Engineering, South-Eastern Finland University of Applied Sciences, Kouvola FI-45100, Finland
| | - Bharadwaja Velidendla
- Institute of Biomedicine, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| | - Einari Niskanen
- Institute of Biomedicine, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| | - Anna-Liisa Levonen
- A.I. Virtanen Institute, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| | - Anna-Kaisa Ruotsalainen
- A.I. Virtanen Institute, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| | - Sami Heikkinen
- Institute of Biomedicine, Faculty of Health Sciences, University of Eastern Finland, Kuopio FI-70211, Finland
| |
Collapse
|
2
|
Hecker D, Lauber M, Behjati Ardakani F, Ashrafiyan S, Manz Q, Kersting J, Hoffmann M, Schulz MH, List M. Computational tools for inferring transcription factor activity. Proteomics 2023; 23:e2200462. [PMID: 37706624 DOI: 10.1002/pmic.202200462] [Citation(s) in RCA: 7] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 08/11/2023] [Accepted: 08/22/2023] [Indexed: 09/15/2023]
Abstract
Transcription factors (TFs) are essential players in orchestrating the regulatory landscape in cells. Still, their exact modes of action and dependencies on other regulatory aspects remain elusive. Since TFs act cell type-specific and each TF has its own characteristics, untangling their regulatory interactions from an experimental point of view is laborious and convoluted. Thus, there is an ongoing development of computational tools that estimate transcription factor activity (TFA) from a variety of data modalities, either based on a mapping of TFs to their putative target genes or in a genome-wide, gene-unspecific fashion. These tools can help to gain insights into TF regulation and to prioritize candidates for experimental validation. We want to give an overview of available computational tools that estimate TFA, illustrate examples of their application, debate common result validation strategies, and discuss assumptions and concomitant limitations.
Collapse
Affiliation(s)
- Dennis Hecker
- Goethe University Frankfurt, Frankfurt am Main, Germany
- German Center for Cardiovascular Research, Partner site Rhein-Main, Frankfurt am Main, Germany
- Cardio-Pulmonary Institute, Goethe University Hospital, Frankfurt am Main, Germany
| | - Michael Lauber
- Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Fatemeh Behjati Ardakani
- Goethe University Frankfurt, Frankfurt am Main, Germany
- German Center for Cardiovascular Research, Partner site Rhein-Main, Frankfurt am Main, Germany
- Cardio-Pulmonary Institute, Goethe University Hospital, Frankfurt am Main, Germany
| | - Shamim Ashrafiyan
- Goethe University Frankfurt, Frankfurt am Main, Germany
- German Center for Cardiovascular Research, Partner site Rhein-Main, Frankfurt am Main, Germany
- Cardio-Pulmonary Institute, Goethe University Hospital, Frankfurt am Main, Germany
| | - Quirin Manz
- Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| | - Johannes Kersting
- Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
- GeneSurge GmbH, München, Germany
| | - Markus Hoffmann
- Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
- Institute for Advanced Study, Technical University of Munich, Garching, Germany
- National Institute of Diabetes, Digestive, and Kidney Diseases, National Institutes of Health, Bethesda, Maryland, USA
| | - Marcel H Schulz
- Goethe University Frankfurt, Frankfurt am Main, Germany
- German Center for Cardiovascular Research, Partner site Rhein-Main, Frankfurt am Main, Germany
- Cardio-Pulmonary Institute, Goethe University Hospital, Frankfurt am Main, Germany
| | - Markus List
- Big Data in BioMedicine Group, Chair of Experimental Bioinformatics, TUM School of Life Sciences, Technical University of Munich, Freising, Germany
| |
Collapse
|
3
|
Khodursky S, Zheng EB, Svetec N, Durkin SM, Benjamin S, Gadau A, Wu X, Zhao L. The evolution and mutational robustness of chromatin accessibility in Drosophila. Genome Biol 2023; 24:232. [PMID: 37845780 PMCID: PMC10578003 DOI: 10.1186/s13059-023-03079-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/14/2022] [Accepted: 09/29/2023] [Indexed: 10/18/2023] Open
Abstract
BACKGROUND The evolution of genomic regulatory regions plays a critical role in shaping the diversity of life. While this process is primarily sequence-dependent, the enormous complexity of biological systems complicates the understanding of the factors underlying regulation and its evolution. Here, we apply deep neural networks as a tool to investigate the sequence determinants underlying chromatin accessibility in different species and tissues of Drosophila. RESULTS We train hybrid convolution-attention neural networks to accurately predict ATAC-seq peaks using only local DNA sequences as input. We show that our models generalize well across substantially evolutionarily diverged species of insects, implying that the sequence determinants of accessibility are highly conserved. Using our model to examine species-specific gains in accessibility, we find evidence suggesting that these regions may be ancestrally poised for evolution. Using in silico mutagenesis, we show that accessibility can be accurately predicted from short subsequences in each example. However, in silico knock-out of these sequences does not qualitatively impair classification, implying that accessibility is mutationally robust. Subsequently, we show that accessibility is predicted to be robust to large-scale random mutation even in the absence of selection. Conversely, simulations under strong selection demonstrate that accessibility can be extremely malleable despite its robustness. Finally, we identify motifs predictive of accessibility, recovering both novel and previously known motifs. CONCLUSIONS These results demonstrate the conservation of the sequence determinants of accessibility and the general robustness of chromatin accessibility, as well as the power of deep neural networks to explore fundamental questions in regulatory genomics and evolution.
Collapse
Affiliation(s)
- Samuel Khodursky
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Eric B Zheng
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Nicolas Svetec
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Sylvia M Durkin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
- Present Address: Department of Integrative Biology and Museum of Vertebrate Zoology, University of California, Berkeley, Berkeley, CA, USA
| | - Sigi Benjamin
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Alice Gadau
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Xia Wu
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA
| | - Li Zhao
- Laboratory of Evolutionary Genetics and Genomics, The Rockefeller University, New York, NY, 10065, USA.
| |
Collapse
|
4
|
Caglayan E, Konopka G. Decoding DNA sequence-driven evolution of the human brain epigenome at cellular resolution. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.09.14.557820. [PMID: 37745404 PMCID: PMC10515917 DOI: 10.1101/2023.09.14.557820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 09/26/2023]
Abstract
DNA-based evolutionary comparisons of regulatory genomic elements enable insight into functional changes, overcoming tissue inaccessibility. Here, we harnessed adult and fetal cortex single-cell ATAC-seq datasets to uncover DNA substitutions specific to the human and human-ancestral lineages within apes. We found that fetal microglia identity is evolutionarily divergent in all lineages, whereas other cell types are conserved. Using multiomic datasets, we further identified genes linked to multiple lineage-divergent gene regulatory elements and implicated biological pathways associated with these divergent features. We also uncovered patterns of transcription factor binding site evolution across lineages and identified expansion of bHLH-PAS factor targets in human-hominin lineages, and MEF2 factor targets in the ape lineage. Finally, conserved features were more enriched in brain disease variants, whereas there was no distinct enrichment on the human lineage compared to its ancestral lineages. Our study identifies major evolutionary patterns in the human brain epigenome at cellular resolution.
Collapse
Affiliation(s)
- Emre Caglayan
- Department of Neuroscience, UT Southwestern Medical Center, Dallas, TX 75390, USA
- Peter O’Donnell Jr. Brain Institute, UT Southwestern Medical Center, Dallas, TX 75390, USA
| | - Genevieve Konopka
- Department of Neuroscience, UT Southwestern Medical Center, Dallas, TX 75390, USA
- Peter O’Donnell Jr. Brain Institute, UT Southwestern Medical Center, Dallas, TX 75390, USA
| |
Collapse
|
5
|
Kleinschmidt H, Xu C, Bai L. Using Synthetic DNA Libraries to Investigate Chromatin and Gene Regulation. Chromosoma 2023; 132:167-189. [PMID: 37184694 PMCID: PMC10542970 DOI: 10.1007/s00412-023-00796-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2023] [Revised: 04/25/2023] [Accepted: 04/26/2023] [Indexed: 05/16/2023]
Abstract
Despite the recent explosion in genome-wide studies in chromatin and gene regulation, we are still far from extracting a set of genetic rules that can predict the function of the regulatory genome. One major reason for this deficiency is that gene regulation is a multi-layered process that involves an enormous variable space, which cannot be fully explored using native genomes. This problem can be partially solved by introducing synthetic DNA libraries into cells, a method that can test the regulatory roles of thousands to millions of sequences with limited variables. Here, we review recent applications of this method to study transcription factor (TF) binding, nucleosome positioning, and transcriptional activity. We discuss the design principles, experimental procedures, and major findings from these studies and compare the pros and cons of different approaches.
Collapse
Affiliation(s)
- Holly Kleinschmidt
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Cheng Xu
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, PA, 16802, USA.
- Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, PA, 16802, USA.
- Department of Physics, The Pennsylvania State University, University Park, PA, 16802, USA.
| |
Collapse
|
6
|
Patel T, Hammelman J, Aziz S, Jang S, Closser M, Michaels TL, Blum JA, Gifford DK, Wichterle H. Transcriptional dynamics of murine motor neuron maturation in vivo and in vitro. Nat Commun 2022; 13:5427. [PMID: 36109497 PMCID: PMC9477853 DOI: 10.1038/s41467-022-33022-4] [Citation(s) in RCA: 25] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2022] [Accepted: 08/25/2022] [Indexed: 12/03/2022] Open
Abstract
Neurons born in the embryo can undergo a protracted period of maturation lasting well into postnatal life. How gene expression changes are regulated during maturation and whether they can be recapitulated in cultured neurons remains poorly understood. Here, we show that mouse motor neurons exhibit pervasive changes in gene expression and accessibility of associated regulatory regions from embryonic till juvenile age. While motifs of selector transcription factors, ISL1 and LHX3, are enriched in nascent regulatory regions, motifs of NFI factors, activity-dependent factors, and hormone receptors become more prominent in maturation-dependent enhancers. Notably, stem cell-derived motor neurons recapitulate ~40% of the maturation expression program in vitro, with neural activity playing only a modest role as a late-stage modulator. Thus, the genetic maturation program consists of a core hardwired subprogram that is correctly executed in vitro and an extrinsically-controlled subprogram that is dependent on the in vivo context of the maturing organism.
Collapse
Affiliation(s)
- Tulsi Patel
- Departments of Pathology & Cell Biology, Neuroscience, and Neurology, Columbia University Irving Medical Center, New York, NY, 10032, USA.
| | - Jennifer Hammelman
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, 02139, USA
| | - Siaresh Aziz
- Departments of Pathology & Cell Biology, Neuroscience, and Neurology, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Sumin Jang
- Departments of Pathology & Cell Biology, Neuroscience, and Neurology, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Michael Closser
- Departments of Pathology & Cell Biology, Neuroscience, and Neurology, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Theodore L Michaels
- Departments of Pathology & Cell Biology, Neuroscience, and Neurology, Columbia University Irving Medical Center, New York, NY, 10032, USA
| | - Jacob A Blum
- Department of Genetics, Stanford University School of Medicine, Stanford, CA, USA
| | - David K Gifford
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, 02139, USA
| | - Hynek Wichterle
- Departments of Pathology & Cell Biology, Neuroscience, and Neurology, Columbia University Irving Medical Center, New York, NY, 10032, USA.
| |
Collapse
|
7
|
Zhao Y, Vartak SV, Conte A, Wang X, Garcia DA, Stevens E, Kyoung Jung S, Kieffer-Kwon KR, Vian L, Stodola T, Moris F, Chopp L, Preite S, Schwartzberg PL, Kulinski JM, Olivera A, Harly C, Bhandoola A, Heuston EF, Bodine DM, Urrutia R, Upadhyaya A, Weirauch MT, Hager G, Casellas R. "Stripe" transcription factors provide accessibility to co-binding partners in mammalian genomes. Mol Cell 2022; 82:3398-3411.e11. [PMID: 35863348 PMCID: PMC9481673 DOI: 10.1016/j.molcel.2022.06.029] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2021] [Revised: 04/06/2022] [Accepted: 06/22/2022] [Indexed: 10/17/2022]
Abstract
Regulatory elements activate promoters by recruiting transcription factors (TFs) to specific motifs. Notably, TF-DNA interactions often depend on cooperativity with colocalized partners, suggesting an underlying cis-regulatory syntax. To explore TF cooperativity in mammals, we analyze ∼500 mouse and human primary cells by combining an atlas of TF motifs, footprints, ChIP-seq, transcriptomes, and accessibility. We uncover two TF groups that colocalize with most expressed factors, forming stripes in hierarchical clustering maps. The first group includes lineage-determining factors that occupy DNA elements broadly, consistent with their key role in tissue-specific transcription. The second one, dubbed universal stripe factors (USFs), comprises ∼30 SP, KLF, EGR, and ZBTB family members that recognize overlapping GC-rich sequences in all tissues analyzed. Knockouts and single-molecule tracking reveal that USFs impart accessibility to colocalized partners and increase their residence time. Mammalian cells have thus evolved a TF superfamily with overlapping DNA binding that facilitate chromatin accessibility.
Collapse
Affiliation(s)
- Yongbing Zhao
- The NIH Regulome Project, National Institutes of Health, Bethesda, MD 20892, USA; Lymphocyte Nuclear Biology, NIAMS-NCI, NIH, Bethesda, MD 20892, USA.
| | - Supriya V Vartak
- The NIH Regulome Project, National Institutes of Health, Bethesda, MD 20892, USA; Lymphocyte Nuclear Biology, NIAMS-NCI, NIH, Bethesda, MD 20892, USA
| | - Andrea Conte
- The NIH Regulome Project, National Institutes of Health, Bethesda, MD 20892, USA; Lymphocyte Nuclear Biology, NIAMS-NCI, NIH, Bethesda, MD 20892, USA
| | - Xiang Wang
- The NIH Regulome Project, National Institutes of Health, Bethesda, MD 20892, USA; Lymphocyte Nuclear Biology, NIAMS-NCI, NIH, Bethesda, MD 20892, USA
| | - David A Garcia
- Laboratory of Receptor Biology and Gene Expression, NCI, NIH, Bethesda, MD 20893, USA; Department of Physics, University of Maryland, College Park, MD 20742, USA
| | - Evan Stevens
- Lymphocyte Nuclear Biology, NIAMS-NCI, NIH, Bethesda, MD 20892, USA
| | - Seol Kyoung Jung
- The NIH Regulome Project, National Institutes of Health, Bethesda, MD 20892, USA; Lymphocyte Nuclear Biology, NIAMS-NCI, NIH, Bethesda, MD 20892, USA
| | | | - Laura Vian
- Lymphocyte Nuclear Biology, NIAMS-NCI, NIH, Bethesda, MD 20892, USA
| | - Timothy Stodola
- Genomic Sciences and Precision Medicine Center (GSPMC), Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Francisco Moris
- EntreChem S.L., Vivero Ciencias de la Salud, 33011 Oviedo, Spain
| | - Laura Chopp
- Laboratory of Immune Cell Biology, NCI, NIH, Bethesda, MD 20892, USA
| | - Silvia Preite
- Laboratory of Immune System Biology, NIAID, NIH, Bethesda, MD 20892, USA
| | | | - Joseph M Kulinski
- Mast cell Biology Section, Laboratory of Allergic Diseases, NIAID, NIH, Bethesda, MD 20892, USA
| | - Ana Olivera
- Mast cell Biology Section, Laboratory of Allergic Diseases, NIAID, NIH, Bethesda, MD 20892, USA
| | - Christelle Harly
- Laboratory of Genome Integrity, NCI, NIH, Bethesda, MD 20892, USA
| | | | | | - David M Bodine
- Genetics and Molecular Biology Branch, NHGRI, NIH, Bethesda, MD 20892, USA
| | - Raul Urrutia
- Genomic Sciences and Precision Medicine Center (GSPMC), Medical College of Wisconsin, Milwaukee, WI 53226, USA
| | - Arpita Upadhyaya
- Department of Physics, University of Maryland, College Park, MD 20742, USA
| | - Matthew T Weirauch
- Divisions of Biomedical Informatics and Developmental Biology, Center for Autoimmune Genomics and Etiology (CAGE), Cincinnati Children's Hospital Medical Center, Cincinnati, OH 45229, USA; Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45229, USA
| | - Gordon Hager
- Laboratory of Receptor Biology and Gene Expression, NCI, NIH, Bethesda, MD 20893, USA
| | - Rafael Casellas
- The NIH Regulome Project, National Institutes of Health, Bethesda, MD 20892, USA; Lymphocyte Nuclear Biology, NIAMS-NCI, NIH, Bethesda, MD 20892, USA.
| |
Collapse
|
8
|
Isbel L, Grand RS, Schübeler D. Generating specificity in genome regulation through transcription factor sensitivity to chromatin. Nat Rev Genet 2022; 23:728-740. [PMID: 35831531 DOI: 10.1038/s41576-022-00512-6] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/30/2022] [Indexed: 12/11/2022]
Abstract
Cell type-specific gene expression relies on transcription factors (TFs) binding DNA sequence motifs embedded in chromatin. Understanding how motifs are accessed in chromatin is crucial to comprehend differential transcriptional responses and the phenotypic impact of sequence variation. Chromatin obstacles to TF binding range from DNA methylation to restriction of DNA access by nucleosomes depending on their position, composition and modification. In vivo and in vitro approaches now enable the study of TF binding in chromatin at unprecedented resolution. Emerging insights suggest that TFs vary in their ability to navigate chromatin states. However, it remains challenging to link binding and transcriptional outcomes to molecular characteristics of TFs or the local chromatin substrate. Here, we discuss our current understanding of how TFs access DNA in chromatin and novel techniques and directions towards a better understanding of this critical step in genome regulation.
Collapse
Affiliation(s)
- Luke Isbel
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.,School of Biotechnology and Biomolecular Sciences, University of New South Wales, Sydney, New South Wales, Australia
| | - Ralph S Grand
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland.,Zentrum für Molekulare Biologie der Universität Heidelberg, Heidelberg, Germany
| | - Dirk Schübeler
- Friedrich Miescher Institute for Biomedical Research, Basel, Switzerland. .,Faculty of Sciences, University of Basel, Basel, Switzerland.
| |
Collapse
|
9
|
Hammelman J, Patel T, Closser M, Wichterle H, Gifford D. Ranking reprogramming factors for cell differentiation. Nat Methods 2022; 19:812-822. [PMID: 35710610 PMCID: PMC10460539 DOI: 10.1038/s41592-022-01522-2] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Accepted: 05/13/2022] [Indexed: 12/16/2022]
Abstract
Transcription factor over-expression is a proven method for reprogramming cells to a desired cell type for regenerative medicine and therapeutic discovery. However, a general method for the identification of reprogramming factors to create an arbitrary cell type is an open problem. Here we examine the success rate of methods and data for differentiation by testing the ability of nine computational methods (CellNet, GarNet, EBseq, AME, DREME, HOMER, KMAC, diffTF and DeepAccess) to discover and rank candidate factors for eight target cell types with known reprogramming solutions. We compare methods that use gene expression, biological networks and chromatin accessibility data, and comprehensively test parameter and preprocessing of input data to optimize performance. We find the best factor identification methods can identify an average of 50-60% of reprogramming factors within the top ten candidates, and methods that use chromatin accessibility perform the best. Among the chromatin accessibility methods, complex methods DeepAccess and diffTF have higher correlation with the ranked significance of transcription factor candidates within reprogramming protocols for differentiation. We provide evidence that AME and diffTF are optimal methods for transcription factor recovery that will allow for systematic prioritization of transcription factor candidates to aid in the design of new reprogramming protocols.
Collapse
Affiliation(s)
- Jennifer Hammelman
- Computational and Systems Biology, MIT, Cambridge, MA, USA
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA
| | - Tulsi Patel
- Departments of Pathology and Cell Biology, Neuroscience, Rehabilitation and Regenerative Medicine (in Neurology), Columbia University Irving Medical Center, New York, NY, USA
- Center for Motor Neuron Biology and Disease, Columbia University Irving Medical Center, New York, NY, USA
- Columbia Stem Cell Initiative, Columbia University Irving Medical Center, New York, NY, USA
| | - Michael Closser
- Departments of Pathology and Cell Biology, Neuroscience, Rehabilitation and Regenerative Medicine (in Neurology), Columbia University Irving Medical Center, New York, NY, USA
- Center for Motor Neuron Biology and Disease, Columbia University Irving Medical Center, New York, NY, USA
- Columbia Stem Cell Initiative, Columbia University Irving Medical Center, New York, NY, USA
| | - Hynek Wichterle
- Departments of Pathology and Cell Biology, Neuroscience, Rehabilitation and Regenerative Medicine (in Neurology), Columbia University Irving Medical Center, New York, NY, USA
- Center for Motor Neuron Biology and Disease, Columbia University Irving Medical Center, New York, NY, USA
- Columbia Stem Cell Initiative, Columbia University Irving Medical Center, New York, NY, USA
| | - David Gifford
- Computational and Systems Biology, MIT, Cambridge, MA, USA.
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, MA, USA.
- Department of Biological Engineering, MIT, Cambridge, MA, USA.
- Department of Electrical Engineering and Computer Science, MIT, Cambridge, MA, USA.
| |
Collapse
|
10
|
Zhang Q, Zhang J, Lei T, Liang Z, Dong X, Sun L, Zhao Y. Sirt6-mediated epigenetic modification of DNA accessibility is essential for Pou2f3-induced thymic tuft cell development. Commun Biol 2022; 5:544. [PMID: 35668088 PMCID: PMC9170729 DOI: 10.1038/s42003-022-03484-9] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/14/2021] [Accepted: 05/11/2022] [Indexed: 11/09/2022] Open
Abstract
AbstractThymic epithelial cells (TECs) are essential for the production of self-tolerant T cells. The newly identified thymic tuft cells are regulated by Pou2f3 and represent important elements for host type 2 immunity. However, epigenetic involvement in thymic tuft cell development remains unclear. We performed single-cell ATAC-seq of medullary TEC (mTEC) and established single-cell chromatin accessibility profiling of mTECs. The results showed that mTEC III cells can be further divided into three groups (Late Aire 1, 2, and 3) and that thymic tuft cells may be derived from Late Aire 2 cells. Pou2f3 is expressed in both Late Aire 2 cells and thymic tuft cells, while Pou2f3-regulated genes are specifically expressed in thymic tuft cells with simultaneous opening of chromatin accessibility, indicating the involvement of epigenetic modification in this process. Using the epigenetic regulator Sirt6-defect mouse model, we found that Sirt6 deletion increased Late Aire 2 cells and decreased thymic tuft cells and Late Aire 3 cells without affecting Pou2f3 expression. However, Sirt6 deletion reduced the chromatin accessibility of Pou2f3-regulated genes in thymic tuft cells, which may be caused by Sirt6–mediated regulation of Hdac9 expression. These data indicate that epigenetic regulation is indispensable for Pou2f3-mediated thymic tuft cell development.
Collapse
|
11
|
Boldyreva LV, Andreyeva EN, Pindyurin AV. Position Effect Variegation: Role of the Local Chromatin Context in Gene Expression Regulation. Mol Biol 2022. [DOI: 10.1134/s0026893322030049] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
12
|
Hammelman J, Gifford DK. Discovering differential genome sequence activity with interpretable and efficient deep learning. PLoS Comput Biol 2021; 17:e1009282. [PMID: 34370721 PMCID: PMC8376110 DOI: 10.1371/journal.pcbi.1009282] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/06/2021] [Revised: 08/19/2021] [Accepted: 07/16/2021] [Indexed: 11/23/2022] Open
Abstract
Discovering sequence features that differentially direct cells to alternate fates is key to understanding both cellular development and the consequences of disease related mutations. We introduce Expected Pattern Effect and Differential Expected Pattern Effect, two black-box methods that can interpret genome regulatory sequences for cell type-specific or condition specific patterns. We show that these methods identify relevant transcription factor motifs and spacings that are predictive of cell state-specific chromatin accessibility. Finally, we integrate these methods into framework that is readily accessible to non-experts and available for download as a binary or installed via PyPI or bioconda at https://cgs.csail.mit.edu/deepaccess-package/.
Collapse
Affiliation(s)
- Jennifer Hammelman
- Computational and Systems Biology, MIT, Cambridge, Massachusetts, United States of America
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, Massachusetts, United States of America
| | - David K. Gifford
- Computer Science and Artificial Intelligence Laboratory, MIT, Cambridge, Massachusetts, United States of America
- Department of Electrical Engineering & Computer Science, MIT, Cambridge, Massachusetts, United States of America
- Department of Biological Engineering, MIT, Cambridge, Massachusetts, United States of America
| |
Collapse
|
13
|
Song B, Buckler ES, Wang H, Wu Y, Rees E, Kellogg EA, Gates DJ, Khaipho-Burch M, Bradbury PJ, Ross-Ibarra J, Hufford MB, Romay MC. Conserved noncoding sequences provide insights into regulatory sequence and loss of gene expression in maize. Genome Res 2021; 31:1245-1257. [PMID: 34045362 PMCID: PMC8256870 DOI: 10.1101/gr.266528.120] [Citation(s) in RCA: 22] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 05/21/2021] [Indexed: 01/16/2023]
Abstract
Thousands of species will be sequenced in the next few years; however, understanding how their genomes work, without an unlimited budget, requires both molecular and novel evolutionary approaches. We developed a sensitive sequence alignment pipeline to identify conserved noncoding sequences (CNSs) in the Andropogoneae tribe (multiple crop species descended from a common ancestor ∼18 million years ago). The Andropogoneae share similar physiology while being tremendously genomically diverse, harboring a broad range of ploidy levels, structural variation, and transposons. These contribute to the potential of Andropogoneae as a powerful system for studying CNSs and are factors we leverage to understand the function of maize CNSs. We found that 86% of CNSs were comprised of annotated features, including introns, UTRs, putative cis-regulatory elements, chromatin loop anchors, noncoding RNA (ncRNA) genes, and several transposable element superfamilies. CNSs were enriched in active regions of DNA replication in the early S phase of the mitotic cell cycle and showed different DNA methylation ratios compared to the genome-wide background. More than half of putative cis-regulatory sequences (identified via other methods) overlapped with CNSs detected in this study. Variants in CNSs were associated with gene expression levels, and CNS absence contributed to loss of gene expression. Furthermore, the evolution of CNSs was associated with the functional diversification of duplicated genes in the context of maize subgenomes. Our results provide a quantitative understanding of the molecular processes governing the evolution of CNSs in maize.
Collapse
Affiliation(s)
- Baoxing Song
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
| | - Edward S Buckler
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, New York 14853, USA
- Agricultural Research Service, United States Department of Agriculture, Ithaca, New York 14853, USA
| | - Hai Wang
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
- National Maize Improvement Center, Key Laboratory of Crop Heterosis and Utilization, Joint Laboratory for International Cooperation in Crop Molecular Breeding, China Agricultural University, Beijing 100193, China
| | - Yaoyao Wu
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
- Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen 518124, China
| | - Evan Rees
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, New York 14853, USA
| | | | - Daniel J Gates
- Department of Evolution and Ecology, University of California Davis, Davis, California 95616, USA
| | - Merritt Khaipho-Burch
- Section of Plant Breeding and Genetics, Cornell University, Ithaca, New York 14853, USA
| | - Peter J Bradbury
- Agricultural Research Service, United States Department of Agriculture, Ithaca, New York 14853, USA
| | - Jeffrey Ross-Ibarra
- Department of Evolution and Ecology, University of California Davis, Davis, California 95616, USA
- Center for Population Biology and Genome Center, University of California Davis, Davis, California 95616, USA
| | - Matthew B Hufford
- Department of Ecology, Evolution, and Organismal Biology, Iowa State University, Ames, Iowa 50011, USA
| | - M Cinta Romay
- Institute for Genomic Diversity, Cornell University, Ithaca, New York 14853, USA
| |
Collapse
|
14
|
Akinci E, Hamilton MC, Khowpinitchai B, Sherwood RI. Using CRISPR to understand and manipulate gene regulation. Development 2021; 148:dev182667. [PMID: 33913466 PMCID: PMC8126405 DOI: 10.1242/dev.182667] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]
Abstract
Understanding how genes are expressed in the correct cell types and at the correct level is a key goal of developmental biology research. Gene regulation has traditionally been approached largely through observational methods, whereas perturbational approaches have lacked precision. CRISPR-Cas9 has begun to transform the study of gene regulation, allowing for precise manipulation of genomic sequences, epigenetic functionalization and gene expression. CRISPR-Cas9 technology has already led to the discovery of new paradigms in gene regulation and, as new CRISPR-based tools and methods continue to be developed, promises to transform our knowledge of the gene regulatory code and our ability to manipulate cell fate. Here, we discuss the current and future application of the emerging CRISPR toolbox toward predicting gene regulatory network behavior, improving stem cell disease modeling, dissecting the epigenetic code, reprogramming cell fate and treating diseases of gene dysregulation.
Collapse
Affiliation(s)
- Ersin Akinci
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Department of Agricultural Biotechnology, Faculty of Agriculture, Akdeniz University, Antalya, 07070, Turkey
| | - Marisa C. Hamilton
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Benyapa Khowpinitchai
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
| | - Richard I. Sherwood
- Division of Genetics, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA
- Hubrecht Institute, 3584 CT, Utrecht, The Netherlands
| |
Collapse
|