1
|
Lu Y, Lee J, Li J, Allu SR, Wang J, Kim H, Bullaughey KL, Fisher SA, Nordgren CE, Rosario JG, Anderson SA, Ulyanova AV, Brem S, Chen HI, Wolf JA, Grady MS, Vinogradov SA, Kim J, Eberwine J. CHEX-seq detects single-cell genomic single-stranded DNA with catalytical potential. Nat Commun 2023; 14:7346. [PMID: 37963886 PMCID: PMC10645931 DOI: 10.1038/s41467-023-43158-6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Accepted: 11/02/2023] [Indexed: 11/16/2023] Open
Abstract
Genomic DNA (gDNA) undergoes structural interconversion between single- and double-stranded states during transcription, DNA repair and replication, which is critical for cellular homeostasis. We describe "CHEX-seq" which identifies the single-stranded DNA (ssDNA) in situ in individual cells. CHEX-seq uses 3'-terminal blocked, light-activatable probes to prime the copying of ssDNA into complementary DNA that is sequenced, thereby reporting the genome-wide single-stranded chromatin landscape. CHEX-seq is benchmarked in human K562 cells, and its utilities are demonstrated in cultures of mouse and human brain cells as well as immunostained spatially localized neurons in brain sections. The amount of ssDNA is dynamically regulated in response to perturbation. CHEX-seq also identifies single-stranded regions of mitochondrial DNA in single cells. Surprisingly, CHEX-seq identifies single-stranded loci in mouse and human gDNA that catalyze porphyrin metalation in vitro, suggesting a catalytic activity for genomic ssDNA. We posit that endogenous DNA enzymatic activity is a function of genomic ssDNA.
Collapse
Affiliation(s)
- Youtao Lu
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jaehee Lee
- Department of Systems Pharmacology and Translational Therapeutics Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jifen Li
- Department of Systems Pharmacology and Translational Therapeutics Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Srinivasa Rao Allu
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jinhui Wang
- Department of Systems Pharmacology and Translational Therapeutics Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - HyunBum Kim
- Department of Systems Pharmacology and Translational Therapeutics Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Kevin L Bullaughey
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Stephen A Fisher
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - C Erik Nordgren
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Jean G Rosario
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Stewart A Anderson
- Department of Psychiatry, Children's Hospital of Philadelphia, ARC 517, 3615 Civic Center Blvd, Philadelphia, PA, 19104, USA
| | - Alexandra V Ulyanova
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Steven Brem
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - H Isaac Chen
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - John A Wolf
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - M Sean Grady
- Department of Neurosurgery, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Sergei A Vinogradov
- Department of Biochemistry and Biophysics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - Junhyong Kim
- Department of Biology, School of Arts and Sciences, University of Pennsylvania, Philadelphia, PA, 19104, USA
| | - James Eberwine
- Department of Systems Pharmacology and Translational Therapeutics Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, 19104, USA.
| |
Collapse
|
2
|
Pearson A, Lladser ME. On latent idealized models in symbolic datasets: unveiling signals in noisy sequencing data. J Math Biol 2023; 87:26. [PMID: 37428265 DOI: 10.1007/s00285-023-01961-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2020] [Revised: 06/19/2023] [Accepted: 06/25/2023] [Indexed: 07/11/2023]
Abstract
Data taking values on discrete sample spaces are the embodiment of modern biological research. "Omics" experiments based on high-throughput sequencing produce millions of symbolic outcomes in the form of reads (i.e., DNA sequences of a few dozens to a few hundred nucleotides). Unfortunately, these intrinsically non-numerical datasets often deviate dramatically from natural assumptions a practitioner might make, and the possible sources of this deviation are usually poorly characterized. This contrasts with numerical datasets where Gaussian-type errors are often well-justified. To overcome this hurdle, we introduce the notion of latent weight, which measures the largest expected fraction of samples from a probabilistic source that conform to a model in a class of idealized models. We examine various properties of latent weights, which we specialize to the class of exchangeable probability distributions. As proof of concept, we analyze DNA methylation data from the 22 human autosome pairs. Contrary to what is usually assumed in the literature, we provide strong evidence that highly specific methylation patterns are overrepresented at some genomic locations when latent weights are taken into account.
Collapse
Affiliation(s)
- Antony Pearson
- Department of Applied Mathematics, University of Colorado Boulder, Boulder, CO, USA
| | - Manuel E Lladser
- Department of Applied Mathematics, University of Colorado Boulder, Boulder, CO, USA.
| |
Collapse
|
3
|
Briend M, Rufiange A, Moncla LHM, Mathieu S, Bossé Y, Mathieu P. Connectome and regulatory hubs of CAGE highly active enhancers. Sci Rep 2023; 13:5594. [PMID: 37019979 PMCID: PMC10076288 DOI: 10.1038/s41598-023-32669-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2022] [Accepted: 03/30/2023] [Indexed: 04/07/2023] Open
Abstract
Evidence indicates that enhancers are transcriptionally active. Herein, we investigated transcriptionally active enhancers by using cap analysis of gene expression (CAGE) combined with epigenetic marks and chromatin interactions. We identified CAGE-tag highly active (CHA) enhancers as distant regulatory elements with CAGE-tag ≥ 90th percentile and overlapping with H3K27ac peaks (4.5% of enhancers). CHA enhancers were conserved between mouse and man and were independent from super-enhancers in predicting cell identity with lower P-values. CHA enhancers had increased open chromatin and a higher recruitment of cell-specific transcription factors as well as molecules involved in 3D genome interactions. HiChIP analysis of enhancer-promoter looping indicated that CHA enhancers had a higher density of anchor loops when compared to regular enhancers. A subset of CHA enhancers and promoters characterized by a high density of chromatin loops and forming hub regulatory units were connected to the promoter of immediate early response genes, genes involved in cancer and encoding for transcription factors. Promoter of genes within hub CHA regulatory units were less likely to be paused. CHA enhancers were enriched in gene variants associated with autoimmune disorders and had looping with causal candidate genes as revealed by Mendelian randomization. Hence, CHA enhancers form a dense hierarchical network of chromatin interactions between regulatory elements and genes involved in cell identity and disorders.
Collapse
Affiliation(s)
- Mewen Briend
- Genomic Medicine Laboratory, Quebec Heart and Lung Institute, Laval University, Quebec, Canada
| | - Anne Rufiange
- Genomic Medicine Laboratory, Quebec Heart and Lung Institute, Laval University, Quebec, Canada
| | | | - Samuel Mathieu
- Genomic Medicine Laboratory, Quebec Heart and Lung Institute, Laval University, Quebec, Canada
| | - Yohan Bossé
- Quebec Heart and Lung Institute, Laval University, Quebec, Canada
- Department of Molecular Medicine, Laval University, Quebec, Canada
| | - Patrick Mathieu
- Genomic Medicine Laboratory, Quebec Heart and Lung Institute, Laval University, Quebec, Canada.
- Institut de Cardiologie et de Pneumologie de Québec/Québec Heart and Lung Institute, 2725 Chemin Ste-Foy, Québec, Québec, G1V-4G5, Canada.
| |
Collapse
|
4
|
Gally F, Sasse SK, Kurche JS, Gruca MA, Cardwell JH, Okamoto T, Chu HW, Hou X, Poirion OB, Buchanan J, Preissl S, Ren B, Colgan SP, Dowell RD, Yang IV, Schwartz DA, Gerber AN. The MUC5B-associated variant rs35705950 resides within an enhancer subject to lineage- and disease-dependent epigenetic remodeling. JCI Insight 2021; 6:144294. [PMID: 33320836 PMCID: PMC7934873 DOI: 10.1172/jci.insight.144294] [Citation(s) in RCA: 16] [Impact Index Per Article: 5.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 12/09/2020] [Indexed: 12/19/2022] Open
Abstract
The G/T transversion rs35705950, located approximately 3 kb upstream of the MUC5B start site, is the cardinal risk factor for idiopathic pulmonary fibrosis (IPF). Here, we investigate the function and chromatin structure of this –3 kb region and provide evidence that it functions as a classically defined enhancer subject to epigenetic programming. We use nascent transcript analysis to show that RNA polymerase II loads within 10 bp of the G/T transversion site, definitively establishing enhancer function for the region. By integrating Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) analysis of fresh and cultured human airway epithelial cells with nuclease sensitivity data, we demonstrate that this region is in accessible chromatin that affects the expression of MUC5B. Through applying paired single-nucleus RNA- and ATAC-seq to frozen tissue from IPF lungs, we extend these findings directly to disease, with results indicating that epigenetic programming of the –3 kb enhancer in IPF occurs in both MUC5B-expressing and nonexpressing lineages. In aggregate, our results indicate that the MUC5B-associated variant rs35705950 resides within an enhancer that is subject to epigenetic remodeling and contributes to pathologic misexpression in IPF.
Collapse
Affiliation(s)
- Fabienne Gally
- Department of Immunology and Genomic Medicine, National Jewish Health, Denver, Colorado, USA.,Department of Medicine, University of Colorado, Aurora, Colorado, USA
| | - Sarah K Sasse
- Department of Medicine, National Jewish Health, Denver, Colorado, USA
| | - Jonathan S Kurche
- Department of Medicine, University of Colorado, Aurora, Colorado, USA
| | - Margaret A Gruca
- BioFrontiers Institute, University of Colorado-Boulder (CU Boulder), Boulder, Colorado, USA
| | | | - Tsukasa Okamoto
- Department of Medicine, University of Colorado, Aurora, Colorado, USA.,Department of Respiratory Medicine, Tokyo Medical and Dental University, Tokyo, Japan
| | - Hong W Chu
- Department of Medicine, National Jewish Health, Denver, Colorado, USA
| | - Xiaomeng Hou
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, California, USA
| | - Olivier B Poirion
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, California, USA
| | - Justin Buchanan
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, California, USA
| | - Sebastian Preissl
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, California, USA
| | - Bing Ren
- Center for Epigenomics, Department of Cellular and Molecular Medicine, University of California, San Diego School of Medicine, La Jolla, California, USA.,Ludwig Institute for Cancer Research, La Jolla, California, USA
| | - Sean P Colgan
- Department of Medicine, University of Colorado, Aurora, Colorado, USA
| | - Robin D Dowell
- BioFrontiers Institute, University of Colorado-Boulder (CU Boulder), Boulder, Colorado, USA.,Molecular, Cellular and Developmental Biology, and.,Computer Science, CU Boulder, Boulder, Colorado, USA
| | - Ivana V Yang
- Department of Medicine, University of Colorado, Aurora, Colorado, USA
| | - David A Schwartz
- Department of Medicine, University of Colorado, Aurora, Colorado, USA
| | - Anthony N Gerber
- Department of Immunology and Genomic Medicine, National Jewish Health, Denver, Colorado, USA.,Department of Medicine, University of Colorado, Aurora, Colorado, USA.,Department of Medicine, National Jewish Health, Denver, Colorado, USA
| |
Collapse
|
5
|
TDP-43 regulates transcription at protein-coding genes and Alu retrotransposons. BIOCHIMICA ET BIOPHYSICA ACTA-GENE REGULATORY MECHANISMS 2019; 1862:194434. [PMID: 31655156 DOI: 10.1016/j.bbagrm.2019.194434] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/06/2019] [Revised: 09/19/2019] [Accepted: 09/24/2019] [Indexed: 12/13/2022]
Abstract
The 43-kDa transactive response DNA-binding protein (TDP-43) is an example of an RNA-binding protein that regulates RNA metabolism at multiple levels from transcription and splicing to translation. Its role in post-transcriptional RNA processing has been a primary focus of recent research, but its role in regulating transcription has been studied for only a few human genes. We characterized the effects of TDP-43 on transcription genome-wide and found that TDP-43 broadly affects transcription of protein-coding and noncoding RNA genes. Among protein-coding genes, the effects of TDP-43 were greatest for genes <30 thousand base pairs in length. Surprisingly, we found that the loss of TDP-43 resulted in increased evidence for transcription activity near repetitive Alu elements found within expressed genes. The highest densities of affected Alu elements were found in the shorter genes, whose transcription was most affected by TDP-43. Thus, in addition to its role in post-transcriptional RNA processing, TDP-43 plays a critical role in maintaining the transcriptional stability of protein-coding genes and transposable DNA elements.
Collapse
|
6
|
Samata M, Akhtar A. Dosage Compensation of the X Chromosome: A Complex Epigenetic Assignment Involving Chromatin Regulators and Long Noncoding RNAs. Annu Rev Biochem 2018; 87:323-350. [PMID: 29668306 DOI: 10.1146/annurev-biochem-062917-011816] [Citation(s) in RCA: 86] [Impact Index Per Article: 14.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022]
Abstract
X chromosome regulation represents a prime example of an epigenetic phenomenon where coordinated regulation of a whole chromosome is required. In flies, this is achieved by transcriptional upregulation of X chromosomal genes in males to equalize the gene dosage differences in females. Chromatin-bound proteins and long noncoding RNAs (lncRNAs) constituting a ribonucleoprotein complex known as the male-specific lethal (MSL) complex or the dosage compensation complex mediate this process. MSL complex members decorate the male X chromosome, and their absence leads to male lethality. The male X chromosome is also enriched with histone H4 lysine 16 acetylation (H4K16ac), indicating that the chromatin compaction status of the X chromosome also plays an important role in transcriptional activation. How the X chromosome is specifically targeted and how dosage compensation is mechanistically achieved are central questions for the field. Here, we review recent advances, which reveal a complex interplay among lncRNAs, the chromatin landscape, transcription, and chromosome conformation that fine-tune X chromosome gene expression.
Collapse
Affiliation(s)
- Maria Samata
- Department of Chromatin Regulation, Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg im Breisgau, Germany; .,Faculty of Biology, University of Freiburg, 79104 Freiburg im Breisgau, Germany
| | - Asifa Akhtar
- Department of Chromatin Regulation, Max Planck Institute of Immunobiology and Epigenetics, 79108 Freiburg im Breisgau, Germany;
| |
Collapse
|
7
|
Azofeifa JG, Dowell RD. A generative model for the behavior of RNA polymerase. Bioinformatics 2016; 33:227-234. [PMID: 27663494 PMCID: PMC5942361 DOI: 10.1093/bioinformatics/btw599] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2016] [Revised: 08/31/2016] [Accepted: 09/12/2016] [Indexed: 12/30/2022] Open
Abstract
MOTIVATION Transcription by RNA polymerase is a highly dynamic process involving multiple distinct points of regulation. Nascent transcription assays are a relatively new set of high throughput techniques that measure the location of actively engaged RNA polymerase genome wide. Hence, nascent transcription is a rich source of information on the regulation of RNA polymerase activity. To fully dissect this data requires the development of stochastic models that can both deconvolve the stages of polymerase activity and identify significant changes in activity between experiments. RESULTS We present a generative, probabilistic model of RNA polymerase that fully describes loading, initiation, elongation and termination. We fit this model genome wide and profile the enzymatic activity of RNA polymerase across various loci and following experimental perturbation. We observe striking correlation of predicted loading events and regulatory chromatin marks. We provide principled statistics that compute probabilities reminiscent of traveler's and divergent ratios. We finish with a systematic comparison of RNA Polymerase activity at promoter versus non-promoter associated loci. AVAILABILITY AND IMPLEMENTATION Transcription Fit (Tfit) is a freely available, open source software package written in C/C ++ that requires GNU compilers 4.7.3 or greater. Tfit is available from GitHub (https://github.com/azofeifa/Tfit). CONTACT robin.dowell@colorado.eduSupplementary information: Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Joseph G Azofeifa
- Department of Computer Science, University of Colorado, Boulder, CO, USA
| | - Robin D Dowell
- Department of Computer Science, University of Colorado, Boulder, CO, USA.,Department of Molecular, Cellular and Developmental Biology, University of Colorado, Boulder, CO, USA.,BioFrontiers Institute, University of Colorado, Boulder, CO, USA
| |
Collapse
|