1
|
Kim SH, Marinov GK, Greenleaf WJ. KAS-ATAC reveals the genome-wide single-stranded accessible chromatin landscape of the human genome. Genome Res 2025; 35:124-134. [PMID: 39572230 PMCID: PMC11789636 DOI: 10.1101/gr.279621.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2024] [Accepted: 11/19/2024] [Indexed: 01/24/2025]
Abstract
Gene regulation in most eukaryotes involves two fundamental processes: alterations in genome packaging by nucleosomes, with active cis-regulatory elements (CREs) generally characterized by open-chromatin configuration, and transcriptional activation. Mapping these physical properties and biochemical activities, through profiling chromatin accessibility and active transcription, is a key tool for understanding the logic and mechanisms of transcription and its regulation. However, the relationship between these two states has not been accessible to simultaneous measurement. To this end, we developed KAS-ATAC, a combination of the kethoxal-assisted ssDNA sequencing (KAS-seq) and assay for transposase-accessible chromatin using sequencing (ATAC-seq) methods for mapping single-stranded DNA (and thus active transcription) and chromatin accessibility, respectively, enabling the genome-wide identification of DNA fragments that are simultaneously accessible and contain ssDNA. We use KAS-ATAC to evaluate levels of active transcription over different CRE classes, to estimate absolute levels of transcribed accessible DNA over CREs, to map nucleosomal configurations associated with RNA polymerase activities, and to assess transcription factor association with transcribed DNA through transcription factor binding site (TFBS) footprinting. We observe lower levels of transcription over distal enhancers compared with promoters and distinct nucleosomal configurations around transcription initiation sites associated with active transcription. We find that most TFs associate equally with transcribed and nontranscribed DNA, but a few factors specifically do not exhibit footprints over ssDNA-containing fragments. We anticipate KAS-ATAC to continue to derive useful insights into chromatin organization and transcriptional regulation in other contexts in the future.
Collapse
Affiliation(s)
- Samuel H Kim
- Cancer Biology Programs, School of Medicine, Stanford University, Stanford, California 94305, USA
| | - Georgi K Marinov
- Department of Genetics, School of Medicine, Stanford University, Stanford, California 94305, USA;
| | - William J Greenleaf
- Department of Genetics, School of Medicine, Stanford University, Stanford, California 94305, USA
- Department of Applied Physics, Stanford University, Stanford, California 94305, USA
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, California 94305, USA
- Chan Zuckerberg Biohub, San Francisco, California 94158, USA
| |
Collapse
|
2
|
Kwok AWC, Shim H, McCarthy DJ. Going beyond cell clustering and feature aggregation: Is there single cell level information in single-cell ATAC-seq data? BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.04.626927. [PMID: 39713401 PMCID: PMC11661094 DOI: 10.1101/2024.12.04.626927] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 12/24/2024]
Abstract
Single-cell Assay for Transposase Accessible Chromatin with sequencing (scATAC-seq) has become a widely used method for investigating chromatin accessibility at single-cell resolution. However, the resulting data is highly sparse with most data entries being zeros. As such, currently available computational methods for scATAC-seq feature a range of transformation procedures to extract meaningful information from the sparse data. Most notably, these transformations can be categorized into: 1) feature aggregation with known biological associations, 2) pseudo-bulking cells of similar biology, and 3) binarisation of count data. These strategies beg the question of whether or not scATAC-seq data actually has usable single-cell and single-region information as intended from the assay. If we can go beyond aggregated features and pooled cells, it opens up the possibility of more complex statistical tasks that require that degree of granularity. To reach the finest possible resolution of single-cell, single-region information there are inevitably many computational challenges to overcome. Here, we review the major data analysis challenges lying between raw data readout and biological discovery, and discuss the limitations of current data analysis approaches. Lastly, we conclude that chromatin accessibility profiling at true single-cell resolution is not yet achieved with current technology, but that it may be achieved with promising developments in optimising the efficiency of scATAC-seq assays.
Collapse
Affiliation(s)
- Aaron Wing Cheung Kwok
- Bioinformatics and Cellular Genomics, St Vincent's Institute of Medical Research, Fitzroy, VIC 3065, Australia
- Melbourne Integrative Genomics, University of Melbourne, Parkville, VIC, 3010, Australia
- School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Heejung Shim
- Melbourne Integrative Genomics, University of Melbourne, Parkville, VIC, 3010, Australia
- School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Parkville, VIC, 3010, Australia
| | - Davis J McCarthy
- Bioinformatics and Cellular Genomics, St Vincent's Institute of Medical Research, Fitzroy, VIC 3065, Australia
- Melbourne Integrative Genomics, University of Melbourne, Parkville, VIC, 3010, Australia
- School of Mathematics and Statistics, Faculty of Science, University of Melbourne, Parkville, VIC, 3010, Australia
- Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC, 3010, Australia
| |
Collapse
|
3
|
Thompson M, Byrd A. Untargeted CUT&Tag and BG4 CUT&Tag are both enriched at G-quadruplexes and accessible chromatin. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.09.26.615263. [PMID: 39386625 PMCID: PMC11463444 DOI: 10.1101/2024.09.26.615263] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 10/12/2024]
Abstract
G-quadruplex DNA structures (G4s) form within single-stranded DNA in nucleosome-free chromatin. As G4s modulate gene expression and genomic stability, genome-wide mapping of G4s has generated strong research interest. Recently, the Cleavage Under Targets and Tagmentation (CUT&Tag) method was performed with the G4-specific BG4 antibody to target Tn5 transposase to G4s. While this method generated a novel high-resolution map of G4s, we unexpectedly observed a strong correlation between the genome-wide signal distribution of BG4 CUT&Tag and accessible chromatin. To examine whether untargeted Tn5 cutting at accessible chromatin contributes to BG4 CUT&Tag signal, we examined the genome-wide distribution of signal from untargeted (i.e. negative control) CUT&Tag datasets. We observed that untargeted CUT&Tag signal distribution was highly similar to both that of accessible chromatin and of BG4 CUT&Tag. We also observed that BG4 CUT&Tag signal increased at mapped G4s, but this increase was accompanied by a concomitant increase in untargeted CUT&Tag at the same loci. Consequently, enrichment of BG4 CUT&Tag over untargeted CUT&Tag was not increased at mapped G4s. These results imply that either the vast majority of accessible chromatin regions contain mappable G4s or that the presence of G4s within accessible chromatin cannot reliably be determined using BG4 CUT&Tag alone.
Collapse
Affiliation(s)
- Matthew Thompson
- Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, AR, 72205, USA
| | - Alicia Byrd
- Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, AR, 72205, USA
- Winthrop P. Rockefeller Cancer Institute, Little Rock, AR, 72205, USA
| |
Collapse
|
4
|
Soroczynski J, Anderson LJ, Yeung JL, Rendleman JM, Oren DA, Konishi HA, Risca VI. OpenTn5: Open-Source Resource for Robust and Scalable Tn5 Transposase Purification and Characterization. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.07.11.602973. [PMID: 39026714 PMCID: PMC11257509 DOI: 10.1101/2024.07.11.602973] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/20/2024]
Abstract
Tagmentation combines DNA fragmentation and sequencing adapter addition by leveraging the transposition activity of the bacterial cut-and-paste Tn5 transposase, to enable efficient sequencing library preparation. Here we present an open-source protocol for the generation of multi-purpose hyperactive Tn5 transposase, including its benchmarking in CUT&Tag, bulk and single-cell ATAC-seq. The OpenTn5 protocol yields multi-milligram quantities of pG-Tn5E54K, L372P protein per liter of E. coli culture, sufficient for thousands of tagmentation reactions and the enzyme retains activity in storage for more than a year.
Collapse
Affiliation(s)
- Jan Soroczynski
- Laboratory of Genome Architecture and Dynamics, The Rockefeller University, New York, NY
| | - Lauren J. Anderson
- Laboratory of Genome Architecture and Dynamics, The Rockefeller University, New York, NY
| | - Joanna L. Yeung
- Laboratory of Genome Architecture and Dynamics, The Rockefeller University, New York, NY
| | - Justin M. Rendleman
- Laboratory of Genome Architecture and Dynamics, The Rockefeller University, New York, NY
| | - Deena A. Oren
- Structural Biology Resource Center, The Rockefeller University, New York, NY
| | - Hide A. Konishi
- Laboratory of Chromosome and Cell Biology, The Rockefeller University, New York, NY
| | - Viviana I. Risca
- Laboratory of Genome Architecture and Dynamics, The Rockefeller University, New York, NY
| |
Collapse
|
5
|
Miao Z, Kim J. Uniform quantification of single-nucleus ATAC-seq data with Paired-Insertion Counting (PIC) and a model-based insertion rate estimator. Nat Methods 2024; 21:32-36. [PMID: 38049698 PMCID: PMC10776405 DOI: 10.1038/s41592-023-02103-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/13/2022] [Accepted: 10/25/2023] [Indexed: 12/06/2023]
Abstract
Existing approaches to scoring single-nucleus assay for transposase-accessible chromatin with sequencing (snATAC-seq) feature matrices from sequencing reads are inconsistent, affecting downstream analyses and displaying artifacts. We show that, even with sparse single-cell data, quantitative counts are informative for estimating the regulatory state of a cell, which calls for a consistent treatment. We propose Paired-Insertion Counting as a uniform method for snATAC-seq feature characterization and provide a probability model for inferring latent insertion dynamics from snATAC-seq count matrices.
Collapse
Affiliation(s)
- Zhen Miao
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA
| | - Junhyong Kim
- Graduate Group in Genomics and Computational Biology, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.
- Department of Biology, University of Pennsylvania, Philadelphia, PA, USA.
| |
Collapse
|
6
|
Arshadi A, Tolomeo D, Venuto S, Storlazzi CT. Advancements in Focal Amplification Detection in Tumor/Liquid Biopsies and Emerging Clinical Applications. Genes (Basel) 2023; 14:1304. [PMID: 37372484 PMCID: PMC10298061 DOI: 10.3390/genes14061304] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2023] [Revised: 06/14/2023] [Accepted: 06/16/2023] [Indexed: 06/29/2023] Open
Abstract
Focal amplifications (FAs) are crucial in cancer research due to their significant diagnostic, prognostic, and therapeutic implications. FAs manifest in various forms, such as episomes, double minute chromosomes, and homogeneously staining regions, arising through different mechanisms and mainly contributing to cancer cell heterogeneity, the leading cause of drug resistance in therapy. Numerous wet-lab, mainly FISH, PCR-based assays, next-generation sequencing, and bioinformatics approaches have been set up to detect FAs, unravel the internal structure of amplicons, assess their chromatin compaction status, and investigate the transcriptional landscape associated with their occurrence in cancer cells. Most of them are tailored for tumor samples, even at the single-cell level. Conversely, very limited approaches have been set up to detect FAs in liquid biopsies. This evidence suggests the need to improve these non-invasive investigations for early tumor detection, monitoring disease progression, and evaluating treatment response. Despite the potential therapeutic implications of FAs, such as, for example, the use of HER2-specific compounds for patients with ERBB2 amplification, challenges remain, including developing selective and effective FA-targeting agents and understanding the molecular mechanisms underlying FA maintenance and replication. This review details a state-of-the-art of FA investigation, with a particular focus on liquid biopsies and single-cell approaches in tumor samples, emphasizing their potential to revolutionize the future diagnosis, prognosis, and treatment of cancer patients.
Collapse
Affiliation(s)
| | | | | | - Clelia Tiziana Storlazzi
- Department of Biosciences, Biotechnology and Environment, University of Bari Aldo Moro, 70125 Bari, Italy; (A.A.); (D.T.); (S.V.)
| |
Collapse
|
7
|
Recent advances in genetic tools for engineering probiotic lactic acid bacteria. Biosci Rep 2023; 43:232386. [PMID: 36597861 PMCID: PMC9842951 DOI: 10.1042/bsr20211299] [Citation(s) in RCA: 21] [Impact Index Per Article: 10.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2022] [Revised: 12/19/2022] [Accepted: 01/03/2023] [Indexed: 01/05/2023] Open
Abstract
Synthetic biology has grown exponentially in the last few years, with a variety of biological applications. One of the emerging applications of synthetic biology is to exploit the link between microorganisms, biologics, and human health. To exploit this link, it is critical to select effective synthetic biology tools for use in appropriate microorganisms that would address unmet needs in human health through the development of new game-changing applications and by complementing existing technological capabilities. Lactic acid bacteria (LAB) are considered appropriate chassis organisms that can be genetically engineered for therapeutic and industrial applications. Here, we have reviewed comprehensively various synthetic biology techniques for engineering probiotic LAB strains, such as clustered regularly interspaced short palindromic repeats (CRISPR)/Cas9 mediated genome editing, homologous recombination, and recombineering. In addition, we also discussed heterologous protein expression systems used in engineering probiotic LAB. By combining computational biology with genetic engineering, there is a lot of potential to develop next-generation synthetic LAB with capabilities to address bottlenecks in industrial scale-up and complex biologics production. Recently, we started working on Lactochassis project where we aim to develop next generation synthetic LAB for biomedical application.
Collapse
|
8
|
Investigating chromatin accessibility during development and differentiation by ATAC-sequencing to guide the identification of cis-regulatory elements. Biochem Soc Trans 2022; 50:1167-1177. [PMID: 35604124 PMCID: PMC9246326 DOI: 10.1042/bst20210834] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/16/2022] [Revised: 05/11/2022] [Accepted: 05/13/2022] [Indexed: 11/17/2022]
Abstract
Mapping accessible chromatin across time scales can give insights into its dynamic nature, for example during cellular differentiation and tissue or organism development. Analysis of such data can be utilised to identify functional cis-regulatory elements (CRE) and transcription factor binding sites and, when combined with transcriptomics, can reveal gene regulatory networks (GRNs) of expressed genes. Chromatin accessibility mapping is a powerful approach and can be performed using ATAC-sequencing (ATAC-seq), whereby Tn5 transposase inserts sequencing adaptors into genomic DNA to identify differentially accessible regions of chromatin in different cell populations. It requires low sample input and can be performed and analysed relatively quickly compared with other methods. The data generated from ATAC-seq, along with other genomic approaches, can help uncover chromatin packaging and potential cis-regulatory elements that may be responsible for gene expression. Here, we describe the ATAC-seq approach and give examples from mainly vertebrate embryonic development, where such datasets have identified the highly dynamic nature of chromatin, with differing landscapes between cellular precursors for different lineages.
Collapse
|