1
|
Kindel F, Triesch S, Schlüter U, Randarevitch LA, Reichel-Deland V, Weber APM, Denton AK. Predmoter-cross-species prediction of plant promoter and enhancer regions. BIOINFORMATICS ADVANCES 2024; 4:vbae074. [PMID: 38841126 PMCID: PMC11150885 DOI: 10.1093/bioadv/vbae074] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/04/2024] [Revised: 04/10/2024] [Accepted: 05/22/2024] [Indexed: 06/07/2024]
Abstract
Motivation Identifying cis-regulatory elements (CREs) is crucial for analyzing gene regulatory networks. Next generation sequencing methods were developed to identify CREs but represent a considerable expenditure for targeted analysis of few genomic loci. Thus, predicting the outputs of these methods would significantly cut costs and time investment. Results We present Predmoter, a deep neural network that predicts base-wise Assay for Transposase Accessible Chromatin using sequencing (ATAC-seq) and histone Chromatin immunoprecipitation DNA-sequencing (ChIP-seq) read coverage for plant genomes. Predmoter uses only the DNA sequence as input. We trained our final model on 21 species for 13 of which ATAC-seq data and for 17 of which ChIP-seq data was publicly available. We evaluated our models on Arabidopsis thaliana and Oryza sativa. Our best models showed accurate predictions in peak position and pattern for ATAC- and histone ChIP-seq. Annotating putatively accessible chromatin regions provides valuable input for the identification of CREs. In conjunction with other in silico data, this can significantly reduce the search space for experimentally verifiable DNA-protein interaction pairs. Availability and implementation The source code for Predmoter is available at: https://github.com/weberlab-hhu/Predmoter. Predmoter takes a fasta file as input and outputs h5, and optionally bigWig and bedGraph files.
Collapse
Affiliation(s)
- Felicitas Kindel
- Institute of Plant Biochemistry, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany
| | - Sebastian Triesch
- Institute of Plant Biochemistry, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany
- Cluster of Excellence on Plant Sciences (CEPLAS), Germany
| | - Urte Schlüter
- Institute of Plant Biochemistry, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany
| | - Laura Alexandra Randarevitch
- Cluster of Excellence on Plant Sciences (CEPLAS), Germany
- Institute of Population Genetics, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany
| | - Vanessa Reichel-Deland
- Institute of Plant Biochemistry, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany
| | - Andreas P M Weber
- Institute of Plant Biochemistry, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany
- Cluster of Excellence on Plant Sciences (CEPLAS), Germany
| | - Alisandra K Denton
- Institute of Plant Biochemistry, Math.-Nat. Faculty, Heinrich Heine University, Düsseldorf 40225, Germany
- Cluster of Excellence on Plant Sciences (CEPLAS), Germany
- Valence Labs, Montréal, Québec H2S 3H1, Canada
| |
Collapse
|
2
|
Sunitha Kumary VUN, Venters BJ, Raman K, Sen S, Estève PO, Cowles MW, Keogh MC, Pradhan S. Emerging Approaches to Profile Accessible Chromatin from Formalin-Fixed Paraffin-Embedded Sections. EPIGENOMES 2024; 8:20. [PMID: 38804369 PMCID: PMC11130958 DOI: 10.3390/epigenomes8020020] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2024] [Accepted: 05/06/2024] [Indexed: 05/29/2024] Open
Abstract
Nucleosomes are non-uniformly distributed across eukaryotic genomes, with stretches of 'open' chromatin strongly associated with transcriptionally active promoters and enhancers. Understanding chromatin accessibility patterns in normal tissue and how they are altered in pathologies can provide critical insights to development and disease. With the advent of high-throughput sequencing, a variety of strategies have been devised to identify open regions across the genome, including DNase-seq, MNase-seq, FAIRE-seq, ATAC-seq, and NicE-seq. However, the broad application of such methods to FFPE (formalin-fixed paraffin-embedded) tissues has been curtailed by the major technical challenges imposed by highly fixed and often damaged genomic material. Here, we review the most common approaches for mapping open chromatin regions, recent optimizations to overcome the challenges of working with FFPE tissue, and a brief overview of a typical data pipeline with analysis considerations.
Collapse
Affiliation(s)
| | - Bryan J. Venters
- EpiCypher Inc., Durham, NC 27709, USA; (V.U.N.S.K.); (B.J.V.); (M.W.C.)
| | - Karthikeyan Raman
- Genome Biology Division, New England Biolabs, Ipswich, MA 01983, USA; (K.R.); (S.S.); (P.-O.E.)
| | - Sagnik Sen
- Genome Biology Division, New England Biolabs, Ipswich, MA 01983, USA; (K.R.); (S.S.); (P.-O.E.)
| | - Pierre-Olivier Estève
- Genome Biology Division, New England Biolabs, Ipswich, MA 01983, USA; (K.R.); (S.S.); (P.-O.E.)
| | - Martis W. Cowles
- EpiCypher Inc., Durham, NC 27709, USA; (V.U.N.S.K.); (B.J.V.); (M.W.C.)
| | | | - Sriharsa Pradhan
- Genome Biology Division, New England Biolabs, Ipswich, MA 01983, USA; (K.R.); (S.S.); (P.-O.E.)
| |
Collapse
|
3
|
Lee KH, Kim J, Kim JH. 3D epigenomics and 3D epigenopathies. BMB Rep 2024; 57:216-231. [PMID: 38627948 PMCID: PMC11139681] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2023] [Revised: 01/15/2024] [Accepted: 03/18/2024] [Indexed: 05/25/2024] Open
Abstract
Mammalian genomes are intricately compacted to form sophisticated 3-dimensional structures within the tiny nucleus, so called 3D genome folding. Despite their shapes reminiscent of an entangled yarn, the rapid development of molecular and next-generation sequencing technologies (NGS) has revealed that mammalian genomes are highly organized in a hierarchical order that delicately affects transcription activities. An increasing amount of evidence suggests that 3D genome folding is implicated in diseases, giving us a clue on how to identify novel therapeutic approaches. In this review, we will study what 3D genome folding means in epigenetics, what types of 3D genome structures there are, how they are formed, and how the technologies have developed to explore them. We will also discuss the pathological implications of 3D genome folding. Finally, we will discuss how to leverage 3D genome folding and engineering for future studies. [BMB Reports 2024; 57(5): 216-231].
Collapse
Affiliation(s)
- Kyung-Hwan Lee
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
| | - Jungyu Kim
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
| | - Ji Hun Kim
- Graduate School of Medical Science and Engineering, Korea Advanced Institute of Science and Technology, Daejeon 34141, Korea
| |
Collapse
|
4
|
Nava AA, Arboleda VA. The omics era: a nexus of untapped potential for Mendelian chromatinopathies. Hum Genet 2024; 143:475-495. [PMID: 37115317 PMCID: PMC11078811 DOI: 10.1007/s00439-023-02560-2] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/27/2022] [Accepted: 04/10/2023] [Indexed: 04/29/2023]
Abstract
The OMICs cascade describes the hierarchical flow of information through biological systems. The epigenome sits at the apex of the cascade, thereby regulating the RNA and protein expression of the human genome and governs cellular identity and function. Genes that regulate the epigenome, termed epigenes, orchestrate complex biological signaling programs that drive human development. The broad expression patterns of epigenes during human development mean that pathogenic germline mutations in epigenes can lead to clinically significant multi-system malformations, developmental delay, intellectual disabilities, and stem cell dysfunction. In this review, we refer to germline developmental disorders caused by epigene mutation as "chromatinopathies". We curated the largest number of human chromatinopathies to date and our expanded approach more than doubled the number of established chromatinopathies to 179 disorders caused by 148 epigenes. Our study revealed that 20.6% (148/720) of epigenes cause at least one chromatinopathy. In this review, we highlight key examples in which OMICs approaches have been applied to chromatinopathy patient biospecimens to identify underlying disease pathogenesis. The rapidly evolving OMICs technologies that couple molecular biology with high-throughput sequencing or proteomics allow us to dissect out the causal mechanisms driving temporal-, cellular-, and tissue-specific expression. Using the full repertoire of data generated by the OMICs cascade to study chromatinopathies will provide invaluable insight into the developmental impact of these epigenes and point toward future precision targets for these rare disorders.
Collapse
Affiliation(s)
- Aileen A Nava
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA
- Broad Stem Cell Research Center, University of California, Los Angeles, CA, USA
| | - Valerie A Arboleda
- Department of Human Genetics, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
- Department of Pathology & Laboratory Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
- Department of Computational Medicine, David Geffen School of Medicine, UCLA, Los Angeles, CA, USA.
- Broad Stem Cell Research Center, University of California, Los Angeles, CA, USA.
- Molecular Biology Institute, University of California, Los Angeles, CA, USA.
- Jonsson Comprehensive Cancer Center, University of California, Los Angeles, CA, USA.
| |
Collapse
|
5
|
Bai D, Zhang X, Xiang H, Guo Z, Zhu C, Yi C. Simultaneous single-cell analysis of 5mC and 5hmC with SIMPLE-seq. Nat Biotechnol 2024:10.1038/s41587-024-02148-9. [PMID: 38336903 DOI: 10.1038/s41587-024-02148-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/24/2022] [Accepted: 01/18/2024] [Indexed: 02/12/2024]
Abstract
Dynamic 5-methylcytosine (5mC) and 5-hydroxymethylcytosine (5hmC) modifications to DNA regulate gene expression in a cell-type-specific manner and are associated with various biological processes, but the two modalities have not yet been measured simultaneously from the same genome at the single-cell level. Here we present SIMPLE-seq, a scalable, base resolution method for joint analysis of 5mC and 5hmC from thousands of single cells. Based on orthogonal labeling and recording of 'C-to-T' mutational signals from 5mC and 5hmC sites, SIMPLE-seq detects these two modifications from the same molecules in single cells and enables unbiased DNA methylation dynamics analysis of heterogeneous biological samples. We applied this method to mouse embryonic stem cells, human peripheral blood mononuclear cells and mouse brain to give joint epigenome maps at single-cell and single-molecule resolution. Integrated analysis of these two cytosine modifications reveals distinct epigenetic patterns associated with divergent regulatory programs in different cell types as well as cell states.
Collapse
Affiliation(s)
- Dongsheng Bai
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Xiaoting Zhang
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China
| | - Huifen Xiang
- Department of Obstetrics and Gynecology, First Affiliated Hospital of Anhui Medical University, Anhui, China
- NHC Key Laboratory of Study on Abnormal Gametes and Reproductive Tract, Anhui Medical University, Anhui, China
| | - Zijian Guo
- State Key Laboratory of Coordination Chemistry, Coordination Chemistry Institute, School of Chemistry and Chemical Engineering, Nanjing University, Nanjing, China
| | - Chenxu Zhu
- New York Genome Center, New York, NY, USA.
- Department of Physiology and Biophysics, Institute for Computational Biomedicine, Weill Cornell Medicine, New York, NY, USA.
| | - Chengqi Yi
- State Key Laboratory of Protein and Plant Gene Research, School of Life Sciences, Peking University, Beijing, China.
- Peking-Tsinghua Center for Life Sciences, Peking University, Beijing, China.
- Department of Chemical Biology and Synthetic and Functional Biomolecules Center, College of Chemistry and Molecular Engineering, Peking University, Beijing, China.
| |
Collapse
|
6
|
Taing L, Dandawate A, L’Yi S, Gehlenborg N, Brown M, Meyer C. Cistrome Data Browser: integrated search, analysis and visualization of chromatin data. Nucleic Acids Res 2024; 52:D61-D66. [PMID: 37971305 PMCID: PMC10767960 DOI: 10.1093/nar/gkad1069] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2023] [Revised: 10/14/2023] [Accepted: 11/02/2023] [Indexed: 11/19/2023] Open
Abstract
The Cistrome Data Browser is a resource of ChIP-seq, ATAC-seq and DNase-seq data from humans and mice. It provides maps of the genome-wide locations of transcription factors, cofactors, chromatin remodelers, histone post-translational modifications and regions of chromatin accessible to endonuclease activity. Cistrome DB v3.0 contains approximately 45 000 human and 44 000 mouse samples with about 32 000 newly collected datasets compared to the previous release. The Cistrome DB v3.0 user interface is implemented as a single page application that unifies menu driven and data driven search functions and provides an embedded genome browser, which allows users to find and visualize data more effectively. Users can find informative chromatin profiles through keyword, menu, and data-driven search tools. Browser search functions can predict the regulators of query genes as well as the cell type and factor dependent functionality of potential cis-regulatory elements. Cistrome DB v3.0 expands the display of quality control statistics, incorporates sequence logos into motif enrichment displays and includes more expansive sample metadata. Cistrome DB v3.0 is available at http://db3.cistrome.org/browser.
Collapse
Affiliation(s)
- Len Taing
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Ariaki Dandawate
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
| | - Sehi L’Yi
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Nils Gehlenborg
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Myles Brown
- Center for Functional Cancer Epigenetics, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Medical Oncology, Dana-Farber Cancer Institute, Brigham and Women's Hospital, and Harvard Medical School, Boston, MA, USA
| | - Clifford A Meyer
- Department of Data Science, Dana-Farber Cancer Institute, Boston, MA, USA
- Department of Biostatistics, Harvard T.H. Chan School of Public Health, Boston, MA, USA
| |
Collapse
|
7
|
Liu Z, Wong HM, Chen X, Lin J, Zhang S, Yan S, Wang F, Li X, Wong KC. MotifHub: Detection of trans-acting DNA motif group with probabilistic modeling algorithm. Comput Biol Med 2024; 168:107753. [PMID: 38039889 DOI: 10.1016/j.compbiomed.2023.107753] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2023] [Revised: 10/30/2023] [Accepted: 11/20/2023] [Indexed: 12/03/2023]
Abstract
BACKGROUND Trans-acting factors are of special importance in transcription regulation, which is a group of proteins that can directly or indirectly recognize or bind to the 8-12 bp core sequence of cis-acting elements and regulate the transcription efficiency of target genes. The progressive development in high-throughput chromatin capture technology (e.g., Hi-C) enables the identification of chromatin-interacting sequence groups where trans-acting DNA motif groups can be discovered. The problem difficulty lies in the combinatorial nature of DNA sequence pattern matching and its underlying sequence pattern search space. METHOD Here, we propose to develop MotifHub for trans-acting DNA motif group discovery on grouped sequences. Specifically, the main approach is to develop probabilistic modeling for accommodating the stochastic nature of DNA motif patterns. RESULTS Based on the modeling, we develop global sampling techniques based on EM and Gibbs sampling to address the global optimization challenge for model fitting with latent variables. The results reflect that our proposed approaches demonstrate promising performance with linear time complexities. CONCLUSION MotifHub is a novel algorithm considering the identification of both DNA co-binding motif groups and trans-acting TFs. Our study paves the way for identifying hub TFs of stem cell development (OCT4 and SOX2) and determining potential therapeutic targets of prostate cancer (FOXA1 and MYC). To ensure scientific reproducibility and long-term impact, its matrix-algebra-optimized source code is released at http://bioinfo.cs.cityu.edu.hk/MotifHub.
Collapse
Affiliation(s)
- Zhe Liu
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Kowloon, Hong Kong, China
| | - Hiu-Man Wong
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Kowloon, Hong Kong, China
| | - Xingjian Chen
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Kowloon, Hong Kong, China
| | - Jiecong Lin
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Kowloon, Hong Kong, China
| | - Shixiong Zhang
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Kowloon, Hong Kong, China
| | - Shankai Yan
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Kowloon, Hong Kong, China
| | - Fuzhou Wang
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Kowloon, Hong Kong, China
| | - Xiangtao Li
- School of Artificial Intelligence, Jilin University, Jilin, China
| | - Ka-Chun Wong
- Department of Computer Science, City University of Hong Kong, Kowloon Tong, Kowloon, Hong Kong, China.
| |
Collapse
|
8
|
Müller-Dott S, Tsirvouli E, Vazquez M, Ramirez Flores R, Badia-i-Mompel P, Fallegger R, Türei D, Lægreid A, Saez-Rodriguez J. Expanding the coverage of regulons from high-confidence prior knowledge for accurate estimation of transcription factor activities. Nucleic Acids Res 2023; 51:10934-10949. [PMID: 37843125 PMCID: PMC10639077 DOI: 10.1093/nar/gkad841] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 08/08/2023] [Accepted: 09/22/2023] [Indexed: 10/17/2023] Open
Abstract
Gene regulation plays a critical role in the cellular processes that underlie human health and disease. The regulatory relationship between transcription factors (TFs), key regulators of gene expression, and their target genes, the so called TF regulons, can be coupled with computational algorithms to estimate the activity of TFs. However, to interpret these findings accurately, regulons of high reliability and coverage are needed. In this study, we present and evaluate a collection of regulons created using the CollecTRI meta-resource containing signed TF-gene interactions for 1186 TFs. In this context, we introduce a workflow to integrate information from multiple resources and assign the sign of regulation to TF-gene interactions that could be applied to other comprehensive knowledge bases. We find that the signed CollecTRI-derived regulons outperform other public collections of regulatory interactions in accurately inferring changes in TF activities in perturbation experiments. Furthermore, we showcase the value of the regulons by examining TF activity profiles in three different cancer types and exploring TF activities at the level of single-cells. Overall, the CollecTRI-derived TF regulons enable the accurate and comprehensive estimation of TF activities and thereby help to interpret transcriptomics data.
Collapse
Affiliation(s)
- Sophia Müller-Dott
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Eirini Tsirvouli
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
- Department of Biology, Norwegian University of Science and Technology, Trondheim, Norway
| | | | - Ricardo O Ramirez Flores
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Pau Badia-i-Mompel
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Robin Fallegger
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Dénes Türei
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| | - Astrid Lægreid
- Department of Clinical and Molecular Medicine, Norwegian University of Science and Technology, Trondheim, Norway
| | - Julio Saez-Rodriguez
- Heidelberg University, Faculty of Medicine, and Heidelberg University Hospital, Institute for Computational Biomedicine, Bioquant, Heidelberg, Germany
| |
Collapse
|
9
|
Zhu T, Zhou X, You Y, Wang L, He Z, Chen D. cisDynet: An integrated platform for modeling gene-regulatory dynamics and networks. IMETA 2023; 2:e152. [PMID: 38868212 PMCID: PMC10989917 DOI: 10.1002/imt2.152] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 10/28/2023] [Accepted: 11/06/2023] [Indexed: 06/14/2024]
Abstract
Chromatin accessibility sequencing has been widely used for uncovering genetic regulatory mechanisms and inferring gene regulatory networks. However, effectively integrating large-scale chromatin accessibility datasets has posed a significant challenge. This is due to the lack of a comprehensive end-to-end solution, as many existing tools primarily emphasize data preprocessing and overlook downstream analyses. To bridge this gap, we have introduced cisDynet, a holistic solution that combines streamlined data preprocessing using Snakemake and R functions with advanced downstream analysis capabilities. cisDynet excels in conventional data analyses, encompassing peak statistics, peak annotation, differential analysis, motif enrichment analysis, and more. Additionally, it allows to perform sophisticated data exploration, such as tissue-specific peak identification, time course data modeling, integration of RNA-seq data to establish peak-to-gene associations, constructing regulatory networks, and conducting enrichment analysis of genome-wide association study (GWAS) variants. As a proof of concept, we applied cisDynet to reanalyze comprehensive ATAC-seq datasets across various tissues from the Encyclopedia of DNA Elements (ENCODE) project. The analysis successfully delineated tissue-specific open chromatin regions (OCRs), established connections between OCRs and target genes, and effectively linked these discoveries with 1861 GWAS variants. Furthermore, cisDynet was instrumental in dissecting the time course open chromatin data of mouse embryonic development, revealing the dynamic behavior of OCRs over developmental stages and identifying key transcription factors governing differentiation trajectories. In summary, cisDynet offers researchers a user-friendly solution that minimizes the need for extensive coding, ensures the reproducibility of results, and greatly simplifies the exploration of epigenomic data.
Collapse
Affiliation(s)
- Tao Zhu
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life SciencesNanjing UniversityNanjingChina
| | - Xinkai Zhou
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life SciencesNanjing UniversityNanjingChina
| | - Yuxin You
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life SciencesNanjing UniversityNanjingChina
| | - Lin Wang
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life SciencesNanjing UniversityNanjingChina
| | - Zhaohui He
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life SciencesNanjing UniversityNanjingChina
| | - Dijun Chen
- State Key Laboratory of Pharmaceutical Biotechnology, School of Life SciencesNanjing UniversityNanjingChina
| |
Collapse
|
10
|
Ramalingam V, Yu X, Slaughter BD, Unruh JR, Brennan KJ, Onyshchenko A, Lange JJ, Natarajan M, Buck M, Zeitlinger J. Lola-I is a promoter pioneer factor that establishes de novo Pol II pausing during development. Nat Commun 2023; 14:5862. [PMID: 37735176 PMCID: PMC10514308 DOI: 10.1038/s41467-023-41408-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2023] [Accepted: 08/30/2023] [Indexed: 09/23/2023] Open
Abstract
While the accessibility of enhancers is dynamically regulated during development, promoters tend to be constitutively accessible and poised for activation by paused Pol II. By studying Lola-I, a Drosophila zinc finger transcription factor, we show here that the promoter state can also be subject to developmental regulation independently of gene activation. Lola-I is ubiquitously expressed at the end of embryogenesis and causes its target promoters to become accessible and acquire paused Pol II throughout the embryo. This promoter transition is required but not sufficient for tissue-specific target gene activation. Lola-I mediates this function by depleting promoter nucleosomes, similar to the action of pioneer factors at enhancers. These results uncover a level of regulation for promoters that is normally found at enhancers and reveal a mechanism for the de novo establishment of paused Pol II at promoters.
Collapse
Affiliation(s)
- Vivekanandan Ramalingam
- Stowers Institute for Medical Research, Kansas City, MO, USA
- Department of Pathology and Laboratory Medicine, University of Kansas Medical Center----, Kansas City, KS, USA
- Department of Genetics, Stanford University, Palo Alto, CA, USA
| | - Xinyang Yu
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY, USA
| | | | - Jay R Unruh
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | | | | | - Jeffrey J Lange
- Stowers Institute for Medical Research, Kansas City, MO, USA
| | | | - Michael Buck
- Department of Biochemistry, State University of New York at Buffalo, Buffalo, NY, USA
- Department of Biomedical Informatics, Jacobs School of Medicine & Biomedical Sciences, Buffalo, NY, USA
| | - Julia Zeitlinger
- Stowers Institute for Medical Research, Kansas City, MO, USA.
- Department of Pathology and Laboratory Medicine, University of Kansas Medical Center----, Kansas City, KS, USA.
| |
Collapse
|
11
|
Armendariz DA, Sundarrajan A, Hon GC. Breaking enhancers to gain insights into developmental defects. eLife 2023; 12:e88187. [PMID: 37497775 PMCID: PMC10374278 DOI: 10.7554/elife.88187] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 07/19/2023] [Indexed: 07/28/2023] Open
Abstract
Despite ground-breaking genetic studies that have identified thousands of risk variants for developmental diseases, how these variants lead to molecular and cellular phenotypes remains a gap in knowledge. Many of these variants are non-coding and occur at enhancers, which orchestrate key regulatory programs during development. The prevailing paradigm is that non-coding variants alter the activity of enhancers, impacting gene expression programs, and ultimately contributing to disease risk. A key obstacle to progress is the systematic functional characterization of non-coding variants at scale, especially since enhancer activity is highly specific to cell type and developmental stage. Here, we review the foundational studies of enhancers in developmental disease and current genomic approaches to functionally characterize developmental enhancers and their variants at scale. In the coming decade, we anticipate systematic enhancer perturbation studies to link non-coding variants to molecular mechanisms, changes in cell state, and disease phenotypes.
Collapse
Affiliation(s)
- Daniel A Armendariz
- Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, United States
| | - Anjana Sundarrajan
- Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, United States
| | - Gary C Hon
- Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, United States
- Hamon Center for Regenerative Science and Medicine, University of Texas Southwestern Medical Center, Dallas, United States
- Lyda Hill Department of Bioinformatics, Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, United States
| |
Collapse
|
12
|
Laub V, Devraj K, Elias L, Schulte D. Bioinformatics for wet-lab scientists: practical application in sequencing analysis. BMC Genomics 2023; 24:382. [PMID: 37420172 DOI: 10.1186/s12864-023-09454-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2023] [Accepted: 06/15/2023] [Indexed: 07/09/2023] Open
Abstract
BACKGROUND Genomics data is available to the scientific community after publication of research projects and can be investigated for a multitude of research questions. However, in many cases deposited data is only assessed and used for the initial publication, resulting in valuable resources not being exploited to their full depth. MAIN: A likely reason for this is that many wetlab-based researchers are not formally trained to apply bioinformatic tools and may therefore assume that they lack the necessary experience to do so themselves. In this article, we present a series of freely available, predominantly web-based platforms and bioinformatic tools that can be combined in analysis pipelines to interrogate different types of next-generation sequencing data. Additionally to the presented exemplary route, we also list a number of alternative tools that can be combined in a mix-and-match fashion. We place special emphasis on tools that can be followed and used correctly without extensive prior knowledge in programming. Such analysis pipelines can be applied to existing data downloaded from the public domain or be compared to the results of own experiments. CONCLUSION Integrating transcription factor binding to chromatin (ChIP-seq) with transcriptional output (RNA-seq) and chromatin accessibility (ATAC-seq) can not only assist to form a deeper understanding of the molecular interactions underlying transcriptional regulation but will also help establishing new hypotheses and pre-testing them in silico.
Collapse
Affiliation(s)
- Vera Laub
- Neurological Institute (Edinger Institute), University Hospital Frankfurt, Goethe University, Frankfurt, Germany.
| | - Kavi Devraj
- Neurological Institute (Edinger Institute), University Hospital Frankfurt, Goethe University, Frankfurt, Germany
- Department of Biological Sciences, Birla Institute of Technology and Science Pilani, Hyderabad Campus, Hyderabad, Telangana, India
| | - Lena Elias
- Neurological Institute (Edinger Institute), University Hospital Frankfurt, Goethe University, Frankfurt, Germany
| | - Dorothea Schulte
- Neurological Institute (Edinger Institute), University Hospital Frankfurt, Goethe University, Frankfurt, Germany
| |
Collapse
|
13
|
Wang LS, Sun ZL. iDHS-FFLG: Identifying DNase I Hypersensitive Sites by Feature Fusion and Local-Global Feature Extraction Network. Interdiscip Sci 2023; 15:155-170. [PMID: 36166165 DOI: 10.1007/s12539-022-00538-8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2022] [Revised: 09/12/2022] [Accepted: 09/12/2022] [Indexed: 05/01/2023]
Abstract
The DNase I hypersensitive sites (DHSs) are active regions on chromatin that have been found to be highly sensitive to DNase I. These regions contain various cis-regulatory elements, including promoters, enhancers and silencers. Accurate identification of DHSs helps researchers better understand the transcriptional machinery of DNA and deepen the knowledge of functional DNA elements in non-coding sequences. Researchers have developed many methods based on traditional experiments and machine learning to identify DHSs. However, low prediction accuracy and robustness limit their application in genetics research. In this paper, a novel computational approach based on deep learning is proposed by feature fusion and local-global feature extraction network to identify DHSs in mouse, named iDHS-FFLG. First of all, multiple binary features of nucleotides are fused to better express sequence information. Then, a network consisting of the convolutional neural network (CNN), bidirectional long short-term memory (BiLSTM) and self-attention mechanism is designed to extract local features and global contextual associations. In the end, the prediction module is applied to distinguish between DHSs and non-DHSs. The results of several experiments demonstrate the superior performances of iDHS-FFLG compared to the latest methods.
Collapse
Affiliation(s)
- Lei-Shan Wang
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, Anhui, China
- School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, Anhui, China
| | - Zhan-Li Sun
- Information Materials and Intelligent Sensing Laboratory of Anhui Province, Anhui University, Hefei, 230601, Anhui, China.
- School of Electrical Engineering and Automation, Anhui University, Hefei, 230601, Anhui, China.
| |
Collapse
|
14
|
Bina M. Defining Candidate Imprinted loci in Bos taurus. Genes (Basel) 2023; 14:1036. [PMID: 37239396 PMCID: PMC10217866 DOI: 10.3390/genes14051036] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2023] [Revised: 04/27/2023] [Accepted: 04/30/2023] [Indexed: 05/28/2023] Open
Abstract
Using a whole-genome assembly of Bos taurus, I applied my bioinformatics strategy to locate candidate imprinting control regions (ICRs) genome-wide. In mammals, genomic imprinting plays essential roles in embryogenesis. In my strategy, peaks in plots mark the locations of known, inferred, and candidate ICRs. Genes in the vicinity of candidate ICRs correspond to potential imprinted genes. By displaying my datasets on the UCSC genome browser, one could view peak positions with respect to genomic landmarks. I give two examples of candidate ICRs in loci that influence spermatogenesis in bulls: CNNM1 and CNR1. I also give examples of candidate ICRs in loci that influence muscle development: SIX1 and BCL6. By examining the ENCODE data reported for mice, I deduced regulatory clues about cattle. I focused on DNase I hypersensitive sites (DHSs). Such sites reveal accessibility of chromatin to regulators of gene expression. For inspection, I chose DHSs in chromatin from mouse embryonic stem cells (ESCs) ES-E14, mesoderm, brain, heart, and skeletal muscle. The ENCODE data revealed that the SIX1 promoter was accessible to the transcription initiation apparatus in mouse ESCs, mesoderm, and skeletal muscles. The data also revealed accessibility of BCL6 locus to regulatory proteins in mouse ESCs and examined tissues.
Collapse
Affiliation(s)
- Minou Bina
- Department of Chemistry, Purdue University, West Lafayette, IN 47907, USA
| |
Collapse
|
15
|
Hui-Yuen J, Jiang K, Malkiel S, Eberhard BA, Walters H, Diamond B, Jarvis J. B lymphocytes in treatment-naive paediatric patients with lupus are epigenetically distinct from healthy children. Lupus Sci Med 2023; 10:10/1/e000921. [PMID: 37202122 DOI: 10.1136/lupus-2023-000921] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Accepted: 05/04/2023] [Indexed: 05/20/2023]
Abstract
BACKGROUND SLE is likely triggered by gene-environment interactions. We have shown that most SLE-associated haplotypes encompass genomic regions enriched for epigenetic marks associated with enhancer function in lymphocytes, suggesting genetic risk is exerted through altered gene regulation. Data remain scarce on how epigenetic variance contributes to disease risk in paediatric SLE (pSLE). We aim to identify differences in epigenetically regulated chromatin architecture in treatment-naive patients with pSLE compared with healthy children. METHODS Using the assay for transposase-accessible chromatin with sequencing (ATACseq), we surveyed open chromatin in 10 treatment-naive patients with pSLE, with at least moderate disease severity, and 5 healthy children. We investigated whether regions of open chromatin unique to patients with pSLE demonstrate enrichment for specific transcriptional regulators, using standard computational approaches to identify unique peaks and a false discovery rate of <0.05. Further analyses for histone modification enrichment and variant calling were performed using bioinformatics packages in R and Linux. RESULTS We identified 30 139 differentially accessible regions (DAR) unique to pSLE B cells; 64.3% are more accessible in pSLE than healthy children. Many DAR are found in distal, intergenic regions and enriched for enhancer histone marks (p=0.027). B cells from adult patients with SLE contain more regions of inaccessible chromatin than those in pSLE. In pSLE B cells, 65.2% of the DAR are located within or near known SLE haplotypes. Further analysis revealed enrichment of transcription factor binding motifs within these DAR that may regulate genes involved in pro-inflammatory responses and cellular adhesion. CONCLUSIONS We demonstrate an epigenetically distinct profile in pSLE B cells when compared with healthy children and adults with lupus, indicating that pSLE B cells are predisposed for disease onset/development. Increased chromatin accessibility in non-coding genomic regions controlling activation of inflammation suggest that transcriptional dysregulation by regulatory elements controlling B cell activation plays an important role in pSLE pathogenesis.
Collapse
Affiliation(s)
- Joyce Hui-Yuen
- Pediatric Rheumatology, Northwell Health, Lake Success, New York, USA
- Pediatrics, Hofstra Northwell School of Medicine at Hofstra University, Hempstead, New York, USA
- Center for Autoimmune, Musculoskeletal, and Hematopoietic Diseases Research, The Feinstein Institutes for Medical Research, Manhasset, New York, USA
| | - Kaiyu Jiang
- Pediatrics, University at Buffalo School of Medicine and Biomedical Sciences, Buffalo, New York, USA
| | - Susan Malkiel
- Center for Autoimmune, Musculoskeletal, and Hematopoietic Diseases Research, The Feinstein Institutes for Medical Research, Manhasset, New York, USA
| | - Barbara Anne Eberhard
- Pediatric Rheumatology, Northwell Health, Lake Success, New York, USA
- Pediatrics, Hofstra Northwell School of Medicine at Hofstra University, Hempstead, New York, USA
| | - Heather Walters
- Pediatric Rheumatology, Northwell Health, Lake Success, New York, USA
- Pediatrics, Hofstra Northwell School of Medicine at Hofstra University, Hempstead, New York, USA
| | - Betty Diamond
- Center for Autoimmune, Musculoskeletal, and Hematopoietic Diseases Research, The Feinstein Institutes for Medical Research, Manhasset, New York, USA
| | - James Jarvis
- Pediatrics, University at Buffalo School of Medicine and Biomedical Sciences, Buffalo, New York, USA
| |
Collapse
|
16
|
Marinov GK, Shipony Z, Kundaje A, Greenleaf WJ. Genome-Wide Mapping of Active Regulatory Elements Using ATAC-seq. Methods Mol Biol 2023; 2611:3-19. [PMID: 36807060 DOI: 10.1007/978-1-0716-2899-7_1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
Active cis-regulatory elements (cREs) in eukaryotes are characterized by nucleosomal depletion and, accordingly, higher accessibility. This property has turned out to be immensely useful for identifying cREs genome-wide and tracking their dynamics across different cellular states and is the basis of numerous methods taking advantage of the preferential enzymatic cleavage/labeling of accessible DNA. ATAC-seq (Assay for Transposase-Accessible Chromatin using sequencing) has emerged as the most versatile and widely adaptable method and has been widely adopted as the standard tool for mapping open chromatin regions. Here, we discuss the current optimal practices and important considerations for carrying out ATAC-seq experiments, primarily in the context of mammalian systems.
Collapse
Affiliation(s)
| | - Zohar Shipony
- Department of Genetics, Stanford University, Stanford, CA, USA
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA, USA.,Department of Computer Science, Stanford University, Stanford, CA, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University, Stanford, CA, USA. .,Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA. .,Department of Applied Physics, Stanford University, Stanford, CA, USA. .,Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
17
|
Hinks M, Marinov GK, Kundaje A, Bintu L, Greenleaf WJ. Single-Molecule Mapping of Chromatin Accessibility Using NOMe-seq/dSMF. Methods Mol Biol 2023; 2611:101-119. [PMID: 36807067 DOI: 10.1007/978-1-0716-2899-7_8] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
The bulk of gene expression regulation in most organisms is accomplished through the action of transcription factors (TFs) on cis-regulatory elements (CREs). In eukaryotes, these CREs are generally characterized by nucleosomal depletion and thus higher physical accessibility of DNA. Many methods exploit this property to map regions of high average accessibility, and thus putative active CREs, in bulk. However, these techniques do not provide information about coordinated patterns of accessibility along the same DNA molecule, nor do they map the absolute levels of occupancy/accessibility. SMF (Single-Molecule Footprinting) fills these gaps by leveraging recombinant DNA cytosine methyltransferases (MTase) to mark accessible locations on individual DNA molecules. In this chapter, we discuss current methods and important considerations for performing SMF experiments.
Collapse
Affiliation(s)
- Michaela Hinks
- Department of Genetics, Stanford University, Stanford, CA, USA.
| | | | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA, USA.,Department of Computer Science, Stanford University, Stanford, CA, USA
| | - Lacramioara Bintu
- Department of Bioengineering, Stanford University, Stanford, CA, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University, Stanford, CA, USA. .,Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA. .,Department of Applied Physics, Stanford University, Stanford, CA, USA. .,Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
18
|
Estève PO, Vishnu US, Chin HG, Pradhan S. NicE-viewSeq: An Integrative Visualization and Genomics Method to Detect Accessible Chromatin in Fixed Cells. Methods Mol Biol 2023; 2611:293-302. [PMID: 36807075 DOI: 10.1007/978-1-0716-2899-7_16] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/23/2023]
Abstract
A novel genome-wide accessible chromatin visualization, quantitation, and sequencing method is described, which allows in situ fluorescence visualization and sequencing of the accessible chromatin in the mammalian cell. The cells are fixed by formaldehyde crosslinking, and processed using a modified nick translation method, where a nicking enzyme nicks one strand of DNA, and DNA polymerase incorporates biotin-conjugated dCTP, 5-methyl-dCTP, Fluorescein-12-dATP or Texas Red-5-dATP, dGTP, and dTTP. This allows accessible chromatin DNA to be labeled for visualization and on bead NGS library preparation. This technology allows cellular level chromatin accessibility quantification and genomic analysis of the epigenetic information in the chromatin, particularly accessible promoter, enhancers, nucleosome positioning, transcription factor occupancy, and other chromosomal protein binding.
Collapse
Affiliation(s)
| | | | - Hang Gyeong Chin
- Genome Biology Division, New England Biolabs, Inc., Ipswich, MA, USA
| | - Sriharsa Pradhan
- Genome Biology Division, New England Biolabs, Inc., Ipswich, MA, USA.
| |
Collapse
|
19
|
Enhancer-promoter entanglement explains their transcriptional interdependence. Proc Natl Acad Sci U S A 2023; 120:e2216436120. [PMID: 36656865 PMCID: PMC9942820 DOI: 10.1073/pnas.2216436120] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Enhancers not only activate target promoters to stimulate messenger RNA (mRNA) synthesis, but they themselves also undergo transcription to produce enhancer RNAs (eRNAs), the significance of which is not well understood. Transcription at the participating enhancer-promoter pair appears coordinated, but it is unclear why and how. Here, we employ cell-free transcription assays using constructs derived from the human GREB1 locus to demonstrate that transcription at an enhancer and its target promoter is interdependent. This interdependence is observable under conditions where direct enhancer-promoter contact (EPC) takes place. We demonstrate that transcription activation at a participating enhancer-promoter pair is dependent on i) the mutual availability of the enhancer and promoter, ii) the state of transcription at both the enhancer and promoter, iii) local abundance of both eRNA and mRNA, and iv) direct EPC. Our results suggest transcriptional interdependence between the enhancer and the promoter as the basis of their transcriptional concurrence and coordination throughout the genome. We propose a model where transcriptional concurrence, coordination and interdependence are possible if the participating enhancer and promoter are entangled in the form of EPC, reside in a proteinaceous bubble, and utilize shared transcriptional resources and regulatory inputs.
Collapse
|
20
|
Tabrizi S, Martin-Alonso C, Xiong K, Blewett T, Sridhar S, An Z, Patel S, Rodriguez-Aponte S, Naranjo CA, Wang ST, Shea D, Golub TR, Bhatia SN, Adalsteinsson V, Love JC. An intravenous DNA-binding priming agent protects cell-free DNA and improves the sensitivity of liquid biopsies. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2023:2023.01.13.523947. [PMID: 36711455 PMCID: PMC9882106 DOI: 10.1101/2023.01.13.523947] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/16/2023]
Abstract
Blood-based, or "liquid," biopsies enable minimally invasive diagnostics but have limits on sensitivity due to scarce cell-free DNA (cfDNA). Improvements to sensitivity have primarily relied on enhancing sequencing technology ex vivo . Here, we sought to augment the level of circulating tumor DNA (ctDNA) detected in a blood draw by attenuating the clearance of cfDNA in vivo . We report a first-in-class intravenous DNA-binding priming agent given 2 hours prior to a blood draw to recover more cfDNA. The DNA-binding antibody minimizes nuclease digestion and organ uptake of cfDNA, decreasing its clearance at 1 hour by over 150-fold. To improve plasma persistence and limit potential immune interactions, we abrogated its Fc-effector function. We found that it protects GC-rich sequences and DNase-hypersensitive sites, which are ordinarily underrepresented in cfDNA. In tumor-bearing mice, priming improved tumor DNA recovery by 19-fold and sensitivity for detecting cancer from 6% to 84%. These results suggest a novel method to enhance the sensitivity of existing DNA-based cancer testing using blood biopsies.
Collapse
|
21
|
Tang X, Zheng P, Liu Y, Yao Y, Huang G. LangMoDHS: A deep learning language model for predicting DNase I hypersensitive sites in mouse genome. MATHEMATICAL BIOSCIENCES AND ENGINEERING : MBE 2023; 20:1037-1057. [PMID: 36650801 DOI: 10.3934/mbe.2023048] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/17/2023]
Abstract
DNase I hypersensitive sites (DHSs) are a specific genomic region, which is critical to detect or understand cis-regulatory elements. Although there are many methods developed to detect DHSs, there is a big gap in practice. We presented a deep learning-based language model for predicting DHSs, named LangMoDHS. The LangMoDHS mainly comprised the convolutional neural network (CNN), the bi-directional long short-term memory (Bi-LSTM) and the feed-forward attention. The CNN and the Bi-LSTM were stacked in a parallel manner, which was helpful to accumulate multiple-view representations from primary DNA sequences. We conducted 5-fold cross-validations and independent tests over 14 tissues and 4 developmental stages. The empirical experiments showed that the LangMoDHS is competitive with or slightly better than the iDHS-Deep, which is the latest method for predicting DHSs. The empirical experiments also implied substantial contribution of the CNN, Bi-LSTM, and attention to DHSs prediction. We implemented the LangMoDHS as a user-friendly web server which is accessible at http:/www.biolscience.cn/LangMoDHS/. We used indices related to information entropy to explore the sequence motif of DHSs. The analysis provided a certain insight into the DHSs.
Collapse
Affiliation(s)
- Xingyu Tang
- School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China
| | - Peijie Zheng
- School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China
| | - Yuewu Liu
- College of Information and Intelligence, Hunan Agricultural University, Changsha 410128, China
| | - Yuhua Yao
- School of Mathematics and Statistics, Hainan Normal University, Haikou 571158, China
| | - Guohua Huang
- School of Electrical Engineering, Shaoyang University, Shaoyang 422000, China
| |
Collapse
|
22
|
Dapas M, Dunaif A. Deconstructing a Syndrome: Genomic Insights Into PCOS Causal Mechanisms and Classification. Endocr Rev 2022; 43:927-965. [PMID: 35026001 PMCID: PMC9695127 DOI: 10.1210/endrev/bnac001] [Citation(s) in RCA: 58] [Impact Index Per Article: 29.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Indexed: 01/16/2023]
Abstract
Polycystic ovary syndrome (PCOS) is among the most common disorders in women of reproductive age, affecting up to 15% worldwide, depending on the diagnostic criteria. PCOS is characterized by a constellation of interrelated reproductive abnormalities, including disordered gonadotropin secretion, increased androgen production, chronic anovulation, and polycystic ovarian morphology. It is frequently associated with insulin resistance and obesity. These reproductive and metabolic derangements cause major morbidities across the lifespan, including anovulatory infertility and type 2 diabetes (T2D). Despite decades of investigative effort, the etiology of PCOS remains unknown. Familial clustering of PCOS cases has indicated a genetic contribution to PCOS. There are rare Mendelian forms of PCOS associated with extreme phenotypes, but PCOS typically follows a non-Mendelian pattern of inheritance consistent with a complex genetic architecture, analogous to T2D and obesity, that reflects the interaction of susceptibility genes and environmental factors. Genomic studies of PCOS have provided important insights into disease pathways and have indicated that current diagnostic criteria do not capture underlying differences in biology associated with different forms of PCOS. We provide a state-of-the-science review of genetic analyses of PCOS, including an overview of genomic methodologies aimed at a general audience of non-geneticists and clinicians. Applications in PCOS will be discussed, including strengths and limitations of each study. The contributions of environmental factors, including developmental origins, will be reviewed. Insights into the pathogenesis and genetic architecture of PCOS will be summarized. Future directions for PCOS genetic studies will be outlined.
Collapse
Affiliation(s)
- Matthew Dapas
- Department of Human Genetics, University of Chicago, Chicago, IL, USA
| | - Andrea Dunaif
- Division of Endocrinology, Diabetes and Bone Disease, Department of Medicine, Icahn School of Medicine at Mount Sinai, New York, NY, USA
| |
Collapse
|
23
|
Li Z, Zhao B, Qin C, Wang Y, Li T, Wang W. Chromatin Dynamics in Digestive System Cancer: Commander and Regulator. Front Oncol 2022; 12:935877. [PMID: 35965507 PMCID: PMC9372441 DOI: 10.3389/fonc.2022.935877] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Accepted: 06/23/2022] [Indexed: 11/30/2022] Open
Abstract
Digestive system tumors have a poor prognosis due to complex anatomy, insidious onset, challenges in early diagnosis, and chemoresistance. Epidemiological statistics has verified that digestive system tumors rank first in tumor-related death. Although a great number of studies are devoted to the molecular biological mechanism, early diagnostic markers, and application of new targeted drugs in digestive system tumors, the therapeutic effect is still not satisfactory. Epigenomic alterations including histone modification and chromatin remodeling are present in human cancers and are now known to cooperate with genetic changes to drive the cancer phenotype. Chromatin is the carrier of genetic information and consists of DNA, histones, non-histone proteins, and a small amount of RNA. Chromatin and nucleosomes control the stability of the eukaryotic genome and regulate DNA processes such as transcription, replication, and repair. The dynamic structure of chromatin plays a key role in this regulatory function. Structural fluctuations expose internal DNA and thus provide access to the nuclear machinery. The dynamic changes are affected by various complexes and epigenetic modifications. Variation of chromatin dynamics produces early and superior regulation of the expression of related genes and downstream pathways, thereby controlling tumor development. Intervention at the chromatin level can change the process of cancer earlier and is a feasible option for future tumor diagnosis and treatment. In this review, we introduced chromatin dynamics including chromatin remodeling, histone modifications, and chromatin accessibility, and current research on chromatin regulation in digestive system tumors was also summarized.
Collapse
|
24
|
Tu X, Marand AP, Schmitz RJ, Zhong S. A combinatorial indexing strategy for low-cost epigenomic profiling of plant single cells. PLANT COMMUNICATIONS 2022; 3:100308. [PMID: 35605196 PMCID: PMC9284282 DOI: 10.1016/j.xplc.2022.100308] [Citation(s) in RCA: 10] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 11/07/2021] [Revised: 02/22/2022] [Accepted: 02/28/2022] [Indexed: 06/15/2023]
Abstract
Understanding how cis-regulatory elements facilitate gene expression is a key question in biology. Recent advances in single-cell genomics have led to the discovery of cell-specific chromatin landscapes that underlie transcription programs in animal models. However, the high equipment and reagent costs of commercial systems limit their applications for many laboratories. In this study, we developed a combinatorial index and dual PCR barcode strategy to profile the Arabidopsis thaliana root single-cell epigenome without any specialized equipment. We generated chromatin accessibility profiles for 13 576 root nuclei with an average of 12 784 unique Tn5 integrations per cell. Integration of the single-cell assay for transposase-accessible chromatin sequencing and RNA sequencing data sets enabled the identification of 24 cell clusters with unique transcription, chromatin, and cis-regulatory signatures. Comparison with single-cell data generated using the commercial microfluidic platform from 10X Genomics revealed that this low-cost combinatorial index method is capable of unbiased identification of cell-type-specific chromatin accessibility. We anticipate that, by removing cost, instrumentation, and other technical obstacles, this method will be a valuable tool for routine investigation of single-cell epigenomes and provide new insights into plant growth and development and plant interactions with the environment.
Collapse
Affiliation(s)
- Xiaoyu Tu
- Joint Center for Single Cell Biology, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China; Shanghai Collaborative Innovation Center of Agri-Seeds, School of Agriculture and Biology, Shanghai Jiao Tong University, Shanghai 200240, China
| | | | - Robert J Schmitz
- Department of Genetics, University of Georgia, Athens, GA 30602, USA.
| | - Silin Zhong
- State Key Laboratory of Agrobiotechnology, School of Life Sciences, The Chinese University of Hong Kong, Hong Kong, China.
| |
Collapse
|
25
|
Hou TY, Kraus WL. Analysis of estrogen-regulated enhancer RNAs identifies a functional motif required for enhancer assembly and gene expression. Cell Rep 2022; 39:110944. [PMID: 35705040 PMCID: PMC9246336 DOI: 10.1016/j.celrep.2022.110944] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Revised: 04/12/2022] [Accepted: 05/20/2022] [Indexed: 11/03/2022] Open
Abstract
To better understand the functions of non-coding enhancer RNAs (eRNAs), we annotated the estrogen-regulated eRNA transcriptome in estrogen receptor α (ERα)-positive breast cancer cells using PRO-cap and RNA sequencing. We then cloned a subset of the eRNAs identified, fused them to single guide RNAs, and targeted them to their ERα enhancers of origin using CRISPR/dCas9. Some of the eRNAs tested modulated the expression of cognate, but not heterologous, target genes after estrogen treatment by increasing ERα recruitment and stimulating p300-catalyzed H3K27 acetylation at the enhancer. We identified a ∼40 nucleotide functional eRNA regulatory motif (FERM) present in many eRNAs that was necessary and sufficient to modulate gene expression, but not the specificity of activation, after estrogen treatment. The FERM interacted with BCAS2, an RNA-binding protein amplified in breast cancers. The ectopic expression of a targeted eRNA controlling the expression of an oncogene resulted in increased cell proliferation, demonstrating the regulatory potential of eRNAs in breast cancer.
Collapse
Affiliation(s)
- Tim Y Hou
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA; Division of Basic Research, Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA
| | - W Lee Kraus
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA; Division of Basic Research, Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, 5323 Harry Hines Boulevard, Dallas, TX 75390, USA.
| |
Collapse
|
26
|
Liu Q, Hua K, Zhang X, Wong WH, Jiang R. DeepCAGE: Incorporating Transcription Factors in Genome-wide Prediction of Chromatin Accessibility. GENOMICS, PROTEOMICS & BIOINFORMATICS 2022; 20:496-507. [PMID: 35293310 PMCID: PMC9801045 DOI: 10.1016/j.gpb.2021.08.015] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/31/2021] [Revised: 05/31/2021] [Accepted: 09/27/2021] [Indexed: 01/26/2023]
Abstract
Although computational approaches have been complementing high-throughput biological experiments for the identification of functional regions in the human genome, it remains a great challenge to systematically decipher interactions between transcription factors (TFs) and regulatory elements to achieve interpretable annotations of chromatin accessibility across diverse cellular contexts. To solve this problem, we propose DeepCAGE, a deep learning framework that integrates sequence information and binding statuses of TFs, for the accurate prediction of chromatin accessible regions at a genome-wide scale in a variety of cell types. DeepCAGE takes advantage of a densely connected deep convolutional neural network architecture to automatically learn sequence signatures of known chromatin accessible regions and then incorporates such features with expression levels and binding activities of human core TFs to predict novel chromatin accessible regions. In a series of systematic comparisons with existing methods, DeepCAGE exhibits superior performance in not only the classification but also the regression of chromatin accessibility signals. In a detailed analysis of TF activities, DeepCAGE successfully extracts novel binding motifs and measures the contribution of a TF to the regulation with respect to a specific locus in a certain cell type. When applied to whole-genome sequencing data analysis, our method successfully prioritizes putative deleterious variants underlying a human complex trait and thus provides insights into the understanding of disease-associated genetic variants. DeepCAGE can be downloaded from https://github.com/kimmo1019/DeepCAGE.
Collapse
Affiliation(s)
- Qiao Liu
- Ministry of Education Key Laboratory of Bioinformatics; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China,Department of Statistics, Stanford University, Stanford, CA 94305, USA
| | - Kui Hua
- Ministry of Education Key Laboratory of Bioinformatics; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Xuegong Zhang
- Ministry of Education Key Laboratory of Bioinformatics; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China
| | - Wing Hung Wong
- Department of Statistics, Stanford University, Stanford, CA 94305, USA,Corresponding authors.
| | - Rui Jiang
- Ministry of Education Key Laboratory of Bioinformatics; Bioinformatics Division, Beijing National Research Center for Information Science and Technology; Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing 100084, China,Corresponding authors.
| |
Collapse
|
27
|
Hesami M, Alizadeh M, Jones AMP, Torkamaneh D. Machine learning: its challenges and opportunities in plant system biology. Appl Microbiol Biotechnol 2022; 106:3507-3530. [PMID: 35575915 DOI: 10.1007/s00253-022-11963-6] [Citation(s) in RCA: 18] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/14/2022] [Revised: 03/14/2022] [Accepted: 05/07/2022] [Indexed: 12/25/2022]
Abstract
Sequencing technologies are evolving at a rapid pace, enabling the generation of massive amounts of data in multiple dimensions (e.g., genomics, epigenomics, transcriptomic, metabolomics, proteomics, and single-cell omics) in plants. To provide comprehensive insights into the complexity of plant biological systems, it is important to integrate different omics datasets. Although recent advances in computational analytical pipelines have enabled efficient and high-quality exploration and exploitation of single omics data, the integration of multidimensional, heterogenous, and large datasets (i.e., multi-omics) remains a challenge. In this regard, machine learning (ML) offers promising approaches to integrate large datasets and to recognize fine-grained patterns and relationships. Nevertheless, they require rigorous optimizations to process multi-omics-derived datasets. In this review, we discuss the main concepts of machine learning as well as the key challenges and solutions related to the big data derived from plant system biology. We also provide in-depth insight into the principles of data integration using ML, as well as challenges and opportunities in different contexts including multi-omics, single-cell omics, protein function, and protein-protein interaction. KEY POINTS: • The key challenges and solutions related to the big data derived from plant system biology have been highlighted. • Different methods of data integration have been discussed. • Challenges and opportunities of the application of machine learning in plant system biology have been highlighted and discussed.
Collapse
Affiliation(s)
- Mohsen Hesami
- Department of Plant Agriculture, University of Guelph, Guelph, ON, N1G 2W1, Canada
| | - Milad Alizadeh
- Department of Botany, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada
| | | | - Davoud Torkamaneh
- Département de Phytologie, Université Laval, Québec City, QC, G1V 0A6, Canada. .,Institut de Biologie Intégrative Et Des Systèmes (IBIS), Université Laval, Québec City, QC, G1V 0A6, Canada.
| |
Collapse
|
28
|
Grandi FC, Modi H, Kampman L, Corces MR. Chromatin accessibility profiling by ATAC-seq. Nat Protoc 2022; 17:1518-1552. [PMID: 35478247 DOI: 10.1038/s41596-022-00692-9] [Citation(s) in RCA: 65] [Impact Index Per Article: 32.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/01/2021] [Accepted: 02/22/2022] [Indexed: 12/13/2022]
Abstract
The assay for transposase-accessible chromatin using sequencing (ATAC-seq) provides a simple and scalable way to detect the unique chromatin landscape associated with a cell type and how it may be altered by perturbation or disease. ATAC-seq requires a relatively small number of input cells and does not require a priori knowledge of the epigenetic marks or transcription factors governing the dynamics of the system. Here we describe an updated and optimized protocol for ATAC-seq, called Omni-ATAC, that is applicable across a broad range of cell and tissue types. The ATAC-seq workflow has five main steps: sample preparation, transposition, library preparation, sequencing and data analysis. This protocol details the steps to generate and sequence ATAC-seq libraries, with recommendations for sample preparation and downstream bioinformatic analysis. ATAC-seq libraries for roughly 12 samples can be generated in 10 h by someone familiar with basic molecular biology, and downstream sequencing analysis can be implemented using benchmarked pipelines by someone with basic bioinformatics skills and with access to a high-performance computing environment.
Collapse
Affiliation(s)
- Fiorella C Grandi
- Gladstone Institute of Neurological Disease, San Francisco, CA, USA.,Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA.,Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Hailey Modi
- Gladstone Institute of Neurological Disease, San Francisco, CA, USA.,Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA.,Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - Lucas Kampman
- Gladstone Institute of Neurological Disease, San Francisco, CA, USA.,Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA.,Department of Neurology, University of California San Francisco, San Francisco, CA, USA
| | - M Ryan Corces
- Gladstone Institute of Neurological Disease, San Francisco, CA, USA. .,Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA. .,Department of Neurology, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
29
|
Ashuach T, Reidenbach DA, Gayoso A, Yosef N. PeakVI: A deep generative model for single-cell chromatin accessibility analysis. CELL REPORTS METHODS 2022; 2:100182. [PMID: 35475224 PMCID: PMC9017241 DOI: 10.1016/j.crmeth.2022.100182] [Citation(s) in RCA: 20] [Impact Index Per Article: 10.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/18/2021] [Revised: 01/08/2022] [Accepted: 02/23/2022] [Indexed: 12/20/2022]
Abstract
Single-cell ATAC sequencing (scATAC-seq) is a powerful and increasingly popular technique to explore the regulatory landscape of heterogeneous cellular populations. However, the high noise levels, degree of sparsity, and scale of the generated data make its analysis challenging. Here, we present PeakVI, a probabilistic framework that leverages deep neural networks to analyze scATAC-seq data. PeakVI fits an informative latent space that preserves biological heterogeneity while correcting batch effects and accounting for technical effects, such as library size and region-specific biases. In addition, PeakVI provides a technique for identifying differential accessibility at a single-region resolution, which can be used for cell-type annotation as well as identification of key cis-regulatory elements. We use public datasets to demonstrate that PeakVI is scalable, stable, robust to low-quality data, and outperforms current analysis methods on a range of critical analysis tasks. PeakVI is publicly available and implemented in the scvi-tools framework.
Collapse
Affiliation(s)
- Tal Ashuach
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Daniel A. Reidenbach
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
| | - Adam Gayoso
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
| | - Nir Yosef
- Center for Computational Biology, University of California, Berkeley, Berkeley, CA, USA
- Department of Electrical Engineering and Computer Sciences, University of California, Berkeley, Berkeley, CA, USA
- Ragon Institute of MGH, MIT, and Harvard, Cambridge, MA, USA
- Chan Zuckerberg BioHub, San Francisco, CA, USA
| |
Collapse
|
30
|
Luo L, Gribskov M, Wang S. Bibliometric review of ATAC-Seq and its application in gene expression. Brief Bioinform 2022; 23:6543486. [PMID: 35255493 PMCID: PMC9116206 DOI: 10.1093/bib/bbac061] [Citation(s) in RCA: 12] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2021] [Revised: 02/06/2022] [Accepted: 02/09/2022] [Indexed: 11/30/2022] Open
Abstract
With recent advances in high-throughput next-generation sequencing, it is possible to describe the regulation and expression of genes at multiple levels. An assay for transposase-accessible chromatin using sequencing (ATAC-seq), which uses Tn5 transposase to sequence protein-free binding regions of the genome, can be combined with chromatin immunoprecipitation coupled with deep sequencing (ChIP-seq) and ribonucleic acid sequencing (RNA-seq) to provide a detailed description of gene expression. Here, we reviewed the literature on ATAC-seq and described the characteristics of ATAC-seq publications. We then briefly introduced the principles of RNA-seq, ChIP-seq and ATAC-seq, focusing on the main features of the techniques. We built a phylogenetic tree from species that had been previously studied by using ATAC-seq. Studies of Mus musculus and Homo sapiens account for approximately 90% of the total ATAC-seq data, while other species are still in the process of accumulating data. We summarized the findings from human diseases and other species, illustrating the cutting-edge discoveries and the role of multi-omics data analysis in current research. Moreover, we collected and compared ATAC-seq analysis pipelines, which allowed biological researchers who lack programming skills to better analyze and explore ATAC-seq data. Through this review, it is clear that multi-omics analysis and single-cell sequencing technology will become the mainstream approach in future research.
Collapse
Affiliation(s)
- Liheng Luo
- School of Life Sciences, Northwestern Polytechnical University, Xi'an, Shaanxi, China, 710072
| | - Michael Gribskov
- Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Sufang Wang
- School of Life Sciences, Northwestern Polytechnical University, Xi'an, Shaanxi, China, 710072
| |
Collapse
|
31
|
Zibetti C. Deciphering the Retinal Epigenome during Development, Disease and Reprogramming: Advancements, Challenges and Perspectives. Cells 2022; 11:cells11050806. [PMID: 35269428 PMCID: PMC8908986 DOI: 10.3390/cells11050806] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/14/2022] [Revised: 02/15/2022] [Accepted: 02/18/2022] [Indexed: 02/01/2023] Open
Abstract
Retinal neurogenesis is driven by concerted actions of transcription factors, some of which are expressed in a continuum and across several cell subtypes throughout development. While seemingly redundant, many factors diversify their regulatory outcome on gene expression, by coordinating variations in chromatin landscapes to drive divergent retinal specification programs. Recent studies have furthered the understanding of the epigenetic contribution to the progression of age-related macular degeneration, a leading cause of blindness in the elderly. The knowledge of the epigenomic mechanisms that control the acquisition and stabilization of retinal cell fates and are evoked upon damage, holds the potential for the treatment of retinal degeneration. Herein, this review presents the state-of-the-art approaches to investigate the retinal epigenome during development, disease, and reprogramming. A pipeline is then reviewed to functionally interrogate the epigenetic and transcriptional networks underlying cell fate specification, relying on a truly unbiased screening of open chromatin states. The related work proposes an inferential model to identify gene regulatory networks, features the first footprinting analysis and the first tentative, systematic query of candidate pioneer factors in the retina ever conducted in any model organism, leading to the identification of previously uncharacterized master regulators of retinal cell identity, such as the nuclear factor I, NFI. This pipeline is virtually applicable to the study of genetic programs and candidate pioneer factors in any developmental context. Finally, challenges and limitations intrinsic to the current next-generation sequencing techniques are discussed, as well as recent advances in super-resolution imaging, enabling spatio-temporal resolution of the genome.
Collapse
Affiliation(s)
- Cristina Zibetti
- Department of Ophthalmology, Institute of Clinical Medicine, University of Oslo, Kirkeveien 166, Building 36, 0455 Oslo, Norway
| |
Collapse
|
32
|
Yuan J, Sun H, Wang Y, Li L, Chen S, Jiao W, Jia G, Wang L, Mao J, Ni Z, Wang X, Song Q. Open chromatin interaction maps reveal functional regulatory elements and chromatin architecture variations during wheat evolution. Genome Biol 2022; 23:34. [PMID: 35073966 PMCID: PMC8785527 DOI: 10.1186/s13059-022-02611-3] [Citation(s) in RCA: 16] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2021] [Accepted: 01/14/2022] [Indexed: 12/13/2022] Open
Abstract
BACKGROUND Bread wheat (Triticum aestivum) is an allohexaploid that is generated by two subsequent allopolyploidization events. The large genome size (16 Gb) and polyploid complexity impede our understanding of how regulatory elements and their interactions shape chromatin structure and gene expression in wheat. The open chromatin enrichment and network Hi-C (OCEAN-C) is a powerful antibody-independent method to detect chromatin interactions between open chromatin regions throughout the genome. RESULTS Here we generate open chromatin interaction maps for hexaploid wheat and its tetraploid and diploid relatives using OCEAN-C. The anchors of chromatin loops show high chromatin accessibility and are concomitant with several active histone modifications, with 67% of them interacting with multiple loci. Binding motifs of various transcription factors are significantly enriched in the hubs of open chromatin interactions (HOCIs). The genes linked by HOCIs represent higher expression level and lower coefficient expression variance than the genes linked by other loops, which suggests HOCIs may coordinate co-expression of linked genes. Thousands of interchromosomal loops are identified, while limited interchromosomal loops (0.4%) are identified between homoeologous genes in hexaploid wheat. Moreover, we find structure variations contribute to chromatin interaction divergence of homoeologs and chromatin topology changes between different wheat species. The genes with discrepant chromatin interactions show expression alteration in hexaploid wheat compared with its tetraploid and diploid relatives. CONCLUSIONS Our results reveal open chromatin interactions in different wheat species, which provide new insights into the role of open chromatin interactions in gene expression during the evolution of polyploid wheat.
Collapse
Affiliation(s)
- Jingya Yuan
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No. 1 Weigang, Nanjing, 210095, Jiangsu, China
| | - Haojie Sun
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No. 1 Weigang, Nanjing, 210095, Jiangsu, China
| | - Yijin Wang
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No. 1 Weigang, Nanjing, 210095, Jiangsu, China
| | - Lulu Li
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No. 1 Weigang, Nanjing, 210095, Jiangsu, China
| | - Shiting Chen
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No. 1 Weigang, Nanjing, 210095, Jiangsu, China
| | - Wu Jiao
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No. 1 Weigang, Nanjing, 210095, Jiangsu, China
| | - Guanghong Jia
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No. 1 Weigang, Nanjing, 210095, Jiangsu, China
| | - Longfei Wang
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No. 1 Weigang, Nanjing, 210095, Jiangsu, China
| | - Junrong Mao
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No. 1 Weigang, Nanjing, 210095, Jiangsu, China
| | - Zhongfu Ni
- State Key Laboratory for Agrobiotechnology, Key Laboratory of Crop Heterosis and Utilization (MOE), Beijing Key Laboratory of Crop Genetic Improvement, China Agricultural University, Beijing, 100193, China
| | - Xiue Wang
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No. 1 Weigang, Nanjing, 210095, Jiangsu, China
| | - Qingxin Song
- State Key Laboratory of Crop Genetics and Germplasm Enhancement, Jiangsu Collaborative Innovation Center for Modern Crop Production, Nanjing Agricultural University, No. 1 Weigang, Nanjing, 210095, Jiangsu, China.
| |
Collapse
|
33
|
Marinov GK, Shipony Z, Kundaje A, Greenleaf WJ. Single-Molecule Multikilobase-Scale Profiling of Chromatin Accessibility Using m6A-SMAC-Seq and m6A-CpG-GpC-SMAC-Seq. Methods Mol Biol 2022; 2458:269-298. [PMID: 35103973 PMCID: PMC9531602 DOI: 10.1007/978-1-0716-2140-0_15] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
A hallmark feature of active cis-regulatory elements (CREs) in eukaryotes is their nucleosomal depletion and, accordingly, higher accessibility to enzymatic treatment. This property has been the basis of a number of sequencing-based assays for genome-wide identification and tracking the activity of CREs across different biological conditions, such as DNAse-seq, ATAC-seq , NOMeseq, and others. However, the fragmentation of DNA inherent to many of these assays and the limited read length of short-read sequencing platforms have so far not allowed the simultaneous measurement of the chromatin accessibility state of CREs located distally from each other. The combination of labeling accessible DNA with DNA modifications and nanopore sequencing has made it possible to develop such assays. Here, we provide a detailed protocol for carrying out the SMAC-seq assay (Single-Molecule long-read Accessible Chromatin mapping sequencing), in its m6A-SMAC-seq and m6A-CpG-GpC-SMAC-seq variants, together with methods for data processing and analysis, and discuss key experimental and analytical considerations for working with SMAC-seq datasets.
Collapse
Affiliation(s)
| | - Zohar Shipony
- Department of Genetics, Stanford University, Stanford, CA, USA.
| | - Anshul Kundaje
- Department of Genetics, Stanford University, Stanford, CA, USA
- Department of Computer Science, Stanford University, Stanford, CA, USA
| | - William J Greenleaf
- Department of Genetics, Stanford University, Stanford, CA, USA
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA
- Department of Applied Physics, Stanford University, Stanford, CA, USA
- Chan Zuckerberg Biohub, San Francisco, CA, USA
| |
Collapse
|
34
|
L'Yi S, Wang Q, Lekschas F, Gehlenborg N. Gosling: A Grammar-based Toolkit for Scalable and Interactive Genomics Data Visualization. IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS 2022; 28:140-150. [PMID: 34596551 PMCID: PMC8826597 DOI: 10.1109/tvcg.2021.3114876] [Citation(s) in RCA: 17] [Impact Index Per Article: 8.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/20/2023]
Abstract
The combination of diverse data types and analysis tasks in genomics has resulted in the development of a wide range of visualization techniques and tools. However, most existing tools are tailored to a specific problem or data type and offer limited customization, making it challenging to optimize visualizations for new analysis tasks or datasets. To address this challenge, we designed Gosling-a grammar for interactive and scalable genomics data visualization. Gosling balances expressiveness for comprehensive multi-scale genomics data visualizations with accessibility for domain scientists. Our accompanying JavaScript toolkit called Gosling.js provides scalable and interactive rendering. Gosling.js is built on top of an existing platform for web-based genomics data visualization to further simplify the visualization of common genomics data formats. We demonstrate the expressiveness of the grammar through a variety of real-world examples. Furthermore, we show how Gosling supports the design of novel genomics visualizations. An online editor and examples of Gosling.js, its source code, and documentation are available at https://gosling.js.org.
Collapse
|
35
|
Loft A, Andersen MW, Madsen JGS, Mandrup S. Analysis of Enhancers and Transcriptional Networks in Thermogenic Adipocytes. Methods Mol Biol 2022; 2448:155-175. [PMID: 35167097 DOI: 10.1007/978-1-0716-2087-8_11] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]
Abstract
Transcription factor (TF) networks orchestrate the regulation of gene programs in mammalian cells, including white and brown adipocytes. In this protocol, we outline how genomics and transcriptomics data can be integrated to infer causal TFs of a given cellular response or cell type using "Integrated analysis of Motif Activity and Gene Expression changes of transcription factors" (IMAGE). Here, we show how key regulatory TFs controlling white and brown adipocyte gene programs can be predicted from chromatin accessibility and RNA-seq data. Furthermore, we demonstrate how information about target sites and target genes of the predicted key regulators can be integrated to propose testable hypotheses regarding the role and mechanisms of TFs.
Collapse
Affiliation(s)
- Anne Loft
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark.
- Center for Functional Genomics and Tissue Plasticity (ATLAS), University of Southern Denmark, Odense, Denmark.
| | - Maja Worm Andersen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
| | - Jesper Grud Skat Madsen
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark
- Center for Functional Genomics and Tissue Plasticity (ATLAS), University of Southern Denmark, Odense, Denmark
| | - Susanne Mandrup
- Department of Biochemistry and Molecular Biology, University of Southern Denmark, Odense, Denmark.
- Center for Functional Genomics and Tissue Plasticity (ATLAS), University of Southern Denmark, Odense, Denmark.
| |
Collapse
|
36
|
iDHS-DT: Identifying DNase I hypersensitive sites by integrating DNA dinucleotide and trinucleotide information. Biophys Chem 2021; 281:106717. [PMID: 34798459 DOI: 10.1016/j.bpc.2021.106717] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2021] [Revised: 11/10/2021] [Accepted: 11/10/2021] [Indexed: 01/02/2023]
Abstract
DNase I hypersensitive sites (DHSs) is important for identifying the location of gene regulatory elements, such as promoters, enhancers, silencers, and so on. Thus, it is crucial for discriminating DHSs from non-DHSs. Although some traditional methods, such as Southern blots and DNase-seq technique, have the ability to identify DHSs, these approaches are time-consuming, laborious, and expensive. To address these issues, researchers paid their attention on computational approaches. Therefore, in this study, we developed a novel predictor called iDHS-DT to identify DHSs. In this predictor, the DNA sequences were firstly denoted by physicochemical properties (PC) of DNA dinucleotide and trinucleotide. Then, three different descriptors, including auto-covariance, cross-covariance, and discrete wavelet transform were used to collect related features from the PC matrix. Next, the least absolute shrinkage and selection operator (LASSO) algorithm was employed to remove these irrelevant and redundant features. Finally, these selected features were fed into support vector machine (SVM) for distinguishing DHSs from non-DHSs. The proposed method achieved 97.64% and 98.22% classification accuracy on dataset S1 and S2, respectively. Compared with the existing predictors, our proposed model has significantly improvement in classification performance. Experimental results demonstrated that the proposed method is powerful in identifying DHSs.
Collapse
|
37
|
Peng S, Petersen JL, Bellone RR, Kalbfleisch T, Kingsley NB, Barber AM, Cappelletti E, Giulotto E, Finno CJ. Decoding the Equine Genome: Lessons from ENCODE. Genes (Basel) 2021; 12:genes12111707. [PMID: 34828313 PMCID: PMC8625040 DOI: 10.3390/genes12111707] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Revised: 10/24/2021] [Accepted: 10/26/2021] [Indexed: 12/23/2022] Open
Abstract
The horse reference genome assemblies, EquCab2.0 and EquCab3.0, have enabled great advancements in the equine genomics field, from tools to novel discoveries. However, significant gaps of knowledge regarding genome function remain, hindering the study of complex traits in horses. In an effort to address these gaps and with inspiration from the Encyclopedia of DNA Elements (ENCODE) project, the equine Functional Annotation of Animal Genome (FAANG) initiative was proposed to bridge the gap between genome and gene expression, providing further insights into functional regulation within the horse genome. Three years after launching the initiative, the equine FAANG group has generated data from more than 400 experiments using over 50 tissues, targeting a variety of regulatory features of the equine genome. In this review, we examine how valuable lessons learned from the ENCODE project informed our decisions in the equine FAANG project. We report the current state of the equine FAANG project and discuss how FAANG can serve as a template for future expansion of functional annotation in the equine genome and be used as a reference for studies of complex traits in horse. A well-annotated reference functional atlas will also help advance equine genetics in the pan-genome and precision medicine era.
Collapse
Affiliation(s)
- Sichong Peng
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California-Davis, Davis, CA 95616, USA; (S.P.); (R.R.B.); (N.B.K.)
| | - Jessica L. Petersen
- Department of Animal Science, University of Nebraska, Lincoln, NE 68583-0908, USA; (J.L.P.); (A.M.B.)
| | - Rebecca R. Bellone
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California-Davis, Davis, CA 95616, USA; (S.P.); (R.R.B.); (N.B.K.)
- Veterinary Genetics Laboratory, School of Veterinary Medicine, University of California, Davis, CA 95616, USA
| | - Ted Kalbfleisch
- Department of Veterinary Science, Gluck Equine Research Center, University of Kentucky, Lexington, KY 40503, USA;
| | - N. B. Kingsley
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California-Davis, Davis, CA 95616, USA; (S.P.); (R.R.B.); (N.B.K.)
- Veterinary Genetics Laboratory, School of Veterinary Medicine, University of California, Davis, CA 95616, USA
| | - Alexa M. Barber
- Department of Animal Science, University of Nebraska, Lincoln, NE 68583-0908, USA; (J.L.P.); (A.M.B.)
| | - Eleonora Cappelletti
- Department of Biology and Biotechnology “L. Spallanzani”, University of Pavia, 27100 Pavia, Italy; (E.C.); (E.G.)
| | - Elena Giulotto
- Department of Biology and Biotechnology “L. Spallanzani”, University of Pavia, 27100 Pavia, Italy; (E.C.); (E.G.)
| | - Carrie J. Finno
- Department of Population Health and Reproduction, School of Veterinary Medicine, University of California-Davis, Davis, CA 95616, USA; (S.P.); (R.R.B.); (N.B.K.)
- Correspondence:
| |
Collapse
|
38
|
Xu W, Wen Y, Liang Y, Xu Q, Wang X, Jin W, Chen X. A plate-based single-cell ATAC-seq workflow for fast and robust profiling of chromatin accessibility. Nat Protoc 2021; 16:4084-4107. [PMID: 34282334 DOI: 10.1038/s41596-021-00583-5] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2020] [Accepted: 06/04/2021] [Indexed: 02/06/2023]
Abstract
Profiling chromatin accessibility at the single-cell level provides critical information about cell type composition and cell-to-cell variation within a complex tissue. Emerging techniques for the interrogation of chromatin accessibility in individual cells allow investigation of the fundamental mechanisms that lead to the variability of different cells. This protocol describes a fast and robust method for single-cell chromatin accessibility profiling based on the assay for transposase-accessible chromatin using sequencing (ATAC-seq). The method combines up-front bulk Tn5 tagging of chromatin with flow cytometry to isolate single nuclei or cells. Reagents required to generate sequencing libraries are added to the same well in the plate where cells are sorted. The protocol described here generates data of high complexity and excellent signal-to-noise ratio and can be combined with index sorting for in-depth characterization of cell types. The whole experimental procedure can be finished within 1 or 2 d with a throughput of hundreds to thousands of nuclei, and the data can be processed by the provided computational pipeline. The execution of the protocol only requires basic techniques and equipment in a molecular biology laboratory with flow cytometry support.
Collapse
Affiliation(s)
- Wei Xu
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Yi Wen
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Yingying Liang
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Qiushi Xu
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Xuefei Wang
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Wenfei Jin
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China
| | - Xi Chen
- Shenzhen Key Laboratory of Gene Regulation and Systems Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, China.
| |
Collapse
|
39
|
Challenges and Opportunities in Understanding Genetics of Fungal Diseases: Towards a Functional Genomics Approach. Infect Immun 2021; 89:e0000521. [PMID: 34031131 DOI: 10.1128/iai.00005-21] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022] Open
Abstract
Infectious diseases are a leading cause of morbidity and mortality worldwide, and human pathogens have long been recognized as one of the main sources of evolutionary pressure, resulting in a high variable genetic background in immune-related genes. The study of the genetic contribution to infectious diseases has undergone tremendous advances over the last decades. Here, focusing on genetic predisposition to fungal diseases, we provide an overview of the available approaches for studying human genetic susceptibility to infections, reviewing current methodological and practical limitations. We describe how the classical methods available, such as family-based studies and candidate gene studies, have contributed to the discovery of crucial susceptibility factors for fungal infections. We will also discuss the contribution of novel unbiased approaches to the field, highlighting their success but also their limitations for the fungal immunology field. Finally, we show how a systems genomics approach can overcome those limitations and can lead to efficient prioritization and identification of genes and pathways with a critical role in susceptibility to fungal diseases. This knowledge will help to stratify at-risk patient groups and, subsequently, develop early appropriate prophylactic and treatment strategies.
Collapse
|
40
|
Kern C, Wang Y, Xu X, Pan Z, Halstead M, Chanthavixay G, Saelao P, Waters S, Xiang R, Chamberlain A, Korf I, Delany ME, Cheng HH, Medrano JF, Van Eenennaam AL, Tuggle CK, Ernst C, Flicek P, Quon G, Ross P, Zhou H. Functional annotations of three domestic animal genomes provide vital resources for comparative and agricultural research. Nat Commun 2021; 12:1821. [PMID: 33758196 PMCID: PMC7988148 DOI: 10.1038/s41467-021-22100-8] [Citation(s) in RCA: 83] [Impact Index Per Article: 27.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2020] [Accepted: 03/01/2021] [Indexed: 01/31/2023] Open
Abstract
Gene regulatory elements are central drivers of phenotypic variation and thus of critical importance towards understanding the genetics of complex traits. The Functional Annotation of Animal Genomes consortium was formed to collaboratively annotate the functional elements in animal genomes, starting with domesticated animals. Here we present an expansive collection of datasets from eight diverse tissues in three important agricultural species: chicken (Gallus gallus), pig (Sus scrofa), and cattle (Bos taurus). Comparative analysis of these datasets and those from the human and mouse Encyclopedia of DNA Elements projects reveal that a core set of regulatory elements are functionally conserved independent of divergence between species, and that tissue-specific transcription factor occupancy at regulatory elements and their predicted target genes are also conserved. These datasets represent a unique opportunity for the emerging field of comparative epigenomics, as well as the agricultural research community, including species that are globally important food resources.
Collapse
Affiliation(s)
- Colin Kern
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Ying Wang
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Xiaoqin Xu
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Zhangyuan Pan
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Michelle Halstead
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Ganrea Chanthavixay
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Perot Saelao
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Susan Waters
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Ruidong Xiang
- Faculty of Veterinary and Agricultural Sciences, The University of Melbourne, Melbourne, VIC, Australia
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
| | - Amanda Chamberlain
- Agriculture Victoria, AgriBio, Centre for AgriBioscience, Bundoora, VIC, Australia
| | - Ian Korf
- Genome Center, University of California, Davis, Davis, CA, USA
| | - Mary E Delany
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | - Hans H Cheng
- USDA-ARS, Avian Disease and Oncology Laboratory, East Lansing, MI, USA
| | - Juan F Medrano
- Department of Animal Science, University of California, Davis, Davis, CA, USA
| | | | - Chris K Tuggle
- Department of Animal Science, Iowa State University, Ames, IA, USA
| | - Catherine Ernst
- Department of Animal Science, Michigan State University, East Lansing, MI, USA
| | - Paul Flicek
- European Molecular Biology Laboratory, European Bioinformatics Institute, Wellcome Genome Campus, Hinxton, Cambridge, UK
| | - Gerald Quon
- Department of Molecular and Cellular Biology, University of California, David, Davis, CA, USA
| | - Pablo Ross
- Department of Animal Science, University of California, Davis, Davis, CA, USA.
| | - Huaijun Zhou
- Department of Animal Science, University of California, Davis, Davis, CA, USA.
| |
Collapse
|
41
|
Gao J, Pan Y, Xu Y, Zhang W, Zhang L, Li X, Tian Z, Chen H, Wang Y. Unveiling the long non-coding RNA profile of porcine reproductive and respiratory syndrome virus-infected porcine alveolar macrophages. BMC Genomics 2021; 22:177. [PMID: 33711920 PMCID: PMC7953715 DOI: 10.1186/s12864-021-07482-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 02/25/2021] [Indexed: 12/13/2022] Open
Abstract
Background Long noncoding RNA (lncRNA) is highly associated with inflammatory response and virus-induced interferon production. By far the majority of studies have focused on the immune-related lncRNAs of mice and humans, but the function of lncRNAs in porcine immune cells are poorly understood. Porcine reproductive and respiratory syndrome virus (PRRSV) impairs local immune responses in the lungs of nursery and growing pigs, whereas the virus triggers the inflammatory responses. Porcine alveolar macrophage (PAM) is the primary target cell of PRRSV, thus PRRSV is used as an in vitro model of inflammation. Here, we profiled lncRNA and mRNA repertories from PRRSV-infected PAMs to explore the underlying mechanism of porcine lncRNAs in regulating host immune responses. Results In this study, a total of 350 annotated lncRNAs and 1792 novel lncRNAs in PAMs were identified through RNA-seq analysis. Among them 86 differentially expressed (DE) lncRNAs and 406 DE protein-coding mRNAs were identified upon PRRSV incubation. GO category and KEGG pathway enrichment analyses revealed that these DE lncRNAs and mRNAs were mainly involved in inflammation- and pathogen infection-induced pathways. The results of dynamic correlated expression networks between lncRNAs and their predicted target genes uncovered that numerous lncRNAs, such as XLOC-022175, XLOC-019295, and XLOC-017089, were correlated with innate immune genes. Further analysis validated that these three lncRNAs were positively correlated with their predicted target genes including CXCL2, IFI6, and CD163. This study suggests that porcine lncRNAs affect immune responses against PRRSV infection through regulating their target genes in PAMs. Conclusion This study provides both transcriptomic and epigenetic status of porcine macrophages. In response to PRRSV infection, comprehensive DE lncRNAs and mRNAs were profiled from PAMs. Co-expression analysis demonstrated that lncRNAs are emerging as the important modulators of immune gene activities through their critical influence upon PRRSV infection in porcine macrophages. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07482-9.
Collapse
Affiliation(s)
- Junxin Gao
- State Key Laboratory of Veterinary Biotechnology, Heilongjiang Provincial Key Laboratory of Laboratory Animal and Comparative Medicine, Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Harbin, China
| | - Yu Pan
- State Key Laboratory of Veterinary Biotechnology, Heilongjiang Provincial Key Laboratory of Laboratory Animal and Comparative Medicine, Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Harbin, China
| | - Yunfei Xu
- State Key Laboratory of Veterinary Biotechnology, Heilongjiang Provincial Key Laboratory of Laboratory Animal and Comparative Medicine, Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Harbin, China
| | - Wenli Zhang
- State Key Laboratory of Veterinary Biotechnology, Heilongjiang Provincial Key Laboratory of Laboratory Animal and Comparative Medicine, Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Harbin, China
| | - Lin Zhang
- State Key Laboratory of Veterinary Biotechnology, Heilongjiang Provincial Key Laboratory of Laboratory Animal and Comparative Medicine, Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Harbin, China
| | - Xi Li
- State Key Laboratory of Veterinary Biotechnology, Heilongjiang Provincial Key Laboratory of Laboratory Animal and Comparative Medicine, Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Harbin, China
| | - Zhijun Tian
- State Key Laboratory of Veterinary Biotechnology, Heilongjiang Provincial Key Laboratory of Laboratory Animal and Comparative Medicine, Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Harbin, China
| | - Hongyan Chen
- State Key Laboratory of Veterinary Biotechnology, Heilongjiang Provincial Key Laboratory of Laboratory Animal and Comparative Medicine, Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Harbin, China
| | - Yue Wang
- State Key Laboratory of Veterinary Biotechnology, Heilongjiang Provincial Key Laboratory of Laboratory Animal and Comparative Medicine, Harbin Veterinary Research Institute, Chinese Academy of Agricultural Sciences, Harbin, China.
| |
Collapse
|
42
|
Talukder A, Hu H, Li X. An intriguing characteristic of enhancer-promoter interactions. BMC Genomics 2021; 22:163. [PMID: 33685407 PMCID: PMC7938488 DOI: 10.1186/s12864-021-07440-5] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2020] [Accepted: 02/12/2021] [Indexed: 01/22/2023] Open
Abstract
Background It is still challenging to predict interacting enhancer-promoter pairs (IEPs), partially because of our limited understanding of their characteristics. To understand IEPs better, here we studied the IEPs in nine cell lines and nine primary cell types. Results By measuring the bipartite clustering coefficient of the graphs constructed from these experimentally supported IEPs, we observed that one enhancer is likely to interact with either none or all of the target genes of another enhancer. This observation implies that enhancers form clusters, and every enhancer in the same cluster synchronously interact with almost every member of a set of genes and only this set of genes. We perceived that an enhancer can be up to two megabase pairs away from other enhancers in the same cluster. We also noticed that although a fraction of these clusters of enhancers do overlap with super-enhancers, the majority of the enhancer clusters are different from the known super-enhancers. Conclusions Our study showed a new characteristic of IEPs, which may shed new light on distal gene regulation and the identification of IEPs. Supplementary Information The online version contains supplementary material available at (10.1186/s12864-021-07440-5).
Collapse
Affiliation(s)
- Amlan Talukder
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA
| | - Haiyan Hu
- Department of Computer Science, University of Central Florida, Orlando, FL 32816, USA.
| | - Xiaoman Li
- Burnett School of Biomedical Science, College of Medicine, University of Central Florida, Orlando, FL 32816, USA.
| |
Collapse
|
43
|
Zhu C, Zhang Y, Li YE, Lucero J, Behrens MM, Ren B. Joint profiling of histone modifications and transcriptome in single cells from mouse brain. Nat Methods 2021; 18:283-292. [PMID: 33589836 PMCID: PMC7954905 DOI: 10.1038/s41592-021-01060-3] [Citation(s) in RCA: 119] [Impact Index Per Article: 39.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2020] [Accepted: 01/04/2021] [Indexed: 12/13/2022]
Abstract
Genome-wide profiling of histone modifications can reveal not only the location and activity state of regulatory elements, but also the regulatory mechanisms involved in cell-type-specific gene expression during development and disease pathology. Conventional assays to profile histone modifications in bulk tissues lack single-cell resolution. Here we describe an ultra-high-throughput method, Paired-Tag, for joint profiling of histone modifications and transcriptome in single cells to produce cell-type-resolved maps of chromatin state and transcriptome in complex tissues. We used this method to profile five histone modifications jointly with transcriptome in the adult mouse frontal cortex and hippocampus. Integrative analysis of the resulting maps identified distinct groups of genes subject to divergent epigenetic regulatory mechanisms. Our single-cell multiomics approach enables comprehensive analysis of chromatin state and gene regulation in complex tissues and characterization of gene regulatory programs in the constituent cell types.
Collapse
Affiliation(s)
- Chenxu Zhu
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
| | - Yanxiao Zhang
- Ludwig Institute for Cancer Research, La Jolla, CA, USA
| | - Yang Eric Li
- Department of Cellular and Molecular Medicine, University of California San Diego School of Medicine, La Jolla, CA, USA
| | - Jacinta Lucero
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - M Margarita Behrens
- Computational Neurobiology Laboratory, The Salk Institute for Biological Studies, La Jolla, CA, USA
| | - Bing Ren
- Ludwig Institute for Cancer Research, La Jolla, CA, USA.
- Department of Cellular and Molecular Medicine, University of California San Diego School of Medicine, La Jolla, CA, USA.
- Center for Epigenomics, University of California San Diego School of Medicine, La Jolla, CA, USA.
- Institute of Genomic Medicine, University of California San Diego School of Medicine, La Jolla, CA, USA.
- Moores Cancer Center, University of California San Diego School of Medicine, La Jolla, CA, USA.
| |
Collapse
|
44
|
Zhang S, Duan Z, Yang W, Qian C, You Y. iDHS-DASTS: identifying DNase I hypersensitive sites based on LASSO and stacking learning. Mol Omics 2021; 17:130-141. [PMID: 33295914 DOI: 10.1039/d0mo00115e] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Abstract
The DNase I hypersensitivity site is an important marker of the DNA regulatory region, and its identification in the DNA sequence is of great significance for biomedical research. However, traditional identification methods are extremely time-consuming and can not obtain an accurate result. In this paper, we proposed a predictor called iDHS-DASTS to identify the DHS based on benchmark datasets. First, we adopt a feature extraction method called PseDNC which can incorporate the original DNA properties and spatial information of the DNA sequence. Then we use a method called LASSO to reduce the dimensions of the original data. Finally, we utilize stacking learning as a classifier, which includes Adaboost, random forest, gradient boosting, extra trees and SVM. Before we train the classifier, we use SMOTE-Tomek to overcome the imbalance of the datasets. In the experiment, our iDHS-DASTS achieves remarkable performance on three benchmark datasets. We achieve state-of-the-art results with over 92.06%, 91.06% and 90.72% accuracy for datasets [Doublestruck S]1, [Doublestruck S]2 and [Doublestruck S]3, respectively. To verify the validation and transferability of our model, we establish another independent dataset [Doublestruck S]4, for which the accuracy can reach 90.31%. Furthermore, we used the proposed model to construct a user friendly web server called iDHS-DASTS, which is available at http://www.xdu-duan.cn/.
Collapse
Affiliation(s)
- Shengli Zhang
- School of Mathematics and Statistics, Xidian University, Xi'an 710071, P. R. China.
| | - Zhengpeng Duan
- School of Electronic Enginnering, Xidian University, Xi'an 710071, P. R. China
| | - Wenhao Yang
- School of Electronic Enginnering, Xidian University, Xi'an 710071, P. R. China
| | - Chenlai Qian
- School of Electronic Enginnering, Xidian University, Xi'an 710071, P. R. China
| | - Yiwei You
- International Business School, Shanghai University of International Business and Economics, Shanghai, 201620, P. R. China
| |
Collapse
|
45
|
Abstract
The ATAC-seq assay has emerged as the most useful, versatile, and widely adaptable method for profiling accessible chromatin regions and tracking the activity of cis-regulatory elements (cREs) in eukaryotes. Thanks to its great utility, it is now being applied to map active chromatin in the context of a very wide diversity of biological systems and questions. In the course of these studies, considerable experience working with ATAC-seq data has accumulated and a standard set of computational tasks that need to be carried for most ATAC-seq analyses has emerged. Here, we review and provide examples of common such analytical procedures (including data processing, quality control, peak calling, identifying differentially accessible open chromatin regions, and variable transcription factor (TF) motif accessibility) and discuss recommended optimal practices.
Collapse
|
46
|
Ko DK, Brandizzi F. Network-based approaches for understanding gene regulation and function in plants. THE PLANT JOURNAL : FOR CELL AND MOLECULAR BIOLOGY 2020; 104:302-317. [PMID: 32717108 PMCID: PMC8922287 DOI: 10.1111/tpj.14940] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/12/2020] [Accepted: 07/14/2020] [Indexed: 05/03/2023]
Abstract
Expression reprogramming directed by transcription factors is a primary gene regulation underlying most aspects of the biology of any organism. Our views of how gene regulation is coordinated are dramatically changing thanks to the advent and constant improvement of high-throughput profiling and transcriptional network inference methods: from activities of individual genes to functional interactions across genes. These technical and analytical advances can reveal the topology of transcriptional networks in which hundreds of genes are hierarchically regulated by multiple transcription factors at systems level. Here we review the state of the art of experimental and computational methods used in plant biology research to obtain large-scale datasets and model transcriptional networks. Examples of direct use of these network models and perspectives on their limitations and future directions are also discussed.
Collapse
Affiliation(s)
- Dae Kwan Ko
- MSU-DOE Plant Research Lab, Michigan State University, East Lansing, MI 48824, USA
- Great Lakes Bioenergy Research Center, Michigan State University, East Lansing, MI 48824, USA
| | - Federica Brandizzi
- MSU-DOE Plant Research Lab, Michigan State University, East Lansing, MI 48824, USA
- Great Lakes Bioenergy Research Center, Michigan State University, East Lansing, MI 48824, USA
- Department of Plant Biology, Michigan State University, East Lansing, MI 48824, USA
- For correspondence ()
| |
Collapse
|
47
|
Hou TY, Kraus WL. Spirits in the Material World: Enhancer RNAs in Transcriptional Regulation. Trends Biochem Sci 2020; 46:138-153. [PMID: 32888773 DOI: 10.1016/j.tibs.2020.08.007] [Citation(s) in RCA: 29] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/23/2020] [Revised: 08/04/2020] [Accepted: 08/07/2020] [Indexed: 12/15/2022]
Abstract
Responses to developmental and environmental cues depend on precise spatiotemporal control of gene transcription. Enhancers, which comprise DNA elements bound by regulatory proteins, can activate target genes in response to these external signals. Recent studies have shown that enhancers are transcribed to produce enhancer RNAs (eRNAs). Do eRNAs play a functional role in activating gene expression or are they non-functional byproducts of nearby transcription machinery? The unstable nature of eRNAs and over-reliance on knockdown approaches have made elucidating the possible functions of eRNAs challenging. We focus here on studies using cloned eRNAs to study their function as transcripts, revealing roles for eRNAs in enhancer-promoter looping, recruiting transcriptional machinery, and facilitating RNA polymerase pause-release to regulate gene expression.
Collapse
Affiliation(s)
- Tim Y Hou
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA; Division of Basic Research, Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA
| | - W Lee Kraus
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA; Division of Basic Research, Department of Obstetrics and Gynecology, University of Texas Southwestern Medical Center, Dallas, TX 75390, USA.
| |
Collapse
|
48
|
Han J, Wang P, Wang Q, Lin Q, Chen Z, Yu G, Miao C, Dao Y, Wu R, Schnable JC, Tang H, Wang K. Genome-Wide Characterization of DNase I-Hypersensitive Sites and Cold Response Regulatory Landscapes in Grasses. THE PLANT CELL 2020; 32:2457-2473. [PMID: 32471863 PMCID: PMC7401015 DOI: 10.1105/tpc.19.00716] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Revised: 05/11/2020] [Accepted: 05/23/2020] [Indexed: 05/05/2023]
Abstract
Deep sequencing of DNase-I treated chromatin (DNase-seq) can be used to identify DNase I-hypersensitive sites (DHSs) and facilitates genome-scale mining of de novo cis-regulatory DNA elements. Here, we adapted DNase-seq to generate genome-wide maps of DHSs using control and cold-treated leaf, stem, and root tissues of three widely studied grass species: Brachypodium distachyon, foxtail millet (Setaria italica), and sorghum (Sorghum bicolor). Functional validation demonstrated that 12 of 15 DHSs drove reporter gene expression in transiently transgenic B. distachyon protoplasts. DHSs under both normal and cold treatment substantially differed among tissues and species. Intriguingly, the putative DHS-derived transcription factors (TFs) are largely colocated among tissues and species and include 17 ubiquitous motifs covering all grass taxa and all tissues examined in this study. This feature allowed us to reconstruct a regulatory network that responds to cold stress. Ethylene-responsive TFs SHINE3, ERF2, and ERF9 occurred frequently in cold feedback loops in the tissues examined, pointing to their possible roles in the regulatory network. Overall, we provide experimental annotation of 322,713 DHSs and 93 derived cold-response TF binding motifs in multiple grasses, which could serve as a valuable resource for elucidating the transcriptional networks that function in the cold-stress response and other physiological processes.
Collapse
Affiliation(s)
- Jinlei Han
- Key Laboratory of Genetics, Breeding, and Multiple Utilization of Crops, Ministry of Education, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, 350002 Fuzhou, China
| | - Pengxi Wang
- Key Laboratory of Genetics, Breeding, and Multiple Utilization of Crops, Ministry of Education, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, 350002 Fuzhou, China
| | - Qiongli Wang
- Key Laboratory of Genetics, Breeding, and Multiple Utilization of Crops, Ministry of Education, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, 350002 Fuzhou, China
| | - Qingfang Lin
- Key Laboratory of Genetics, Breeding, and Multiple Utilization of Crops, Ministry of Education, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, 350002 Fuzhou, China
| | - Zhiyong Chen
- Key Laboratory of Genetics, Breeding, and Multiple Utilization of Crops, Ministry of Education, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, 350002 Fuzhou, China
| | - Guangrun Yu
- Key Laboratory of Genetics, Breeding, and Multiple Utilization of Crops, Ministry of Education, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, 350002 Fuzhou, China
| | - Chenyong Miao
- Center for Plant Science Innovation, University of Nebraska, Lincoln, Nebraska 68588
| | - Yihang Dao
- Key Laboratory of Genetics, Breeding, and Multiple Utilization of Crops, Ministry of Education, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, 350002 Fuzhou, China
| | - Ruoxi Wu
- Key Laboratory of Genetics, Breeding, and Multiple Utilization of Crops, Ministry of Education, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, 350002 Fuzhou, China
| | - James C Schnable
- Center for Plant Science Innovation, University of Nebraska, Lincoln, Nebraska 68588
| | - Haibao Tang
- Key Laboratory of Genetics, Breeding, and Multiple Utilization of Crops, Ministry of Education, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, 350002 Fuzhou, China
| | - Kai Wang
- Key Laboratory of Genetics, Breeding, and Multiple Utilization of Crops, Ministry of Education, Fujian Provincial Key Laboratory of Haixia Applied Plant Systems Biology, Center for Genomics and Biotechnology, Fujian Agriculture and Forestry University, 350002 Fuzhou, China
| |
Collapse
|
49
|
Use Chou’s 5-steps rule to identify DNase I hypersensitive sites via dinucleotide property matrix and extreme gradient boosting. Mol Genet Genomics 2020; 295:1431-1442. [DOI: 10.1007/s00438-020-01711-8] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2020] [Accepted: 07/11/2020] [Indexed: 01/08/2023]
|
50
|
Malladi VS, Nagari A, Franco HL, Kraus WL. Total Functional Score of Enhancer Elements Identifies Lineage-Specific Enhancers That Drive Differentiation of Pancreatic Cells. Bioinform Biol Insights 2020; 14:1177932220938063. [PMID: 32655276 PMCID: PMC7331761 DOI: 10.1177/1177932220938063] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2020] [Accepted: 06/02/2020] [Indexed: 01/10/2023] Open
Abstract
The differentiation of embryonic stem cells into various lineages is highly dependent on the chromatin state of the genome and patterns of gene expression. To identify lineage-specific enhancers driving the differentiation of progenitors into pancreatic cells, we used a previously described computational framework called Total Functional Score of Enhancer Elements (TFSEE), which integrates multiple genomic assays that probe both transcriptional and epigenomic states. First, we evaluated and compared TFSEE as an enhancer-calling algorithm with enhancers called using GRO-seq-defined enhancer transcripts (method 1) versus enhancers called using histone modification ChIP-seq data (method 2). Second, we used TFSEE to define the enhancer landscape and identify transcription factors (TFs) that maintain the multipotency of a subpopulation of endodermal stem cells during differentiation into pancreatic lineages. Collectively, our results demonstrate that TFSEE is a robust enhancer-calling algorithm that can be used to perform multilayer genomic data integration to uncover cell type-specific TFs that control lineage-specific enhancers.
Collapse
Affiliation(s)
- Venkat S Malladi
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, The University of Texas Southwestern Medical Center, Dallas, TX, USA.,Department of Bioinformatics, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Anusha Nagari
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| | - Hector L Franco
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, The University of Texas Southwestern Medical Center, Dallas, TX, USA.,Department of Genetics and Lineberger Comprehensive Cancer Center, The University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
| | - W Lee Kraus
- Laboratory of Signaling and Gene Regulation, Cecil H. and Ida Green Center for Reproductive Biology Sciences, The University of Texas Southwestern Medical Center, Dallas, TX, USA
| |
Collapse
|