1
|
Li D, He M, Tang Q, Tian S, Zhang J, Li Y, Wang D, Jin L, Ning C, Zhu W, Hu S, Long K, Ma J, Liu J, Zhang Z, Li M. Comparative 3D genome architecture in vertebrates. BMC Biol 2022; 20:99. [PMID: 35524220 PMCID: PMC9077971 DOI: 10.1186/s12915-022-01301-7] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2021] [Accepted: 04/20/2022] [Indexed: 12/14/2022] Open
Abstract
Background The three-dimensional (3D) architecture of the genome has a highly ordered and hierarchical nature, which influences the regulation of essential nuclear processes at the basis of gene expression, such as gene transcription. While the hierarchical organization of heterochromatin and euchromatin can underlie differences in gene expression that determine evolutionary differences among species, the way 3D genome architecture is affected by evolutionary forces within major lineages remains unclear. Here, we report a comprehensive comparison of 3D genomes, using high resolution Hi-C data in fibroblast cells of fish, chickens, and 10 mammalian species. Results This analysis shows a correlation between genome size and chromosome length that affects chromosome territory (CT) organization in the upper hierarchy of genome architecture, whereas lower hierarchical features, including local transcriptional availability of DNA, are selected through the evolution of vertebrates. Furthermore, conservation of topologically associating domains (TADs) appears strongly associated with the modularity of expression profiles across species. Additionally, LINE and SINE transposable elements likely contribute to heterochromatin and euchromatin organization, respectively, during the evolution of genome architecture. Conclusions Our analysis uncovers organizational features that appear to determine the conservation and transcriptional regulation of functional genes across species. These findings can guide ongoing investigations of genome evolution by extending our understanding of the mechanisms shaping genome architecture. Supplementary Information The online version contains supplementary material available at 10.1186/s12915-022-01301-7.
Collapse
Affiliation(s)
- Diyan Li
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Mengnan He
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Qianzi Tang
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Shilin Tian
- Department of Ecology, Tibetan Centre for Ecology and Conservation at WHU-TU, Hubei Key Laboratory of Cell Homeostasis, College of Life Sciences, Wuhan University, Wuhan, 430072, China.,Novogene Bioinformatics Institute, Beijing, 100000, China
| | - Jiaman Zhang
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Yan Li
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Danyang Wang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China.,University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Long Jin
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Chunyou Ning
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Wei Zhu
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Silu Hu
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Keren Long
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Jideng Ma
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China
| | - Jing Liu
- University of Chinese Academy of Sciences, Beijing, 100049, China
| | - Zhihua Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences and China National Center for Bioinformation, Beijing, 100101, China. .,University of Chinese Academy of Sciences, Beijing, 100049, China. .,School of Artificial Intelligence, University of Chinese Academy of Sciences, Beijing, 100049, China.
| | - Mingzhou Li
- Institute of Animal Genetics and Breeding, College of Animal Science and Technology, Sichuan Agricultural University, Chengdu, 611130, China.
| |
Collapse
|
2
|
Zhu Y, Gujar AD, Wong CH, Tjong H, Ngan CY, Gong L, Chen YA, Kim H, Liu J, Li M, Mil-Homens A, Maurya R, Kuhlberg C, Sun F, Yi E, deCarvalho AC, Ruan Y, Verhaak RGW, Wei CL. Oncogenic extrachromosomal DNA functions as mobile enhancers to globally amplify chromosomal transcription. Cancer Cell 2021; 39:694-707.e7. [PMID: 33836152 PMCID: PMC8119378 DOI: 10.1016/j.ccell.2021.03.006] [Citation(s) in RCA: 94] [Impact Index Per Article: 31.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 07/14/2020] [Revised: 11/05/2020] [Accepted: 03/12/2021] [Indexed: 12/13/2022]
Abstract
Extrachromosomal, circular DNA (ecDNA) is emerging as a prevalent yet less characterized oncogenic alteration in cancer genomes. We leverage ChIA-PET and ChIA-Drop chromatin interaction assays to characterize genome-wide ecDNA-mediated chromatin contacts that impact transcriptional programs in cancers. ecDNAs in glioblastoma patient-derived neurosphere and prostate cancer cell cultures are marked by widespread intra-ecDNA and genome-wide chromosomal interactions. ecDNA-chromatin contact foci are characterized by broad and high-level H3K27ac signals converging predominantly on chromosomal genes of increased expression levels. Prostate cancer cells harboring synthetic ecDNA circles composed of characterized enhancers result in the genome-wide activation of chromosomal gene transcription. Deciphering the chromosomal targets of ecDNAs at single-molecule resolution reveals an association with actively expressed oncogenes spatially clustered within ecDNA-directed interaction networks. Our results suggest that ecDNA can function as mobile transcriptional enhancers to promote tumor progression and manifest a potential synthetic aneuploidy mechanism of transcription control in cancer.
Collapse
Affiliation(s)
- Yanfen Zhu
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Amit D Gujar
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Chee-Hong Wong
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Harianto Tjong
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Chew Yee Ngan
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Liang Gong
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Yi-An Chen
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Hoon Kim
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Jihe Liu
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Meihong Li
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Adam Mil-Homens
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Rahul Maurya
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Chris Kuhlberg
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Fanyue Sun
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Eunhee Yi
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Ana C deCarvalho
- Department of Neurosurgery, Henry Ford Hospital, Detroit, MI 48202, USA
| | - Yijun Ruan
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA
| | - Roel G W Verhaak
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA.
| | - Chia-Lin Wei
- The Jackson Laboratory for Genomic Medicine, Farmington, CT 06032, USA.
| |
Collapse
|
3
|
Abstract
Three-dimensional (3D) genome organization has emerged as an important layer of gene regulation in development and disease. The functional properties of chromatin folding within individual chromosomes (i.e., intra-chromosomal or in cis) have been studied extensively. On the other hand, interactions across different chromosomes (i.e., inter-chromosomal or in trans) have received less attention, being often regarded as background noise or technical artifacts. This viewpoint has been challenged by emerging evidence of functional relationships between specific trans chromatin interactions and epigenetic control, transcription, and splicing. Therefore, it is an intriguing possibility that the key processes involved in the biogenesis of RNAs may both shape and be in turn influenced by inter-chromosomal genome architecture. Here I present the rationale behind this hypothesis, and discuss a potential experimental framework aimed at its formal testing. I present a specific example in the cardiac myocyte, a well-studied post-mitotic cell whose development and response to stress are associated with marked rearrangements of chromatin topology both in cis and in trans. I argue that RNA polymerase II clusters (i.e., transcription factories) and foci of the cardiac-specific splicing regulator RBM20 (i.e., splicing factories) exemplify the existence of trans-interacting chromatin domains (TIDs) with important roles in cellular homeostasis. Overall, I propose that inter-molecular 3D proximity between co-regulated nucleic acids may be a pervasive functional mechanism in biology.
Collapse
Affiliation(s)
- Alessandro Bertero
- Department of Laboratory Medicine and Pathology, Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA, United States
| |
Collapse
|
4
|
Abstract
BACKGROUND Chromosome conformation capture-based methods, especially Hi-C, enable scientists to detect genome-wide chromatin interactions and study the spatial organization of chromatin, which plays important roles in gene expression regulation, DNA replication and repair etc. Thus, developing computational methods to unravel patterns behind the data becomes critical. Existing computational methods focus on intrachromosomal interactions and ignore interchromosomal interactions partly because there is no prior knowledge for interchromosomal interactions and the frequency of interchromosomal interactions is much lower while the search space is much larger. With the development of single-cell technologies, the advent of single-cell Hi-C makes interrogating the spatial structure of chromatin at single-cell resolution possible. It also brings a new type of frequency information, the number of single cells with chromatin interactions between two disjoint chromosome regions. RESULTS Considering the lack of computational methods on interchromosomal interactions and the unsurprisingly frequent intrachromosomal interactions along the diagonal of a chromatin contact map, we propose a computational method dedicated to analyzing interchromosomal interactions of single-cell Hi-C with this new frequency information. To the best of our knowledge, our proposed tool is the first to identify regions with statistically frequent interchromosomal interactions at single-cell resolution. We demonstrate that the tool utilizing networks and binomial statistical tests can identify interesting structural regions through visualization, comparison and enrichment analysis and it also supports different configurations to provide users with flexibility. CONCLUSIONS It will be a useful tool for analyzing single-cell Hi-C interchromosomal interactions.
Collapse
Affiliation(s)
| | - Lu Liu
- North Dakota State University, 1340 Administration Ave, Fargo, 58102, USA.
| |
Collapse
|
5
|
Ye C, Paccanaro A, Gerstein M, Yan KK. The corrected gene proximity map for analyzing the 3D genome organization using Hi-C data. BMC Bioinformatics 2020; 21:222. [PMID: 32471347 PMCID: PMC7260828 DOI: 10.1186/s12859-020-03545-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/17/2020] [Accepted: 05/11/2020] [Indexed: 11/11/2022] Open
Abstract
BACKGROUND Genome-wide ligation-based assays such as Hi-C provide us with an unprecedented opportunity to investigate the spatial organization of the genome. Results of a typical Hi-C experiment are often summarized in a chromosomal contact map, a matrix whose elements reflect the co-location frequencies of genomic loci. To elucidate the complex structural and functional interactions between those genomic loci, networks offer a natural and powerful framework. RESULTS We propose a novel graph-theoretical framework, the Corrected Gene Proximity (CGP) map to study the effect of the 3D spatial organization of genes in transcriptional regulation. The starting point of the CGP map is a weighted network, the gene proximity map, whose weights are based on the contact frequencies between genes extracted from genome-wide Hi-C data. We derive a null model for the network based on the signal contributed by the 1D genomic distance and use it to "correct" the gene proximity for cell type 3D specific arrangements. The CGP map, therefore, provides a network framework for the 3D structure of the genome on a global scale. On human cell lines, we show that the CGP map can detect and quantify gene co-regulation and co-localization more effectively than the map obtained by raw contact frequencies. Analyzing the expression pattern of metabolic pathways of two hematopoietic cell lines, we find that the relative positioning of the genes, as captured and quantified by the CGP, is highly correlated with their expression change. We further show that the CGP map can be used to form an inter-chromosomal proximity map that allows large-scale abnormalities, such as chromosomal translocations, to be identified. CONCLUSIONS The Corrected Gene Proximity map is a map of the 3D structure of the genome on a global scale. It allows the simultaneous analysis of intra- and inter- chromosomal interactions and of gene co-regulation and co-localization more effectively than the map obtained by raw contact frequencies, thus revealing hidden associations between global spatial positioning and gene expression. The flexible graph-based formalism of the CGP map can be easily generalized to study any existing Hi-C datasets.
Collapse
Affiliation(s)
- Cheng Ye
- Department of Computer Science, Centre for Systems and Synthetic Biology, Royal Holloway, University of London, Egham, TW20 0EX, UK
| | - Alberto Paccanaro
- Department of Computer Science, Centre for Systems and Synthetic Biology, Royal Holloway, University of London, Egham, TW20 0EX, UK.
- School of Applied Mathematics, Fundação Getulio Vargas, Rio de Janeiro, Brazil.
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Department of Molecular Biophysics and Biochemistry, Department of Computer Science, Department of Statistics and Data Science, Yale University, New Haven, CT, 06520, USA
| | - Koon-Kiu Yan
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, 38105-3678, USA.
| |
Collapse
|
6
|
Ito EA, Katahira I, Vicente FFDR, Pereira LFP, Lopes FM. BASiNET-BiologicAl Sequences NETwork: a case study on coding and non-coding RNAs identification. Nucleic Acids Res 2019; 46:e96. [PMID: 29873784 PMCID: PMC6144827 DOI: 10.1093/nar/gky462] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2017] [Accepted: 05/22/2018] [Indexed: 01/23/2023] Open
Abstract
With the emergence of Next Generation Sequencing (NGS) technologies, a large volume of sequence data in particular de novo sequencing was rapidly produced at relatively low costs. In this context, computational tools are increasingly important to assist in the identification of relevant information to understand the functioning of organisms. This work introduces BASiNET, an alignment-free tool for classifying biological sequences based on the feature extraction from complex network measurements. The method initially transform the sequences and represents them as complex networks. Then it extracts topological measures and constructs a feature vector that is used to classify the sequences. The method was evaluated in the classification of coding and non-coding RNAs of 13 species and compared to the CNCI, PLEK and CPC2 methods. BASiNET outperformed all compared methods in all adopted organisms and datasets. BASiNET have classified sequences in all organisms with high accuracy and low standard deviation, showing that the method is robust and non-biased by the organism. The proposed methodology is implemented in open source in R language and freely available for download at https://cran.r-project.org/package=BASiNET.
Collapse
Affiliation(s)
- Eric Augusto Ito
- Department of Computer Science, Bioinformatics Graduate Program, Federal University of Technology - Paraná, Cornélio Procópio, PR 86300-000, Brazil
| | - Isaque Katahira
- Department of Computer Science, Bioinformatics Graduate Program, Federal University of Technology - Paraná, Cornélio Procópio, PR 86300-000, Brazil
| | - Fábio Fernandes da Rocha Vicente
- Department of Computer Science, Bioinformatics Graduate Program, Federal University of Technology - Paraná, Cornélio Procópio, PR 86300-000, Brazil
| | - Luiz Filipe Protasio Pereira
- Department of Computer Science, Bioinformatics Graduate Program, Federal University of Technology - Paraná, Cornélio Procópio, PR 86300-000, Brazil.,Empresa Brasileira de Pesquisa Agropecuária, Embrapa Café, Brasília, DF 70770-901, Brazil
| | - Fabrício Martins Lopes
- Department of Computer Science, Bioinformatics Graduate Program, Federal University of Technology - Paraná, Cornélio Procópio, PR 86300-000, Brazil
| |
Collapse
|
7
|
Huang H, Chen ST, Titus KR, Emerson DJ, Bassett DS, Phillips-Cremins JE. A subset of topologically associating domains fold into mesoscale core-periphery networks. Sci Rep 2019; 9:9526. [PMID: 31266973 DOI: 10.1038/s41598-019-45457-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 06/07/2019] [Indexed: 12/21/2022] Open
Abstract
Mammalian genomes are folded into a hierarchy of compartments, topologically associating domains (TADs), subTADs, and long-range looping interactions. The higher-order folding patterns of chromatin contacts within TADs and how they localize to disease-associated single nucleotide variants (daSNVs) remains an open area of investigation. Here, we analyze high-resolution Hi-C data with graph theory to understand possible mesoscale network architecture within chromatin domains. We identify a subset of TADs exhibiting strong core-periphery mesoscale structure in embryonic stem cells, neural progenitor cells, and cortical neurons. Hyper-connected core nodes co-localize with genomic segments engaged in multiple looping interactions and enriched for occupancy of the architectural protein CCCTC binding protein (CTCF). CTCF knockdown and in silico deletion of CTCF-bound core nodes disrupts core-periphery structure, whereas in silico mutation of cell type-specific enhancer or gene nodes has a negligible effect. Importantly, neuropsychiatric daSNVs are significantly more likely to localize with TADs folded into core-periphery networks compared to domains devoid of such structure. Together, our results reveal that a subset of TADs encompasses looping interactions connected into a core-periphery mesoscale network. We hypothesize that daSNVs in the periphery of genome folding networks might preserve global nuclear architecture but cause local topological and functional disruptions contributing to human disease. By contrast, daSNVs co-localized with hyper-connected core nodes might cause severe topological and functional disruptions. Overall, these findings shed new light into the mesoscale network structure of fine scale genome folding within chromatin domains and its link to common genetic variants in human disease.
Collapse
|
8
|
Diament A, Tuller T. Modeling three-dimensional genomic organization in evolution and pathogenesis. Semin Cell Dev Biol 2018; 90:78-93. [PMID: 30030143 DOI: 10.1016/j.semcdb.2018.07.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 07/08/2018] [Indexed: 12/17/2022]
Abstract
The regulation of gene expression is mediated via the complex three-dimensional (3D) conformation of the genetic material and its interactions with various intracellular factors. Various experimental and computational approaches have been developed in recent years for understating the relation between the 3D conformation of the genome and the phenotypes of cells in normal condition and diseases. In this review, we will discuss novel approaches for analyzing and modeling the 3D genomic conformation, focusing on deciphering disease-causing mutations that affect gene expression. We conclude that as this is a very challenging mission, an important direction should involve the comparative analysis of various 3D models from various organisms or cells.
Collapse
Affiliation(s)
- Alon Diament
- Dept. of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Tamir Tuller
- Dept. of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel; The Sagol School of Neuroscience, Tel-Aviv University, Tel Aviv 6997801, Israel.
| |
Collapse
|
9
|
Chang P, Gohain M, Yen MR, Chen PY. Computational Methods for Assessing Chromatin Hierarchy. Comput Struct Biotechnol J 2018; 16:43-53. [PMID: 29686798 PMCID: PMC5910504 DOI: 10.1016/j.csbj.2018.02.003] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2017] [Revised: 01/29/2018] [Accepted: 02/11/2018] [Indexed: 12/27/2022] Open
Abstract
The hierarchical organization of chromatin is known to associate with diverse cellular functions; however, the precise mechanisms and the 3D structure remain to be determined. With recent advances in high-throughput next generation sequencing (NGS) techniques, genome-wide profiling of chromatin structures is made possible. Here, we provide a comprehensive overview of NGS-based methods for profiling "higher-order" and "primary-order" chromatin structures from both experimental and computational aspects. Experimental requirements and considerations specific for each method were highlighted. For computational analysis, we summarized a common analysis strategy for both levels of chromatin assessment, focusing on the characteristic computing steps and the tools. The recently developed single-cell level techniques based on Hi-C and ATAC-seq present great potential to reveal cell-to-cell variability in chromosome architecture. A brief discussion on these methods in terms of experimental and data analysis features is included. We also touch upon the biological relevance of chromatin organization and how the combination with other techniques uncovers the underlying mechanisms. We conclude with a summary and our prospects on necessary improvements of currently available methods in order to advance understanding of chromatin hierarchy. Our review brings together the analyses of both higher- and primary-order chromatin structures, and serves as a roadmap when choosing appropriate experimental and computational methods for assessing chromatin hierarchy.
Collapse
Affiliation(s)
- Pearl Chang
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
| | - Moloya Gohain
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
| | - Ming-Ren Yen
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
| | - Pao-Yang Chen
- Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
| |
Collapse
|
10
|
Abstract
New Hi-C technologies have revealed that chromosomes have a complex network of spatial contacts in the cell nucleus of higher organisms, whose organisation is only partially understood. Here, we investigate the structure of such a network in human GM12878 cells, to derive a large scale picture of nuclear architecture. We find that the intensity of intra-chromosomal interactions is power-law distributed. Inter-chromosomal interactions are two orders of magnitude weaker and exponentially distributed, yet they are not randomly arranged along the genomic sequence. Intra-chromosomal contacts broadly occur between epigenomically homologous regions, whereas inter-chromosomal contacts are especially associated with regions rich in highly expressed genes. Overall, genomic contacts in the nucleus appear to be structured as a network of networks where a set of strongly individual chromosomal units, as envisaged in the ‘chromosomal territory’ scenario derived from microscopy, interact with each other via on average weaker, yet far from random and functionally important interactions.
Collapse
Affiliation(s)
- Sergio Sarnataro
- Institut de Génétique et de Biologie Moléculaire et Cellulaire, 67404 Illkirch, France
- * E-mail: (SS); (AMC)
| | - Andrea M. Chiariello
- Dipartimento di Fisica, Universitá di Napoli Federico II, and INFN Napoli, CNR-SPIN, Complesso Universitario di Monte Sant’Angelo, 80126 Naples, Italy
- * E-mail: (SS); (AMC)
| | - Andrea Esposito
- Dipartimento di Fisica, Universitá di Napoli Federico II, and INFN Napoli, CNR-SPIN, Complesso Universitario di Monte Sant’Angelo, 80126 Naples, Italy
- Max Delbrück Center for Molecular Medicine in the Helmholtz Association (MDC), Berlin, Germany
| | | | - Mario Nicodemi
- Dipartimento di Fisica, Universitá di Napoli Federico II, and INFN Napoli, CNR-SPIN, Complesso Universitario di Monte Sant’Angelo, 80126 Naples, Italy
| |
Collapse
|
11
|
Diament A, Tuller T. Tracking the evolution of 3D gene organization demonstrates its connection to phenotypic divergence. Nucleic Acids Res 2017; 45:4330-4343. [PMID: 28369658 PMCID: PMC5416853 DOI: 10.1093/nar/gkx205] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/17/2016] [Accepted: 03/20/2017] [Indexed: 12/20/2022] Open
Abstract
It has recently been shown that the organization of genes in eukaryotic genomes, and specifically in 3D, is strongly related to gene expression and function and partially conserved between organisms. However, previous studies of 3D genomic organization analyzed each organism independently from others. Here, we propose an approach for unified inter-organismal analysis of gene organization based on a network representation of Hi-C data. We define and detect four classes of spatially co-evolving orthologous modules (SCOMs), i.e. gene families that co-evolve in their 3D organization, based on patterns of divergence and conservation of distances. We demonstrate our methodology on Hi-C data from Saccharomyces cerevisiae and Schizosaccharomyces pombe, and identify, among others, modules relating to RNA splicing machinery and chromatin silencing by small RNA which are central to S. pombe's lifestyle. Our results emphasize the importance of 3D genomic organization in eukaryotes and suggest that the evolutionary mechanisms that shape gene organization affect the organism fitness and phenotypes. The proposed algorithms can be utilized in future studies of genome evolution and comparative analysis of spatial genomic organization in different tissues, conditions and single cells.
Collapse
Affiliation(s)
- Alon Diament
- Biomedical Engineering Dept., Tel Aviv University, Tel Aviv 6997801, Israel
| | - Tamir Tuller
- Biomedical Engineering Dept., Tel Aviv University, Tel Aviv 6997801, Israel.,The Sagol School of Neuroscience, Tel Aviv University, Tel Aviv 6997801, Israel
| |
Collapse
|
12
|
Yan KK, Lou S, Gerstein M. MrTADFinder: A network modularity based approach to identify topologically associating domains in multiple resolutions. PLoS Comput Biol 2017; 13:e1005647. [PMID: 28742097 PMCID: PMC5546724 DOI: 10.1371/journal.pcbi.1005647] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2016] [Revised: 08/07/2017] [Accepted: 06/27/2017] [Indexed: 11/18/2022] Open
Abstract
Genome-wide proximity ligation based assays such as Hi-C have revealed that eukaryotic genomes are organized into structural units called topologically associating domains (TADs). From a visual examination of the chromosomal contact map, however, it is clear that the organization of the domains is not simple or obvious. Instead, TADs exhibit various length scales and, in many cases, a nested arrangement. Here, by exploiting the resemblance between TADs in a chromosomal contact map and densely connected modules in a network, we formulate TAD identification as a network optimization problem and propose an algorithm, MrTADFinder, to identify TADs from intra-chromosomal contact maps. MrTADFinder is based on the network-science concept of modularity. A key component of it is deriving an appropriate background model for contacts in a random chain, by numerically solving a set of matrix equations. The background model preserves the observed coverage of each genomic bin as well as the distance dependence of the contact frequency for any pair of bins exhibited by the empirical map. Also, by introducing a tunable resolution parameter, MrTADFinder provides a self-consistent approach for identifying TADs at different length scales, hence the acronym "Mr" standing for Multiple Resolutions. We then apply MrTADFinder to various Hi-C datasets. The identified domain boundaries are marked by characteristic signatures in chromatin marks and transcription factors (TF) that are consistent with earlier work. Moreover, by calling TADs at different length scales, we observe that boundary signatures change with resolution, with different chromatin features having different characteristic length scales. Furthermore, we report an enrichment of HOT (high-occupancy target) regions near TAD boundaries and investigate the role of different TFs in determining boundaries at various resolutions. To further explore the interplay between TADs and epigenetic marks, as tumor mutational burden is known to be coupled to chromatin structure, we examine how somatic mutations are distributed across boundaries and find a clear stepwise pattern. Overall, MrTADFinder provides a novel computational framework to explore the multi-scale structures in Hi-C contact maps. The accommodation of the roughly 2m of DNA in the nuclei of mammalian cells results in an intricate structure, in which the topologically associating domains (TADs) formed by densely interacting genomic regions emerge as a fundamental structural unit. Identification of TADs is essential for understanding the role of 3D genome organization in gene regulation. By viewing the chromosomal contact map as a network, TADs correspond to the densely connected regions in the network. Motivated by this mapping, we propose a novel method, MrTADFinder, to identify TADs based on the concept of modularity in network science. Using MrTADFinder, we identify domains at various resolutions, and further explore the interplay between domains and other chromatin features like transcription factors binding and histone modifications at different resolutions. Overall, MrTADFinder provides a new computational framework to investigate the multiple length scales that are built inside the organization of the genome.
Collapse
Affiliation(s)
- Koon-Kiu Yan
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States of America
| | - Shaoke Lou
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States of America
| | - Mark Gerstein
- Program in Computational Biology and Bioinformatics, Yale University, New Haven, CT, United States of America
- Department of Molecular Biophysics and Biochemistry, Yale University, New Haven, CT, United States of America
- Department of Computer Science, Yale University, New Haven, CT, United States of America
| |
Collapse
|
13
|
Boudaoud I, Fournier É, Baguette A, Vallée M, Lamaze FC, Droit A, Bilodeau S. Connected Gene Communities Underlie Transcriptional Changes in Cornelia de Lange Syndrome. Genetics 2017; 207:139-51. [PMID: 28679547 DOI: 10.1534/genetics.117.202291] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Accepted: 06/28/2017] [Indexed: 12/25/2022] Open
Abstract
Cornelia de Lange syndrome (CdLS) is a complex multisystem developmental disorder caused by mutations in cohesin subunits and regulators. While its precise molecular mechanisms are not well defined, they point toward a global deregulation of the transcriptional gene expression program. Cohesin is associated with the boundaries of chromosome domains and with enhancer and promoter regions connecting the three-dimensional genome organization with transcriptional regulation. Here, we show that connected gene communities, structures emerging from the interactions of noncoding regulatory elements and genes in the three-dimensional chromosomal space, provide a molecular explanation for the pathoetiology of CdLS associated with mutations in the cohesin-loading factor NIPBL and the cohesin subunit SMC1A NIPBL and cohesin are important constituents of connected gene communities that are centrally positioned at noncoding regulatory elements. Accordingly, genes deregulated in CdLS are positioned within reach of NIPBL- and cohesin-occupied regions through promoter-promoter interactions. Our findings suggest a dynamic model where NIPBL loads cohesin to connect genes in communities, offering an explanation for the gene expression deregulation in the CdLS.
Collapse
|
14
|
Robert A. Beagrie, Antonio Scialdone, Markus Schueler, Dorothee C.A. Kraemer, Mita Chotalia, Sheila Q. Xie, Mariano Barbieri, Inês de Santiago, Liron-Mark Lavitas, Miguel R. Branco, James Fraser, Josée Dostie, Laurence Game, Niall Dillon, Paul A.W. Edwards, Mario Nicodemi, Ana Pombo. Complex multi-enhancer contacts captured by Genome Architecture Mapping (GAM). Nature 2017; 543. [PMID: 28273065 DOI: 10.1038/nature21411] [Citation(s) in RCA: 425] [Impact Index Per Article: 60.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2015] [Accepted: 01/18/2017] [Indexed: 12/22/2022]
Abstract
The organization of the genome in the nucleus and the interactions of genes with their regulatory elements are key features of transcriptional control and their disruption can cause disease. We developed a novel genome-wide method, Genome Architecture Mapping (GAM), for measuring chromatin contacts, and other features of three-dimensional chromatin topology, based on sequencing DNA from a large collection of thin nuclear sections. We apply GAM to mouse embryonic stem cells and identify an enrichment for specific interactions between active genes and enhancers across very large genomic distances, using a mathematical model ‘SLICE’ (Statistical Inference of Co-segregation). GAM also reveals an abundance of three-way contacts genome-wide, especially between regions that are highly transcribed or contain super-enhancers, highlighting a previously inaccessible complexity in genome architecture and a major role for gene-expression specific contacts in organizing the genome in mammalian nuclei.
Collapse
|
15
|
Hahn S, Kim D. Relationship between spatial organization and biological function, analyzed using gene ontology and chromosome conformation capture of human and fission yeast genomes. Genes Genomics 2016; 38:693-705. [DOI: 10.1007/s13258-016-0413-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
16
|
Pancaldi V, Carrillo-de-Santa-Pau E, Javierre BM, Juan D, Fraser P, Spivakov M, Valencia A, Rico D. Integrating epigenomic data and 3D genomic structure with a new measure of chromatin assortativity. Genome Biol 2016; 17:152. [PMID: 27391817 PMCID: PMC4939006 DOI: 10.1186/s13059-016-1003-3] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2016] [Accepted: 06/07/2016] [Indexed: 01/09/2023] Open
Abstract
BACKGROUND Network analysis is a powerful way of modeling chromatin interactions. Assortativity is a network property used in social sciences to identify factors affecting how people establish social ties. We propose a new approach, using chromatin assortativity, to integrate the epigenomic landscape of a specific cell type with its chromatin interaction network and thus investigate which proteins or chromatin marks mediate genomic contacts. RESULTS We use high-resolution promoter capture Hi-C and Hi-Cap data as well as ChIA-PET data from mouse embryonic stem cells to investigate promoter-centered chromatin interaction networks and calculate the presence of specific epigenomic features in the chromatin fragments constituting the nodes of the network. We estimate the association of these features with the topology of four chromatin interaction networks and identify features localized in connected areas of the network. Polycomb group proteins and associated histone marks are the features with the highest chromatin assortativity in promoter-centered networks. We then ask which features distinguish contacts amongst promoters from contacts between promoters and other genomic elements. We observe higher chromatin assortativity of the actively elongating form of RNA polymerase 2 (RNAPII) compared with inactive forms only in interactions between promoters and other elements. CONCLUSIONS Contacts among promoters and between promoters and other elements have different characteristic epigenomic features. We identify a possible role for the elongating form of RNAPII in mediating interactions among promoters, enhancers, and transcribed gene bodies. Our approach facilitates the study of multiple genome-wide epigenomic profiles, considering network topology and allowing the comparison of chromatin interaction networks.
Collapse
Affiliation(s)
- Vera Pancaldi
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain.
| | | | | | - David Juan
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Peter Fraser
- Nuclear Dynamics Programme, The Babraham Institute, Cambridge, UK
| | - Mikhail Spivakov
- Nuclear Dynamics Programme, The Babraham Institute, Cambridge, UK
| | - Alfonso Valencia
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Daniel Rico
- Structural Biology and BioComputing Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain.
| |
Collapse
|
17
|
Meluzzi D, Arya G. Quantification of DNA cleavage specificity in Hi-C experiments. Nucleic Acids Res 2016; 44:e4. [PMID: 26264668 PMCID: PMC4705682 DOI: 10.1093/nar/gkv820] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2014] [Accepted: 08/02/2015] [Indexed: 11/14/2022] Open
Abstract
Hi-C experiments produce large numbers of DNA sequence read pairs that are typically analyzed to deduce genomewide interactions between arbitrary loci. A key step in these experiments is the cleavage of cross-linked chromatin with a restriction endonuclease. Although this cleavage should happen specifically at the enzyme's recognition sequence, an unknown proportion of cleavage events may involve other sequences, owing to the enzyme's star activity or to random DNA breakage. A quantitative estimation of these non-specific cleavages may enable simulating realistic Hi-C read pairs for validation of downstream analyses, monitoring the reproducibility of experimental conditions and investigating biophysical properties that correlate with DNA cleavage patterns. Here we describe a computational method for analyzing Hi-C read pairs to estimate the fractions of cleavages at different possible targets. The method relies on expressing an observed local target distribution downstream of aligned reads as a linear combination of known conditional local target distributions. We validated this method using Hi-C read pairs obtained by computer simulation. Application of the method to experimental Hi-C datasets from murine cells revealed interesting similarities and differences in patterns of cleavage across the various experiments considered.
Collapse
Affiliation(s)
- Dario Meluzzi
- Department of NanoEngineering, University of California San Diego, 9500 Gilman Dr., La Jolla, CA 92093, USA
| | - Gaurav Arya
- Department of NanoEngineering, University of California San Diego, 9500 Gilman Dr., La Jolla, CA 92093, USA
| |
Collapse
|
18
|
Diament A, Tuller T. Three-dimensional Genomic Organization of Genes’ Function in Eukaryotes. Evol Biol 2016. [DOI: 10.1007/978-3-319-41324-2_14] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/21/2022]
|
19
|
Abstract
The study of the 3D architecture of chromosomes has been advancing rapidly in recent years. While a number of methods for 3D reconstruction of genomic models based on Hi-C data were proposed, most of the analyses in the field have been performed on different 3D representation forms (such as graphs). Here, we reproduce most of the previous results on the 3D genomic organization of the eukaryote Saccharomyces cerevisiae using analysis of 3D reconstructions. We show that many of these results can be reproduced in sparse reconstructions, generated from a small fraction of the experimental data (5% of the data), and study the properties of such models. Finally, we propose for the first time a novel approach for improving the accuracy of 3D reconstructions by introducing additional predicted physical interactions to the model, based on orthologous interactions in an evolutionary-related organism and based on predicted functional interactions between genes. We demonstrate that this approach indeed leads to the reconstruction of improved models. Understanding the importance of genome architecture, the arrangement of genes within the genome and how this organization evolved has been intensively studied in recent years. Despite rapid progress in the field, accurate 3D modeling of genome organization remains a challenge. While a number of methods for 3D reconstruction of genomic models based on genome-wide experimental data were proposed, most of the analyses in the field have been performed on different 3D representation forms (such as graphs). Here, we reproduce most of the previous results on the 3D genome organization of the eukaryote Saccharomyces cerevisiae using analysis of 3D reconstructions. We show that many of these results can be reproduced in sparse reconstructions, generated from a small fraction of the experimental data (5% of the data), and study the properties of such models. Finally, we propose for the first time a novel approach for improving the accuracy of 3D reconstructions by introducing additional predicted physical interactions to the model, based on orthologous interactions in a different organism and based on predicted functional interactions between genes. Our proposed approach can facilitate future studies of 3D genome organization via improved models.
Collapse
Affiliation(s)
- Alon Diament
- Dept. of Biomedical Engineering, Tel Aviv University, Tel Aviv, Israel
| | - Tamir Tuller
- Dept. of Biomedical Engineering, Tel Aviv University, Tel Aviv, Israel
- The Sagol School of Neuroscience, Tel Aviv University, Tel Aviv, Israel
- * E-mail:
| |
Collapse
|
20
|
Kaufmann S, Fuchs C, Gonik M, Khrameeva EE, Mironov AA, Frishman D. Inter-chromosomal contact networks provide insights into Mammalian chromatin organization. PLoS One 2015; 10:e0126125. [PMID: 25961318 PMCID: PMC4427453 DOI: 10.1371/journal.pone.0126125] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2014] [Accepted: 03/30/2015] [Indexed: 11/18/2022] Open
Abstract
The recent advent of conformation capture techniques has provided unprecedented insights into the spatial organization of chromatin. We present a large-scale investigation of the inter-chromosomal segment and gene contact networks in embryonic stem cells of two mammalian organisms: humans and mice. Both interaction networks are characterized by a high degree of clustering of genome regions and the existence of hubs. Both genomes exhibit similar structural characteristics such as increased flexibility of certain Y chromosome regions and co-localization of centromere-proximal regions. Spatial proximity is correlated with the functional similarity of genes in both species. We also found a significant association between spatial proximity and the co-expression of genes in the human genome. The structural properties of chromatin are also species specific, including the presence of two highly interactive regions in mouse chromatin and an increased contact density on short, gene-rich human chromosomes, thereby indicating their central nuclear position. Trans-interacting segments are enriched in active marks in human and had no distinct feature profile in mouse. Thus, in contrast to interactions within individual chromosomes, the inter-chromosomal interactions in human and mouse embryonic stem cells do not appear to be conserved.
Collapse
Affiliation(s)
- Stefanie Kaufmann
- Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany
| | - Christiane Fuchs
- Helmholtz Zentrum München, German Research Center for Environmental Health, Institute of Computational Biology, Neuherberg, Germany
- Technical University Munich, Institute for Mathematical Sciences, Garching, Germany
| | - Mariya Gonik
- Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany
- Institute for Stroke and Dementia Research (ISD), Klinikum der Universität München, Munich, Germany
| | - Ekaterina E. Khrameeva
- Research and Training Center on Bioinformatics, Institute for Information Transmission Problems, RAS, Moscow, Russia
| | - Andrey A. Mironov
- Department of Bioengineering and Bioinformatics, M.V. Lomonosov Moscow State University, Moscow, Russia
| | - Dmitrij Frishman
- Department of Genome Oriented Bioinformatics, Technische Universität München, Wissenschaftszentrum Weihenstephan, Freising, Germany
- Institute for Bioinformatics and Systems Biology, HMGU German Research Center for Environmental Health, Neuherberg, Germany
- Department of Bioinformatics, St Petersburg State Polytechnical University, St Petersburg, Russia
- * E-mail:
| |
Collapse
|
21
|
Diament A, Pinter RY, Tuller T. Three-dimensional eukaryotic genomic organization is strongly correlated with codon usage expression and function. Nat Commun 2014; 5:5876. [PMID: 25510862 DOI: 10.1038/ncomms6876] [Citation(s) in RCA: 22] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/28/2014] [Accepted: 11/17/2014] [Indexed: 01/08/2023] Open
Abstract
It has been shown that the distribution of genes in eukaryotic genomes is not random; however, formerly reported relations between gene function and genomic organization were relatively weak. Previous studies have demonstrated that codon usage bias is related to all stages of gene expression and to protein function. Here we apply a novel tool for assessing functional relatedness, codon usage frequency similarity (CUFS), which measures similarity between genes in terms of codon and amino acid usage. By analyzing chromosome conformation capture data, describing the three-dimensional (3D) conformation of the DNA, we show that the functional similarity between genes captured by CUFS is directly and very strongly correlated with their 3D distance in Saccharomyces cerevisiae, Schizosaccharomyces pombe, Arabidopsis thaliana, mouse and human. This emphasizes the importance of three-dimensional genomic localization in eukaryotes and indicates that codon usage is tightly linked to genome architecture.
Collapse
Affiliation(s)
- Alon Diament
- Department of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Ron Y Pinter
- Department of Computer Science, Technion-Israel Institute of Technology, Haifa 32000, Israel
| | - Tamir Tuller
- 1] Department of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel [2] The Sagol School of Neuroscience, Tel Aviv University, Tel Aviv 6997801, Israel
| |
Collapse
|
22
|
Capurso D, Segal MR. Distance-based assessment of the localization of functional annotations in 3D genome reconstructions. BMC Genomics 2014; 15:992. [PMID: 25407917 PMCID: PMC4254257 DOI: 10.1186/1471-2164-15-992] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/06/2014] [Accepted: 11/04/2014] [Indexed: 11/20/2022] Open
Abstract
Background Recent studies used the contact data or three-dimensional (3D) genome reconstructions from Hi-C (chromosome conformation capture with next-generation sequencing) to assess the co-localization of functional genomic annotations in the nucleus. These analyses dichotomized data point pairs belonging to a functional annotation as “close” or “far” based on some threshold and then tested for enrichment of “close” pairs. We propose an alternative approach that avoids dichotomization of the data and instead directly estimates the significance of distances within the 3D reconstruction. Results We applied this approach to 3D genome reconstructions for Plasmodium falciparum, the causative agent of malaria, and Saccharomyces cerevisiae and compared the results to previous approaches. We found significant 3D co-localization of centromeres, telomeres, virulence genes, and several sets of genes with developmentally regulated expression in P. falciparum; and significant 3D co-localization of centromeres and long terminal repeats in S. cerevisiae. Additionally, we tested the experimental observation that telomeres form three to seven clusters in P. falciparum and S. cerevisiae. Applying affinity propagation clustering to telomere coordinates in the 3D reconstructions yielded six telomere clusters for both organisms. Conclusions Distance-based assessment replicated key findings, while avoiding dichotomization of the data (which previously yielded threshold-sensitive results). Electronic supplementary material The online version of this article (doi:10.1186/1471-2164-15-992) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
| | - Mark R Segal
- Department of Epidemiology and Biostatistics, University of California, San Francisco, CA 94107, USA.
| |
Collapse
|
23
|
Paulsen J, Sandve GK, Gundersen S, Lien TG, Trengereid K, Hovig E. HiBrowse: multi-purpose statistical analysis of genome-wide chromatin 3D organization. ACTA ACUST UNITED AC 2014; 30:1620-2. [PMID: 24511080 PMCID: PMC4029040 DOI: 10.1093/bioinformatics/btu082] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/05/2023]
Abstract
Summary: Recently developed methods that couple next-generation sequencing with chromosome conformation capture-based techniques, such as Hi-C and ChIA-PET, allow for characterization of genome-wide chromatin 3D structure. Understanding the organization of chromatin in three dimensions is a crucial next step in the unraveling of global gene regulation, and methods for analyzing such data are needed. We have developed HiBrowse, a user-friendly web-tool consisting of a range of hypothesis-based and descriptive statistics, using realistic assumptions in null-models. Availability and implementation: HiBrowse is supported by all major browsers, and is freely available at http://hyperbrowser.uio.no/3d. Software is implemented in Python, and source code is available for download by following instructions on the main site. Contact:jonaspau@ifi.uio.no Supplementary Information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Jonas Paulsen
- Institute for Cancer Genetics and Informatics, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Mathematics, University of Oslo, Problemveien 7, 0313 Oslo and ELIXIR project, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Norway
| | - Geir Kjetil Sandve
- Institute for Cancer Genetics and Informatics, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Mathematics, University of Oslo, Problemveien 7, 0313 Oslo and ELIXIR project, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Norway
| | - Sveinung Gundersen
- Institute for Cancer Genetics and Informatics, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Mathematics, University of Oslo, Problemveien 7, 0313 Oslo and ELIXIR project, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Norway
| | - Tonje G Lien
- Institute for Cancer Genetics and Informatics, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Mathematics, University of Oslo, Problemveien 7, 0313 Oslo and ELIXIR project, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Norway
| | - Kai Trengereid
- Institute for Cancer Genetics and Informatics, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Mathematics, University of Oslo, Problemveien 7, 0313 Oslo and ELIXIR project, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Norway
| | - Eivind Hovig
- Institute for Cancer Genetics and Informatics, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Mathematics, University of Oslo, Problemveien 7, 0313 Oslo and ELIXIR project, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, NorwayInstitute for Cancer Genetics and Informatics, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Mathematics, University of Oslo, Problemveien 7, 0313 Oslo and ELIXIR project, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, NorwayInstitute for Cancer Genetics and Informatics, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Department of Tumor Biology, Institute for Cancer Research, Oslo University Hospital, PO Box 4950, Nydalen, 0424 Oslo, Department of Mathematics, University of Oslo, Problemveien 7, 0313 Oslo and ELIXIR project, Department of Informatics, University of Oslo, Problemveien 7, 0313 Oslo, Norway
| |
Collapse
|
24
|
Abstract
Long-range chromosomal associations between genomic regions, and their repositioning in the 3D space of the nucleus, are now considered to be key contributors to the regulation of gene expression and important links have been highlighted with other genomic features involved in DNA rearrangements. Recent Chromosome Conformation Capture (3C) measurements performed with high throughput sequencing (Hi-C) and molecular dynamics studies show that there is a large correlation between colocalization and coregulation of genes, but these important researches are hampered by the lack of biologists-friendly analysis and visualisation software. Here, we describe NuChart, an R package that allows the user to annotate and statistically analyse a list of input genes with information relying on Hi-C data, integrating knowledge about genomic features that are involved in the chromosome spatial organization. NuChart works directly with sequenced reads to identify the related Hi-C fragments, with the aim of creating gene-centric neighbourhood graphs on which multi-omics features can be mapped. Predictions about CTCF binding sites, isochores and cryptic Recombination Signal Sequences are provided directly with the package for mapping, although other annotation data in bed format can be used (such as methylation profiles and histone patterns). Gene expression data can be automatically retrieved and processed from the Gene Expression Omnibus and ArrayExpress repositories to highlight the expression profile of genes in the identified neighbourhood. Moreover, statistical inferences about the graph structure and correlations between its topology and multi-omics features can be performed using Exponential-family Random Graph Models. The Hi-C fragment visualisation provided by NuChart allows the comparisons of cells in different conditions, thus providing the possibility of novel biomarkers identification. NuChart is compliant with the Bioconductor standard and it is freely available at ftp://fileserver.itb.cnr.it/nuchart.
Collapse
Affiliation(s)
- Ivan Merelli
- Institute for Biomedical Technologies, National Research Council, Segrate (Milan), Italy
- * E-mail:
| | - Pietro Liò
- Computer Laboratory, University of Cambridge, Cambridge, United Kingdom
| | - Luciano Milanesi
- Institute for Biomedical Technologies, National Research Council, Segrate (Milan), Italy
| |
Collapse
|
25
|
Peng C, Fu LY, Dong PF, Deng ZL, Li JX, Wang XT, Zhang HY. The sequencing bias relaxed characteristics of Hi-C derived data and implications for chromatin 3D modeling. Nucleic Acids Res 2013; 41:e183. [PMID: 23965308 PMCID: PMC3799458 DOI: 10.1093/nar/gkt745] [Citation(s) in RCA: 42] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open
Abstract
The 3D chromatin structure modeling by chromatin interactions derived from Hi-C experiments is significantly challenged by the intrinsic sequencing biases in these experiments. Conventional modeling methods only focus on the bias among different chromatin regions within the same experiment but neglect the bias arising from different experimental sequencing depth. We now show that the regional interaction bias is tightly coupled with the sequencing depth, and we further identify a chromatin structure parameter as the inherent characteristics of Hi-C derived data for chromatin regions. Then we present an approach for chromatin structure prediction capable of relaxing both kinds of sequencing biases by using this identified parameter. This method is validated by intra and inter cell-line comparisons among various chromatin regions for four human cell-lines (K562, GM12878, IMR90 and H1hESC), which shows that the openness of chromatin region is well correlated with chromatin function. This method has been executed by an automatic pipeline (AutoChrom3D) and thus can be conveniently used.
Collapse
Affiliation(s)
- Cheng Peng
- National Key Laboratory of Crop Genetic Improvement, Center for Bioinformatics, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | | | | | | | | | | | | |
Collapse
|
26
|
Irimia M, Maeso I, Roy SW, Fraser HB. Ancient cis-regulatory constraints and the evolution of genome architecture. Trends Genet 2013; 29:521-8. [PMID: 23791467 DOI: 10.1016/j.tig.2013.05.008] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2013] [Revised: 05/02/2013] [Accepted: 05/15/2013] [Indexed: 01/18/2023]
Abstract
The order of genes along metazoan chromosomes has generally been thought to be largely random, with few implications for organismal function. However, two recent studies, reporting hundreds of pairs of genes that have remained linked in diverse metazoan species over hundreds of millions of years of evolution, suggest widespread functional implications for gene order. These associations appear to largely reflect cis-regulatory constraints, with either (i) multiple genes sharing transcriptional regulatory elements, or (ii) regulatory elements for a developmental gene being found within a neighboring 'bystander' gene (known as a genomic regulatory block). We discuss implications, questions raised, and new research directions arising from these studies, as well as evidence for similar phenomena in other eukaryotic groups.
Collapse
Affiliation(s)
- Manuel Irimia
- The Donnelly Centre, University of Toronto, Toronto, Ontario, Canada.
| | | | | | | |
Collapse
|