1
|
Raynal F, Sengupta K, Plewczynski D, Aliaga B, Pancaldi V. Global chromatin reorganization and regulation of genes with specific evolutionary ages during differentiation and cancer. Nucleic Acids Res 2025; 53:gkaf084. [PMID: 39964480 PMCID: PMC11833689 DOI: 10.1093/nar/gkaf084] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2024] [Revised: 01/18/2025] [Accepted: 02/07/2025] [Indexed: 02/21/2025] Open
Abstract
Cancer cells are highly plastic, favoring adaptation to changing conditions. Genes related to basic cellular processes evolved in ancient species, while more specialized genes appeared later with multicellularity (metazoan genes) or even after mammals evolved. Transcriptomic analyses have shown that ancient genes are up-regulated in cancer, while metazoan-origin genes are inactivated. Despite the importance of these observations, the underlying mechanisms remain unexplored. Here, we study local and global epigenomic mechanisms that may regulate genes from specific evolutionary periods. Using evolutionary gene age data, we characterize the epigenomic landscape, gene expression regulation, and chromatin organization in several cell types: human embryonic stem cells, normal primary B-cells, primary chronic lymphocytic leukemia malignant B-cells, and primary colorectal cancer samples. We identify topological changes in chromatin organization during differentiation observing patterns in Polycomb repression and RNA polymerase II pausing, which are reversed during oncogenesis. Beyond the non-random organization of genes and chromatin features in the 3D epigenome, we suggest that these patterns lead to preferential interactions among ancient, intermediate, and recent genes, mediated by RNA polymerase II, Polycomb, and the lamina, respectively. Our findings shed light on gene regulation according to evolutionary age and suggest this organization changes across differentiation and oncogenesis.
Collapse
Affiliation(s)
- Flavien Raynal
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, 31100 Toulouse, France
| | - Kaustav Sengupta
- Laboratory of Functional and Structural Genomics, Center of New Technologies (CeNT), University of Warsaw, Mazowieckie, 02-097 Warsaw, Poland
- Faculty of Mathematics and Information Science, Warsaw University of Technology, 00-662 Warsaw, Poland
- Department of Molecular Genetics, Erasmus University Medical Center, Erasmus MC Cancer Institute, 3015 GD Rotterdam, the Netherlands
| | - Dariusz Plewczynski
- Laboratory of Functional and Structural Genomics, Center of New Technologies (CeNT), University of Warsaw, Mazowieckie, 02-097 Warsaw, Poland
- Faculty of Mathematics and Information Science, Warsaw University of Technology, 00-662 Warsaw, Poland
| | - Benoît Aliaga
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, 31100 Toulouse, France
| | - Vera Pancaldi
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, 31100 Toulouse, France
- Barcelona Supercomputing Center, 08034 Barcelona, Spain
| |
Collapse
|
2
|
Xu Y, Das P, McCord RP, Shen T. Node features of chromosome structure networks and their connections to genome annotation. Comput Struct Biotechnol J 2024; 23:2240-2250. [PMID: 38827231 PMCID: PMC11140560 DOI: 10.1016/j.csbj.2024.05.026] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2023] [Revised: 05/14/2024] [Accepted: 05/15/2024] [Indexed: 06/04/2024] Open
Abstract
The 3D conformations of chromosomes can encode biological significance, and the implications of such structures have been increasingly appreciated recently. Certain chromosome structural features, such as A/B compartmentalization, are frequently extracted from Hi-C pairwise genome contact information (physical association between different regions of the genome) and compared with linear annotations of the genome, such as histone modifications and lamina association. We investigate how additional properties of chromosome structure can be deduced using an abstract graph representation of the contact heatmap, and describe specific network properties that can have a strong connection with some of these biological annotations. We constructed chromosome structure networks (CSNs) from bulk Hi-C data and calculated a set of site-resolved (node-based) network properties. These properties are useful for characterizing certain aspects of chromosomal structure. We examined the ability of network properties to differentiate several scenarios, such as haploid vs diploid cells, partially inverted nuclei vs conventional architecture, depletion of chromosome architectural proteins, and structural changes during cell development. We also examined the connection between network properties and a series of other linear annotations, such as histone modifications and chromatin states including poised promoter and enhancer labels. We found that semi-local network properties exhibit greater capability in characterizing genome annotations compared to diffusive or ultra-local node features. For example, the local square clustering coefficient can be a strong classifier of lamina-associated domains. We demonstrated that network properties can be useful for highlighting large-scale chromosome structure differences that emerge in different biological situations.
Collapse
Affiliation(s)
- Yingjie Xu
- Graduate School of Genome Science & Technology, University of Tennessee, Knoxville, TN 37996, USA
| | - Priyojit Das
- Department of Genetics, Harvard Medical School, Boston, MA 02115, USA
- Department of Molecular Biology, Massachusetts General Hospital, Boston, MA 02114, USA
| | - Rachel Patton McCord
- Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| | - Tongye Shen
- Department of Biochemistry & Cellular and Molecular Biology, University of Tennessee, Knoxville, TN 37996, USA
| |
Collapse
|
3
|
Raynal F, Sengupta K, Plewczynski D, Aliaga B, Pancaldi V. Global chromatin reorganization and regulation of genes with specific evolutionary ages during differentiation and cancer. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.30.564438. [PMID: 39149250 PMCID: PMC11326123 DOI: 10.1101/2023.10.30.564438] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Subscribe] [Scholar Register] [Indexed: 08/17/2024]
Abstract
Cancer cells are highly plastic, allowing them to adapt to changing conditions. Genes related to basic cellular processes evolved in ancient species, while more specialized genes appeared later with multicellularity (metazoan genes) or even after mammals evolved. Transcriptomic analyses have shown that ancient genes are up-regulated in cancer, while metazoan-origin genes are inactivated. Despite the importance of these observations, the underlying mechanisms remain unexplored. Here, we study local and global epigenomic mechanisms that may regulate genes from specific evolutionary periods. Using evolutionary gene age data, we characterize the epigenomic landscape, gene expression regulation, and chromatin organization in three cell types: human embryonic stem cells, normal B-cells, and primary cells from Chronic Lymphocytic Leukemia, a B-cell malignancy. We identify topological changes in chromatin organization during differentiation observing patterns in Polycomb repression and RNA Polymerase II pausing, which are reversed during oncogenesis. Beyond the non-random organization of genes and chromatin features in the 3D epigenome, we suggest that these patterns lead to preferential interactions among ancient, intermediate, and recent genes, mediated by RNA Polymerase II, Polycomb, and the lamina, respectively. Our findings shed light on gene regulation according to evolutionary age and suggest this organization changes across differentiation and oncogenesis.
Collapse
Affiliation(s)
- Flavien Raynal
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
| | - Kaustav Sengupta
- Laboratory of Functional and Structural Genomics, Center of New Technologies (CeNT), University of Warsaw, Mazowieckie, Poland
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
- Department of Molecular Genetics, Erasmus University Medical Center, Erasmus MC Cancer Institute, Rotterdam, the Netherlands
| | - Dariusz Plewczynski
- Laboratory of Functional and Structural Genomics, Center of New Technologies (CeNT), University of Warsaw, Mazowieckie, Poland
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | - Benoît Aliaga
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
| | - Vera Pancaldi
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France
- Barcelona Supercomputing Center, Barcelona, Spain
| |
Collapse
|
4
|
Melkus G, Sizovs A, Rucevskis P, Silina S. Transcriptional Hubs Within Cliques in Ensemble Hi-C Chromatin Interaction Networks. J Comput Biol 2024; 31:589-596. [PMID: 38768423 DOI: 10.1089/cmb.2024.0515] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2024] Open
Abstract
Chromatin conformation capture technologies permit the study of chromatin spatial organization on a genome-wide scale at a variety of resolutions. Despite the increasing precision and resolution of high-throughput chromatin conformation capture (Hi-C) methods, it remains challenging to conclusively link transcriptional activity to spatial organizational phenomena. We have developed a clique-based approach for analyzing Hi-C data that helps identify chromosomal hotspots that feature considerable enrichment of chromatin annotations for transcriptional start sites and, building on previously published work, show that these chromosomal hotspots are not only significantly enriched in RNA polymerase II binding sites as identified by the ENCODE project, but also identify a noticeable increase in FANTOM5 and GTEx transcription within our identified cliques across a variety of tissue types. From the obtained data, we surmise that our cliques are a suitable method for identifying transcription factories in Hi-C data, and outline further extensions to the method that may make it useful for locating regions of increased transcriptional activity in datasets where in-depth expression or polymerase data may not be available.
Collapse
Affiliation(s)
- Gatis Melkus
- Institute of Mathematics and Computer Science, University of Latvia, Riga, Latvia
| | - Andrejs Sizovs
- Institute of Mathematics and Computer Science, University of Latvia, Riga, Latvia
| | - Peteris Rucevskis
- Institute of Mathematics and Computer Science, University of Latvia, Riga, Latvia
| | - Sandra Silina
- Institute of Mathematics and Computer Science, University of Latvia, Riga, Latvia
| |
Collapse
|
5
|
Heer M, Giudice L, Mengoni C, Giugno R, Rico D. Esearch3D: propagating gene expression in chromatin networks to illuminate active enhancers. Nucleic Acids Res 2023; 51:e55. [PMID: 37021559 PMCID: PMC10250221 DOI: 10.1093/nar/gkad229] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/05/2022] [Revised: 03/06/2023] [Accepted: 04/03/2023] [Indexed: 04/07/2023] Open
Abstract
Most cell type-specific genes are regulated by the interaction of enhancers with their promoters. The identification of enhancers is not trivial as enhancers are diverse in their characteristics and dynamic in their interaction partners. We present Esearch3D, a new method that exploits network theory approaches to identify active enhancers. Our work is based on the fact that enhancers act as a source of regulatory information to increase the rate of transcription of their target genes and that the flow of this information is mediated by the folding of chromatin in the three-dimensional (3D) nuclear space between the enhancer and the target gene promoter. Esearch3D reverse engineers this flow of information to calculate the likelihood of enhancer activity in intergenic regions by propagating the transcription levels of genes across 3D genome networks. Regions predicted to have high enhancer activity are shown to be enriched in annotations indicative of enhancer activity. These include: enhancer-associated histone marks, bidirectional CAGE-seq, STARR-seq, P300, RNA polymerase II and expression quantitative trait loci (eQTLs). Esearch3D leverages the relationship between chromatin architecture and transcription, allowing the prediction of active enhancers and an understanding of the complex underpinnings of regulatory networks. The method is available at: https://github.com/InfOmics/Esearch3D and https://doi.org/10.5281/zenodo.7737123.
Collapse
Affiliation(s)
- Maninder Heer
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| | - Luca Giudice
- Department of Computer Science, University of Verona, Strada le Grazie 15, 37134, Verona, Italy
- A.I. Virtanen Institute for Molecular Sciences, University of Eastern Finland, Kuopio, Finland
| | - Claudia Mengoni
- Department of Computer Science, University of Verona, Strada le Grazie 15, 37134, Verona, Italy
| | - Rosalba Giugno
- Department of Computer Science, University of Verona, Strada le Grazie 15, 37134, Verona, Italy
| | - Daniel Rico
- Biosciences Institute, Faculty of Medical Sciences, Newcastle University, Newcastle upon Tyne, UK
| |
Collapse
|
6
|
Pancaldi V. Network models of chromatin structure. Curr Opin Genet Dev 2023; 80:102051. [PMID: 37245241 DOI: 10.1016/j.gde.2023.102051] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2022] [Revised: 04/13/2023] [Accepted: 04/14/2023] [Indexed: 05/30/2023]
Abstract
Increasing numbers of datasets and experimental assays that capture the organization of chromatin inside the nucleus warrant an effort to develop tools to visualize and analyze these structures. Alongside polymer physics or constraint-based modeling, network theory approaches to describe 3D epigenome organization have gained in popularity. Representing genomic regions as nodes in a network enables visualization of 1D epigenomics datasets in the context of chromatin structure maps, while network theory metrics can be used to describe 3D epigenome organization and dynamics. In this review, we summarize the most salient applications of network theory to the study of chromatin contact maps, demonstrating its potential in revealing epigenomic patterns and relating them to cellular phenotypes.
Collapse
Affiliation(s)
- Vera Pancaldi
- CRCT, Université de Toulouse, Inserm, CNRS, Université Toulouse III-Paul Sabatier, Centre de Recherches en Cancérologie de Toulouse, Toulouse, France.
| |
Collapse
|
7
|
Wang J, Xue Y, He Y, Quan H, Zhang J, Gao YQ. Characterization of network hierarchy reflects cell state specificity in genome organization. Genome Res 2023; 33:247-260. [PMID: 36828586 PMCID: PMC10069467 DOI: 10.1101/gr.277206.122] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2022] [Accepted: 01/31/2023] [Indexed: 02/26/2023]
Abstract
Dynamic chromatin structure acts as the regulator of transcription program in crucial processes including cancer and cell development, but a unified framework for characterizing chromatin structural evolution remains to be established. Here, we performed graph inferences on Hi-C data sets and derived the chromatin contact networks. We discovered significant decreases in information transmission efficiencies in chromatin of colorectal cancer (CRC) and T-cell acute lymphoblastic leukemia (T-ALL) compared to corresponding normal controls through graph statistics. Using network embedding in the Poincaré disk, the hierarchy depths of chromatin from CRC and T-ALL patients were found to be significantly shallower compared to their normal controls. A reverse trend of change in chromatin structure was observed during early embryo development. We found tissue-specific conservation of hierarchy order in chromatin contact networks. Our findings reveal the top-down hierarchy of chromatin organization, which is significantly attenuated in cancer.
Collapse
Affiliation(s)
- Jingyao Wang
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| | - Yue Xue
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| | - Yueying He
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| | - Hui Quan
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China
| | - Jun Zhang
- Changping Laboratory, Beijing, 102206, China
| | - Yi Qin Gao
- Beijing National Laboratory for Molecular Sciences, College of Chemistry and Molecular Engineering, Peking University, Beijing, 100871, China; .,Biomedical Pioneering Innovation Center (BIOPIC), Peking University, Beijing, 100871, China.,Changping Laboratory, Beijing, 102206, China
| |
Collapse
|
8
|
Gong H, Li M, Ji M, Zhang X, Yuan Z, Zhang S, Yang Y, Li C, Chen Y. MINE is a method for detecting spatial density of regulatory chromatin interactions based on a multi-modal network. CELL REPORTS METHODS 2023; 3:100386. [PMID: 36814847 PMCID: PMC9939382 DOI: 10.1016/j.crmeth.2022.100386] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/30/2022] [Revised: 09/15/2022] [Accepted: 12/16/2022] [Indexed: 06/18/2023]
Abstract
Chromatin interactions play essential roles in chromatin conformation and gene expression. However, few tools exist to analyze the spatial density of regulatory chromatin interactions (SD-RCI). Here, we present the multi-modal network (MINE) toolkit, including MINE-Loop, MINE-Density, and MINE-Viewer. The MINE-Loop network aims to enhance the detection of RCIs, MINE-Density quantifies the SD--RCI, and MINE-Viewer facilitates 3D visualization of the density of chromatin interactions and participating regulatory factors (e.g., transcription factors). We applied MINE to investigate the relationship between the SD-RCI and chromatin volume change in HeLa cells before and after liquid-liquid phase separation. Changes in SD-RCI before and after treating the HeLa cells with 1,6-hexanediol suggest that changes in chromatin organization was related to the degree of activation or repression of genes. Together, the MINE toolkit enables quantitative studies on different aspects of chromatin conformation and regulatory activity.
Collapse
Affiliation(s)
- Haiyan Gong
- Beijing Advanced Innovation Center for Materials Genome Engineering, School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
| | - Minghong Li
- Beijing Advanced Innovation Center for Materials Genome Engineering, School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
| | - Mengdie Ji
- State Key Laboratory of Medical Molecular Biology, Department of Biochemistry and Molecular Biology, Institute of Basic Medical Sciences, School of Basic Medicine, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing 100005, China
| | - Xiaotong Zhang
- Beijing Advanced Innovation Center for Materials Genome Engineering, School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
- Shunde Innovation School, University of Science and Technology Beijing, Foshan 528399, China
| | - Zan Yuan
- State Key Laboratory of Medical Molecular Biology, Department of Biochemistry and Molecular Biology, Institute of Basic Medical Sciences, School of Basic Medicine, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing 100005, China
| | - Sichen Zhang
- Beijing Advanced Innovation Center for Materials Genome Engineering, School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
| | - Yi Yang
- Beijing Advanced Innovation Center for Materials Genome Engineering, School of Computer and Communication Engineering, University of Science and Technology Beijing, Beijing 100083, China
| | - Chun Li
- School of Mechanical Engineering, University of Science and Technology Beijing, Beijing 100083, China
| | - Yang Chen
- State Key Laboratory of Medical Molecular Biology, Department of Biochemistry and Molecular Biology, Institute of Basic Medical Sciences, School of Basic Medicine, Chinese Academy of Medical Sciences, Peking Union Medical College, Beijing 100005, China
| |
Collapse
|
9
|
Jodkowska K, Pancaldi V, Rigau M, Almeida R, Fernández-Justel J, Graña-Castro O, Rodríguez-Acebes S, Rubio-Camarillo M, Carrillo-de Santa Pau E, Pisano D, Al-Shahrour F, Valencia A, Gómez M, Méndez J. 3D chromatin connectivity underlies replication origin efficiency in mouse embryonic stem cells. Nucleic Acids Res 2022; 50:12149-12165. [PMID: 36453993 PMCID: PMC9757045 DOI: 10.1093/nar/gkac1111] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/01/2022] [Revised: 10/31/2022] [Accepted: 11/25/2022] [Indexed: 12/03/2022] Open
Abstract
In mammalian cells, chromosomal replication starts at thousands of origins at which replisomes are assembled. Replicative stress triggers additional initiation events from 'dormant' origins whose genomic distribution and regulation are not well understood. In this study, we have analyzed origin activity in mouse embryonic stem cells in the absence or presence of mild replicative stress induced by aphidicolin, a DNA polymerase inhibitor, or by deregulation of origin licensing factor CDC6. In both cases, we observe that the majority of stress-responsive origins are also active in a small fraction of the cell population in a normal S phase, and stress increases their frequency of activation. In a search for the molecular determinants of origin efficiency, we compared the genetic and epigenetic features of origins displaying different levels of activation, and integrated their genomic positions in three-dimensional chromatin interaction networks derived from high-depth Hi-C and promoter-capture Hi-C data. We report that origin efficiency is directly proportional to the proximity to transcriptional start sites and to the number of contacts established between origin-containing chromatin fragments, supporting the organization of origins in higher-level DNA replication factories.
Collapse
Affiliation(s)
| | | | | | | | - José M Fernández-Justel
- Functional Organization of the Mammalian Genome Group, Centro de Biología Molecular “Severo Ochoa” (CSIC-UAM), Madrid, Spain
| | - Osvaldo Graña-Castro
- Bioinformatics Unit, Structural Biology Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain,Institute of Applied Molecular Medicine (IMMA-Nemesio Díez), San Pablo-CEU University, Boadilla del Monte, Madrid, Spain
| | - Sara Rodríguez-Acebes
- DNA Replication Group, Molecular Oncology Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Miriam Rubio-Camarillo
- Bioinformatics Unit, Structural Biology Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | | | - David Pisano
- Bioinformatics Unit, Structural Biology Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Fátima Al-Shahrour
- Bioinformatics Unit, Structural Biology Programme, Spanish National Cancer Research Centre (CNIO), Madrid, Spain
| | - Alfonso Valencia
- Computational Biology Life Sciences Group, Barcelona Supercomputing Center (BSC), Barcelona, Spain
| | - María Gómez
- Correspondence may also be addressed to María Gómez. Tel: +34 911964724; Fax: +34 911964420;
| | - Juan Méndez
- To whom correspondence should be addressed. Tel: +34 917328000; Fax: +34 917328033;
| |
Collapse
|
10
|
Zhang K, Wang C, Sun L, Zheng J. Prediction of gene co-expression from chromatin contacts with graph attention network. Bioinformatics 2022; 38:4457-4465. [PMID: 35929807 PMCID: PMC9525008 DOI: 10.1093/bioinformatics/btac535] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2021] [Revised: 07/12/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION The technology of high-throughput chromatin conformation capture (Hi-C) allows genome-wide measurement of chromatin interactions. Several studies have shown statistically significant relationships between gene-gene spatial contacts and their co-expression. It is desirable to uncover epigenetic mechanisms of transcriptional regulation behind such relationships using computational modeling. Existing methods for predicting gene co-expression from Hi-C data use manual feature engineering or unsupervised learning, which either limits the prediction accuracy or lacks interpretability. RESULTS To address these issues, we propose HiCoEx (Hi-C predicts gene co-expression), a novel end-to-end framework for explainable prediction of gene co-expression from Hi-C data based on graph neural network. We apply graph attention mechanism to a gene contact network inferred from Hi-C data to distinguish the importance among different neighboring genes of each gene, and learn the gene representation to predict co-expression in a supervised and task-specific manner. Then, from the trained model, we extract the learned gene embeddings as a model interpretation to distill biological insights. Experimental results show that HiCoEx can learn gene representation from 3D genomics signals automatically to improve prediction accuracy, and make the black box model explainable by capturing some biologically meaningful patterns, e.g., in a gene contact network, the common neighbors of two central genes might contribute to the co-expression of the two central genes through sharing enhancers. AVAILABILITY AND IMPLEMENTATION The source code is freely available at https://github.com/JieZheng-ShanghaiTech/HiCoEx. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Ke Zhang
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 200050, China
| | - Chenxi Wang
- iHuman Institute, ShanghaiTech University, Shanghai 201210, China
- School of Life Science and Technology, ShanghaiTech University, Shanghai 201210, China
| | - Liping Sun
- iHuman Institute, ShanghaiTech University, Shanghai 201210, China
| | - Jie Zheng
- School of Information Science and Technology, ShanghaiTech University, Shanghai 201210, China
- Shanghai Engineering Research Center of Intelligent Vision and Imaging, ShanghaiTech University, Shanghai 201210, China
| |
Collapse
|
11
|
Mora A, Huang X, Jauhari S, Jiang Q, Li X. Chromatin Hubs: A biological and computational outlook. Comput Struct Biotechnol J 2022; 20:3796-3813. [PMID: 35891791 PMCID: PMC9304431 DOI: 10.1016/j.csbj.2022.07.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/14/2022] [Revised: 07/02/2022] [Accepted: 07/02/2022] [Indexed: 11/20/2022] Open
Abstract
This review discusses our current understanding of chromatin biology and bioinformatics under the unifying concept of “chromatin hubs.” The first part reviews the biology of chromatin hubs, including chromatin–chromatin interaction hubs, chromatin hubs at the nuclear periphery, hubs around macromolecules such as RNA polymerase or lncRNAs, and hubs around nuclear bodies such as the nucleolus or nuclear speckles. The second part reviews existing computational methods, including enhancer–promoter interaction prediction, network analysis, chromatin domain callers, transcription factory predictors, and multi-way interaction analysis. We introduce an integrated model that makes sense of the existing evidence. Understanding chromatin hubs may allow us (i) to explain long-unsolved biological questions such as interaction specificity and redundancy of mechanisms, (ii) to develop more realistic kinetic and functional predictions, and (iii) to explain the etiology of genomic disease.
Collapse
Affiliation(s)
- Antonio Mora
- Joint School of Life Sciences, Guangzhou Medical University and Guangzhou Institutes of Biomedicine and Health (Chinese Academy of Sciences), Guangzhou 511436, PR China
- Corresponding authors.
| | - Xiaowei Huang
- Joint School of Life Sciences, Guangzhou Medical University and Guangzhou Institutes of Biomedicine and Health (Chinese Academy of Sciences), Guangzhou 511436, PR China
| | - Shaurya Jauhari
- Joint School of Life Sciences, Guangzhou Medical University and Guangzhou Institutes of Biomedicine and Health (Chinese Academy of Sciences), Guangzhou 511436, PR China
| | - Qin Jiang
- Affiliated Eye Hospital of Nanjing Medical University, Nanjing 210000, PR China
| | - Xuri Li
- State Key Laboratory of Ophthalmology, Zhongshan Ophthalmic Center, Sun Yat-Sen University, and Guangdong Provincial Key Laboratory of Ophthalmology and Visual Science, Guangzhou 510060, PR China
- Corresponding authors.
| |
Collapse
|
12
|
Feng Y, Cai L, Hong W, Zhang C, Tan N, Wang M, Wang C, Liu F, Wang X, Ma J, Gao C, Kumar M, Mo Y, Geng Q, Luo C, Lin Y, Chen H, Wang SY, Watson MJ, Jegga AG, Pedersen RA, Fu JD, Wang ZV, Fan GC, Sadayappan S, Wang Y, Pauklin S, Huang F, Huang W, Jiang L. Rewiring of 3D Chromatin Topology Orchestrates Transcriptional Reprogramming and the Development of Human Dilated Cardiomyopathy. Circulation 2022; 145:1663-1683. [PMID: 35400201 PMCID: PMC9251830 DOI: 10.1161/circulationaha.121.055781] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Submit a Manuscript] [Subscribe] [Scholar Register] [Received: 05/19/2021] [Accepted: 02/18/2022] [Indexed: 02/05/2023]
Abstract
BACKGROUND Transcriptional reconfiguration is central to heart failure, the most common cause of which is dilated cardiomyopathy (DCM). The effect of 3-dimensional chromatin topology on transcriptional dysregulation and pathogenesis in human DCM remains elusive. METHODS We generated a compendium of 3-dimensional epigenome and transcriptome maps from 101 biobanked human DCM and nonfailing heart tissues through highly integrative chromatin immunoprecipitation (H3K27ac [acetylation of lysine 27 on histone H3]), in situ high-throughput chromosome conformation capture, chromatin immunoprecipitation sequencing, assay for transposase-accessible chromatin using sequencing, and RNA sequencing. We used human induced pluripotent stem cell-derived cardiomyocytes and mouse models to interrogate the key transcription factor implicated in 3-dimensional chromatin organization and transcriptional regulation in DCM pathogenesis. RESULTS We discovered that the active regulatory elements (H3K27ac peaks) and their connectome (H3K27ac loops) were extensively reprogrammed in DCM hearts and contributed to transcriptional dysregulation implicated in DCM development. For example, we identified that nontranscribing NPPA-AS1 (natriuretic peptide A antisense RNA 1) promoter functions as an enhancer and physically interacts with the NPPA (natriuretic peptide A) and NPPB (natriuretic peptide B) promoters, leading to the cotranscription of NPPA and NPPB in DCM hearts. We revealed that DCM-enriched H3K27ac loops largely resided in conserved high-order chromatin architectures (compartments, topologically associating domains) and their anchors unexpectedly had equivalent chromatin accessibility. We discovered that the DCM-enriched H3K27ac loop anchors exhibited a strong enrichment for HAND1 (heart and neural crest derivatives expressed 1), a key transcription factor involved in early cardiogenesis. In line with this, its protein expression was upregulated in human DCM and mouse failing hearts. To further validate whether HAND1 is a causal driver for the reprogramming of enhancer-promoter connectome in DCM hearts, we performed comprehensive 3-dimensional epigenome mappings in human induced pluripotent stem cell-derived cardiomyocytes. We found that forced overexpression of HAND1 in human induced pluripotent stem cell-derived cardiomyocytes induced a distinct gain of enhancer-promoter connectivity and correspondingly increased the expression of their connected genes implicated in DCM pathogenesis, thus recapitulating the transcriptional signature in human DCM hearts. Electrophysiology analysis demonstrated that forced overexpression of HAND1 in human induced pluripotent stem cell-derived cardiomyocytes induced abnormal calcium handling. Furthermore, cardiomyocyte-specific overexpression of Hand1 in the mouse hearts resulted in dilated cardiac remodeling with impaired contractility/Ca2+ handling in cardiomyocytes, increased ratio of heart weight/body weight, and compromised cardiac function, which were ascribed to recapitulation of transcriptional reprogramming in DCM. CONCLUSIONS This study provided novel chromatin topology insights into DCM pathogenesis and illustrated a model whereby a single transcription factor (HAND1) reprograms the genome-wide enhancer-promoter connectome to drive DCM pathogenesis.
Collapse
Affiliation(s)
- Yuliang Feng
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Old Road, Headington, Oxford, OX3 7LD, UK
- These authors contributed equally to this work
| | - Liuyang Cai
- Department of Microbiology, Faculty of Medicine, The Chinese University of Hong Kong, Hong Kong, SAR 999077, China
- These authors contributed equally to this work
| | - Wanzi Hong
- Guangdong Provincial Geriatrics Institute, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, Guangdong 510080, China
- These authors contributed equally to this work
| | - Chunxiang Zhang
- Institute of Cardiovascular Research, Southwest Medical University, Luzhou, Sichuan 646000, China
- These authors contributed equally to this work
| | - Ning Tan
- Guangdong Provincial Geriatrics Institute, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, Guangdong 510080, China
| | - Mingyang Wang
- College of Engineering and Applied Science, University of Cincinnati, Cincinnati, OH 45221, USA
| | - Cheng Wang
- Smurfit Institute of Genetics, Trinity College Dublin, Dublin 2, Ireland D02 VF25
| | - Feng Liu
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Old Road, Headington, Oxford, OX3 7LD, UK
| | - Xiaohong Wang
- Department of Pharmacology and Systems Physiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Jianyong Ma
- Department of Pharmacology and Systems Physiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Chen Gao
- Department of Pharmacology and Systems Physiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Mohit Kumar
- Department of Pharmacology and Systems Physiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
- Heart, Lung and Vascular Institute, Department of Internal Medicine, Division of Cardiovascular Health and Disease, University of Cincinnati, Cincinnati, OH 45236, USA
| | - Yuanxi Mo
- Guangdong Provincial Geriatrics Institute, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, Guangdong 510080, China
| | - Qingshan Geng
- Guangdong Provincial Geriatrics Institute, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, Guangdong 510080, China
| | - Changjun Luo
- Institute of Cardiovascular Diseases, the First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi 530021, China
| | - Yan Lin
- Guangdong Provincial Geriatrics Institute, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, Guangdong 510080, China
| | - Haiyang Chen
- National Clinical Research Center for Geriatrics, West China Hospital, Sichuan University, Chengdu 610041, Sichuan, China
| | - Shuang-Yin Wang
- Department of Immunology, Weizmann Institute of Science, Rehovot WR35+R8, Israel
| | - Michael J. Watson
- Department of Surgery, Cardiovascular & Thoracic, Duke University, Durham, NC 27710, USA
| | - Anil G. Jegga
- Division of Biomedical Informatics, Cincinnati Children’s Hospital Medical Center, Cincinnati, OH 45229, USA
- Department of Pediatrics, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
- Department of Computer Science, University of Cincinnati College of Engineering, Cincinnati, OH 45221, USA
| | - Roger A. Pedersen
- Department of OB-GYN/Reproductive, Perinatal and Stem Cell Biology Research, Stanford University, Stanford, California, USA
| | - Ji-dong Fu
- Departments of Physiology and Cell Biology, the Dorothy M. Davis Heart and Lung Research Institute, Frick Center for Heart Failure and Arrhythmia, the Ohio State University, Columbus, OH 43210, USA
| | - Zhao V. Wang
- Division of Cardiology, Department of Internal Medicine, University of Texas Southwestern Medical Center, Dallas, Texas, USA, 75390-8573
| | - Guo-Chang Fan
- Department of Pharmacology and Systems Physiology, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Sakthivel Sadayappan
- Heart, Lung and Vascular Institute, Department of Internal Medicine, Division of Cardiovascular Health and Disease, University of Cincinnati, Cincinnati, OH 45236, USA
| | - Yigang Wang
- Department of Pathology and Laboratory Medicine, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Siim Pauklin
- Botnar Research Centre, Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, University of Oxford Old Road, Headington, Oxford, OX3 7LD, UK
| | - Feng Huang
- Institute of Cardiovascular Diseases, the First Affiliated Hospital of Guangxi Medical University, Nanning, Guangxi 530021, China
| | - Wei Huang
- Department of Pathology and Laboratory Medicine, University of Cincinnati College of Medicine, Cincinnati, OH 45267, USA
| | - Lei Jiang
- Guangdong Provincial Geriatrics Institute, Guangdong Provincial People’s Hospital, Guangdong Academy of Medical Sciences, Guangzhou, Guangdong 510080, China
- Lead contact
| |
Collapse
|
13
|
Yadav VK, Singh S, Yadav A, Agarwal N, Singh B, Jalmi SK, Yadav VK, Tiwari VK, Kumar V, Singh R, Sawant SV. Stress Conditions Modulate the Chromatin Interactions Network in Arabidopsis. Front Genet 2022; 12:799805. [PMID: 35069698 PMCID: PMC8766718 DOI: 10.3389/fgene.2021.799805] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2021] [Accepted: 11/15/2021] [Indexed: 11/26/2022] Open
Abstract
Stresses have been known to cause various responses like cellular physiology, gene regulation, and genome remodeling in the organism to cope and survive. Here, we assessed the impact of stress conditions on the chromatin-interactome network of Arabidopsis thaliana. We identified thousands of chromatin interactions in native as well as in salicylic acid treatment and high temperature conditions in a genome-wide fashion. Our analysis revealed the definite pattern of chromatin interactions and stress conditions could modulate the dynamics of chromatin interactions. We found the heterochromatic region of the genome actively involved in the chromatin interactions. We further observed that the establishment or loss of interactions in response to stress does not result in the global change in the expression profile of interacting genes; however, interacting regions (genes) containing motifs for known TFs showed either lower expression or no difference than non-interacting genes. The present study also revealed that interactions preferred among the same epigenetic state (ES) suggest interactions clustered the same ES together in the 3D space of the nucleus. Our analysis showed that stress conditions affect the dynamics of chromatin interactions among the chromatin loci and these interaction networks govern the folding principle of chromatin by bringing together similar epigenetic marks.
Collapse
Affiliation(s)
- Vikash Kumar Yadav
- CSIR-National Botanical Research Institute, Lucknow, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Swadha Singh
- CSIR-National Botanical Research Institute, Lucknow, India.,School of Natural Sciences, University of California, Merced, Merced, CA, United States
| | - Amrita Yadav
- CSIR-National Botanical Research Institute, Lucknow, India
| | - Neha Agarwal
- CSIR-National Botanical Research Institute, Lucknow, India
| | - Babita Singh
- CSIR-National Botanical Research Institute, Lucknow, India
| | | | | | - Vipin Kumar Tiwari
- CSIR-National Botanical Research Institute, Lucknow, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| | - Verandra Kumar
- Department of Botany, Manyawar Kanshiram Government Degree College, Aligarh, India
| | | | - Samir Vishwanath Sawant
- CSIR-National Botanical Research Institute, Lucknow, India.,Academy of Scientific and Innovative Research (AcSIR), Ghaziabad, India
| |
Collapse
|
14
|
Okada H, Saga Y. Repurposing of the enhancer-promoter communication underlies the compensation of Mesp2 by Mesp1. PLoS Genet 2022; 18:e1010000. [PMID: 35025872 PMCID: PMC8791502 DOI: 10.1371/journal.pgen.1010000] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/05/2021] [Revised: 01/26/2022] [Accepted: 12/17/2021] [Indexed: 11/25/2022] Open
Abstract
Organisms are inherently equipped with buffering systems against genetic perturbations. Genetic compensation, the compensatory response by upregulating another gene or genes, is one such buffering mechanism. Recently, a well-conserved compensatory mechanism was proposed: transcriptional adaptation of homologs under the nonsense-mediated mRNA decay pathways. However, this model cannot explain the onset of all compensatory events. We report a novel genetic compensation mechanism operating over the Mesp gene locus. Mesp1 and Mesp2 are paralogs located adjacently in the genome. Mesp2 loss is partially rescued by Mesp1 upregulation in the presomitic mesoderm (PSM). Using a cultured PSM induction system, we reproduced the compensatory response in vitro and found that the Mesp2-enhancer is required to promote Mesp1. We revealed that the Mesp2-enhancer directly interacts with the Mesp1 promoter, thereby upregulating Mesp1 expression upon the loss of Mesp2. Of note, this interaction is established by genomic arrangement upon PSM development independently of Mesp2 disruption. We propose that the repurposing of this established enhancer-promoter communication is the mechanism underlying this compensatory response for the upregulation of the adjacent gene. Genetic compensation, the compensatory response by upregulating another gene or genes, is one of the inherent mechanisms against gene disruption to confer cellular fitness. However, the regulatory mechanisms are largely unknown. Nonsense-mediated mutant mRNA degradation was recently proposed as a conserved mechanism across species to upregulate homologous genes to compensate for a disrupted gene, but this cannot explain compensation events with no mutant mRNA. This study investigated the compensation mechanism operating over adjacent paralogs, Mesp1 and Mesp2, in the genome. Mesp genes encode essential transcription factors in the presomitic mesoderm for development. In general, an enhancer is considered to activate a target gene when it physically interacts with the target. The communication of the Mesp2-enhancer with the Mesp1 promoter is established upon differentiation of the presomitic mesoderm, but this communication activates Mesp1 only when Mesp2 is disrupted, leading to compensation. We revealed a novel compensation mechanism depending on the repurposing of this enhancer-promoter communication by gene disruption. Our study also provides new insight into transcriptional regulation by providing the concept that an enhancer changes its target even among its physically interacting genes in a context-dependent manner.
Collapse
Affiliation(s)
- Hajime Okada
- Department of Gene Function and Phenomics, National Institute of Genetics, Mishima, Japan
| | - Yumiko Saga
- Department of Gene Function and Phenomics, National Institute of Genetics, Mishima, Japan
- Department of Genetics, School of Life Science, The Graduate University for Advised Studies (SOKENDAI), Mishima, Japan
- Department of Biological Sciences, Graduate School of Science, The University of Tokyo, Tokyo, Japan
- * E-mail:
| |
Collapse
|
15
|
|
16
|
Pancaldi V. Chromatin Network Analyses: Towards Structure-Function Relationships in Epigenomics. FRONTIERS IN BIOINFORMATICS 2021; 1:742216. [PMID: 36303769 PMCID: PMC9581029 DOI: 10.3389/fbinf.2021.742216] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 10/04/2021] [Indexed: 01/16/2023] Open
Abstract
Recent technological advances have allowed us to map chromatin conformation and uncover the genome's spatial organization of the genome inside the nucleus. These experiments have revealed the complexities of genome folding, characterized by the presence of loops and domains at different scales, which can change across development and in different cell types. There is strong evidence for a relationship between the topological properties of chromatin contacts and cellular phenotype. Chromatin can be represented as a network, in which genomic fragments are the nodes and connections represent experimentally observed spatial proximity of two genomically distant regions in a specific cell type or biological condition. With this approach we can consider a variety of chromatin features in association with the 3D structure, investigating how nuclear chromatin organization can be related to gene regulation, replication, malignancy, phenotypic variability and plasticity. We briefly review the results obtained on genome architecture through network theoretic approaches. As previously observed in protein-protein interaction networks and many types of non-biological networks, external conditions could shape network topology through a yet unidentified structure-function relationship. Similar to scientists studying the brain, we are confronted with a duality between a spatially embedded network of physical contacts, a related network of correlation in the dynamics of network nodes and, finally, an abstract definition of function of this network, related to phenotype. We summarise major developments in the study of networks in other fields, which we think can suggest a path towards better understanding how 3D genome configuration can impact biological function and adaptation to the environment.
Collapse
Affiliation(s)
- Vera Pancaldi
- Centre de Recherches en Cancérologie de Toulouse (CRCT), Institut National de la Santé et de la Recherche Médicale (Inserm) U1037, Centre National de la Recherche Scientifique (CNRS) U5071, Université Paul Sabatier, Toulouse, France
- Barcelona Supercomputing Center, Barcelona, Spain
| |
Collapse
|
17
|
Saint-André V. Computational biology approaches for mapping transcriptional regulatory networks. Comput Struct Biotechnol J 2021; 19:4884-4895. [PMID: 34522292 PMCID: PMC8426465 DOI: 10.1016/j.csbj.2021.08.028] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/24/2021] [Revised: 08/16/2021] [Accepted: 08/16/2021] [Indexed: 12/13/2022] Open
Abstract
Transcriptional Regulatory Networks (TRNs) are mainly responsible for the cell-type- or cell-state-specific expression of gene sets from the same DNA sequence. However, so far there are no precise maps of TRNs available for each cell-type or cell-state, and no ideal tool to map those networks clearly and in full from biological samples. In this review, major approaches and tools to map TRNs from high-throughput data are presented, depending on the type of methods or data used to infer them, and their advantages and limitations are discussed. After summarizing the main principles defining the topology and structure–function relationships in TRNs, an overview of the extensive work done to map TRNs from bulk transcriptomic data will be presented by type of methodological approach. Most recent modellings of TRNs using other types of molecular data or integrating different data types, including single-cell RNA-sequencing and chromatin information, will then be discussed, before briefly concluding with improvements expected to come in the field.
Collapse
Affiliation(s)
- Violaine Saint-André
- Hub de Bioinformatique et Biostatistique - Département Biologie Computationnelle, Institut Pasteur, Paris, France
| |
Collapse
|
18
|
Chiliński M, Sengupta K, Plewczynski D. From DNA human sequence to the chromatin higher order organisation and its biological meaning: Using biomolecular interaction networks to understand the influence of structural variation on spatial genome organisation and its functional effect. Semin Cell Dev Biol 2021; 121:171-185. [PMID: 34429265 DOI: 10.1016/j.semcdb.2021.08.007] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2021] [Revised: 08/06/2021] [Accepted: 08/12/2021] [Indexed: 12/30/2022]
Abstract
The three-dimensional structure of the human genome has been proven to have a significant functional impact on gene expression. The high-order spatial chromatin is organised first by looping mediated by multiple protein factors, and then it is further formed into larger structures of topologically associated domains (TADs) or chromatin contact domains (CCDs), followed by A/B compartments and finally the chromosomal territories (CTs). The genetic variation observed in human population influences the multi-scale structures, posing a question regarding the functional impact of structural variants reflected by the variability of the genes expression patterns. The current methods of evaluating the functional effect include eQTLs analysis which uses statistical testing of influence of variants on spatially close genes. Rarely, non-coding DNA sequence changes are evaluated by their impact on the biomolecular interaction network (BIN) reflecting the cellular interactome that can be analysed by the classical graph-theoretic algorithms. Therefore, in the second part of the review, we introduce the concept of BIN, i.e. a meta-network model of the complete molecular interactome developed by integrating various biological networks. The BIN meta-network model includes DNA-protein binding by the plethora of protein factors as well as chromatin interactions, therefore allowing connection of genomics with the downstream biomolecular processes present in a cell. As an illustration, we scrutinise the chromatin interactions mediated by the CTCF protein detected in a ChIA-PET experiment in the human lymphoblastoid cell line GM12878. In the corresponding BIN meta-network the DNA spatial proximity is represented as a graph model, combined with the Proteins-Interaction Network (PIN) of human proteome using the Gene Association Network (GAN). Furthermore, we enriched the BIN with the signalling and metabolic pathways and Gene Ontology (GO) terms to assert its functional context. Finally, we mapped the Single Nucleotide Polymorphisms (SNPs) from the GWAS studies and identified the chromatin mutational hot-spots associated with a significant enrichment of SNPs related to autoimmune diseases. Afterwards, we mapped Structural Variants (SVs) from healthy individuals of 1000 Genomes Project and identified an interesting example of the missing protein complex associated with protein Q6GYQ0 due to a deletion on chromosome 14. Such an analysis using the meta-network BIN model is therefore helpful in evaluating the influence of genetic variation on spatial organisation of the genome and its functional effect in a cell.
Collapse
Affiliation(s)
- Mateusz Chiliński
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland; Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| | - Kaustav Sengupta
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| | - Dariusz Plewczynski
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland; Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland.
| |
Collapse
|
19
|
Liyakat Ali TM, Brunet A, Collas P, Paulsen J. TAD cliques predict key features of chromatin organization. BMC Genomics 2021; 22:499. [PMID: 34217222 PMCID: PMC8254932 DOI: 10.1186/s12864-021-07815-8] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/05/2020] [Accepted: 06/17/2021] [Indexed: 01/03/2023] Open
Abstract
BACKGROUND Mechanisms underlying genome 3D organization and domain formation in the mammalian nucleus are not completely understood. Multiple processes such as transcriptional compartmentalization, DNA loop extrusion and interactions with the nuclear lamina dynamically act on chromatin at multiple levels. Here, we explore long-range interaction patterns between topologically associated domains (TADs) in several cell types. RESULTS We find that TAD long-range interactions are connected to many key features of chromatin organization, including open and closed compartments, compaction and loop extrusion processes. Domains that form large TAD cliques tend to be repressive across cell types, when comparing gene expression, LINE/SINE repeat content and chromatin subcompartments. Further, TADs in large cliques are larger in genomic size, less dense and depleted of convergent CTCF motifs, in contrast to smaller and denser TADs formed by a loop extrusion process. CONCLUSIONS Our results shed light on the organizational principles that govern repressive and active domains in the human genome.
Collapse
Affiliation(s)
- Tharvesh M Liyakat Ali
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Annaël Brunet
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Oslo, Norway
| | - Philippe Collas
- Department of Molecular Medicine, Institute of Basic Medical Sciences, Faculty of Medicine, University of Oslo, Oslo, Norway.
- Department of Immunology and Transfusion Medicine, Oslo University Hospital, Oslo, Norway.
| | - Jonas Paulsen
- Institute of Biosciences, Faculty of Mathematics and Natural Sciences, University of Oslo, Oslo, Norway.
| |
Collapse
|
20
|
Halder AK, Denkiewicz M, Sengupta K, Basu S, Plewczynski D. Aggregated network centrality shows non-random structure of genomic and proteomic networks. Methods 2020; 181-182:5-14. [PMID: 31740366 DOI: 10.1016/j.ymeth.2019.11.006] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/16/2019] [Revised: 11/02/2019] [Accepted: 11/08/2019] [Indexed: 11/25/2022] Open
Abstract
Network analysis is a powerful tool for modelling biological systems. We propose a new approach that integrates the genomic interaction data at population level with the proteomic interaction data. In our approach we use chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) data from human genome to construct a set of genomic interaction networks, considering the natural partitioning of chromatin into chromatin contact domains (CCD). The genomic networks are then mapped onto proteomic interactions, to create protein-protein interaction (PPI) subnetworks. Furthermore, the network-based topological properties of these proteomic subnetworks are investigated, namely closeness centrality, betweenness centrality and clustering coefficient. We statistically confirm, that networks identified by our method significantly differ from random networks in these network properties. Additionally, we identify one of the regions, namely chr6:32014923-33217929, as having an above-random concentration of the single nucleotide polymorphisms (SNPs) related to autoimmune diseases. Then we present it in the form of a meta-network, which includes multi-omic data: genomic contact sites (anchors), genes, proteins and SNPs. Using this example we demonstrate, that the created networks provide a valid mapping of genes to SNPs, expanding on the raw SNP dataset used.
Collapse
Affiliation(s)
- Anup Kumar Halder
- Centre of New Technologies, University of Warsaw, Warsaw, Poland; Department of Computer Science and Engineering, Jadavpur University, Kolkata, India.
| | - Michał Denkiewicz
- Centre of New Technologies, University of Warsaw, Warsaw, Poland; Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | - Kaustav Sengupta
- Centre of New Technologies, University of Warsaw, Warsaw, Poland; Faculty of Mathematics, Informatics and Mechanics, University of Warsaw, Warsaw, Poland
| | - Subhadip Basu
- Department of Computer Science and Engineering, Jadavpur University, Kolkata, India.
| | - Dariusz Plewczynski
- Centre of New Technologies, University of Warsaw, Warsaw, Poland; Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland; Computer Science Department, University of California, 2063 Kemper Hall, One Shields Avenue, Davis, CA 95616-8562, United States.
| |
Collapse
|
21
|
Sun J, Zhang Y, Wang M, Guan Q, Yang X, Ou JX, Yan M, Wang C, Zhang Y, Li ZH, Lan C, Mao C, Zhou HW, Hao B, Zhang Z. The Biological Significance of Multi-copy Regions and Their Impact on Variant Discovery. GENOMICS, PROTEOMICS & BIOINFORMATICS 2020; 18:516-524. [PMID: 32827758 PMCID: PMC8377240 DOI: 10.1016/j.gpb.2019.05.004] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 03/21/2019] [Revised: 05/07/2019] [Accepted: 06/06/2019] [Indexed: 11/23/2022]
Abstract
Identification of genetic variants via high-throughput sequencing (HTS) technologies has been essential for both fundamental and clinical studies. However, to what extent the genome sequence composition affects variant calling remains unclear. In this study, we identified 63,897 multi-copy sequences (MCSs) with a minimum length of 300 bp, each of which occurs at least twice in the human genome. The 151,749 genomic loci (multi-copy regions, or MCRs) harboring these MCSs account for 1.98% of the genome and are distributed unevenly across chromosomes. MCRs containing the same MCS tend to be located on the same chromosome. Gene Ontology (GO) analyses revealed that 3800 genes whose UTRs or exons overlap with MCRs are enriched for Golgi-related cellular component terms and various enzymatic activities in the GO biological function category. MCRs are also enriched for loci that are sensitive to neocarzinostatin-induced double-strand breaks. Moreover, genetic variants discovered by genome-wide association studies and recorded in dbSNP are significantly underrepresented in MCRs. Using simulated HTS datasets, we show that false variant discovery rates are significantly higher in MCRs than in other genomic regions. These results suggest that extra caution must be taken when identifying genetic variants in the MCRs via HTS technologies.
Collapse
Affiliation(s)
- Jing Sun
- State Key Laboratory of Organ Failure Research, National Clinical Research Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou 510515, China; Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China; Key Laboratory of Mental Health of the Ministry of Education, Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Southern Medical University, Guangzhou 510515, China; Center for Precision Medicine, Shunde Hospital of Southern Medical University, Foshan 528399, China
| | - Yanfang Zhang
- State Key Laboratory of Organ Failure Research, National Clinical Research Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou 510515, China; Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China; Key Laboratory of Mental Health of the Ministry of Education, Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Southern Medical University, Guangzhou 510515, China
| | - Minhui Wang
- State Key Laboratory of Organ Failure Research, National Clinical Research Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou 510515, China
| | - Qian Guan
- State Key Laboratory of Organ Failure Research, National Clinical Research Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou 510515, China
| | - Xiujia Yang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China; Key Laboratory of Mental Health of the Ministry of Education, Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Southern Medical University, Guangzhou 510515, China
| | - Jin Xia Ou
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou 510282, China
| | - Mingchen Yan
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Chengrui Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Yan Zhang
- Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China
| | - Zhi-Hao Li
- Division of Epidemiology, School of Public Health, Southern Medical University, Guangzhou 510515, China
| | - Chunhong Lan
- State Key Laboratory of Organ Failure Research, National Clinical Research Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou 510515, China; Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China; Key Laboratory of Mental Health of the Ministry of Education, Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Southern Medical University, Guangzhou 510515, China; Center for Precision Medicine, Shunde Hospital of Southern Medical University, Foshan 528399, China
| | - Chen Mao
- Division of Epidemiology, School of Public Health, Southern Medical University, Guangzhou 510515, China
| | - Hong-Wei Zhou
- Microbiome Medicine Center, Division of Laboratory Medicine, Zhujiang Hospital, Southern Medical University, Guangzhou 510282, China
| | - Bingtao Hao
- Center for Precision Medicine, Shunde Hospital of Southern Medical University, Foshan 528399, China.
| | - Zhenhai Zhang
- State Key Laboratory of Organ Failure Research, National Clinical Research Center for Kidney Disease, Division of Nephrology, Nanfang Hospital, Southern Medical University, Guangzhou 510515, China; Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University, Guangzhou 510515, China; Key Laboratory of Mental Health of the Ministry of Education, Guangdong-Hong Kong-Macao Greater Bay Area Center for Brain Science and Brain-Inspired Intelligence, Southern Medical University, Guangzhou 510515, China; Center for Precision Medicine, Shunde Hospital of Southern Medical University, Foshan 528399, China.
| |
Collapse
|
22
|
Chen Y, Wang Y, Liu X, Xu J, Zhang MQ. Model-based analysis of chromatin interactions from dCas9-Based CAPTURE-3C-seq. PLoS One 2020; 15:e0236666. [PMID: 32735574 PMCID: PMC7394367 DOI: 10.1371/journal.pone.0236666] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/26/2020] [Accepted: 07/10/2020] [Indexed: 11/18/2022] Open
Abstract
Deciphering long-range chromatin interactions is critical for understanding temporal and tissue-specific gene expression regulated by cis- and trans-acting factors. By combining the chromosome conformation capture (3C) and biotinylated dCas9 system, we previously established a method CAPTURE-3C-seq to unbiasedly identify high-resolution and locus-specific long-range DNA interactions. Here we present the statistical model and a flexible pipeline, C3S, for analysing CAPTURE-3C-seq or similar experimental data from raw sequencing reads to significantly interacting chromatin loci. C3S provides all steps for data processing, quality control and result illustration. It can automatically define the bin size based on the binding peak of the dCas9-targeted regions. Furthermore, it supports the analysis of intra- and inter-chromosomal interactions for different mammalian cell types. We successfully applied C3S across multiple datasets in human K562 cells and mouse embryonic stem cells (mESC) for detecting known and new chromatin interactions at multiple scales. Integrative and topological analysis of the interacted loci at the human β-globin gene cluster provides new insights into mechanisms in developmental gene regulation and network structure in local chromosomal architecture. Furthermore, computational results in mESCs reveal a role for chromatin interacting loops between enhancers and promoters in regulating alternative transcripts of the pluripotency gene OCT4.
Collapse
Affiliation(s)
- Yong Chen
- Department of Molecular and Cellular Biosciences, Rowan University, Glassboro, New Jersey, United States of America
- Department of Biological Sciences, Center for Systems Biology, University of Texas, Dallas, Richardson, Texas, United States of America
- * E-mail: (YC); (MZ)
| | - Yunfei Wang
- Department of Melanoma Medical Oncology, University of Texas MD Anderson Cancer Center, Houston, Texas, United States of America
| | - Xin Liu
- Children’s Medical Center Research Institute, Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Jian Xu
- Children’s Medical Center Research Institute, Department of Pediatrics, University of Texas Southwestern Medical Center, Dallas, Texas, United States of America
| | - Michael Q. Zhang
- Department of Biological Sciences, Center for Systems Biology, University of Texas, Dallas, Richardson, Texas, United States of America
- MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing, China
- Bioinformatics Division and Center for Synthetic & Systems Biology, BNRist, Tsinghua University, Beijing, China
- Department of Automation, Tsinghua University, Beijing, China
- * E-mail: (YC); (MZ)
| |
Collapse
|
23
|
Dawson WK, Lazniewski M, Plewczynski D. Free energy-based model of CTCF-mediated chromatin looping in the human genome. Methods 2020; 181-182:35-51. [PMID: 32645447 DOI: 10.1016/j.ymeth.2020.05.025] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/25/2019] [Revised: 04/21/2020] [Accepted: 05/31/2020] [Indexed: 12/23/2022] Open
Abstract
In recent years, high-throughput techniques have revealed considerable structural organization of the human genome with diverse regions of the chromatin interacting with each other in the form of loops. Some of these loops are quite complex and may encompass regions comprised of many interacting chain segments around a central locus. Popular techniques for extracting this information are chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) and high-throughput chromosome conformation capture (Hi-C). Here, we introduce a physics-based method to predict the three-dimensional structure of chromatin from population-averaged ChIA-PET data. The approach uses experimentally-validated data from human B-lymphoblastoid cells to generate 2D meta-structures of chromatin using a dynamic programming algorithm that explores the chromatin free energy landscape. By generating both optimal and suboptimal meta-structures we can calculate both the free energy and additionally the relative thermodynamic probability. A 3D structure prediction program with applied restraints then can be used to generate the tertiary structures. The main advantage of this approach for population-averaged experimental data is that it provides a way to distinguish between the principal and the spurious contacts. This study also finds that euchromatin appear to have rather precisely regulated 2D meta-structures compared to heterochromatin. The program source-code is available at https://github.com/plewczynski/looper.
Collapse
Affiliation(s)
- Wayne K Dawson
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, Warsaw 02-089, Poland; Department of Biotechnology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo 103-8657, Japan.
| | - Michal Lazniewski
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, Warsaw 02-089, Poland; Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | - Dariusz Plewczynski
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, Warsaw 02-089, Poland; Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland.
| |
Collapse
|
24
|
Madrid-Mencía M, Raineri E, Cao T, Pancaldi V. Using GARDEN-NET and ChAseR to explore human haematopoietic 3D chromatin interaction networks. Nucleic Acids Res 2020; 48:4066-4080. [PMID: 32182345 PMCID: PMC7192625 DOI: 10.1093/nar/gkaa159] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/12/2019] [Revised: 02/21/2020] [Accepted: 03/02/2020] [Indexed: 12/31/2022] Open
Abstract
We introduce an R package and a web-based visualization tool for the representation, analysis and integration of epigenomic data in the context of 3D chromatin interaction networks. GARDEN-NET allows for the projection of user-submitted genomic features on pre-loaded chromatin interaction networks, exploiting the functionalities of the ChAseR package to explore the features in combination with chromatin network topology properties. We demonstrate the approach using published epigenomic and chromatin structure datasets in haematopoietic cells, including a collection of gene expression, DNA methylation and histone modifications data in primary healthy myeloid cells from hundreds of individuals. These datasets allow us to test the robustness of chromatin assortativity, which highlights which epigenomic features, alone or in combination, are more strongly associated with 3D genome architecture. We find evidence for genomic regions with specific histone modifications, DNA methylation, and gene expression levels to be forming preferential contacts in 3D nuclear space, to a different extent depending on the cell type and lineage. Finally, we examine replication timing data and find it to be the genomic feature most strongly associated with overall 3D chromatin organization at multiple scales, consistent with previous results from the literature.
Collapse
Affiliation(s)
- Miguel Madrid-Mencía
- Centre de Recherches en Cancérologie de Toulouse (CRCT), INSERM U1037, Toulouse 31037, France
- Université Paul Sabatier III, Toulouse 31400, Toulouse, France
- Barcelona Supercomputing Center, Barcelona 08034, Spain
| | - Emanuele Raineri
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Barcelona 08028, Spain
| | - Tran Bich Ngoc Cao
- Pharmacological, Medical and Agronomical Biotechnology Department, University of Science and Technology of Hanoi, 100000, Vietnam
| | - Vera Pancaldi
- Centre de Recherches en Cancérologie de Toulouse (CRCT), INSERM U1037, Toulouse 31037, France
- Université Paul Sabatier III, Toulouse 31400, Toulouse, France
- Barcelona Supercomputing Center, Barcelona 08034, Spain
| |
Collapse
|
25
|
Lee KS, Bang H, Choi JK, Kim K. Accelerated Evolution of the Regulatory Sequences of Brain Development in the Human Genome. Mol Cells 2020; 43:331-339. [PMID: 32235023 PMCID: PMC7191052 DOI: 10.14348/molcells.2020.2282] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2019] [Revised: 03/03/2020] [Accepted: 03/08/2020] [Indexed: 12/14/2022] Open
Abstract
Genetic modifications in noncoding regulatory regions are likely critical to human evolution. Human-accelerated noncoding elements are highly conserved noncoding regions among vertebrates but have large differences across humans, which implies human-specific regulatory potential. In this study, we found that human-accelerated noncoding elements were frequently coupled with DNase I hypersensitive sites (DHSs), together with monomethylated and trimethylated histone H3 lysine 4, which are active regulatory markers. This coupling was particularly pronounced in fetal brains relative to adult brains, non-brain fetal tissues, and embryonic stem cells. However, fetal brain DHSs were also specifically enriched in deeply conserved sequences, implying coexistence of universal maintenance and human-specific fitness in human brain development. We assessed whether this coexisting pattern was a general one by quantitatively measuring evolutionary rates of DHSs. As a result, fetal brain DHSs showed a mixed but distinct signature of regional conservation and outlier point acceleration as compared to other DHSs. This finding suggests that brain developmental sequences are selectively constrained in general, whereas specific nucleotides are under positive selection or constraint relaxation simultaneously. Hence, we hypothesize that human- or primate-specific changes to universally conserved regulatory codes of brain development may drive the accelerated, and most likely adaptive, evolution of the regulatory network of the human brain.
Collapse
Affiliation(s)
- Kang Seon Lee
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 344, Korea
| | - Hyoeun Bang
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 344, Korea
| | - Jung Kyoon Choi
- Department of Bio and Brain Engineering, Korea Advanced Institute of Science and Technology (KAIST), Daejeon 344, Korea
| | - Kwoneel Kim
- Department of Biology, Kyung Hee University, Seoul 0447, Korea
| |
Collapse
|
26
|
Hemerich D, Pei J, Harakalova M, van Setten J, Boymans S, Boukens BJ, Efimov IR, Michels M, van der Velden J, Vink A, Cheng C, van der Harst P, Moore JH, Mokry M, Tragante V, Asselbergs FW. Integrative Functional Annotation of 52 Genetic Loci Influencing Myocardial Mass Identifies Candidate Regulatory Variants and Target Genes. CIRCULATION-GENOMIC AND PRECISION MEDICINE 2020; 12:e002328. [PMID: 30681347 DOI: 10.1161/circgen.118.002328] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/16/2022]
Abstract
BACKGROUND Regulatory elements may be involved in the mechanisms by which 52 loci influence myocardial mass, reflected by abnormal amplitude and duration of the QRS complex on the ECG. Functional annotation thus far did not take into account how these elements are affected in disease context. METHODS We generated maps of regulatory elements on hypertrophic cardiomyopathy patients (ChIP-seq N=14 and RNA-seq N=11) and nondiseased hearts (ChIP-seq N=4 and RNA-seq N=11). We tested enrichment of QRS-associated loci on elements differentially acetylated and directly regulating differentially expressed genes between hypertrophic cardiomyopathy patients and controls. We further performed functional annotation on QRS-associated loci using these maps of differentially active regulatory elements. RESULTS Regions differentially affected in disease showed a stronger enrichment ( P=8.6×10-5) for QRS-associated variants than those not showing differential activity ( P=0.01). Promoters of genes differentially regulated between hypertrophic cardiomyopathy patients and controls showed more enrichment ( P=0.001) than differentially acetylated enhancers ( P=0.8) and super-enhancers ( P=0.025). We also identified 74 potential causal variants overlapping these differential regulatory elements. Eighteen of the genes mapped confirmed previous findings, now also pinpointing the potentially affected regulatory elements and candidate causal variants. Fourteen new genes were also mapped. CONCLUSIONS Our results suggest differentially active regulatory elements between hypertrophic cardiomyopathy patients and controls can offer more insights into the mechanisms of QRS-associated loci than elements not affected by disease.
Collapse
Affiliation(s)
- Daiane Hemerich
- Department of Cardiology (D.H., M.H., J.v.S., V.T., F.W.A.), UMC Utrecht, Utrecht University, The Netherlands.,CAPES Foundation, Ministry of Education of Brazil, Brasília (D.H.)
| | - Jiayi Pei
- CAPES Foundation, Ministry of Education of Brazil, Brasília (D.H.).,Department of Nephrology and Hypertension (J.P., C.C.), UMC Utrecht
| | - Magdalena Harakalova
- Department of Cardiology (D.H., M.H., J.v.S., V.T., F.W.A.), UMC Utrecht, Utrecht University, The Netherlands
| | - Jessica van Setten
- Department of Cardiology (D.H., M.H., J.v.S., V.T., F.W.A.), UMC Utrecht, Utrecht University, The Netherlands
| | - Sander Boymans
- Department of Genetics, Center for Molecular Medicine, Cancer Genomics Netherlands (S.B.), UMC Utrecht
| | - Bas J Boukens
- Department of Medical Biology, Academic Medical Center, Amsterdam, The Netherlands (B.J.B.)
| | - Igor R Efimov
- Department of Biomedical Engineering, The George Washington University, Washington, DC (I.R.E.)
| | - Michelle Michels
- Department of Cardiology, Erasmus MC, Rotterdam, The Netherlands (M. Michels)
| | - Jolanda van der Velden
- Department of Physiology, Amsterdam Cardiovascular Sciences, Amsterdam University Medical Center (J.v.d.V.)
| | - Aryan Vink
- Department of Pathology (A.V.), UMC Utrecht, Utrecht University, The Netherlands
| | - Caroline Cheng
- Department of Nephrology and Hypertension (J.P., C.C.), UMC Utrecht
| | | | - Jason H Moore
- Department of Biostatistics and Epidemiology, Institute for Biomedical Informatics, University of Pennsylvania, PA (J.H.M.)
| | - Michal Mokry
- Department of Pediatrics, Wilhelmina Children's Hospital, Utrecht (M. Mokry.)
| | - Vinicius Tragante
- Department of Cardiology (D.H., M.H., J.v.S., V.T., F.W.A.), UMC Utrecht, Utrecht University, The Netherlands
| | - Folkert W Asselbergs
- Department of Cardiology (D.H., M.H., J.v.S., V.T., F.W.A.), UMC Utrecht, Utrecht University, The Netherlands.,Durrer Center for Cardiogenetic Research, ICINNetherlands Heart Institute, Utrecht (F.W.A.).,Institute of Cardiovascular Science, Faculty of Population Health Sciences (F.W.A.), University College London, United Kingdom.,Health Data Research UK London, Institute of Health Informatics F.W.A.), University College London, United Kingdom
| |
Collapse
|
27
|
Morer I, Cardillo A, Díaz-Guilera A, Prignano L, Lozano S. Comparing spatial networks: A one-size-fits-all efficiency-driven approach. Phys Rev E 2020; 101:042301. [PMID: 32422764 DOI: 10.1103/physreve.101.042301] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/25/2019] [Accepted: 03/03/2020] [Indexed: 11/07/2022]
Abstract
Spatial networks are a powerful framework for studying a large variety of systems belonging to a broad diversity of contexts: from transportation to biology, from epidemiology to communications, and migrations, to cite a few. Spatial networks can be described in terms of their total cost (i.e., the total amount of resources needed for building or traveling their connections). Here, we address the issue of how to gauge and compare the quality of spatial network designs (i.e., efficiency vs. total cost) by proposing a two-step methodology. First, we assess the network's design by introducing a quality function based on the concept of network's efficiency. Second, we propose an algorithm to estimate computationally the upper bound of our quality function for a given network. Complementarily, we provide a universal expression to obtain an approximated upper bound to any spatial network, regardless of its size. Smaller differences between the upper bound and the empirical value correspond to better designs. Finally, we test the applicability of this analytic tool set on spatial network data-sets of different nature.
Collapse
Affiliation(s)
- Ignacio Morer
- Departament de Fisica de la Matèria Condensada, Universitat de Barcelona, Barcelona, Spain.,Universitat de Barcelona Institute of Complex Systems (UBICS) Universitat de Barcelona, Barcelona, Spain
| | - Alessio Cardillo
- Institut Català de Paleoecologia Humana i Evolució Social (IPHES), E-43007 Tarragona, Spain.,Department of Engineering Mathematics, University of Bristol, Bristol, BS8 1UB, United Kingdom.,Department of Computer Science and Mathematics, Universitat Rovira i Virgili, E-43007 Tarragona, Spain.,GOTHAM Lab - Institute for Biocomputation and Physics of Complex Systems (BIFI), University of Zaragoza, E-50018 Zaragoza, Spain
| | - Albert Díaz-Guilera
- Departament de Fisica de la Matèria Condensada, Universitat de Barcelona, Barcelona, Spain.,Universitat de Barcelona Institute of Complex Systems (UBICS) Universitat de Barcelona, Barcelona, Spain
| | - Luce Prignano
- Departament de Fisica de la Matèria Condensada, Universitat de Barcelona, Barcelona, Spain.,Universitat de Barcelona Institute of Complex Systems (UBICS) Universitat de Barcelona, Barcelona, Spain
| | - Sergi Lozano
- Universitat de Barcelona Institute of Complex Systems (UBICS) Universitat de Barcelona, Barcelona, Spain.,Institut Català de Paleoecologia Humana i Evolució Social (IPHES), E-43007 Tarragona, Spain.,Àrea de Prehistòria, Universitat Rovira i Virgili, Tarragona, Spain.,Departament d'Història Econòmica, Institucions, Política i Economia Mundial, Universitat de Barcelona, Barcelona, Spain
| |
Collapse
|
28
|
Erenpreisa J, Giuliani A. Resolution of Complex Issues in Genome Regulation and Cancer Requires Non-Linear and Network-Based Thermodynamics. Int J Mol Sci 2019; 21:E240. [PMID: 31905791 PMCID: PMC6981914 DOI: 10.3390/ijms21010240] [Citation(s) in RCA: 14] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/03/2019] [Revised: 12/22/2019] [Accepted: 12/27/2019] [Indexed: 02/06/2023] Open
Abstract
The apparent lack of success in curing cancer that was evidenced in the last four decades of molecular medicine indicates the need for a global re-thinking both its nature and the biological approaches that we are taking in its solution. The reductionist, one gene/one protein method that has served us well until now, and that still dominates in biomedicine, requires complementation with a more systemic/holistic approach, to address the huge problem of cross-talk between more than 20,000 protein-coding genes, about 100,000 protein types, and the multiple layers of biological organization. In this perspective, the relationship between the chromatin network organization and gene expression regulation plays a fundamental role. The elucidation of such a relationship requires a non-linear thermodynamics approach to these biological systems. This change of perspective is a necessary step for developing successful 'tumour-reversion' therapeutic strategies.
Collapse
Affiliation(s)
- Jekaterina Erenpreisa
- Cancer Research Division, Latvian Biomedicine Research and Study Centre, LV1067 Riga, Latvia
| | - Alessandro Giuliani
- Environmental and Health Department, Istituto Superiore di Sanità, 00161 Rome, Italy;
| |
Collapse
|
29
|
The first enhancer in an enhancer chain safeguards subsequent enhancer-promoter contacts from a distance. Genome Biol 2019; 20:197. [PMID: 31514731 PMCID: PMC6739990 DOI: 10.1186/s13059-019-1808-y] [Citation(s) in RCA: 17] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2019] [Accepted: 09/02/2019] [Indexed: 01/01/2023] Open
Abstract
Background Robustness and evolutionary stability of gene expression in the human genome are established by an array of redundant enhancers. Results Using Hi-C data in multiple cell lines, we report a comprehensive map of promoters and active enhancers connected by chromatin contacts, spanning 9000 enhancer chains in 4 human cell lines associated with 2600 human genes. We find that the first enhancer in a chain that directly contacts the target promoter is commonly located at a greater genomic distance from the promoter than the second enhancer in a chain, 96 kb vs. 45 kb, respectively. The first enhancer also features higher similarity to the promoter in terms of tissue specificity and higher enrichment of loop factors, suggestive of a stable primary contact with the promoter. In contrast, a chain of enhancers which connects to the target promoter through a neutral DNA segment instead of an enhancer is associated with a significant decrease in target gene expression, suggesting an important role of the first enhancer in initiating transcription using the target promoter and bridging the promoter with other regulatory elements in the locus. Conclusions The widespread chained structure of gene enhancers in humans reveals that the primary, critical enhancer is distal, commonly located further away than other enhancers. This first, distal enhancer establishes contacts with multiple regulatory elements and safeguards a complex regulatory program of its target gene. Electronic supplementary material The online version of this article (10.1186/s13059-019-1808-y) contains supplementary material, which is available to authorized users.
Collapse
|
30
|
Ramirez RN, Bedirian K, Gray SM, Diallo A. DNA Rchitect: an R based visualizer for network analysis of chromatin interaction data. Bioinformatics 2019; 36:644-646. [PMID: 31373608 PMCID: PMC7867998 DOI: 10.1093/bioinformatics/btz608] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/27/2018] [Revised: 07/30/2019] [Accepted: 08/01/2019] [Indexed: 02/06/2023] Open
Abstract
MOTIVATION Visualization of multiple genomic data generally requires the use of public or commercially hosted browsers. Flexible visualization of chromatin interaction data as genomic features and network components offer informative insights to gene expression. An open source application for visualizing HiC and chromatin conformation-based data as 2D-arcs accompanied by interactive network analyses is valuable. RESULTS DNA Rchitect is a new tool created to visualize HiC and chromatin conformation-based contacts at high (Kb) and low (Mb) genomic resolutions. The user can upload their pre-filtered HiC experiment in bedpe format to the DNA Rchitect web app that we have hosted or to a version they themselves have deployed. Using DNA Rchitect, the uploaded data allows the user to visualize different interactions of their sample, perform simple network analyses, while also offering visualization of other genomic data types. The user can then download their results for additional network functionality offered in network based programs such as Cytoscape. AVAILABILITY AND IMPLEMENTATION DNA Rchitect is freely available both as a web application written primarily in R available at http://shiny.immgen.org/DNARchitect/ and as an open source released under an MIT license at: https://github.com/alosdiallo/DNA_Rchitect.
Collapse
Affiliation(s)
| | | | - S M Gray
- Department of Immunology, Harvard Medical School, Boston, MA, 02115, USA
| | - A Diallo
- To whom correspondence should be addressed.
| |
Collapse
|
31
|
Fischer P, Chen H, Pacho F, Rieder D, Kimmel RA, Meyer D. FoxH1 represses miR-430 during early embryonic development of zebrafish via non-canonical regulation. BMC Biol 2019; 17:61. [PMID: 31362746 PMCID: PMC6664792 DOI: 10.1186/s12915-019-0683-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2019] [Accepted: 07/19/2019] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND FoxH1 is a forkhead transcription factor with conserved key functions in vertebrate mesoderm induction and left-right patterning downstream of the TGF-beta/Nodal signaling pathway. Binding of the forkhead domain (FHD) of FoxH1 to a highly conserved proximal sequence motif was shown to regulate target gene expression. RESULTS We identify the conserved microRNA-430 family (miR-430) as a novel target of FoxH1. miR-430 levels are increased in foxH1 mutants, resulting in a reduced expression of transcripts that are targeted by miR-430 for degradation. To determine the underlying mechanism of miR-430 repression, we performed chromatin immunoprecipitation studies and overexpression experiments with mutant as well as constitutive active and repressive forms of FoxH1. Our studies reveal a molecular interaction of FoxH1 with miR-430 loci independent of the FHD. Furthermore, we show that previously described mutant forms of FoxH1 that disrupt DNA binding or that lack the C-terminal Smad Interaction Domain (SID) dominantly interfere with miR-430 repression, but not with the regulation of previously described FoxH1 targets. CONCLUSIONS We were able to identify the distinct roles of protein domains of FoxH1 in the regulation process of miR-430. We provide evidence that the indirect repression of miR-430 loci depends on the connection to a distal repressive chromosome environment via a non-canonical mode. The widespread distribution of such non-canonical binding sites of FoxH1, found not only in our study, argues against a function restricted to regulating miR-430 and for a more global role of FoxH1 in chromatin folding.
Collapse
Affiliation(s)
- Patrick Fischer
- Institute of Molecular Biology/CMBI, University of Innsbruck, Technikerstrasse 25, 6020, Innsbruck, Austria
| | - Hao Chen
- Institute of Molecular Biology/CMBI, University of Innsbruck, Technikerstrasse 25, 6020, Innsbruck, Austria
| | - Frederic Pacho
- Institute of Molecular Biology/CMBI, University of Innsbruck, Technikerstrasse 25, 6020, Innsbruck, Austria
| | - Dietmar Rieder
- Division of Bioinformatics, Biocenter, Innsbruck Medical University, Innrain 80, 6020, Innsbruck, Austria
| | - Robin A Kimmel
- Institute of Molecular Biology/CMBI, University of Innsbruck, Technikerstrasse 25, 6020, Innsbruck, Austria
| | - Dirk Meyer
- Institute of Molecular Biology/CMBI, University of Innsbruck, Technikerstrasse 25, 6020, Innsbruck, Austria.
| |
Collapse
|
32
|
A subset of topologically associating domains fold into mesoscale core-periphery networks. Sci Rep 2019; 9:9526. [PMID: 31266973 PMCID: PMC6606598 DOI: 10.1038/s41598-019-45457-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/17/2018] [Accepted: 06/07/2019] [Indexed: 12/21/2022] Open
Abstract
Mammalian genomes are folded into a hierarchy of compartments, topologically associating domains (TADs), subTADs, and long-range looping interactions. The higher-order folding patterns of chromatin contacts within TADs and how they localize to disease-associated single nucleotide variants (daSNVs) remains an open area of investigation. Here, we analyze high-resolution Hi-C data with graph theory to understand possible mesoscale network architecture within chromatin domains. We identify a subset of TADs exhibiting strong core-periphery mesoscale structure in embryonic stem cells, neural progenitor cells, and cortical neurons. Hyper-connected core nodes co-localize with genomic segments engaged in multiple looping interactions and enriched for occupancy of the architectural protein CCCTC binding protein (CTCF). CTCF knockdown and in silico deletion of CTCF-bound core nodes disrupts core-periphery structure, whereas in silico mutation of cell type-specific enhancer or gene nodes has a negligible effect. Importantly, neuropsychiatric daSNVs are significantly more likely to localize with TADs folded into core-periphery networks compared to domains devoid of such structure. Together, our results reveal that a subset of TADs encompasses looping interactions connected into a core-periphery mesoscale network. We hypothesize that daSNVs in the periphery of genome folding networks might preserve global nuclear architecture but cause local topological and functional disruptions contributing to human disease. By contrast, daSNVs co-localized with hyper-connected core nodes might cause severe topological and functional disruptions. Overall, these findings shed new light into the mesoscale network structure of fine scale genome folding within chromatin domains and its link to common genetic variants in human disease.
Collapse
|
33
|
Long-range interactions between proximal and distal regulatory regions in maize. Nat Commun 2019; 10:2633. [PMID: 31201330 PMCID: PMC6572780 DOI: 10.1038/s41467-019-10603-4] [Citation(s) in RCA: 77] [Impact Index Per Article: 12.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2018] [Accepted: 05/20/2019] [Indexed: 12/30/2022] Open
Abstract
Long-range chromatin interactions are important for transcriptional regulation of genes, many of which are related to complex agronomics traits. However, the pattern of three-dimensional chromatin interactions remains unclear in plants. Here we report the generation of chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) data and the construction of extensive H3K4me3- and H3K27ac-centered chromatin interaction maps in maize. Results show that the interacting patterns between proximal and distal regulatory regions of genes are highly complex and dynamic. Genes with chromatin interactions have higher expression levels than those without interactions. Genes with proximal-proximal interactions prefer to be transcriptionally coordinated. Tissue-specific proximal–distal interactions are associated with tissue-specific expression of genes. Interactions between proximal and distal regulatory regions further interweave into organized network communities that are enriched in specific biological functions. The high-resolution chromatin interaction maps will help to understand the transcription regulation of genes associated with complex agronomic traits of maize. Chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) can discover specific protein-centered chromatin interactions in high resolution. Here, the authors use ChIA-PET to reveal the complex and dynamic interactions between proximal and distal regulatory regions of genes in maize.
Collapse
|
34
|
Piro RM, Marsico A. Network-Based Methods and Other Approaches for Predicting lncRNA Functions and Disease Associations. Methods Mol Biol 2019; 1912:301-321. [PMID: 30635899 DOI: 10.1007/978-1-4939-8982-9_12] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022]
Abstract
The discovery that a considerable portion of eukaryotic genomes is transcribed and gives rise to long noncoding RNAs (lncRNAs) provides an important new perspective on the transcriptome and raises questions about the centrality of these lncRNAs in gene-regulatory processes and diseases. The rapidly increasing number of mechanistically investigated lncRNAs has provided evidence for distinct functional classes, such as enhancer-like lncRNAs, which modulate gene expression via chromatin looping, and noncoding competing endogenous RNAs (ceRNAs), which act as microRNA decoys. Despite great progress in the last years, the majority of lncRNAs are functionally uncharacterized and their implication for disease biogenesis and progression is unknown. Here, we summarize recent developments in lncRNA function prediction in general and lncRNA-disease associations in particular, with emphasis on in silico methods based on network analysis and on ceRNA function prediction. We believe that such computational techniques provide a valuable aid to prioritize functional lncRNAs or disease-relevant lncRNAs for targeted, experimental follow-up studies.
Collapse
Affiliation(s)
- Rosario Michael Piro
- Institut für Informatik, Freie Universität Berlin, Berlin, Germany.,Institut für Medizinische Genetik und Humangenetik, Charité-Universitätsmedizin Berlin, Berlin, Germany
| | - Annalisa Marsico
- Institut für Informatik, Freie Universität Berlin, Berlin, Germany. .,Max-Planck-Institut für molekulare Genetik, Berlin, Germany.
| |
Collapse
|
35
|
Fu S, Zhang L, Lv J, Zhu B, Wang W, Wang X. Two main stream methods analysis and visual 3D genome architecture. Semin Cell Dev Biol 2019; 90:43-53. [DOI: 10.1016/j.semcdb.2018.07.011] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2018] [Accepted: 07/10/2018] [Indexed: 01/07/2023]
|
36
|
Thiel D, Conrad ND, Ntini E, Peschutter RX, Siebert H, Marsico A. Identifying lncRNA-mediated regulatory modules via ChIA-PET network analysis. BMC Bioinformatics 2019; 20:292. [PMID: 31142264 PMCID: PMC6540383 DOI: 10.1186/s12859-019-2900-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2019] [Accepted: 05/13/2019] [Indexed: 12/12/2022] Open
Abstract
Background Although several studies have provided insights into the role of long non-coding RNAs (lncRNAs), the majority of them have unknown function. Recent evidence has shown the importance of both lncRNAs and chromatin interactions in transcriptional regulation. Although network-based methods, mainly exploiting gene-lncRNA co-expression, have been applied to characterize lncRNA of unknown function by means of ’guilt-by-association’, no strategy exists so far which identifies mRNA-lncRNA functional modules based on the 3D chromatin interaction graph. Results To better understand the function of chromatin interactions in the context of lncRNA-mediated gene regulation, we have developed a multi-step graph analysis approach to examine the RNA polymerase II ChIA-PET chromatin interaction network in the K562 human cell line. We have annotated the network with gene and lncRNA coordinates, and chromatin states from the ENCODE project. We used centrality measures, as well as an adaptation of our previously developed Markov State Models (MSM) clustering method, to gain a better understanding of lncRNAs in transcriptional regulation. The novelty of our approach resides in the detection of fuzzy regulatory modules based on network properties and their optimization based on co-expression analysis between genes and gene-lncRNA pairs. This results in our method returning more bona fide regulatory modules than other state-of-the art approaches for clustering on graphs. Conclusions Interestingly, we find that lncRNA network hubs tend to be significantly enriched in evolutionary conserved lncRNAs and enhancer-like functions. We validated regulatory functions for well known lncRNAs, such as MALAT1 and the enhancer-like lncRNA FALEC. In addition, by investigating the modular structure of bigger components we mine putative regulatory functions for uncharacterized lncRNAs. Electronic supplementary material The online version of this article (10.1186/s12859-019-2900-8) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Denise Thiel
- Max Planck Institute for Molecular Genetics, Berlin, Ihnestraße 63-73, Berlin, 14195, Germany
| | | | - Evgenia Ntini
- Max Planck Institute for Molecular Genetics, Berlin, Ihnestraße 63-73, Berlin, 14195, Germany.,Department of Mathematics and Informatics, Freie Universität, Berlin, Arnimallee 7, Berlin, 14195, Germany
| | - Ria X Peschutter
- Max Planck Institute for Molecular Genetics, Berlin, Ihnestraße 63-73, Berlin, 14195, Germany
| | - Heike Siebert
- Department of Mathematics and Informatics, Freie Universität, Berlin, Arnimallee 7, Berlin, 14195, Germany
| | - Annalisa Marsico
- Max Planck Institute for Molecular Genetics, Berlin, Ihnestraße 63-73, Berlin, 14195, Germany. .,Department of Mathematics and Informatics, Freie Universität, Berlin, Arnimallee 7, Berlin, 14195, Germany. .,Institute of Computational Biology (ICB), Helmholtz Zentrum München, Ingolstädter Landstraße 1, Oberschleißheim, 85764, Germany.
| |
Collapse
|
37
|
Ben Zouari Y, Molitor AM, Sikorska N, Pancaldi V, Sexton T. ChiCMaxima: a robust and simple pipeline for detection and visualization of chromatin looping in Capture Hi-C. Genome Biol 2019; 20:102. [PMID: 31118054 PMCID: PMC6532271 DOI: 10.1186/s13059-019-1706-3] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2018] [Accepted: 05/03/2019] [Indexed: 12/19/2022] Open
Abstract
Capture Hi-C (CHi-C) is a new technique for assessing genome organization based on chromosome conformation capture coupled to oligonucleotide capture of regions of interest, such as gene promoters. Chromatin loop detection is challenging because existing Hi-C/4C-like tools, which make different assumptions about the technical biases presented, are often unsuitable. We describe a new approach, ChiCMaxima, which uses local maxima combined with limited filtering to detect DNA looping interactions, integrating information from biological replicates. ChiCMaxima shows more stringency and robustness compared to previously developed tools. The tool includes a GUI browser for flexible visualization of CHi-C profiles alongside epigenomic tracks.
Collapse
Affiliation(s)
- Yousra Ben Zouari
- Institute of Genetics and Molecular and Cellular Biology (IGBMC), Illkirch, France
- CNRS UMR7104, Illkirch, France
- INSERM U1258, Illkirch, France
- University of Strasbourg, Illkirch, France
| | - Anne M Molitor
- Institute of Genetics and Molecular and Cellular Biology (IGBMC), Illkirch, France
- CNRS UMR7104, Illkirch, France
- INSERM U1258, Illkirch, France
- University of Strasbourg, Illkirch, France
| | - Natalia Sikorska
- Institute of Genetics and Molecular and Cellular Biology (IGBMC), Illkirch, France
- CNRS UMR7104, Illkirch, France
- INSERM U1258, Illkirch, France
- University of Strasbourg, Illkirch, France
| | - Vera Pancaldi
- Centre de Recherches en Cancérologie de Toulouse (CRCT), INSERM U1037, Toulouse, France
- University Paul Sabatier III, Toulouse, France
- Barcelona Supercomputing Center, Barcelona, Spain
| | - Tom Sexton
- Institute of Genetics and Molecular and Cellular Biology (IGBMC), Illkirch, France.
- CNRS UMR7104, Illkirch, France.
- INSERM U1258, Illkirch, France.
- University of Strasbourg, Illkirch, France.
| |
Collapse
|
38
|
Tan ZW, Guarnera E, Berezovsky IN. Exploring chromatin hierarchical organization via Markov State Modelling. PLoS Comput Biol 2018; 14:e1006686. [PMID: 30596637 PMCID: PMC6355033 DOI: 10.1371/journal.pcbi.1006686] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2018] [Revised: 01/31/2019] [Accepted: 11/27/2018] [Indexed: 01/02/2023] Open
Abstract
We propose a new computational method for exploring chromatin structural organization based on Markov State Modelling of Hi-C data represented as an interaction network between genomic loci. A Markov process describes the random walk of a traveling probe in the corresponding energy landscape, mimicking the motion of a biomolecule involved in chromatin function. By studying the metastability of the associated Markov State Model upon annealing, the hierarchical structure of individual chromosomes is observed, and corresponding set of structural partitions is identified at each level of hierarchy. Then, the notion of effective interaction between partitions is derived, delineating the overall topology and architecture of chromosomes. Mapping epigenetic data on the graphs of intra-chromosomal effective interactions helps in understanding how chromosome organization facilitates its function. A sketch of whole-genome interactions obtained from the analysis of 539 partitions from all 23 chromosomes, complemented by distributions of gene expression regulators and epigenetic factors, sheds light on the structure-function relationships in chromatin, delineating chromosomal territories, as well as structural partitions analogous to topologically associating domains and active / passive epigenomic compartments. In addition to the overall genome architecture shown by effective interactions, the affinity between partitions of different chromosomes was analyzed as an indicator of the degree of association between partitions in functionally relevant genomic interactions. The overall static picture of whole-genome interactions obtained with the method presented in this work provides a foundation for chromatin structural reconstruction, for the modelling of chromatin dynamics, and for exploring the regulation of genome function. The algorithms used in this study are implemented in a freely available Python package ChromaWalker (https://bitbucket.org/ZhenWahTan/chromawalker).
Collapse
Affiliation(s)
- Zhen Wah Tan
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Matrix, Singapore
| | - Enrico Guarnera
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Matrix, Singapore
| | - Igor N. Berezovsky
- Bioinformatics Institute (BII), Agency for Science, Technology and Research (A*STAR), Matrix, Singapore
- Department of Biological Sciences (DBS), National University of Singapore (NUS), Singapore
| |
Collapse
|
39
|
Kai Y, Andricovich J, Zeng Z, Zhu J, Tzatsos A, Peng W. Predicting CTCF-mediated chromatin interactions by integrating genomic and epigenomic features. Nat Commun 2018; 9:4221. [PMID: 30310060 PMCID: PMC6181989 DOI: 10.1038/s41467-018-06664-6] [Citation(s) in RCA: 40] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2017] [Accepted: 09/17/2018] [Indexed: 01/27/2023] Open
Abstract
The CCCTC-binding zinc-finger protein (CTCF)-mediated network of long-range chromatin interactions is important for genome organization and function. Although this network has been considered largely invariant, we find that it exhibits extensive cell-type-specific interactions that contribute to cell identity. Here, we present Lollipop, a machine-learning framework, which predicts CTCF-mediated long-range interactions using genomic and epigenomic features. Using ChIA-PET data as benchmark, we demonstrate that Lollipop accurately predicts CTCF-mediated chromatin interactions both within and across cell types, and outperforms other methods based only on CTCF motif orientation. Predictions are confirmed computationally and experimentally by Chromatin Conformation Capture (3C). Moreover, our approach identifies other determinants of CTCF-mediated chromatin wiring, such as gene expression within the loops. Our study contributes to a better understanding about the underlying principles of CTCF-mediated chromatin interactions and their impact on gene expression. CTCF mediates long-range chromatin interactions which are important for genome organization and function. Here, the authors demonstrate that CTCF-mediated interactome exhibits extensive plasticity and present Lollipop, a machine-learning framework which predicts CTCF-mediated long-range interactions using genomic and epigenomic features.
Collapse
Affiliation(s)
- Yan Kai
- Department of Physics, George Washington University (GWU), Washington, DC, 20052, USA.,Department of Anatomy and Cell Biology, Cancer Epigenetics Laboratory, GWU, Washington, DC, 20052, USA.,GWU Cancer Center, GWU School of Medicine and Health Sciences, Washington, DC, 20052, USA
| | - Jaclyn Andricovich
- Department of Anatomy and Cell Biology, Cancer Epigenetics Laboratory, GWU, Washington, DC, 20052, USA.,GWU Cancer Center, GWU School of Medicine and Health Sciences, Washington, DC, 20052, USA
| | - Zhouhao Zeng
- Department of Physics, George Washington University (GWU), Washington, DC, 20052, USA
| | - Jun Zhu
- Systems Biology Center, National Heart Lung and Blood Institute, National Institute of Health, Bethesda, MD, 20892, USA
| | - Alexandros Tzatsos
- Department of Anatomy and Cell Biology, Cancer Epigenetics Laboratory, GWU, Washington, DC, 20052, USA. .,GWU Cancer Center, GWU School of Medicine and Health Sciences, Washington, DC, 20052, USA.
| | - Weiqun Peng
- Department of Physics, George Washington University (GWU), Washington, DC, 20052, USA.
| |
Collapse
|
40
|
Barreiro E, Munteanu CR, Cruz-Monteagudo M, Pazos A, González-Díaz H. Net-Net Auto Machine Learning (AutoML) Prediction of Complex Ecosystems. Sci Rep 2018; 8:12340. [PMID: 30120369 PMCID: PMC6098100 DOI: 10.1038/s41598-018-30637-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/09/2018] [Accepted: 07/24/2018] [Indexed: 11/09/2022] Open
Abstract
Biological Ecosystem Networks (BENs) are webs of biological species (nodes) establishing trophic relationships (links). Experimental confirmation of all possible links is difficult and generates a huge volume of information. Consequently, computational prediction becomes an important goal. Artificial Neural Networks (ANNs) are Machine Learning (ML) algorithms that may be used to predict BENs, using as input Shannon entropy information measures (Shk) of known ecosystems to train them. However, it is difficult to select a priori which ANN topology will have a higher accuracy. Interestingly, Auto Machine Learning (AutoML) methods focus on the automatic selection of the more efficient ML algorithms for specific problems. In this work, a preliminary study of a new approach to AutoML selection of ANNs is proposed for the prediction of BENs. We call it the Net-Net AutoML approach, because it uses for the first time Shk values of both networks involving BENs (networks to be predicted) and ANN topologies (networks to be tested). Twelve types of classifiers have been tested for the Net-Net model including linear, Bayesian, trees-based methods, multilayer perceptrons and deep neuronal networks. The best Net-Net AutoML model for 338,050 outputs of 10 ANN topologies for links of 69 BENs was obtained with a deep fully connected neuronal network, characterized by a test accuracy of 0.866 and a test AUROC of 0.935. This work paves the way for the application of Net-Net AutoML to other systems or ML algorithms.
Collapse
Affiliation(s)
- Enrique Barreiro
- Department of Computation, Computer Science Faculty, University of A Coruna (UDC), 15071, A Coruña, Spain.,Center for Computational Science (CCS), University of Miami (UM), Miami, 33136, FL, USA.,West Coast University, Miami Campus, 33178, FL, USA
| | - Cristian R Munteanu
- Department of Computation, Computer Science Faculty, University of A Coruna (UDC), 15071, A Coruña, Spain
| | - Maykel Cruz-Monteagudo
- Center for Computational Science (CCS), University of Miami (UM), Miami, 33136, FL, USA.,West Coast University, Miami Campus, 33178, FL, USA
| | - Alejandro Pazos
- Biomedical Research Institute of A Coruña (INIBIC), University Hospital Complex of A Coruña (CHUAC), A Coruña, 15006, Spain
| | - Humbert González-Díaz
- Faculty of Science and Technology, University of the Basque Country (UPV/EHU), 48940, Biscay, Spain. .,IKERBASQUE, Basque Foundation for Science, 48011, Bilbao, Biscay, Spain.
| |
Collapse
|
41
|
Diament A, Tuller T. Modeling three-dimensional genomic organization in evolution and pathogenesis. Semin Cell Dev Biol 2018; 90:78-93. [PMID: 30030143 DOI: 10.1016/j.semcdb.2018.07.008] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2018] [Accepted: 07/08/2018] [Indexed: 12/17/2022]
Abstract
The regulation of gene expression is mediated via the complex three-dimensional (3D) conformation of the genetic material and its interactions with various intracellular factors. Various experimental and computational approaches have been developed in recent years for understating the relation between the 3D conformation of the genome and the phenotypes of cells in normal condition and diseases. In this review, we will discuss novel approaches for analyzing and modeling the 3D genomic conformation, focusing on deciphering disease-causing mutations that affect gene expression. We conclude that as this is a very challenging mission, an important direction should involve the comparative analysis of various 3D models from various organisms or cells.
Collapse
Affiliation(s)
- Alon Diament
- Dept. of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel
| | - Tamir Tuller
- Dept. of Biomedical Engineering, Tel Aviv University, Tel Aviv 6997801, Israel; The Sagol School of Neuroscience, Tel-Aviv University, Tel Aviv 6997801, Israel.
| |
Collapse
|
42
|
Xu D, Ma R, Zhang J, Liu Z, Wu B, Peng J, Zhai Y, Gong Q, Shi Y, Wu J, Wu Q, Zhang Z, Ruan K. Dynamic Nature of CTCF Tandem 11 Zinc Fingers in Multivalent Recognition of DNA As Revealed by NMR Spectroscopy. J Phys Chem Lett 2018; 9:4020-4028. [PMID: 29965776 DOI: 10.1021/acs.jpclett.8b01440] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/08/2023]
Abstract
The 11 zinc fingers (ZFs) of the transcription factor CTCF play a versatile role in the regulation of gene expression. CTCF binds to numerous genomic sites to form chromatin loops and topologically associated domains and thus mediates the 3D architecture of chromatin. Although CTCF inter-ZF plasticity is essential for the recognition of multiple genomic sites, the dynamic nature of its 11 ZFs remains unknown. We assigned the chemical shifts of the CTCF ZFs 1-11 and solved the solution structures of each ZF. NMR backbone dynamics, residual dipolar couplings, and small-angle X-ray scattering experiments suggest a high inter-ZF plasticity of the free-form ZFs 1-11. As exemplified by two different protocadherin DNA sequences, the titration of DNAs to 15N-labeled CTCF ZFs 1-11 enabled systematic mapping of binding of CTCF ZFs to various chromatin sites. Our work paves the way for illustrating the molecular basis of the versatile DNA recognized by CTCF and has interesting implications for its conformational transition during DNA binding.
Collapse
Affiliation(s)
- Difei Xu
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences , University of Science and Technology of China , Hefei , Anhui 230027 , P. R. China
| | - Rongsheng Ma
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences , University of Science and Technology of China , Hefei , Anhui 230027 , P. R. China
| | - Jiahai Zhang
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences , University of Science and Technology of China , Hefei , Anhui 230027 , P. R. China
| | - Zhijun Liu
- National Facility for Protein Science in Shanghai, ZhangJiang Lab, Shanghai Advanced Research Institute , Chinese Academy of Sciences , Shanghai 201210 , P. R. China
| | - Bo Wu
- High Magnetic Field Laboratory, Hefei Institutes of Physical Science , Chinese Academy of Sciences , Hefei , Anhui 230031 , P. R. China
| | - Junhui Peng
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences , University of Science and Technology of China , Hefei , Anhui 230027 , P. R. China
| | - Yanan Zhai
- Center for Comparative Biomedicine, MOE Key Laboratory of Systems Biomedicine, Institute of Systems Biomedicine, Collaborative Innovative Center of Systems Biomedicine, SCSB, State Key Laboratory of On-cogenes and Related Genes, School of Life Sciences and Biotechnology , Shanghai Jiao Tong University , Shanghai 200240 , P. R. China
| | - Qingguo Gong
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences , University of Science and Technology of China , Hefei , Anhui 230027 , P. R. China
| | - Yunyu Shi
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences , University of Science and Technology of China , Hefei , Anhui 230027 , P. R. China
| | - Jihui Wu
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences , University of Science and Technology of China , Hefei , Anhui 230027 , P. R. China
| | - Qiang Wu
- Center for Comparative Biomedicine, MOE Key Laboratory of Systems Biomedicine, Institute of Systems Biomedicine, Collaborative Innovative Center of Systems Biomedicine, SCSB, State Key Laboratory of On-cogenes and Related Genes, School of Life Sciences and Biotechnology , Shanghai Jiao Tong University , Shanghai 200240 , P. R. China
| | - Zhiyong Zhang
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences , University of Science and Technology of China , Hefei , Anhui 230027 , P. R. China
| | - Ke Ruan
- Hefei National Laboratory for Physical Sciences at the Microscale, School of Life Sciences , University of Science and Technology of China , Hefei , Anhui 230027 , P. R. China
| |
Collapse
|
43
|
Li T, Jia L, Cao Y, Chen Q, Li C. OCEAN-C: mapping hubs of open chromatin interactions across the genome reveals gene regulatory networks. Genome Biol 2018; 19:54. [PMID: 29690904 PMCID: PMC5926533 DOI: 10.1186/s13059-018-1430-4] [Citation(s) in RCA: 35] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/03/2018] [Accepted: 03/30/2018] [Indexed: 01/26/2023] Open
Abstract
We develop a method called open chromatin enrichment and network Hi-C (OCEAN-C) for antibody-independent mapping of global open chromatin interactions. By integrating FAIRE-seq and Hi-C, OCEAN-C detects open chromatin interactions enriched by active cis-regulatory elements. We identify more than 10,000 hubs of open chromatin interactions (HOCIs) in human cells, which are mainly active promoters and enhancers bound by many DNA-binding proteins and form interaction networks crucial for gene transcription. In addition to identifying large-scale topological structures, including topologically associated domains and A/B compartments, OCEAN-C can detect HOCI-mediated chromatin interactions that are strongly associated with gene expression, super-enhancers, and broad H3K4me3 domains.
Collapse
Affiliation(s)
- Tingting Li
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies; School of Life Sciences, Peking University, Beijing, 100871, China.,State Key Laboratory of Proteomics, National Center of Biomedical Analysis, Institute of Basic Medical Sciences, Beijing, 100850, China
| | - Lumeng Jia
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies; School of Life Sciences, Peking University, Beijing, 100871, China
| | - Yong Cao
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies; School of Life Sciences, Peking University, Beijing, 100871, China
| | - Qing Chen
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies; School of Life Sciences, Peking University, Beijing, 100871, China
| | - Cheng Li
- Peking-Tsinghua Center for Life Sciences, Academy for Advanced Interdisciplinary Studies; School of Life Sciences, Peking University, Beijing, 100871, China. .,Center for Statistical Science; Center for Bioinformatics, Peking University, Beijing, 100871, China.
| |
Collapse
|
44
|
Ouimette JF, Rougeulle C, Veitia RA. Three-dimensional genome architecture in health and disease. Clin Genet 2018; 95:189-198. [PMID: 29377081 DOI: 10.1111/cge.13219] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2017] [Revised: 01/15/2018] [Accepted: 01/23/2018] [Indexed: 11/29/2022]
Abstract
More than a decade of massive DNA sequencing efforts have generated a large body of genomic, transcriptomic and epigenomic information that has provided a more and more detailed view of the functional elements and transactions within the human genome. Considerable efforts have also focused on linking these elements with one another by mapping their interactions and by establishing 3-dimensional (3D) genomic landscapes in various cell and tissue types. In parallel, multiple studies have associated genomic deletions, duplications and other rearrangements with human pathologies. In this review, we explore recent progresses that have allowed connecting disease-causing alterations with perturbations of the 3D genome organization.
Collapse
Affiliation(s)
- J-F Ouimette
- Epigenetics and Cell Fate Center, UMR7216 CNRS, Université Paris Diderot, Paris, France.,Université Paris Diderot, Paris, France
| | - C Rougeulle
- Epigenetics and Cell Fate Center, UMR7216 CNRS, Université Paris Diderot, Paris, France.,Université Paris Diderot, Paris, France
| | - R A Veitia
- Université Paris Diderot, Paris, France.,Institut Jacques Monod, Paris, France
| |
Collapse
|
45
|
Zhang H, Li F, Jia Y, Xu B, Zhang Y, Li X, Zhang Z. Characteristic arrangement of nucleosomes is predictive of chromatin interactions at kilobase resolution. Nucleic Acids Res 2018; 45:12739-12751. [PMID: 29036650 PMCID: PMC5727446 DOI: 10.1093/nar/gkx885] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2017] [Accepted: 09/24/2017] [Indexed: 01/29/2023] Open
Abstract
High-throughput chromosome conformation capture (3C) technologies, such as Hi-C, have made it possible to survey 3D genome structure. However, obtaining 3D profiles at kilobase resolution at low cost remains a major challenge. Therefore, we herein present an algorithm for precise identification of chromatin interaction sites at kilobase resolution from MNase-seq data, termed chromatin interaction site detector (CISD), and a CISD-based chromatin loop predictor (CISD_loop) that predicts chromatin–chromatin interactions (CCIs) from low-resolution Hi-C data. We show that the predictions of CISD and CISD_loop overlap closely with chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) anchors and loops, respectively. The validity of CISD/CISD_loop was further supported by a 3C assay at about 5 kb resolution. Finally, we demonstrate that only modest amounts of MNase-seq and Hi-C data are sufficient to achieve ultrahigh resolution CCI maps. Our results suggest that CCIs may result in characteristic nucleosomes arrangement patterns flanking the interaction sites, and our algorithms may facilitate precise and systematic investigations of CCIs on a larger scale than hitherto have been possible.
Collapse
Affiliation(s)
- Hui Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Feifei Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Yan Jia
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Bingxiang Xu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Yiqun Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Xiaoli Li
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| | - Zhihua Zhang
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China
| |
Collapse
|
46
|
Kustatscher G, Grabowski P, Rappsilber J. Pervasive coexpression of spatially proximal genes is buffered at the protein level. Mol Syst Biol 2017; 13:937. [PMID: 28835372 PMCID: PMC5572396 DOI: 10.15252/msb.20177548] [Citation(s) in RCA: 61] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/06/2023] Open
Abstract
Genes are not randomly distributed in the genome. In humans, 10% of protein-coding genes are transcribed from bidirectional promoters and many more are organised in larger clusters. Intriguingly, neighbouring genes are frequently coexpressed but rarely functionally related. Here we show that coexpression of bidirectional gene pairs, and closeby genes in general, is buffered at the protein level. Taking into account the 3D architecture of the genome, we find that co-regulation of spatially close, functionally unrelated genes is pervasive at the transcriptome level, but does not extend to the proteome. We present evidence that non-functional mRNA coexpression in human cells arises from stochastic chromatin fluctuations and direct regulatory interference between spatially close genes. Protein-level buffering likely reflects a lack of coordination of post-transcriptional regulation of functionally unrelated genes. Grouping human genes together along the genome sequence, or through long-range chromosome folding, is associated with reduced expression noise. Our results support the hypothesis that the selection for noise reduction is a major driver of the evolution of genome organisation.
Collapse
Affiliation(s)
- Georg Kustatscher
- Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, UK
| | - Piotr Grabowski
- Chair of Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, Berlin, Germany
| | - Juri Rappsilber
- Wellcome Trust Centre for Cell Biology, University of Edinburgh, Edinburgh, UK .,Chair of Bioanalytics, Institute of Biotechnology, Technische Universität Berlin, Berlin, Germany
| |
Collapse
|
47
|
Connected Gene Communities Underlie Transcriptional Changes in Cornelia de Lange Syndrome. Genetics 2017; 207:139-151. [PMID: 28679547 DOI: 10.1534/genetics.117.202291] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/24/2017] [Accepted: 06/28/2017] [Indexed: 12/25/2022] Open
Abstract
Cornelia de Lange syndrome (CdLS) is a complex multisystem developmental disorder caused by mutations in cohesin subunits and regulators. While its precise molecular mechanisms are not well defined, they point toward a global deregulation of the transcriptional gene expression program. Cohesin is associated with the boundaries of chromosome domains and with enhancer and promoter regions connecting the three-dimensional genome organization with transcriptional regulation. Here, we show that connected gene communities, structures emerging from the interactions of noncoding regulatory elements and genes in the three-dimensional chromosomal space, provide a molecular explanation for the pathoetiology of CdLS associated with mutations in the cohesin-loading factor NIPBL and the cohesin subunit SMC1A NIPBL and cohesin are important constituents of connected gene communities that are centrally positioned at noncoding regulatory elements. Accordingly, genes deregulated in CdLS are positioned within reach of NIPBL- and cohesin-occupied regions through promoter-promoter interactions. Our findings suggest a dynamic model where NIPBL loads cohesin to connect genes in communities, offering an explanation for the gene expression deregulation in the CdLS.
Collapse
|
48
|
Du M, Bai L. 3D clustering of co-regulated genes and its effect on gene expression. Curr Genet 2017; 63:1017-1021. [PMID: 28551816 DOI: 10.1007/s00294-017-0712-9] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2017] [Revised: 05/22/2017] [Accepted: 05/24/2017] [Indexed: 01/29/2023]
Abstract
There are extensive long-distance chromosomal interactions in eukaryotic genomes, but to what extent these interactions affect gene expression is not clear. Recent works have identified several cases where clustering of co-regulated genes leads to enhanced gene expression in budding yeast. Similar phenomenon was also observed in mammalian cells. These results challenge widely held views of gene regulation in yeast and further our understanding of how the 3D organization of the genome contribute to gene regulation in eukaryotes.
Collapse
Affiliation(s)
- Manyu Du
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, State College, PA, USA.,Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, State College, PA, USA
| | - Lu Bai
- Department of Biochemistry and Molecular Biology, The Pennsylvania State University, University Park, State College, PA, USA. .,Center for Eukaryotic Gene Regulation, The Pennsylvania State University, University Park, State College, PA, USA. .,Department of Physics, The Pennsylvania State University, University Park, State College, PA, USA.
| |
Collapse
|
49
|
Boulos RE, Tremblay N, Arneodo A, Borgnat P, Audit B. Multi-scale structural community organisation of the human genome. BMC Bioinformatics 2017; 18:209. [PMID: 28399820 PMCID: PMC5387268 DOI: 10.1186/s12859-017-1616-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2016] [Accepted: 03/28/2017] [Indexed: 12/11/2022] Open
Abstract
BACKGROUND Structural interaction frequency matrices between all genome loci are now experimentally achievable thanks to high-throughput chromosome conformation capture technologies. This ensues a new methodological challenge for computational biology which consists in objectively extracting from these data the structural motifs characteristic of genome organisation. RESULTS We deployed the fast multi-scale community mining algorithm based on spectral graph wavelets to characterise the networks of intra-chromosomal interactions in human cell lines. We observed that there exist structural domains of all sizes up to chromosome length and demonstrated that the set of structural communities forms a hierarchy of chromosome segments. Hence, at all scales, chromosome folding predominantly involves interactions between neighbouring sites rather than the formation of links between distant loci. CONCLUSIONS Multi-scale structural decomposition of human chromosomes provides an original framework to question structural organisation and its relationship to functional regulation across the scales. By construction the proposed methodology is independent of the precise assembly of the reference genome and is thus directly applicable to genomes whose assembly is not fully determined.
Collapse
Affiliation(s)
- Rasha E Boulos
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342, Lyon, France.,Present address: Montpellier Cancer Institute (ICM), Montpellier Cancer Research Institute (IRCM) Inserm U1194, University of Montpellier, Montpellier, France
| | - Nicolas Tremblay
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342, Lyon, France.,Present address: CNRS, GIPSA-lab, Grenoble, France
| | - Alain Arneodo
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342, Lyon, France.,Present address: LOMA, Université de Bordeaux, CNRS, UMR 5798, 51 Cours de le Libération, Talence, 33405, France
| | - Pierre Borgnat
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342, Lyon, France
| | - Benjamin Audit
- Univ Lyon, Ens de Lyon, Univ Claude Bernard Lyon 1, CNRS, Laboratoire de Physique, F-69342, Lyon, France.
| |
Collapse
|
50
|
Cai L, Chang H, Fang Y, Li G. A Comprehensive Characterization of the Function of LincRNAs in Transcriptional Regulation Through Long-Range Chromatin Interactions. Sci Rep 2016; 6:36572. [PMID: 27824113 PMCID: PMC5099911 DOI: 10.1038/srep36572] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2016] [Accepted: 10/18/2016] [Indexed: 11/13/2022] Open
Abstract
LincRNAs are emerging as important regulators with various cellular functions. However, the mechanisms behind their role in transcriptional regulation have not yet been fully explored. In this report, we proposed to characterize the diverse functions of lincRNAs in transcription regulation through an examination of their long-range chromatin interactions. We found that the promoter regions of lincRNAs displayed two distinct patterns of chromatin states, promoter-like and enhancer-like, indicating different regulatory functions for lincRNAs. Notably, the chromatin interactions between lincRNA genes and other genes suggested a potential mechanism for lincRNAs in the regulation of other genes at the RNA level because the transcribed lincRNAs could function at local spaces on other genes that interact with the lincRNAs at the DNA level. These results represent a novel way to predict the functions of lincRNAs. The GWAS-identification of SNPs within the lincRNAs revealed that some lincRNAs were disease-associated, and the chromatin interactions with those lincRNAs suggested that they were potential target genes of these lincRNA-associated SNPs. Our study provides new insights into the roles that lincRNAs play in transcription regulation.
Collapse
Affiliation(s)
- Liuyang Cai
- National Key Laboratory of Crop Genetic Improvement, Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Huidan Chang
- National Key Laboratory of Crop Genetic Improvement, Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Yaping Fang
- National Key Laboratory of Crop Genetic Improvement, Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| | - Guoliang Li
- National Key Laboratory of Crop Genetic Improvement, Agricultural Bioinformatics Key Laboratory of Hubei Province, College of Informatics, Huazhong Agricultural University, Wuhan, Hubei 430070, China
| |
Collapse
|