1
|
Gjoni K, Gunsalus LM, Kuang S, McArthur E, Pittman M, Capra JA, Pollard KS. Comparing chromatin contact maps at scale: methods and insights. Nat Methods 2025; 22:824-833. [PMID: 40108448 PMCID: PMC11978506 DOI: 10.1038/s41592-025-02630-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2023] [Accepted: 02/14/2025] [Indexed: 03/22/2025]
Abstract
Comparing chromatin contact maps is an essential step in quantifying how three-dimensional (3D) genome organization shapes development, evolution, and disease. However, methods often disagree, and no gold standard exists for comparing pairs of maps. Here, we evaluate 25 ways to compare contact maps using Micro-C and Hi-C data from two cell types and in silico-generated contact maps. We identify similarities and differences between the methods and quantify their robustness to common sources of biological and technical variation, including losses and gains of CTCF-binding sites, changes in contact intensity or patterns, and noise. We find that global comparison methods, such as mean squared error, are suitable for initial screening; however, biologically informed methods are necessary for identifying how maps diverge and for proposing specific functional hypotheses. We provide a reference guide, codebase, and thorough evaluation for rapidly comparing chromatin contact maps at scale to enable biological insights into 3D genome organization.
Collapse
Affiliation(s)
- Ketrin Gjoni
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA, USA
| | - Laura M Gunsalus
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA, USA
| | - Shuzhen Kuang
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA, USA
| | - Evonne McArthur
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA, USA
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, USA
- Vanderbilt Genetics Institute, Vanderbilt University Medical Center, Nashville, TN, USA
- Department of Medicine, University of Washington, Seattle, WA, USA
| | - Maureen Pittman
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA, USA
| | - John A Capra
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA, USA.
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, USA.
| | - Katherine S Pollard
- Gladstone Institute of Data Science and Biotechnology, San Francisco, CA, USA.
- Department of Epidemiology & Biostatistics, University of California, San Francisco, CA, USA.
- Bakar Computational Health Sciences Institute, University of California, San Francisco, CA, USA.
- Chan Zuckerberg Biohub, San Francisco, CA, USA.
| |
Collapse
|
2
|
Tan M, Sun S, Liu Y, Perreault AA, Phanstiel DH, Dou L, Pang B. Targeting the 3D genome by anthracyclines for chemotherapeutic effects. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.10.15.614434. [PMID: 39463926 PMCID: PMC11507702 DOI: 10.1101/2024.10.15.614434] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/29/2024]
Abstract
The chromatins are folded into three-dimensional (3D) structures inside cells, which coordinates the regulation of gene transcription by the non-coding regulatory elements. Aberrant chromatin 3D folding has been shown in many diseases, such as acute myeloid leukemia (AML), and may contribute to tumorigenesis. The anthracycline topoisomerase II inhibitors can induce histone eviction and DNA damage. We performed genome-wide high-resolution mapping of the chemotherapeutic effects of various clinically used anthracycline drugs. ATAC-seq was used to profile the histone eviction effects of different anthracyclines. TOP2A ChIP-seq was used to profile the potential DNA damage regions. Integrated analyses show that different anthracyclines have distinct target selectivity on epigenomic regions, based on their respective ATAC-seq and ChIP-seq profiles. We identified the underlying molecular mechanism that unique anthracycline variants selectively target chromatin looping anchors via disrupting CTCF binding, suggesting an additional potential therapeutic effect on the 3D genome. We further performed Hi-C experiments, and data from K562 cells treated with the selective anthracycline drugs indicate that the 3D chromatin organization is disrupted. Furthermore, AML patients receiving anthracycline drugs showed altered chromatin structures around potential looping anchors, which linked to distinct clinical outcomes. Our data indicate that anthracyclines are potent and selective epigenomic targeting drugs and can target the 3D genome for anticancer therapy, which could be used for personalized medicine to treat tumors with aberrant 3D chromatin structures.
Collapse
|
3
|
Puerto M, Shukla M, Bujosa P, Pérez-Roldán J, Torràs-Llort M, Tamirisa S, Carbonell A, Solé C, Puspo J, Cummings C, de Nadal E, Posas F, Azorín F, Rowley M. The zinc-finger protein Z4 cooperates with condensin II to regulate somatic chromosome pairing and 3D chromatin organization. Nucleic Acids Res 2024; 52:5596-5609. [PMID: 38520405 PMCID: PMC11162801 DOI: 10.1093/nar/gkae198] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2023] [Revised: 02/16/2024] [Accepted: 03/07/2024] [Indexed: 03/25/2024] Open
Abstract
Chromosome pairing constitutes an important level of genome organization, yet the mechanisms that regulate pairing in somatic cells and the impact on 3D chromatin organization are still poorly understood. Here, we address these questions in Drosophila, an organism with robust somatic pairing. In Drosophila, pairing preferentially occurs at loci consisting of numerous architectural protein binding sites (APBSs), suggesting a role of architectural proteins (APs) in pairing regulation. Amongst these, the anti-pairing function of the condensin II subunit CAP-H2 is well established. However, the factors that regulate CAP-H2 localization and action at APBSs remain largely unknown. Here, we identify two factors that control CAP-H2 occupancy at APBSs and, therefore, regulate pairing. We show that Z4, interacts with CAP-H2 and is required for its localization at APBSs. We also show that hyperosmotic cellular stress induces fast and reversible unpairing in a Z4/CAP-H2 dependent manner. Moreover, by combining the opposite effects of Z4 depletion and osmostress, we show that pairing correlates with the strength of intrachromosomal 3D interactions, such as active (A) compartment interactions, intragenic gene-loops, and polycomb (Pc)-mediated chromatin loops. Altogether, our results reveal new players in CAP-H2-mediated pairing regulation and the intimate interplay between inter-chromosomal and intra-chromosomal 3D interactions.
Collapse
Affiliation(s)
- Marta Puerto
- Institute of Molecular Biology of Barcelona, IBMB, CSIC, Baldiri Reixac 4, 08028 Barcelona, Spain
- Institute for Research in Biomedicine of Barcelona, IRB Barcelona. The Barcelona Institute of Science and Technology. Baldiri Reixac 10, 08028 Barcelona, Spain
| | - Mamta Shukla
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, USA
| | - Paula Bujosa
- Institute of Molecular Biology of Barcelona, IBMB, CSIC, Baldiri Reixac 4, 08028 Barcelona, Spain
- Institute for Research in Biomedicine of Barcelona, IRB Barcelona. The Barcelona Institute of Science and Technology. Baldiri Reixac 10, 08028 Barcelona, Spain
| | - Juan Pérez-Roldán
- Institute of Molecular Biology of Barcelona, IBMB, CSIC, Baldiri Reixac 4, 08028 Barcelona, Spain
- Institute for Research in Biomedicine of Barcelona, IRB Barcelona. The Barcelona Institute of Science and Technology. Baldiri Reixac 10, 08028 Barcelona, Spain
| | - Mònica Torràs-Llort
- Institute of Molecular Biology of Barcelona, IBMB, CSIC, Baldiri Reixac 4, 08028 Barcelona, Spain
- Institute for Research in Biomedicine of Barcelona, IRB Barcelona. The Barcelona Institute of Science and Technology. Baldiri Reixac 10, 08028 Barcelona, Spain
| | - Srividya Tamirisa
- Institute of Molecular Biology of Barcelona, IBMB, CSIC, Baldiri Reixac 4, 08028 Barcelona, Spain
- Institute for Research in Biomedicine of Barcelona, IRB Barcelona. The Barcelona Institute of Science and Technology. Baldiri Reixac 10, 08028 Barcelona, Spain
| | - Albert Carbonell
- Institute of Molecular Biology of Barcelona, IBMB, CSIC, Baldiri Reixac 4, 08028 Barcelona, Spain
- Institute for Research in Biomedicine of Barcelona, IRB Barcelona. The Barcelona Institute of Science and Technology. Baldiri Reixac 10, 08028 Barcelona, Spain
| | - Carme Solé
- Institute for Research in Biomedicine of Barcelona, IRB Barcelona. The Barcelona Institute of Science and Technology. Baldiri Reixac 10, 08028 Barcelona, Spain
- Department of Medicine and Life Sciences (MELIS), Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Joynob Akter Puspo
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, USA
| | | | - Eulàlia de Nadal
- Institute for Research in Biomedicine of Barcelona, IRB Barcelona. The Barcelona Institute of Science and Technology. Baldiri Reixac 10, 08028 Barcelona, Spain
- Department of Medicine and Life Sciences (MELIS), Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Francesc Posas
- Institute for Research in Biomedicine of Barcelona, IRB Barcelona. The Barcelona Institute of Science and Technology. Baldiri Reixac 10, 08028 Barcelona, Spain
- Department of Medicine and Life Sciences (MELIS), Universitat Pompeu Fabra (UPF), Barcelona, Spain
| | - Fernando Azorín
- Institute of Molecular Biology of Barcelona, IBMB, CSIC, Baldiri Reixac 4, 08028 Barcelona, Spain
- Institute for Research in Biomedicine of Barcelona, IRB Barcelona. The Barcelona Institute of Science and Technology. Baldiri Reixac 10, 08028 Barcelona, Spain
| | - M Jordan Rowley
- Department of Genetics, Cell Biology and Anatomy, University of Nebraska Medical Center, Omaha, NE, USA
| |
Collapse
|
4
|
Nakato R, Sakata T, Wang J, Nagai LAE, Nagaoka Y, Oba GM, Bando M, Shirahige K. Context-dependent perturbations in chromatin folding and the transcriptome by cohesin and related factors. Nat Commun 2023; 14:5647. [PMID: 37726281 PMCID: PMC10509244 DOI: 10.1038/s41467-023-41316-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/17/2023] [Accepted: 08/29/2023] [Indexed: 09/21/2023] Open
Abstract
Cohesin regulates gene expression through context-specific chromatin folding mechanisms such as enhancer-promoter looping and topologically associating domain (TAD) formation by cooperating with factors such as cohesin loaders and the insulation factor CTCF. We developed a computational workflow to explore how three-dimensional (3D) structure and gene expression are regulated collectively or individually by cohesin and related factors. The main component is CustardPy, by which multi-omics datasets are compared systematically. To validate our methodology, we generated 3D genome, transcriptome, and epigenome data before and after depletion of cohesin and related factors and compared the effects of depletion. We observed diverse effects on the 3D genome and transcriptome, and gene expression changes were correlated with the splitting of TADs caused by cohesin loss. We also observed variations in long-range interactions across TADs, which correlated with their epigenomic states. These computational tools and datasets will be valuable for 3D genome and epigenome studies.
Collapse
Affiliation(s)
- Ryuichiro Nakato
- Laboratory of Computational Genomics, Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-Ku, Tokyo, 113-0032, Japan.
| | - Toyonori Sakata
- Laboratory of Genome Structure and Function, Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-Ku, Tokyo, 113-0032, Japan
- Karolinska Institutet, Department of Biosciences and Nutrition, Biomedicum, Quarter A6, 171 77, Stockholm, Sweden
- Karolinska Institutet, Department of Cell and Molecular Biology, Biomedicum, Quarter A6, 171 77, Stockholm, Sweden
| | - Jiankang Wang
- School of Biomedical Sciences, Hunan University, Changsha, China
| | - Luis Augusto Eijy Nagai
- Laboratory of Computational Genomics, Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-Ku, Tokyo, 113-0032, Japan
| | - Yuya Nagaoka
- Laboratory of Computational Genomics, Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-Ku, Tokyo, 113-0032, Japan
| | - Gina Miku Oba
- Laboratory of Computational Genomics, Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-Ku, Tokyo, 113-0032, Japan
| | - Masashige Bando
- Laboratory of Genome Structure and Function, Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-Ku, Tokyo, 113-0032, Japan
| | - Katsuhiko Shirahige
- Laboratory of Genome Structure and Function, Institute for Quantitative Biosciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-Ku, Tokyo, 113-0032, Japan.
- Karolinska Institutet, Department of Biosciences and Nutrition, Biomedicum, Quarter A6, 171 77, Stockholm, Sweden.
- Karolinska Institutet, Department of Cell and Molecular Biology, Biomedicum, Quarter A6, 171 77, Stockholm, Sweden.
| |
Collapse
|
5
|
Li K, Zhang P, Wang Z, Shen W, Sun W, Xu J, Wen Z, Li L. iEnhance: a multi-scale spatial projection encoding network for enhancing chromatin interaction data resolution. Brief Bioinform 2023; 24:bbad245. [PMID: 37381618 DOI: 10.1093/bib/bbad245] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/21/2023] [Revised: 06/06/2023] [Accepted: 06/12/2023] [Indexed: 06/30/2023] Open
Abstract
Although sequencing-based high-throughput chromatin interaction data are widely used to uncover genome-wide three-dimensional chromatin architecture, their sparseness and high signal-noise-ratio greatly restrict the precision of the obtained structural elements. To improve data quality, we here present iEnhance (chromatin interaction data resolution enhancement), a multi-scale spatial projection and encoding network, to predict high-resolution chromatin interaction matrices from low-resolution and noisy input data. Specifically, iEnhance projects the input data into matrix spaces to extract multi-scale global and local feature sets, then hierarchically fused these features by attention mechanism. After that, dense channel encoding and residual channel decoding are used to effectively infer robust chromatin interaction maps. iEnhance outperforms state-of-the-art Hi-C resolution enhancement tools in both visual and quantitative evaluation. Comprehensive analysis shows that unlike other tools, iEnhance can recover both short-range structural elements and long-range interaction patterns precisely. More importantly, iEnhance can be transferred to data enhancement of other tissues or cell lines of unknown resolution. Furthermore, iEnhance performs robustly in enhancement of diverse chromatin interaction data including those from single-cell Hi-C and Micro-C experiments.
Collapse
Affiliation(s)
- Kai Li
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Ping Zhang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Zilin Wang
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Wei Shen
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Weicheng Sun
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Jinsheng Xu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Zi Wen
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
| | - Li Li
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
- Hubei Hongshan Laboratory, Huazhong Agricultural University, Wuhan 430070, China
| |
Collapse
|
6
|
Wang J, Nakato R. Comprehensive multiomics analyses reveal pervasive involvement of aberrant cohesin binding in transcriptional and chromosomal disorder of cancer cells. iScience 2023; 26:106908. [PMID: 37283809 PMCID: PMC10239702 DOI: 10.1016/j.isci.2023.106908] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2023] [Revised: 02/27/2023] [Accepted: 05/12/2023] [Indexed: 06/08/2023] Open
Abstract
Chromatin organization, whose malfunction causes various diseases including cancer, is fundamentally controlled by cohesin. While cancer cells have been found with mutated or misexpressed cohesin genes, there is no comprehensive survey about the presence and role of abnormal cohesin binding in cancer cells. Here, we systematically identified ∼1% of cohesin-binding sites (701-2,633) as cancer-aberrant binding sites of cohesin (CASs). We integrated CASs with large-scale transcriptomics, epigenomics, 3D genomics, and clinical information. CASs represent tissue-specific epigenomic signatures enriched for cancer-dysregulated genes with functional and clinical significance. CASs exhibited alterations in chromatin compartments, loops within topologically associated domains, and cis-regulatory elements, indicating that CASs induce dysregulated genes through misguided chromatin structure. Cohesin depletion data suggested that cohesin binding at CASs actively regulates cancer-dysregulated genes. Overall, our comprehensive investigation suggests that aberrant cohesin binding is an essential epigenomic signature responsible for dysregulated chromatin structure and transcription in cancer cells.
Collapse
Affiliation(s)
- Jiankang Wang
- School of Biomedical Sciences, Hunan University, Changsha, China
- Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| | - Ryuichiro Nakato
- Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo, Japan
| |
Collapse
|
7
|
Wang J, Nakato R. CohesinDB: a comprehensive database for decoding cohesin-related epigenomes, 3D genomes and transcriptomes in human cells. Nucleic Acids Res 2022; 51:D70-D79. [PMID: 36162821 PMCID: PMC9825609 DOI: 10.1093/nar/gkac795] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/08/2022] [Revised: 08/29/2022] [Accepted: 09/03/2022] [Indexed: 01/29/2023] Open
Abstract
Cohesin is a multifunctional protein responsible for transcriptional regulation and chromatin organization. Cohesin binds to chromatin at tens of thousands of distinct sites in a conserved or tissue-specific manner, whereas the function of cohesin varies greatly depending on the epigenetic properties of specific chromatin loci. Cohesin also extensively mediates cis-regulatory modules (CRMs) and chromatin loops. Even though next-generation sequencing technologies have provided a wealth of information on different aspects of cohesin, the integration and exploration of the resultant massive cohesin datasets are not straightforward. Here, we present CohesinDB (https://cohesindb.iqb.u-tokyo.ac.jp), a comprehensive multiomics cohesin database in human cells. CohesinDB includes 2043 epigenomics, transcriptomics and 3D genomics datasets from 530 studies involving 176 cell types. By integrating these large-scale data, CohesinDB summarizes three types of 'cohesin objects': 751 590 cohesin binding sites, 957 868 cohesin-related chromatin loops and 2 229 500 cohesin-related CRMs. Each cohesin object is annotated with locus, cell type, classification, function, 3D genomics and cis-regulatory information. CohesinDB features a user-friendly interface for browsing, searching, analyzing, visualizing and downloading the desired information. CohesinDB contributes a valuable resource for all researchers studying cohesin, epigenomics, transcriptional regulation and chromatin organization.
Collapse
Affiliation(s)
- Jiankang Wang
- Institute for Quantitative Biosciences, The University of Tokyo, Bunkyo-ku, Tokyo, Yayoi 1-1-1, Japan,Graduate School of Medicine, The University of Tokyo, Bunkyo-ku, Tokyo, Hongo 7-3-1, Japan
| | - Ryuichiro Nakato
- To whom correspondence should be addressed. Tel: +81 3 5841 1471; Fax: +81 3 5841 7308;
| |
Collapse
|
8
|
Wang J, Bando M, Shirahige K, Nakato R. Large-scale multi-omics analysis suggests specific roles for intragenic cohesin in transcriptional regulation. Nat Commun 2022; 13:3218. [PMID: 35680859 PMCID: PMC9184728 DOI: 10.1038/s41467-022-30792-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2021] [Accepted: 05/14/2022] [Indexed: 12/19/2022] Open
Abstract
Cohesin, an essential protein complex for chromosome segregation, regulates transcription through a variety of mechanisms. It is not a trivial task to assign diverse cohesin functions. Moreover, the context-specific roles of cohesin-mediated interactions, especially on intragenic regions, have not been thoroughly investigated. Here we perform a comprehensive characterization of cohesin binding sites in several human cell types. We integrate epigenomic, transcriptomic and chromatin interaction data to explore the context-specific functions of intragenic cohesin related to gene activation. We identify a specific subset of cohesin binding sites, decreased intragenic cohesin sites (DICs), which are negatively correlated with transcriptional regulation. A subgroup of DICs is enriched with enhancer markers and RNA polymerase II, while the others are more correlated to chromatin architecture. DICs are observed in various cell types, including cells from patients with cohesinopathy. We also implement machine learning to our data and identified genomic features for isolating DICs from all cohesin sites. These results suggest a previously unidentified function of cohesin on intragenic regions for transcriptional regulation.
Collapse
Affiliation(s)
- Jiankang Wang
- Institute for Quantitative Biosciences, The University of Tokyo, Tokyo, Japan
- Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
| | - Masashige Bando
- Institute for Quantitative Biosciences, The University of Tokyo, Tokyo, Japan
| | - Katsuhiko Shirahige
- Institute for Quantitative Biosciences, The University of Tokyo, Tokyo, Japan
- Graduate School of Medicine, The University of Tokyo, Tokyo, Japan
- Department of Biosciences and Nutrition, Karolinska Institutet, Stockholm, Sweden
| | - Ryuichiro Nakato
- Institute for Quantitative Biosciences, The University of Tokyo, Tokyo, Japan.
- Graduate School of Medicine, The University of Tokyo, Tokyo, Japan.
| |
Collapse
|
9
|
Gong W, Wee J, Wu MC, Sun X, Li C, Xia K. Persistent spectral simplicial complex-based machine learning for chromosomal structural analysis in cellular differentiation. Brief Bioinform 2022; 23:6583209. [PMID: 35536545 DOI: 10.1093/bib/bbac168] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Revised: 04/12/2022] [Accepted: 03/13/2022] [Indexed: 11/13/2022] Open
Abstract
The three-dimensional (3D) chromosomal structure plays an essential role in all DNA-templated processes, including gene transcription, DNA replication and other cellular processes. Although developing chromosome conformation capture (3C) methods, such as Hi-C, which can generate chromosomal contact data characterized genome-wide chromosomal structural properties, understanding 3D genomic nature-based on Hi-C data remains lacking. Here, we propose a persistent spectral simplicial complex (PerSpectSC) model to describe Hi-C data for the first time. Specifically, a filtration process is introduced to generate a series of nested simplicial complexes at different scales. For each of these simplicial complexes, its spectral information can be calculated from the corresponding Hodge Laplacian matrix. PerSpectSC model describes the persistence and variation of the spectral information of the nested simplicial complexes during the filtration process. Different from all previous models, our PerSpectSC-based features provide a quantitative global-scale characterization of chromosome structures and topology. Our descriptors can successfully classify cell types and also cellular differentiation stages for all the 24 types of chromosomes simultaneously. In particular, persistent minimum best characterizes cell types and Dim (1) persistent multiplicity best characterizes cellular differentiation. These results demonstrate the great potential of our PerSpectSC-based models in polymeric data analysis.
Collapse
Affiliation(s)
- Weikang Gong
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China 100124.,Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| | - JunJie Wee
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| | - Min-Chun Wu
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| | - Xiaohan Sun
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China 100124
| | - Chunhua Li
- Faculty of Environmental and Life Sciences, Beijing University of Technology, Beijing, China 100124
| | - Kelin Xia
- Division of Mathematical Sciences, School of Physical and Mathematical Sciences, Nanyang Technological University, Singapore 637371
| |
Collapse
|