1
|
Wang X, Yue F. Hijacked enhancer-promoter and silencer-promoter loops in cancer. Curr Opin Genet Dev 2024; 86:102199. [PMID: 38669773 DOI: 10.1016/j.gde.2024.102199] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/29/2023] [Revised: 03/19/2024] [Accepted: 04/07/2024] [Indexed: 04/28/2024]
Abstract
Recent work has shown that besides inducing fusion genes, structural variations (SVs) can also contribute to oncogenesis by disrupting the three-dimensional genome organization and dysregulating gene expression. At the chromatin-loop level, SVs can relocate enhancers or silencers from their original genomic loci to activate oncogenes or repress tumor suppressor genes. On a larger scale, different types of alterations in topologically associating domains (TADs) have been reported in cancer, such as TAD expansion, shuffling, and SV-induced neo-TADs. Furthermore, the transformation from normal cells to cancerous cells is usually coupled with active or repressive compartmental switches, and cancer-specific compartments have been proposed. This review discusses the sites, and the other latest advances in studying how SVs disrupt higher-order genome structure in cancer, which in turn leads to oncogene dysregulation. We also highlight the clinical implications of these changes and the challenges ahead in this field.
Collapse
Affiliation(s)
- Xiaotao Wang
- Obstetrics and Gynecology Hospital, Institute of Reproduction and Development, Fudan University, Shanghai, China; Shanghai Key Laboratory of Reproduction and Development, Shanghai, China.
| | - Feng Yue
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, Illinois, USA; Robert H. Lurie Comprehensive Cancer Center of Northwestern University, Chicago, Illinois, USA.
| |
Collapse
|
2
|
Kumar Halder A, Agarwal A, Jodkowska K, Plewczynski D. A systematic analyses of different bioinformatics pipelines for genomic data and its impact on deep learning models for chromatin loop prediction. Brief Funct Genomics 2024:elae009. [PMID: 38555493 DOI: 10.1093/bfgp/elae009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2023] [Revised: 02/07/2024] [Accepted: 03/04/2024] [Indexed: 04/02/2024] Open
Abstract
Genomic data analysis has witnessed a surge in complexity and volume, primarily driven by the advent of high-throughput technologies. In particular, studying chromatin loops and structures has become pivotal in understanding gene regulation and genome organization. This systematic investigation explores the realm of specialized bioinformatics pipelines designed specifically for the analysis of chromatin loops and structures. Our investigation incorporates two protein (CTCF and Cohesin) factor-specific loop interaction datasets from six distinct pipelines, amassing a comprehensive collection of 36 diverse datasets. Through a meticulous review of existing literature, we offer a holistic perspective on the methodologies, tools and algorithms underpinning the analysis of this multifaceted genomic feature. We illuminate the vast array of approaches deployed, encompassing pivotal aspects such as data preparation pipeline, preprocessing, statistical features and modelling techniques. Beyond this, we rigorously assess the strengths and limitations inherent in these bioinformatics pipelines, shedding light on the interplay between data quality and the performance of deep learning models, ultimately advancing our comprehension of genomic intricacies.
Collapse
Affiliation(s)
- Anup Kumar Halder
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| | - Abhishek Agarwal
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| | - Karolina Jodkowska
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| | - Dariusz Plewczynski
- Laboratory of Bioinformatics and Computational Genomics, Faculty of Mathematics and Information Science, Warsaw University of Technology, Koszykowa 75, 00-662 Warsaw, Poland
- Laboratory of Functional and Structural Genomics, Centre of New Technologies, University of Warsaw, Banacha 2c, 02-097 Warsaw, Poland
| |
Collapse
|
3
|
Chowdhury HMAM, Boult T, Oluwadare O. Comparative study on chromatin loop callers using Hi-C data reveals their effectiveness. BMC Bioinformatics 2024; 25:123. [PMID: 38515011 PMCID: PMC10958853 DOI: 10.1186/s12859-024-05713-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/15/2023] [Accepted: 02/19/2024] [Indexed: 03/23/2024] Open
Abstract
BACKGROUND Chromosome is one of the most fundamental part of cell biology where DNA holds the hierarchical information. DNA compacts its size by forming loops, and these regions house various protein particles, including CTCF, SMC3, H3 histone. Numerous sequencing methods, such as Hi-C, ChIP-seq, and Micro-C, have been developed to investigate these properties. Utilizing these data, scientists have developed a variety of loop prediction techniques that have greatly improved their methods for characterizing loop prediction and related aspects. RESULTS In this study, we categorized 22 loop calling methods and conducted a comprehensive study of 11 of them. Additionally, we have provided detailed insights into the methodologies underlying these algorithms for loop detection, categorizing them into five distinct groups based on their fundamental approaches. Furthermore, we have included critical information such as resolution, input and output formats, and parameters. For this analysis, we utilized the GM12878 Hi-C datasets at 5 KB, 10 KB, 100 KB and 250 KB resolutions. Our evaluation criteria encompassed various factors, including memory usages, running time, sequencing depth, and recovery of protein-specific sites such as CTCF, H3K27ac, and RNAPII. CONCLUSION This analysis offers insights into the loop detection processes of each method, along with the strengths and weaknesses of each, enabling readers to effectively choose suitable methods for their datasets. We evaluate the capabilities of these tools and introduce a novel Biological, Consistency, and Computational robustness score ( B C C score ) to measure their overall robustness ensuring a comprehensive evaluation of their performance.
Collapse
Affiliation(s)
- H M A Mohit Chowdhury
- Department of Computer Science, University of Colorado at Colorado Springs, 1420 Austin Bluffs Pkwy, Colorado Springs, CO, 80918, USA
| | - Terrance Boult
- Department of Computer Science, University of Colorado at Colorado Springs, 1420 Austin Bluffs Pkwy, Colorado Springs, CO, 80918, USA
| | - Oluwatosin Oluwadare
- Department of Computer Science, University of Colorado at Colorado Springs, 1420 Austin Bluffs Pkwy, Colorado Springs, CO, 80918, USA.
- Department of Biomedical Informatics, University of Colorado Anschutz Medical Campus, Aurora, CO, USA.
| |
Collapse
|
4
|
Xu X, Khunsriraksakul C, Eales JM, Rubin S, Scannali D, Saluja S, Talavera D, Markus H, Wang L, Drzal M, Maan A, Lay AC, Prestes PR, Regan J, Diwadkar AR, Denniff M, Rempega G, Ryszawy J, Król R, Dormer JP, Szulinska M, Walczak M, Antczak A, Matías-García PR, Waldenberger M, Woolf AS, Keavney B, Zukowska-Szczechowska E, Wystrychowski W, Zywiec J, Bogdanski P, Danser AHJ, Samani NJ, Guzik TJ, Morris AP, Liu DJ, Charchar FJ, Tomaszewski M. Genetic imputation of kidney transcriptome, proteome and multi-omics illuminates new blood pressure and hypertension targets. Nat Commun 2024; 15:2359. [PMID: 38504097 PMCID: PMC10950894 DOI: 10.1038/s41467-024-46132-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/26/2023] [Accepted: 02/14/2024] [Indexed: 03/21/2024] Open
Abstract
Genetic mechanisms of blood pressure (BP) regulation remain poorly defined. Using kidney-specific epigenomic annotations and 3D genome information we generated and validated gene expression prediction models for the purpose of transcriptome-wide association studies in 700 human kidneys. We identified 889 kidney genes associated with BP of which 399 were prioritised as contributors to BP regulation. Imputation of kidney proteome and microRNAome uncovered 97 renal proteins and 11 miRNAs associated with BP. Integration with plasma proteomics and metabolomics illuminated circulating levels of myo-inositol, 4-guanidinobutanoate and angiotensinogen as downstream effectors of several kidney BP genes (SLC5A11, AGMAT, AGT, respectively). We showed that genetically determined reduction in renal expression may mimic the effects of rare loss-of-function variants on kidney mRNA/protein and lead to an increase in BP (e.g., ENPEP). We demonstrated a strong correlation (r = 0.81) in expression of protein-coding genes between cells harvested from urine and the kidney highlighting a diagnostic potential of urinary cell transcriptomics. We uncovered adenylyl cyclase activators as a repurposing opportunity for hypertension and illustrated examples of BP-elevating effects of anticancer drugs (e.g. tubulin polymerisation inhibitors). Collectively, our studies provide new biological insights into genetic regulation of BP with potential to drive clinical translation in hypertension.
Collapse
Affiliation(s)
- Xiaoguang Xu
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | | | - James M Eales
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Sebastien Rubin
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - David Scannali
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Sushant Saluja
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - David Talavera
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Havell Markus
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Lida Wang
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Maciej Drzal
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Akhlaq Maan
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Abigail C Lay
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Priscilla R Prestes
- Health Innovation and Transformation Centre, Federation University Australia, Ballarat, Australia
| | - Jeniece Regan
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Avantika R Diwadkar
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Matthew Denniff
- Department of Cardiovascular Sciences, University of Leicester, Leicester, UK
| | - Grzegorz Rempega
- Department of Urology, Medical University of Silesia, Katowice, Poland
| | - Jakub Ryszawy
- Department of Urology, Medical University of Silesia, Katowice, Poland
| | - Robert Król
- Department of General, Vascular and Transplant Surgery, Faculty of Medical Sciences in Katowice, Medical University of Silesia, Katowice, Poland
| | - John P Dormer
- Department of Cellular Pathology, University Hospitals of Leicester, Leicester, UK
| | - Monika Szulinska
- Department of Obesity, Metabolic Disorders Treatment and Clinical Dietetics, Karol Marcinkowski University of Medical Sciences, Poznan, Poland
| | - Marta Walczak
- Department of Internal Diseases, Metabolic Disorders and Arterial Hypertension, Poznan University of Medical Sciences, Poznan, Poland
| | - Andrzej Antczak
- Department of Urology and Uro-oncology, Karol Marcinkowski University of Medical Sciences, Poznan, Poland
| | - Pamela R Matías-García
- Institute of Epidemiology, Helmholtz Center Munich, Neuherberg, Germany
- Research Unit Molecular Epidemiology, Helmholtz Center Munich, Neuherberg, Germany
- German Research Center for Cardiovascular Disease (DZHK), partner site Munich Heart Alliance, Munich, Germany
| | - Melanie Waldenberger
- Institute of Epidemiology, Helmholtz Center Munich, Neuherberg, Germany
- Research Unit Molecular Epidemiology, Helmholtz Center Munich, Neuherberg, Germany
- German Research Center for Cardiovascular Disease (DZHK), partner site Munich Heart Alliance, Munich, Germany
| | - Adrian S Woolf
- Division of Cell Matrix Biology and Regenerative Medicine, Faculty of Biology, Medicine and Health, University of Manchester, Manchester, UK
- Royal Manchester Children's Hospital and Manchester Academic Health Science Centre, Manchester University NHS Foundation Trust, Manchester, UK
| | - Bernard Keavney
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
- Manchester Academic Health Science Centre, Manchester University NHS Foundation Trust Manchester, Manchester Royal Infirmary, Manchester, UK
| | | | - Wojciech Wystrychowski
- Department of General, Vascular and Transplant Surgery, Faculty of Medical Sciences in Katowice, Medical University of Silesia, Katowice, Poland
| | - Joanna Zywiec
- Department of Internal Medicine, Diabetology and Nephrology, Zabrze, Medical University of Silesia, Katowice, Poland
| | - Pawel Bogdanski
- Department of Obesity, Metabolic Disorders Treatment and Clinical Dietetics, Karol Marcinkowski University of Medical Sciences, Poznan, Poland
| | - A H Jan Danser
- Department of Internal Medicine, Division of Pharmacology and Vascular Medicine, Erasmus Medical Centre, Rotterdam, The Netherlands
| | - Nilesh J Samani
- Department of Cardiovascular Sciences, University of Leicester, Leicester, UK
- NIHR Leicester Biomedical Research Centre, Glenfield Hospital, Leicester, UK
| | - Tomasz J Guzik
- Department of Internal Medicine, Jagiellonian University Medical College, Kraków, Poland
- Centre for Cardiovascular Sciences, Queen's Medical Research Institute, University of Edinburgh, Edinburgh, UK
- Center for Medical Genomics OMICRON, Jagiellonian University Medical College, Kraków, Poland
| | - Andrew P Morris
- Centre for Genetics and Genomics Versus Arthritis, Centre for Musculoskeletal Research, Division of Musculoskeletal & Dermatological Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK
| | - Dajiang J Liu
- Department of Public Health Sciences, Penn State College of Medicine, Hershey, PA, USA
| | - Fadi J Charchar
- Health Innovation and Transformation Centre, Federation University Australia, Ballarat, Australia
- Department of Cardiovascular Sciences, University of Leicester, Leicester, UK
- Department of Physiology, University of Melbourne, Melbourne, Australia
| | - Maciej Tomaszewski
- Division of Cardiovascular Sciences, Faculty of Medicine, Biology and Health, University of Manchester, Manchester, UK.
- Manchester Academic Health Science Centre, Manchester University NHS Foundation Trust Manchester, Manchester Royal Infirmary, Manchester, UK.
| |
Collapse
|
5
|
Fleck K, Luria V, Garag N, Karger A, Hunter T, Marten D, Phu W, Nam KM, Sestan N, O’Donnell-Luria AH, Erceg J. Functional associations of evolutionarily recent human genes exhibit sensitivity to the 3D genome landscape and disease. bioRxiv 2024:2024.03.17.585403. [PMID: 38559085 PMCID: PMC10980080 DOI: 10.1101/2024.03.17.585403] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 04/04/2024]
Abstract
Genome organization is intricately tied to regulating genes and associated cell fate decisions. In this study, we examine the positioning and functional significance of human genes, grouped by their evolutionary age, within the 3D organization of the genome. We reveal that genes of different evolutionary origin have distinct positioning relationships with both domains and loop anchors, and remarkably consistent relationships with boundaries across cell types. While the functional associations of each group of genes are primarily cell type-specific, such associations of conserved genes maintain greater stability across 3D genomic features and disease than recently evolved genes. Furthermore, the expression of these genes across various tissues follows an evolutionary progression, such that RNA levels increase from young genes to ancient genes. Thus, the distinct relationships of gene evolutionary age, function, and positioning within 3D genomic features contribute to tissue-specific gene regulation in development and disease.
Collapse
Affiliation(s)
- Katherine Fleck
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269
| | - Victor Luria
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
- Department of Systems Biology, Harvard Medical School, Boston, MA 02115
| | - Nitanta Garag
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
| | - Amir Karger
- IT-Research Computing, Harvard Medical School, Boston, MA 02115
| | - Trevor Hunter
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
| | - Daniel Marten
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
| | - William Phu
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
| | - Kee-Myoung Nam
- Department of Molecular, Cellular and Developmental Biology, Yale University, New Haven, CT 06510
| | - Nenad Sestan
- Department of Neuroscience, Yale School of Medicine, New Haven, CT 06510
| | - Anne H. O’Donnell-Luria
- Division of Genetics and Genomics, Boston Children’s Hospital, Boston, MA 02115
- Program in Medical and Population Genetics, Broad Institute of Harvard and MIT, Cambridge, MA 02142
- Department of Pediatrics, Harvard Medical School, Boston, MA 02115
| | - Jelena Erceg
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, CT 06269
- Institute for Systems Genomics, University of Connecticut, Storrs, CT 06269
- Department of Genetics and Genome Sciences, University of Connecticut Health Center, Farmington, CT 06030
| |
Collapse
|
6
|
Szczepanski A, Tsuboyama N, Lyu H, Wang P, Beytullahoglu O, Zhang T, Singer BD, Yue F, Zhao Z, Wang L. A SWI/SNF-dependent transcriptional regulation mediated by POU2AF2/C11orf53 at enhancer. Nat Commun 2024; 15:2067. [PMID: 38453939 PMCID: PMC10920751 DOI: 10.1038/s41467-024-46492-5] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 02/13/2024] [Indexed: 03/09/2024] Open
Abstract
Recent studies have identified a previously uncharacterized protein C11orf53 (now named POU2AF2/OCA-T1), which functions as a robust co-activator of POU2F3, the master transcription factor which is critical for both normal and neoplastic tuft cell identity and viability. Here, we demonstrate that POU2AF2 dictates opposing transcriptional regulation at distal enhance elements. Loss of POU2AF2 leads to an inhibition of active enhancer nearby genes, such as tuft cell identity genes, and a derepression of Polycomb-dependent poised enhancer nearby genes, which are critical for cell viability and differentiation. Mechanistically, depletion of POU2AF2 results in a global redistribution of the chromatin occupancy of the SWI/SNF complex, leading to a significant 3D genome structure change and a subsequent transcriptional reprogramming. Our genome-wide CRISPR screen further demonstrates that POU2AF2 depletion or SWI/SNF inhibition leads to a PTEN-dependent cell growth defect, highlighting a potential role of POU2AF2-SWI/SNF axis in small cell lung cancer (SCLC) pathogenesis. Additionally, pharmacological inhibition of SWI/SNF phenocopies POU2AF2 depletion in terms of gene expression alteration and cell viability decrease in SCLC-P subtype cells. Therefore, impeding POU2AF2-mediated transcriptional regulation represents a potential therapeutic approach for human SCLC therapy.
Collapse
Affiliation(s)
- Aileen Szczepanski
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Simpson Querrey Center for Epigenetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Natsumi Tsuboyama
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Simpson Querrey Center for Epigenetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Huijue Lyu
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Simpson Querrey Center for Epigenetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Ping Wang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Simpson Querrey Center for Epigenetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Oguzhan Beytullahoglu
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Simpson Querrey Center for Epigenetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Te Zhang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Simpson Querrey Center for Epigenetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Benjamin David Singer
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Simpson Querrey Center for Epigenetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Division of Pulmonary and Critical Care Medicine, Department of Medicine, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
| | - Feng Yue
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Simpson Querrey Center for Epigenetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA
- Robert H. Lurie Comprehensive Cancer Center, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Zibo Zhao
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
- Simpson Querrey Center for Epigenetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
| | - Lu Wang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
- Simpson Querrey Center for Epigenetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, 60611, USA.
| |
Collapse
|
7
|
Wall BPG, Nguyen M, Harrell JC, Dozmorov MG. Machine and deep learning methods for predicting 3D genome organization. ArXiv 2024:arXiv:2403.03231v1. [PMID: 38495565 PMCID: PMC10942493] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Subscribe] [Scholar Register] [Indexed: 03/19/2024]
Abstract
Three-Dimensional (3D) chromatin interactions, such as enhancer-promoter interactions (EPIs), loops, Topologically Associating Domains (TADs), and A/B compartments play critical roles in a wide range of cellular processes by regulating gene expression. Recent development of chromatin conformation capture technologies has enabled genome-wide profiling of various 3D structures, even with single cells. However, current catalogs of 3D structures remain incomplete and unreliable due to differences in technology, tools, and low data resolution. Machine learning methods have emerged as an alternative to obtain missing 3D interactions and/or improve resolution. Such methods frequently use genome annotation data (ChIP-seq, DNAse-seq, etc.), DNA sequencing information (k-mers, Transcription Factor Binding Site (TFBS) motifs), and other genomic properties to learn the associations between genomic features and chromatin interactions. In this review, we discuss computational tools for predicting three types of 3D interactions (EPIs, chromatin interactions, TAD boundaries) and analyze their pros and cons. We also point out obstacles of computational prediction of 3D interactions and suggest future research directions.
Collapse
Affiliation(s)
- Brydon P. G. Wall
- Center for Biological Data Science, Virginia Commonwealth University, Richmond, VA, 23284, USA
| | - My Nguyen
- Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, 23298, USA
| | - J. Chuck Harrell
- Department of Pathology, Virginia Commonwealth University, Richmond, VA, 23284, USA
- Massey Comprehensive Cancer Center, Virginia Commonwealth University, Richmond, VA 23298, USA
- Center for Pharmaceutical Engineering, Virginia Commonwealth University, Richmond, VA 23298, USA
| | - Mikhail G. Dozmorov
- Department of Biostatistics, Virginia Commonwealth University, Richmond, VA, 23298, USA
- Department of Pathology, Virginia Commonwealth University, Richmond, VA, 23284, USA
| |
Collapse
|
8
|
Zhang Y, Boninsegna L, Yang M, Misteli T, Alber F, Ma J. Computational methods for analysing multiscale 3D genome organization. Nat Rev Genet 2024; 25:123-141. [PMID: 37673975 DOI: 10.1038/s41576-023-00638-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/12/2023] [Indexed: 09/08/2023]
Abstract
Recent progress in whole-genome mapping and imaging technologies has enabled the characterization of the spatial organization and folding of the genome in the nucleus. In parallel, advanced computational methods have been developed to leverage these mapping data to reveal multiscale three-dimensional (3D) genome features and to provide a more complete view of genome structure and its connections to genome functions such as transcription. Here, we discuss how recently developed computational tools, including machine-learning-based methods and integrative structure-modelling frameworks, have led to a systematic, multiscale delineation of the connections among different scales of 3D genome organization, genomic and epigenomic features, functional nuclear components and genome function. However, approaches that more comprehensively integrate a wide variety of genomic and imaging datasets are still needed to uncover the functional role of 3D genome structure in defining cellular phenotypes in health and disease.
Collapse
Affiliation(s)
- Yang Zhang
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Lorenzo Boninsegna
- Department of Microbiology, Immunology and Molecular Genetics and Institute for Quantitative and Computational Biosciences, University of California Los Angeles, Los Angeles, CA, USA
| | - Muyu Yang
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA
| | - Tom Misteli
- Center for Cancer Research, National Cancer Institute, Bethesda, MD, USA.
| | - Frank Alber
- Department of Microbiology, Immunology and Molecular Genetics and Institute for Quantitative and Computational Biosciences, University of California Los Angeles, Los Angeles, CA, USA.
| | - Jian Ma
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA, USA.
| |
Collapse
|
9
|
Mathur R, Wang Q, Schupp PG, Nikolic A, Hilz S, Hong C, Grishanina NR, Kwok D, Stevers NO, Jin Q, Youngblood MW, Stasiak LA, Hou Y, Wang J, Yamaguchi TN, Lafontaine M, Shai A, Smirnov IV, Solomon DA, Chang SM, Hervey-Jumper SL, Berger MS, Lupo JM, Okada H, Phillips JJ, Boutros PC, Gallo M, Oldham MC, Yue F, Costello JF. Glioblastoma evolution and heterogeneity from a 3D whole-tumor perspective. Cell 2024; 187:446-463.e16. [PMID: 38242087 PMCID: PMC10832360 DOI: 10.1016/j.cell.2023.12.013] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2023] [Revised: 10/03/2023] [Accepted: 12/06/2023] [Indexed: 01/21/2024]
Abstract
Treatment failure for the lethal brain tumor glioblastoma (GBM) is attributed to intratumoral heterogeneity and tumor evolution. We utilized 3D neuronavigation during surgical resection to acquire samples representing the whole tumor mapped by 3D spatial coordinates. Integrative tissue and single-cell analysis revealed sources of genomic, epigenomic, and microenvironmental intratumoral heterogeneity and their spatial patterning. By distinguishing tumor-wide molecular features from those with regional specificity, we inferred GBM evolutionary trajectories from neurodevelopmental lineage origins and initiating events such as chromothripsis to emergence of genetic subclones and spatially restricted activation of differential tumor and microenvironmental programs in the core, periphery, and contrast-enhancing regions. Our work depicts GBM evolution and heterogeneity from a 3D whole-tumor perspective, highlights potential therapeutic targets that might circumvent heterogeneity-related failures, and establishes an interactive platform enabling 360° visualization and analysis of 3D spatial patterns for user-selected genes, programs, and other features across whole GBM tumors.
Collapse
Affiliation(s)
- Radhika Mathur
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Qixuan Wang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Patrick G Schupp
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Ana Nikolic
- Department of Biochemistry & Molecular Biology, University of Calgary, Calgary, AB
| | - Stephanie Hilz
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Chibo Hong
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Nadia R Grishanina
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Darwin Kwok
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Nicholas O Stevers
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Qiushi Jin
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Mark W Youngblood
- Department of Neurological Surgery, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Lena Ann Stasiak
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Ye Hou
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Juan Wang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Takafumi N Yamaguchi
- Department of Human Genetics, University of California, Los Angeles, Los Angees, CA, USA
| | - Marisa Lafontaine
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Anny Shai
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Ivan V Smirnov
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - David A Solomon
- Department of Pathology, University of California San Francisco, San Francisco, CA, USA
| | - Susan M Chang
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Shawn L Hervey-Jumper
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Mitchel S Berger
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Janine M Lupo
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Hideho Okada
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Joanna J Phillips
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Paul C Boutros
- Department of Human Genetics, University of California, Los Angeles, Los Angees, CA, USA
| | - Marco Gallo
- Department of Biochemistry & Molecular Biology, University of Calgary, Calgary, AB; Department of Pediatrics, Baylor College of Medicine, Texas Children's Hospital, Houston, TX, USA
| | - Michael C Oldham
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA
| | - Feng Yue
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA; Robert H. Lurie Comprehensive Cancer Center, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
| | - Joseph F Costello
- Department of Neurological Surgery, University of California San Francisco, San Francisco, CA, USA.
| |
Collapse
|
10
|
Dehkordi SR, Wong ITL, Ni J, Luebeck J, Zhu K, Prasad G, Krockenberger L, Xu G, Chowdhury B, Rajkumar U, Caplin A, Muliaditan D, Coruh C, Jin Q, Turner K, Teo SX, Pang AWC, Alexandrov LB, Chua CEL, Furnari FB, Paulson TG, Law JA, Chang HY, Yue F, DasGupta R, Zhao J, Mischel PS, Bafna V. Breakage fusion bridge cycles drive high oncogene copy number, but not intratumoral genetic heterogeneity or rapid cancer genome change. bioRxiv 2023:2023.12.12.571349. [PMID: 38168210 PMCID: PMC10760206 DOI: 10.1101/2023.12.12.571349] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/05/2024]
Abstract
Oncogene amplification is a major driver of cancer pathogenesis. Breakage fusion bridge (BFB) cycles, like extrachromosomal DNA (ecDNA), can lead to high copy numbers of oncogenes, but their impact on intratumoral heterogeneity, treatment response, and patient survival are not well understood due to difficulty in detecting them by DNA sequencing. We describe a novel algorithm that detects and reconstructs BFB amplifications using optical genome maps (OGMs), called OM2BFB. OM2BFB showed high precision (>93%) and recall (92%) in detecting BFB amplifications in cancer cell lines, PDX models and primary tumors. OM-based comparisons demonstrated that short-read BFB detection using our AmpliconSuite (AS) toolkit also achieved high precision, albeit with reduced sensitivity. We detected 371 BFB events using whole genome sequences from 2,557 primary tumors and cancer lines. BFB amplifications were preferentially found in cervical, head and neck, lung, and esophageal cancers, but rarely in brain cancers. BFB amplified genes show lower variance of gene expression, with fewer options for regulatory rewiring relative to ecDNA amplified genes. BFB positive (BFB (+)) tumors showed reduced heterogeneity of amplicon structures, and delayed onset of resistance, relative to ecDNA(+) tumors. EcDNA and BFB amplifications represent contrasting mechanisms to increase the copy numbers of oncogene with markedly different characteristics that suggest different routes for intervention.
Collapse
Affiliation(s)
- Siavash Raeisi Dehkordi
- Department of Computer Science and Engineering, University of California San Diego, San Diego, CA, USA
| | - Ivy Tsz-Lo Wong
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Sarafan ChEM-H, Stanford University, Stanford, CA, USA
| | - Jing Ni
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215 USA
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115 USA
| | - Jens Luebeck
- Department of Computer Science and Engineering, University of California San Diego, San Diego, CA, USA
- Bioinformatics and Systems Biology Graduate Program, University of California San Diego, San Diego, CA, USA
| | - Kaiyuan Zhu
- Department of Computer Science and Engineering, University of California San Diego, San Diego, CA, USA
| | - Gino Prasad
- Department of Computer Science and Engineering, University of California San Diego, San Diego, CA, USA
| | - Lena Krockenberger
- Department of Computer Science and Engineering, University of California San Diego, San Diego, CA, USA
| | - Guanghui Xu
- Plant Molecular and Cellular Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
| | - Biswanath Chowdhury
- Department of Computer Science and Engineering, University of California San Diego, San Diego, CA, USA
| | - Utkrisht Rajkumar
- Department of Computer Science and Engineering, University of California San Diego, San Diego, CA, USA
| | - Ann Caplin
- Department of Computer Science and Engineering, University of California San Diego, San Diego, CA, USA
| | - Daniel Muliaditan
- Laboratory of Precision Oncology and Cancer Evolution, Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Ceyda Coruh
- Plant Molecular and Cellular Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
- ClearNote Health, San Diego, CA 92121 USA
| | - Qiushi Jin
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, IL, USA
| | | | - Shu Xian Teo
- Singapore Nuclear Research and Safety Initiative, National University of Singapore
| | | | - Ludmil B Alexandrov
- Moores Cancer Center, UC San Diego Health, La Jolla, CA, USA
- Department of Cellular and Molecular Medicine, University of California at San Diego, La Jolla, CA, USA
- Department of Bioengineering, University of California at San Diego, La Jolla, CA, USA
| | | | - Frank B Furnari
- Department of Medicine, University of California at San Diego, La Jolla, CA, USA
| | - Thomas G Paulson
- Translational Science and Therapeutics Division, Fred Hutchinson Cancer Center, Seattle, WA, USA
| | - Julie A Law
- Plant Molecular and Cellular Biology Laboratory, Salk Institute for Biological Studies, La Jolla, CA, 92037, USA
- Division of Biological Sciences, University of California, San Diego, La Jolla, CA, 92093, USA
| | - Howard Y Chang
- Center for Personal Dynamic Regulomes, Stanford University, Stanford, CA, USA
- Howard Hughes Medical Institute, Stanford University School of Medicine, Stanford, CA, USA
| | - Feng Yue
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, IL, USA
- Robert H. Lurie Comprehensive Cancer Center of Northwestern University, Chicago, IL, USA
| | - Ramanuj DasGupta
- Laboratory of Precision Oncology and Cancer Evolution, Genome Institute of Singapore, Agency for Science, Technology and Research, Singapore, Singapore
| | - Jean Zhao
- Department of Cancer Biology, Dana-Farber Cancer Institute, Boston, MA 02215 USA
- Department of Biological Chemistry and Molecular Pharmacology, Harvard Medical School, Boston, MA 02115 USA
| | - Paul S Mischel
- Department of Pathology, Stanford University School of Medicine, Stanford, CA, USA
- Sarafan ChEM-H, Stanford University, Stanford, CA, USA
| | - Vineet Bafna
- Department of Computer Science and Engineering, University of California San Diego, San Diego, CA, USA
- Halıcıoğlu Data Science Institute, University of California at San Diego, La Jolla, CA, USA
| |
Collapse
|
11
|
Wu H, Zhou B, Zhou H, Zhang P, Wang M. Be-1DCNN: a neural network model for chromatin loop prediction based on bagging ensemble learning. Brief Funct Genomics 2023; 22:475-484. [PMID: 37133976 DOI: 10.1093/bfgp/elad015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2023] [Revised: 03/10/2023] [Accepted: 03/29/2023] [Indexed: 05/04/2023] Open
Abstract
The chromatin loops in the three-dimensional (3D) structure of chromosomes are essential for the regulation of gene expression. Despite the fact that high-throughput chromatin capture techniques can identify the 3D structure of chromosomes, chromatin loop detection utilizing biological experiments is arduous and time-consuming. Therefore, a computational method is required to detect chromatin loops. Deep neural networks can form complex representations of Hi-C data and provide the possibility of processing biological datasets. Therefore, we propose a bagging ensemble one-dimensional convolutional neural network (Be-1DCNN) to detect chromatin loops from genome-wide Hi-C maps. First, to obtain accurate and reliable chromatin loops in genome-wide contact maps, the bagging ensemble learning method is utilized to synthesize the prediction results of multiple 1DCNN models. Second, each 1DCNN model consists of three 1D convolutional layers for extracting high-dimensional features from input samples and one dense layer for producing the prediction results. Finally, the prediction results of Be-1DCNN are compared to those of the existing models. The experimental results indicate that Be-1DCNN predicts high-quality chromatin loops and outperforms the state-of-the-art methods using the same evaluation metrics. The source code of Be-1DCNN is available for free at https://github.com/HaoWuLab-Bioinformatics/Be1DCNN.
Collapse
Affiliation(s)
- Hao Wu
- College of Information Engineering, Northwest A&F University, Yangling, 712100 Shaanxi, China
- School of Software, Shandong University, Jinan, 250101 Shandong, China
| | - Bing Zhou
- College of Information Engineering, Northwest A&F University, Yangling, 712100 Shaanxi, China
| | - Haoru Zhou
- College of Information Engineering, Northwest A&F University, Yangling, 712100 Shaanxi, China
| | - Pengyu Zhang
- College of Information Engineering, Northwest A&F University, Yangling, 712100 Shaanxi, China
| | - Meili Wang
- College of Information Engineering, Northwest A&F University, Yangling, 712100 Shaanxi, China
| |
Collapse
|
12
|
Wang F, Alinejad‐Rokny H, Lin J, Gao T, Chen X, Zheng Z, Meng L, Li X, Wong K. A Lightweight Framework For Chromatin Loop Detection at the Single-Cell Level. Adv Sci (Weinh) 2023; 10:e2303502. [PMID: 37816141 PMCID: PMC10667817 DOI: 10.1002/advs.202303502] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/30/2023] [Revised: 08/10/2023] [Indexed: 10/12/2023]
Abstract
Single-cell Hi-C (scHi-C) has made it possible to analyze chromatin organization at the single-cell level. However, scHi-C experiments generate inherently sparse data, which poses a challenge for loop calling methods. The existing approach performs significance tests across the imputed dense contact maps, leading to substantial computational overhead and loss of information at the single-cell level. To overcome this limitation, a lightweight framework called scGSLoop is proposed, which sets a new paradigm for scHi-C loop calling by adapting the training and inferencing strategies of graph-based deep learning to leverage the sequence features and 1D positional information of genomic loci. With this framework, sparsity is no longer a challenge, but rather an advantage that the model leverages to achieve unprecedented computational efficiency. Compared to existing methods, scGSLoop makes more accurate predictions and is able to identify more loops that have the potential to play regulatory roles in genome functioning. Moreover, scGSLoop preserves single-cell information by identifying a distinct group of loops for each individual cell, which not only enables an understanding of the variability of chromatin looping states between cells, but also allows scGSLoop to be extended for the investigation of multi-connected hubs and their underlying mechanisms.
Collapse
Affiliation(s)
- Fuzhou Wang
- Department of Computer ScienceCity University of Hong KongKowloon TongHong Kong SAR
| | - Hamid Alinejad‐Rokny
- BioMedical Machine Learning Lab, Graduate School of Biomedical EngineeringUniversity of New South WalesSydney2052Australia
| | - Jiecong Lin
- Molecular Pathology Unit, Center for Cancer Research, Massachusetts General HospitalDepartment of PathologyHarvard Medical SchoolBostonMA02129USA
- Department of Computer ScienceThe University of Hong KongPok Fu LamHong Kong SAR
| | - Tingxiao Gao
- Department of Medical Biophysics, Faculty of MedicineUniversity of TorontoTorontoOntarioM5G1L7Canada
| | - Xingjian Chen
- Department of Computer ScienceCity University of Hong KongKowloon TongHong Kong SAR
| | - Zetian Zheng
- Department of Computer ScienceCity University of Hong KongKowloon TongHong Kong SAR
| | - Lingkuan Meng
- Department of Computer ScienceCity University of Hong KongKowloon TongHong Kong SAR
| | - Xiangtao Li
- School of Artificial IntelligenceJilin UniversityChangchun130012China
| | - Ka‐Chun Wong
- Department of Computer ScienceCity University of Hong KongKowloon TongHong Kong SAR
| |
Collapse
|
13
|
Raffo A, Paulsen J. The shape of chromatin: insights from computational recognition of geometric patterns in Hi-C data. Brief Bioinform 2023; 24:bbad302. [PMID: 37646128 PMCID: PMC10516369 DOI: 10.1093/bib/bbad302] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2023] [Revised: 07/05/2023] [Accepted: 08/03/2023] [Indexed: 09/01/2023] Open
Abstract
The three-dimensional organization of chromatin plays a crucial role in gene regulation and cellular processes like deoxyribonucleic acid (DNA) transcription, replication and repair. Hi-C and related techniques provide detailed views of spatial proximities within the nucleus. However, data analysis is challenging partially due to a lack of well-defined, underpinning mathematical frameworks. Recently, recognizing and analyzing geometric patterns in Hi-C data has emerged as a powerful approach. This review provides a summary of algorithms for automatic recognition and analysis of geometric patterns in Hi-C data and their correspondence with chromatin structure. We classify existing algorithms on the basis of the data representation and pattern recognition paradigm they make use of. Finally, we outline some of the challenges ahead and promising future directions.
Collapse
Affiliation(s)
- Andrea Raffo
- Department of Biosciences, University of Oslo, 0316 Oslo, Norway
| | - Jonas Paulsen
- Department of Biosciences, University of Oslo, 0316 Oslo, Norway
- Centre for Bioinformatics, Department of Informatics, University of Oslo, 0316 Oslo, Norway
| |
Collapse
|
14
|
Abstract
Identification of chromatin interactions is crucial for advancing our knowledge of gene regulation. However, due to the limitations of high-throughput experimental techniques, there is an urgent need to develop computational methods for predicting chromatin interactions. In this study, we propose a novel attention-based deep learning model, termed IChrom-Deep, to identify chromatin interactions using sequence features and genomic features. The experimental results based on the datasets of three cell lines demonstrate that the IChrom-Deep achieves satisfactory performance and is superior to the previous methods. We also investigate the effect of DNA sequence and associated features and genomic features on chromatin interactions, and highlight the applicable scenarios of some features, such as sequence conservation and distance. Moreover, we identify a few genomic features that are extremely important across different cell lines, and IChrom-Deep achieves comparable performance with only these significant genomic features versus using all genomic features. It is believed that IChrom-Deep can serve as a useful tool for future studies that seek to identify chromatin interactions.
Collapse
|
15
|
Xu H, Yi X, Fan X, Wu C, Wang W, Chu X, Zhang S, Dong X, Wang Z, Wang J, Zhou Y, Zhao K, Yao H, Zheng N, Wang J, Chen Y, Plewczynski D, Sham PC, Chen K, Huang D, Li MJ. Inferring CTCF-binding patterns and anchored loops across human tissues and cell types. Patterns (N Y) 2023; 4:100798. [PMID: 37602215 PMCID: PMC10436006 DOI: 10.1016/j.patter.2023.100798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/12/2022] [Revised: 01/25/2023] [Accepted: 06/20/2023] [Indexed: 08/22/2023]
Abstract
CCCTC-binding factor (CTCF) is a transcription regulator with a complex role in gene regulation. The recognition and effects of CTCF on DNA sequences, chromosome barriers, and enhancer blocking are not well understood. Existing computational tools struggle to assess the regulatory potential of CTCF-binding sites and their impact on chromatin loop formation. Here we have developed a deep-learning model, DeepAnchor, to accurately characterize CTCF binding using high-resolution genomic/epigenomic features. This has revealed distinct chromatin and sequence patterns for CTCF-mediated insulation and looping. An optimized implementation of a previous loop model based on DeepAnchor score excels in predicting CTCF-anchored loops. We have established a compendium of CTCF-anchored loops across 52 human tissue/cell types, and this suggests that genomic disruption of these loops could be a general mechanism of disease pathogenesis. These computational models and resources can help investigate how CTCF-mediated cis-regulatory elements shape context-specific gene regulation in cell development and disease progression.
Collapse
Affiliation(s)
- Hang Xu
- Department of Epidemiology and Biostatistics, Key Laboratory of Prevention and Control of Human Major Diseases (Ministry of Education), National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
- Singapore Immunology Network (SIgN), Agency for Science, Technology and Research (A∗STAR), Singapore 138648, Singapore
| | - Xianfu Yi
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xutong Fan
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Chengyue Wu
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Wei Wang
- Department of Epidemiology and Biostatistics, Key Laboratory of Prevention and Control of Human Major Diseases (Ministry of Education), National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Xinlei Chu
- Department of Epidemiology and Biostatistics, Key Laboratory of Prevention and Control of Human Major Diseases (Ministry of Education), National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Shijie Zhang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xiaobao Dong
- Department of Genetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Zhao Wang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Jianhua Wang
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Yao Zhou
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Ke Zhao
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Hongcheng Yao
- Centre for PanorOmic Sciences-Genomics and Bioinformatics Cores, The University of Hong Kong, Hong Kong 999077, China
| | - Nan Zheng
- Department of Network Security and Informatization, Tianjin Medical University, Tianjin 300070, China
| | - Junwen Wang
- Department of Health Sciences Research and Center for Individualized Medicine, Mayo Clinic, Scottsdale, AZ 85259, USA
| | - Yupeng Chen
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Dariusz Plewczynski
- Faculty of Mathematics and Information Science, Warsaw University of Technology, Warsaw, Poland
| | - Pak Chung Sham
- Centre for PanorOmic Sciences-Genomics and Bioinformatics Cores, The University of Hong Kong, Hong Kong 999077, China
| | - Kexin Chen
- Department of Epidemiology and Biostatistics, Key Laboratory of Prevention and Control of Human Major Diseases (Ministry of Education), National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| | - Dandan Huang
- Wuxi School of Medicine, Jiangnan University, Wuxi 214122, China
| | - Mulin Jun Li
- Department of Epidemiology and Biostatistics, Key Laboratory of Prevention and Control of Human Major Diseases (Ministry of Education), National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| |
Collapse
|
16
|
Fu Y, Wang X, Yue F. EagleC Explorer: A desktop application for interactively detecting and visualizing SVs and enhancer hijacking on Hi-C contact maps. bioRxiv 2023:2023.08.07.552228. [PMID: 37609202 PMCID: PMC10441372 DOI: 10.1101/2023.08.07.552228] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/24/2023]
Abstract
It has been shown that Hi-C can be used as a powerful tool to detect structural variations (SVs) and enhancer hijacking events. However, there has been no existing programs that can directly visualize and detect such events on a personal computer, which hinders the broad adaption of the technology for intuitive discovery in cancer studies. Here, we introduce the EagleC Explorer, a desktop software that is specifically designed for exploring Hi-C and other chromatin contact data in cancer genomes. EagleC Explorer has a set of unique features, including 1) conveniently visualizing global and local Hi-C data; 2) interactively detecting SVs on a Hi-C map for any user-selected region on screen within seconds, using a deep-learning model; 3) reconstructing local Hi-C map surrounding user-provided SVs and generating publication-quality figures; 4) detecting enhancer hijacking events for any user-suggested regions on screen. In addition, EagleC Explorer can also incorporate other genomic tracks such as RNA-Seq or ChIP-Seq to facilitate scientists for integrative data analysis and making novel discoveries.
Collapse
|
17
|
Zhou T, Zhang R, Jia D, Doty RT, Munday AD, Gao D, Xin L, Abkowitz JL, Duan Z, Ma J. Concurrent profiling of multiscale 3D genome organization and gene expression in single mammalian cells. bioRxiv 2023:2023.07.20.549578. [PMID: 37546900 PMCID: PMC10401946 DOI: 10.1101/2023.07.20.549578] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 08/08/2023]
Abstract
The organization of mammalian genomes within the nucleus features a complex, multiscale three-dimensional (3D) architecture. The functional significance of these 3D genome features, however, remains largely elusive due to limited single-cell technologies that can concurrently profile genome organization and transcriptional activities. Here, we report GAGE-seq, a highly scalable, robust single-cell co-assay that simultaneously measures 3D genome structure and transcriptome within the same cell. Employing GAGE-seq on mouse brain cortex and human bone marrow CD34+ cells, we comprehensively characterized the intricate relationships between 3D genome and gene expression. We found that these multiscale 3D genome features collectively inform cell type-specific gene expressions, hence contributing to defining cell identity at the single-cell level. Integration of GAGE-seq data with spatial transcriptomic data revealed in situ variations of the 3D genome in mouse cortex. Moreover, our observations of lineage commitment in normal human hematopoiesis unveiled notable discordant changes between 3D genome organization and gene expression, underscoring a complex, temporal interplay at the single-cell level that is more nuanced than previously appreciated. Together, GAGE-seq provides a powerful, cost-effective approach for interrogating genome structure and gene expression relationships at the single-cell level across diverse biological contexts.
Collapse
Affiliation(s)
- Tianming Zhou
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Ruochi Zhang
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
- Present address: Broad Institute of MIT and Harvard, Cambridge, MA 02142, USA
| | - Deyong Jia
- Department of Urology, University of Washington, Seattle, WA 98195, USA
| | - Raymond T. Doty
- Division of Hematology, Department of Medicine, University of Washington, Seattle, WA 98195, USA
| | - Adam D. Munday
- Division of Hematology, Department of Medicine, University of Washington, Seattle, WA 98195, USA
| | - Daniel Gao
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA 98109, USA
- Present address: Department of Chemistry, Pomona College, Claremont, CA 91711, USA
| | - Li Xin
- Department of Urology, University of Washington, Seattle, WA 98195, USA
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA 98109, USA
| | - Janis L. Abkowitz
- Division of Hematology, Department of Medicine, University of Washington, Seattle, WA 98195, USA
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA 98109, USA
| | - Zhijun Duan
- Division of Hematology, Department of Medicine, University of Washington, Seattle, WA 98195, USA
- Institute for Stem Cell and Regenerative Medicine, University of Washington, Seattle, WA 98109, USA
| | - Jian Ma
- Computational Biology Department, School of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
18
|
Liu T, Wang Z. DeepChIA-PET: Accurately predicting ChIA-PET from Hi-C and ChIP-seq with deep dilated networks. PLoS Comput Biol 2023; 19:e1011307. [PMID: 37440599 PMCID: PMC10368233 DOI: 10.1371/journal.pcbi.1011307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2023] [Accepted: 06/26/2023] [Indexed: 07/15/2023] Open
Abstract
Chromatin interaction analysis by paired-end tag sequencing (ChIA-PET) can capture genome-wide chromatin interactions mediated by a specific DNA-associated protein. The ChIA-PET experiments have been applied to explore the key roles of different protein factors in chromatin folding and transcription regulation. However, compared with widely available Hi-C and ChIP-seq data, there are not many ChIA-PET datasets available in the literature. A computational method for accurately predicting ChIA-PET interactions from Hi-C and ChIP-seq data is needed that can save the efforts of performing wet-lab experiments. Here we present DeepChIA-PET, a supervised deep learning approach that can accurately predict ChIA-PET interactions by learning the latent relationships between ChIA-PET and two widely used data types: Hi-C and ChIP-seq. We trained our deep models with CTCF-mediated ChIA-PET of GM12878 as ground truth, and the deep network contains 40 dilated residual convolutional blocks. We first showed that DeepChIA-PET with only Hi-C as input significantly outperforms Peakachu, another computational method for predicting ChIA-PET from Hi-C but using random forests. We next proved that adding ChIP-seq as one extra input does improve the classification performance of DeepChIA-PET, but Hi-C plays a more prominent role in DeepChIA-PET than ChIP-seq. Our evaluation results indicate that our learned models can accurately predict not only CTCF-mediated ChIA-ET in GM12878 and HeLa but also non-CTCF ChIA-PET interactions, including RNA polymerase II (RNAPII) ChIA-PET of GM12878, RAD21 ChIA-PET of GM12878, and RAD21 ChIA-PET of K562. In total, DeepChIA-PET is an accurate tool for predicting the ChIA-PET interactions mediated by various chromatin-associated proteins from different cell types.
Collapse
Affiliation(s)
- Tong Liu
- Department of Computer Science, University of Miami, Coral Gables, Florida, United States of America
| | - Zheng Wang
- Department of Computer Science, University of Miami, Coral Gables, Florida, United States of America
| |
Collapse
|
19
|
Buyukcelebi K, Chen X, Abdula F, Elkafas H, Duval AJ, Ozturk H, Seker-Polat F, Jin Q, Yin P, Feng Y, Bulun SE, Wei JJ, Yue F, Adli M. Engineered MED12 mutations drive leiomyoma-like transcriptional and metabolic programs by altering the 3D genome compartmentalization. Nat Commun 2023; 14:4057. [PMID: 37429859 DOI: 10.1038/s41467-023-39684-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/01/2023] [Accepted: 06/26/2023] [Indexed: 07/12/2023] Open
Abstract
Nearly 70% of Uterine fibroid (UF) tumors are driven by recurrent MED12 hotspot mutations. Unfortunately, no cellular models could be generated because the mutant cells have lower fitness in 2D culture conditions. To address this, we employ CRISPR to precisely engineer MED12 Gly44 mutations in UF-relevant myometrial smooth muscle cells. The engineered mutant cells recapitulate several UF-like cellular, transcriptional, and metabolic alterations, including altered Tryptophan/kynurenine metabolism. The aberrant gene expression program in the mutant cells is, in part, driven by a substantial 3D genome compartmentalization switch. At the cellular level, the mutant cells gain enhanced proliferation rates in 3D spheres and form larger lesions in vivo with elevated production of collagen and extracellular matrix deposition. These findings indicate that the engineered cellular model faithfully models key features of UF tumors and provides a platform for the broader scientific community to characterize genomics of recurrent MED12 mutations.
Collapse
Affiliation(s)
- Kadir Buyukcelebi
- Robert Lurie Comprehensive Cancer Center, Department of Obstetrics and Gynecology, Feinberg School of Medicine at Northwestern University, Chicago, IL, USA
| | - Xintong Chen
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, IL, USA
| | - Fatih Abdula
- Robert Lurie Comprehensive Cancer Center, Department of Obstetrics and Gynecology, Feinberg School of Medicine at Northwestern University, Chicago, IL, USA
| | - Hoda Elkafas
- Robert Lurie Comprehensive Cancer Center, Department of Obstetrics and Gynecology, Feinberg School of Medicine at Northwestern University, Chicago, IL, USA
| | - Alexander James Duval
- Robert Lurie Comprehensive Cancer Center, Department of Obstetrics and Gynecology, Feinberg School of Medicine at Northwestern University, Chicago, IL, USA
| | - Harun Ozturk
- Robert Lurie Comprehensive Cancer Center, Department of Obstetrics and Gynecology, Feinberg School of Medicine at Northwestern University, Chicago, IL, USA
| | - Fidan Seker-Polat
- Robert Lurie Comprehensive Cancer Center, Department of Obstetrics and Gynecology, Feinberg School of Medicine at Northwestern University, Chicago, IL, USA
| | - Qiushi Jin
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, IL, USA
| | - Ping Yin
- Robert Lurie Comprehensive Cancer Center, Department of Obstetrics and Gynecology, Feinberg School of Medicine at Northwestern University, Chicago, IL, USA
| | - Yue Feng
- Department of Pathology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Serdar E Bulun
- Robert Lurie Comprehensive Cancer Center, Department of Obstetrics and Gynecology, Feinberg School of Medicine at Northwestern University, Chicago, IL, USA
| | - Jian Jun Wei
- Robert Lurie Comprehensive Cancer Center, Department of Obstetrics and Gynecology, Feinberg School of Medicine at Northwestern University, Chicago, IL, USA
- Department of Pathology, Northwestern University Feinberg School of Medicine, Chicago, IL, USA
| | - Feng Yue
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, IL, USA
| | - Mazhar Adli
- Robert Lurie Comprehensive Cancer Center, Department of Obstetrics and Gynecology, Feinberg School of Medicine at Northwestern University, Chicago, IL, USA.
| |
Collapse
|
20
|
Syed SA, Shqillo K, Nand A, Zhan Y, Dekker J, Imbalzano AN. Protein arginine methyltransferase 5 (Prmt5) localizes to chromatin loop anchors and modulates expression of genes at TAD boundaries during early adipogenesis. bioRxiv 2023:2023.06.13.544859. [PMID: 37398486 PMCID: PMC10312757 DOI: 10.1101/2023.06.13.544859] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 07/04/2023]
Abstract
Protein arginine methyltransferase 5 (Prmt5) is an essential regulator of embryonic development and adult progenitor cell functions. Prmt5 expression is mis-regulated in many cancers, and the development of Prmt5 inhibitors as cancer therapeutics is an active area of research. Prmt5 functions via effects on gene expression, splicing, DNA repair, and other critical cellular processes. We examined whether Prmt5 functions broadly as a genome-wide regulator of gene transcription and higher-order chromatin interactions during the initial stages of adipogenesis using ChIP-Seq, RNA-seq, and Hi-C using 3T3-L1 cells, a frequently utilized model for adipogenesis. We observed robust genome-wide Prmt5 chromatin-binding at the onset of differentiation. Prmt5 localized to transcriptionally active genomic regions, acting as both a positive and a negative regulator. A subset of Prmt5 binding sites co-localized with mediators of chromatin organization at chromatin loop anchors. Prmt5 knockdown decreased insulation strength at the boundaries of topologically associating domains (TADs) adjacent to sites with Prmt5 and CTCF co-localization. Genes overlapping such weakened TAD boundaries showed transcriptional dysregulation. This study identifies Prmt5 as a broad regulator of gene expression, including regulation of early adipogenic factors, and reveals an unappreciated requirement for Prmt5 in maintaining strong insulation at TAD boundaries and overall chromatin organization.
Collapse
Affiliation(s)
- Sabriya A Syed
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA USA
| | - Kristina Shqillo
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA USA
| | - Ankita Nand
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA USA
| | - Ye Zhan
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA USA
| | - Job Dekker
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA USA
- Department of Systems Biology, University of Massachusetts Chan Medical School, Worcester, MA USA
- Howard Hughes Medical Institute, Chevy Chase, MD USA
| | - Anthony N Imbalzano
- Department of Biochemistry and Molecular Biotechnology, University of Massachusetts Chan Medical School, Worcester, MA USA
| |
Collapse
|
21
|
Liu T, Wang J, Yang H, Jin Q, Wang X, Fu Y, Luan Y, Wang Q, Youngblood MW, Lu X, Casadei L, Pollock R, Yue F. Enhancer Coamplification and Hijacking Promote Oncogene Expression in Liposarcoma. Cancer Res 2023; 83:1517-1530. [PMID: 36847778 PMCID: PMC10152236 DOI: 10.1158/0008-5472.can-22-1858] [Citation(s) in RCA: 6] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/06/2022] [Revised: 12/29/2022] [Accepted: 02/22/2023] [Indexed: 03/01/2023]
Abstract
SIGNIFICANCE Comprehensive profiling of the enhancer landscape and 3D genome structure in liposarcoma identifies extensive enhancer-oncogene coamplification and enhancer hijacking events, deepening the understanding of how oncogenes are regulated in cancer.
Collapse
Affiliation(s)
- Tingting Liu
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, Illinois
| | - Juan Wang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, Illinois
| | - Hongbo Yang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, Illinois
| | - Qiushi Jin
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, Illinois
| | - Xiaotao Wang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, Illinois
| | - Yihao Fu
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, Illinois
| | - Yu Luan
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, Illinois
| | - Qixuan Wang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, Illinois
| | - Mark W. Youngblood
- Department of Neurosurgery, Feinberg School of Medicine Northwestern University, Chicago, Illinois
| | - Xinyan Lu
- Department of Pathology, Northwestern University Feinberg School of Medicine, Chicago, Illinois
| | - Lucia Casadei
- Program in Translational Therapeutics, The James Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio
| | - Raphael Pollock
- Program in Translational Therapeutics, The James Comprehensive Cancer Center, The Ohio State University, Columbus, Ohio
- Department of Surgery, The Ohio State University, Columbus, Ohio
| | - Feng Yue
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine Northwestern University, Chicago, Illinois
- Robert H. Lurie Comprehensive Cancer Center of Northwestern University, Chicago, Illinois
| |
Collapse
|
22
|
Sun Y, Xu X, Lin L, Xu K, Zheng Y, Ren C, Tao H, Wang X, Zhao H, Tu W, Bai X, Wang J, Huang Q, Li Y, Chen H, Li H, Bo X. A graph neural network-based interpretable framework reveals a novel DNA fragility-associated chromatin structural unit. Genome Biol 2023; 24:90. [PMID: 37095580 PMCID: PMC10124043 DOI: 10.1186/s13059-023-02916-x] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/19/2022] [Accepted: 03/22/2023] [Indexed: 04/26/2023] Open
Abstract
BACKGROUND DNA double-strand breaks (DSBs) are among the most deleterious DNA lesions, and they can cause cancer if improperly repaired. Recent chromosome conformation capture techniques, such as Hi-C, have enabled the identification of relationships between the 3D chromatin structure and DSBs, but little is known about how to explain these relationships, especially from global contact maps, or their contributions to DSB formation. RESULTS Here, we propose a framework that integrates graph neural network (GNN) to unravel the relationship between 3D chromatin structure and DSBs using an advanced interpretable technique GNNExplainer. We identify a new chromatin structural unit named the DNA fragility-associated chromatin interaction network (FaCIN). FaCIN is a bottleneck-like structure, and it helps to reveal a universal form of how the fragility of a piece of DNA might be affected by the whole genome through chromatin interactions. Moreover, we demonstrate that neck interactions in FaCIN can serve as chromatin structural determinants of DSB formation. CONCLUSIONS Our study provides a more systematic and refined view enabling a better understanding of the mechanisms of DSB formation under the context of the 3D genome.
Collapse
Affiliation(s)
- Yu Sun
- Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Xiang Xu
- Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Lin Lin
- Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Kang Xu
- Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Yang Zheng
- Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Chao Ren
- Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Huan Tao
- Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Xu Wang
- 4Paradigm Inc, Beijing, China
| | | | | | - Xuemei Bai
- The First Affiliated Hospital of Harbin Medical University, Harbin, 150001, China
| | - Junting Wang
- The First Affiliated Hospital of Harbin Medical University, Harbin, 150001, China
| | - Qiya Huang
- State Key Laboratory of Cardiovascular Disease, Fuwai Hospital, National Center for Cardiovascular Diseases, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, China
| | - Yaru Li
- Institute of Health Service and Transfusion Medicine, Beijing, 100850, China
| | - Hebing Chen
- Institute of Health Service and Transfusion Medicine, Beijing, 100850, China.
| | - Hao Li
- Institute of Health Service and Transfusion Medicine, Beijing, 100850, China.
| | - Xiaochen Bo
- Institute of Health Service and Transfusion Medicine, Beijing, 100850, China.
| |
Collapse
|
23
|
Wang N, Yu B, Jun G, Qi Q, Durazo-Arvizu RA, Lindstrom S, Morrison AC, Kaplan RC, Boerwinkle E, Chen H. StocSum: stochastic summary statistics for whole genome sequencing studies. bioRxiv 2023:2023.04.06.535886. [PMID: 37066281 PMCID: PMC10104122 DOI: 10.1101/2023.04.06.535886] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 06/19/2023]
Abstract
Genomic summary statistics, usually defined as single-variant test results from genome-wide association studies, have been widely used to advance the genetics field in a wide range of applications. Applications that involve multiple genetic variants also require their correlations or linkage disequilibrium (LD) information, often obtained from an external reference panel. In practice, it is usually difficult to find suitable external reference panels that represent the LD structure for underrepresented and admixed populations, or rare genetic variants from whole genome sequencing (WGS) studies, limiting the scope of applications for genomic summary statistics. Here we introduce StocSum, a novel reference-panel-free statistical framework for generating, managing, and analyzing stochastic summary statistics using random vectors. We develop various downstream applications using StocSum including single-variant tests, conditional association tests, gene-environment interaction tests, variant set tests, as well as meta-analysis and LD score regression tools. We demonstrate the accuracy and computational efficiency of StocSum using two cohorts from the Trans-Omics for Precision Medicine Program. StocSum will facilitate sharing and utilization of genomic summary statistics from WGS studies, especially for underrepresented and admixed populations.
Collapse
Affiliation(s)
- Nannan Wang
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Bing Yu
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Goo Jun
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Qibin Qi
- Department of Epidemiology & Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Ramon A. Durazo-Arvizu
- The Saban Research Institute, Children’s Hospital Los Angeles, Los Angeles, California
- Department of Pediatrics, Keck School of Medicine, University of Southern California, Los Angeles, CA, USA
| | - Sara Lindstrom
- Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, WA, USA
- Department of Epidemiology, School of Public Health, University of Washington, 3980 15th Ave NE, Seattle, WA, USA
| | - Alanna C. Morrison
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
| | - Robert C. Kaplan
- Department of Epidemiology & Population Health, Albert Einstein College of Medicine, Bronx, NY, USA
| | - Eric Boerwinkle
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, TX, USA
| | - Han Chen
- Human Genetics Center, Department of Epidemiology, Human Genetics and Environmental Sciences, School of Public Health, The University of Texas Health Science Center at Houston, Houston, TX, USA
- Center for Precision Health, School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX, USA
| |
Collapse
|
24
|
Richer S, Tian Y, Schoenfelder S, Hurst L, Murrell A, Pisignano G. Widespread allele-specific topological domains in the human genome are not confined to imprinted gene clusters. Genome Biol 2023; 24:40. [PMID: 36869353 PMCID: PMC9983196 DOI: 10.1186/s13059-023-02876-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 02/13/2023] [Indexed: 03/05/2023] Open
Abstract
BACKGROUND There is widespread interest in the three-dimensional chromatin conformation of the genome and its impact on gene expression. However, these studies frequently do not consider parent-of-origin differences, such as genomic imprinting, which result in monoallelic expression. In addition, genome-wide allele-specific chromatin conformation associations have not been extensively explored. There are few accessible bioinformatic workflows for investigating allelic conformation differences and these require pre-phased haplotypes which are not widely available. RESULTS We developed a bioinformatic pipeline, "HiCFlow," that performs haplotype assembly and visualization of parental chromatin architecture. We benchmarked the pipeline using prototype haplotype phased Hi-C data from GM12878 cells at three disease-associated imprinted gene clusters. Using Region Capture Hi-C and Hi-C data from human cell lines (1-7HB2, IMR-90, and H1-hESCs), we can robustly identify the known stable allele-specific interactions at the IGF2-H19 locus. Other imprinted loci (DLK1 and SNRPN) are more variable and there is no "canonical imprinted 3D structure," but we could detect allele-specific differences in A/B compartmentalization. Genome-wide, when topologically associating domains (TADs) are unbiasedly ranked according to their allele-specific contact frequencies, a set of allele-specific TADs could be defined. These occur in genomic regions of high sequence variation. In addition to imprinted genes, allele-specific TADs are also enriched for allele-specific expressed genes. We find loci that have not previously been identified as allele-specific expressed genes such as the bitter taste receptors (TAS2Rs). CONCLUSIONS This study highlights the widespread differences in chromatin conformation between heterozygous loci and provides a new framework for understanding allele-specific expressed genes.
Collapse
Affiliation(s)
- Stephen Richer
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Yuan Tian
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
- UCL Cancer Institute, University College London, Paul O'Gorman Building, London, UK
| | | | - Laurence Hurst
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Adele Murrell
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK.
| | - Giuseppina Pisignano
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK.
| |
Collapse
|
25
|
Buyukcelebi K, Chen X, Abdula F, Duval A, Ozturk H, Seker-Polat F, Jin Q, Yin P, Feng Y, Wei JJ, Bulun S, Yue F, Adli M. Engineered MED12 mutations drive uterine fibroid-like transcriptional and metabolic programs by altering the 3D genome compartmentalization. Res Sq 2023:rs.3.rs-2537075. [PMID: 36798375 PMCID: PMC9934745 DOI: 10.21203/rs.3.rs-2537075/v1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/11/2023]
Abstract
Uterine fibroid (UF) tumors originate from a mutated smooth muscle cell (SMC). Nearly 70% of these tumors are driven by hotspot recurrent somatic mutations in the MED12 gene; however, there are no tractable genetic models to study the biology of UF tumors because, under culture conditions, the non-mutant fibroblasts outgrow the mutant SMC cells, resulting in the conversion of the population to WT phenotype. The lack of faithful cellular models hampered our ability to delineate the molecular pathways downstream of MED12 mutations and identify therapeutics that may selectively target the mutant cells. To overcome this challenge, we employed CRISPR knock-in with a sensitive PCR-based screening strategy to precisely engineer cells with mutant MED12 Gly44, which constitutes 50% of MED12 exon two mutations. Critically, the engineered myometrial SMC cells recapitulate several UF-like cellular, transcriptional and metabolic alterations, including enhanced proliferation rates in 3D spheres and altered Tryptophan/kynurenine metabolism. Our transcriptomic analysis supported by DNA synthesis tracking reveals that MED12 mutant cells, like UF tumors, have heightened expression of DNA repair genes but reduced DNA synthesis rates. Consequently, these cells accumulate significantly higher rates of DNA damage and are selectively more sensitive to common DNA-damaging chemotherapy, indicating mutation-specific and therapeutically relevant vulnerabilities. Our high-resolution 3D chromatin interaction analysis demonstrates that the engineered MED12 mutations drive aberrant genomic activity due to a genome-wide chromatin compartmentalization switch. These findings indicate that the engineered cellular model faithfully models key features of UF tumors and provides a novel platform for the broader scientific community to characterize genomics of recurrent MED12 mutations and discover potential therapeutic targets.
Collapse
|
26
|
Wu H, Liu M, Zhang P, Zhang H. iEnhancer-SKNN: a stacking ensemble learning-based method for enhancer identification and classification using sequence information. Brief Funct Genomics 2023; 22:302-311. [PMID: 36715222 DOI: 10.1093/bfgp/elac057] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Revised: 12/01/2022] [Accepted: 12/13/2022] [Indexed: 01/31/2023] Open
Abstract
Enhancers, a class of distal cis-regulatory elements located in the non-coding region of DNA, play a key role in gene regulation. It is difficult to identify enhancers from DNA sequence data because enhancers are freely distributed in the non-coding region, with no specific sequence features, and having a long distance with the targeted promoters. Therefore, this study presents a stacking ensemble learning method to accurately identify enhancers and classify enhancers into strong and weak enhancers. Firstly, we obtain the fusion feature matrix by fusing the four features of Kmer, PseDNC, PCPseDNC and Z-Curve9. Secondly, five K-Nearest Neighbor (KNN) models with different parameters are trained as the base model, and the Logistic Regression algorithm is utilized as the meta-model. Thirdly, the stacking ensemble learning strategy is utilized to construct a two-layer model based on the base model and meta-model to train the preprocessed feature sets. The proposed method, named iEnhancer-SKNN, is a two-layer prediction model, in which the function of the first layer is to predict whether the given DNA sequences are enhancers or non-enhancers, and the function of the second layer is to distinguish whether the predicted enhancers are strong enhancers or weak enhancers. The performance of iEnhancer-SKNN is evaluated on the independent testing dataset and the results show that the proposed method has better performance in predicting enhancers and their strength. In enhancer identification, iEnhancer-SKNN achieves an accuracy of 81.75%, an improvement of 1.35% to 8.75% compared with other predictors, and in enhancer classification, iEnhancer-SKNN achieves an accuracy of 80.50%, an improvement of 5.5% to 25.5% compared with other predictors. Moreover, we identify key transcription factor binding site motifs in the enhancer regions and further explore the biological functions of the enhancers and these key motifs. Source code and data can be downloaded from https://github.com/HaoWuLab-Bioinformatics/iEnhancer-SKNN.
Collapse
Affiliation(s)
- Hao Wu
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China.,School of Software, Shandong University, Jinan, 250101, Shandong, China
| | - Mengdi Liu
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Pengyu Zhang
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Hongming Zhang
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China
| |
Collapse
|
27
|
Wang X, Yue F. HiCLift: A fast and efficient tool for converting chromatin interaction data between genome assemblies. bioRxiv 2023:2023.01.17.524475. [PMID: 36712087 PMCID: PMC9882170 DOI: 10.1101/2023.01.17.524475] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 01/22/2023]
Abstract
Motivation With the continuous effort to improve the quality of human reference genome and the generation of more and more personal genomes, the conversion of genomic coordinates between genome assemblies is critical in many integrative and comparative studies. While tools have been developed for such task for linear genome signals such as ChIP-Seq, no tool exists to convert genome assemblies for chromatin interaction data, despite the importance of three-dimensional (3D) genome organization in gene regulation and disease. Results Here, we present HiCLift, a fast and efficient tool that can convert the genomic coordinates of chromatin contacts such as Hi-C and Micro-C from one assembly to another, including the latest T2T genome. Comparing with the strategy of directly re-mapping raw reads to a different genome, HiCLift runs on average 42 times faster (hours vs. days), while outputs nearly identical contact matrices. More importantly, as HiCLift does not need to re-map the raw reads, it can directly convert human patient sample data, where the raw sequencing reads are sometimes hard to acquire or not available. Availability HiCLift is publicly available at https://github.com/XiaoTaoWang/HiCLift .
Collapse
|
28
|
Wang F, Moon W, Letsou W, Sapkota Y, Wang Z, Im C, Baedke JL, Robison L, Yasui Y. Genome-Wide Analysis of Rare Haplotypes Associated with Breast Cancer Risk. Cancer Res 2023; 83:332-345. [PMID: 36354368 PMCID: PMC9852031 DOI: 10.1158/0008-5472.can-22-1888] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2022] [Revised: 09/09/2022] [Accepted: 11/08/2022] [Indexed: 11/12/2022]
Abstract
Numerous common genetic variants have been linked to breast cancer risk, but they only partially explain the total breast cancer heritability. Inference from Nordic population-based twin data indicates rare high-risk loci as the chief determinant of breast cancer risk. Here, we use haplotypes, rather than single variants, to identify rare high-risk loci for breast cancer. With computationally phased genotypes from 181,034 white British women in the UK Biobank, a genome-wide haplotype-breast cancer association analysis was conducted using sliding windows of 5 to 500 consecutive array-genotyped variants. In the discovery stage, haplotype-breast cancer associations were evaluated retrospectively in the prestudy-enrollment data including 5,487 breast cancer cases. Breast cancer hazard ratios (HR) for additive haplotypic effects were estimated using Cox regression. The replication analysis included a prospective cohort of women free of breast cancer at enrollment, of whom 3,524 later developed breast cancer. This two-stage analysis detected 13 rare loci (frequency <1%), each associated with an appreciable breast cancer-risk increase (discovery: HRs = 2.84-6.10, P < 5 × 10-8; replication: HRs = 2.08-5.61, P < 0.01). In contrast, the variants that formed these rare haplotypes individually exhibited much smaller effects. Functional annotation revealed extensive cis-regulatory DNA elements in breast cancer-related cells underlying the replicated rare haplotypes. Using phased, imputed genotypes from 30,064 cases and 25,282 controls in the DRIVE OncoArray case-control study, 6 of the 13 rare-loci associations were found generalizable (odds ratio estimates: 1.48-7.67, P < 0.05). This study demonstrates the complementary advantage of utilizing rare haplotypes to capture novel risk loci and suggests the potential for the discovery of more genetic elements contributing to cancer heritability as large data sets of germline whole-genome sequencing become available. SIGNIFICANCE A genome-wide two-stage haplotype analysis identifies rare haplotypes associated with breast cancer risk and suggests that the rare risk haplotypes represent long-range interactions with regulatory consequences influencing cancer risk.
Collapse
Affiliation(s)
- Fan Wang
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA.,Corresponding authors Fan Wang, Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Mail Stop 735, Memphis, TN 38105, USA. Phone: +1-901-595-1110; Fax: 1-901-595-5845; .; Yutaka Yasui, Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Mail Stop 735, Memphis, TN 38105, USA. Phone: +1-901-595-5893; Fax: 1-901-595-5845;
| | - Wonjong Moon
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - William Letsou
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Yadav Sapkota
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Zhaoming Wang
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Cindy Im
- School of Public Health, University of Alberta, Edmonton, Alberta T6G 1C9, Canada
| | - Jessica L. Baedke
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Leslie Robison
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA
| | - Yutaka Yasui
- Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, Memphis, Tennessee 38105, USA.,School of Public Health, University of Alberta, Edmonton, Alberta T6G 1C9, Canada.,Corresponding authors Fan Wang, Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Mail Stop 735, Memphis, TN 38105, USA. Phone: +1-901-595-1110; Fax: 1-901-595-5845; .; Yutaka Yasui, Department of Epidemiology and Cancer Control, St. Jude Children's Research Hospital, 262 Danny Thomas Place, Mail Stop 735, Memphis, TN 38105, USA. Phone: +1-901-595-5893; Fax: 1-901-595-5845;
| |
Collapse
|
29
|
Zhang Q, Teng P, Wang S, He Y, Cui Z, Guo Z, Liu Y, Yuan C, Liu Q, Huang DS. Computational prediction and characterization of cell-type-specific and shared binding sites. Bioinformatics 2022; 39:6885447. [PMID: 36484687 PMCID: PMC9825777 DOI: 10.1093/bioinformatics/btac798] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/09/2022] [Revised: 11/24/2022] [Accepted: 12/08/2022] [Indexed: 12/13/2022] Open
Abstract
MOTIVATION Cell-type-specific gene expression is maintained in large part by transcription factors (TFs) selectively binding to distinct sets of sites in different cell types. Recent research works have provided evidence that such cell-type-specific binding is determined by TF's intrinsic sequence preferences, cooperative interactions with co-factors, cell-type-specific chromatin landscapes and 3D chromatin interactions. However, computational prediction and characterization of cell-type-specific and shared binding sites is rarely studied. RESULTS In this article, we propose two computational approaches for predicting and characterizing cell-type-specific and shared binding sites by integrating multiple types of features, in which one is based on XGBoost and another is based on convolutional neural network (CNN). To validate the performance of our proposed approaches, ChIP-seq datasets of 10 binding factors were collected from the GM12878 (lymphoblastoid) and K562 (erythroleukemic) human hematopoietic cell lines, each of which was further categorized into cell-type-specific (GM12878- and K562-specific) and shared binding sites. Then, multiple types of features for these binding sites were integrated to train the XGBoost- and CNN-based models. Experimental results show that our proposed approaches significantly outperform other competing methods on three classification tasks. Moreover, we identified independent feature contributions for cell-type-specific and shared sites through SHAP values and explored the ability of the CNN-based model to predict cell-type-specific and shared binding sites by excluding or including DNase signals. Furthermore, we investigated the generalization ability of our proposed approaches to different binding factors in the same cellular environment. AVAILABILITY AND IMPLEMENTATION The source code is available at: https://github.com/turningpoint1988/CSSBS. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Qinhu Zhang
- Translational Medical Center for Stem Cell Therapy and Institute for Regenerative Medicine, Shanghai East Hospital, Bioinformatics Department, School of Life Sciences and Technology, Tongji University, Shanghai 200092, China
| | - Pengrui Teng
- School of Information and Control Engineering, China University of Mining and Technology, Xuzhou 221116, China
| | - Siguo Wang
- Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
| | - Ying He
- Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
| | - Zhen Cui
- Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
| | - Zhenghao Guo
- Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, Shanghai 201804, China
| | - Yixin Liu
- School of Health Science and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
| | - Changan Yuan
- Big Data and Intelligent Computing Research Center, Guangxi Academy of Science, Nanning 530007, China
| | - Qi Liu
- To whom correspondence should be addressed. or
| | | |
Collapse
|
30
|
Zhang Y, Blanchette M. Reference panel guided topological structure annotation of Hi-C data. Nat Commun 2022; 13:7426. [PMID: 36460680 DOI: 10.1038/s41467-022-35231-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2022] [Accepted: 11/22/2022] [Indexed: 12/03/2022] Open
Abstract
Accurately annotating topological structures (e.g., loops and topologically associating domains) from Hi-C data is critical for understanding the role of 3D genome organization in gene regulation. This is a challenging task, especially at high resolution, in part due to the limited sequencing coverage of Hi-C data. Current approaches focus on the analysis of individual Hi-C data sets of interest, without taking advantage of the facts that (i) several hundred Hi-C contact maps are publicly available, and (ii) the vast majority of topological structures are conserved across multiple cell types. Here, we present RefHiC, an attention-based deep learning framework that uses a reference panel of Hi-C datasets to facilitate topological structure annotation from a given study sample. We compare RefHiC against tools that do not use reference samples and find that RefHiC outperforms other programs at both topological associating domain and loop annotation across different cell types, species, and sequencing depths.
Collapse
|
31
|
Xu J, Song F, Lyu H, Kobayashi M, Zhang B, Zhao Z, Hou Y, Wang X, Luan Y, Jia B, Stasiak L, Wong JHY, Wang Q, Jin Q, Jin Q, Fu Y, Yang H, Hardison RC, Dovat S, Platanias LC, Diao Y, Yang Y, Yamada T, Viny AD, Levine RL, Claxton D, Broach JR, Zheng H, Yue F. Subtype-specific 3D genome alteration in acute myeloid leukaemia. Nature 2022; 611:387-398. [PMID: 36289338 PMCID: PMC10060167 DOI: 10.1038/s41586-022-05365-x] [Citation(s) in RCA: 38] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Accepted: 09/20/2022] [Indexed: 11/09/2022]
Abstract
Acute myeloid leukaemia (AML) represents a set of heterogeneous myeloid malignancies, and hallmarks include mutations in epigenetic modifiers, transcription factors and kinases1-5. The extent to which mutations in AML drive alterations in chromatin 3D structure and contribute to myeloid transformation is unclear. Here we use Hi-C and whole-genome sequencing to analyse 25 samples from patients with AML and 7 samples from healthy donors. Recurrent and subtype-specific alterations in A/B compartments, topologically associating domains and chromatin loops were identified. RNA sequencing, ATAC with sequencing and CUT&Tag for CTCF, H3K27ac and H3K27me3 in the same AML samples also revealed extensive and recurrent AML-specific promoter-enhancer and promoter-silencer loops. We validated the role of repressive loops on their target genes by CRISPR deletion and interference. Structural variation-induced enhancer-hijacking and silencer-hijacking events were further identified in AML samples. Hijacked enhancers play a part in AML cell growth, as demonstrated by CRISPR screening, whereas hijacked silencers have a downregulating role, as evidenced by CRISPR-interference-mediated de-repression. Finally, whole-genome bisulfite sequencing of 20 AML and normal samples revealed the delicate relationship between DNA methylation, CTCF binding and 3D genome structure. Treatment of AML cells with a DNA hypomethylating agent and triple knockdown of DNMT1, DNMT3A and DNMT3B enabled the manipulation of DNA methylation to revert 3D genome organization and gene expression. Overall, this study provides a resource for leukaemia studies and highlights the role of repressive loops and hijacked cis elements in human diseases.
Collapse
Affiliation(s)
- Jie Xu
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
- Department of Biochemistry and Molecular Biology, Penn State College of Medicine, Penn State University, Hershey, PA, USA
| | - Fan Song
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
- Bioinformatics and Genomics Graduate Program, Huck Institutes of Life Sciences, Penn State University, State College, PA, USA
| | - Huijue Lyu
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Mikoto Kobayashi
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Baozhen Zhang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
- Key Laboratory of Carcinogenesis and Translational Research (Ministry of Education/Beijing), Division of Etiology, Peking University Cancer Hospital and Institute, Beijing, China
| | - Ziyu Zhao
- Department of Neurobiology, Northwestern University, Evanston, IL, USA
| | - Ye Hou
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Xiaotao Wang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Yu Luan
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Bei Jia
- Department of Medicine, Division of Hematology and Oncology, Penn State Cancer Institute, Penn State University, Hershey, PA, USA
| | - Lena Stasiak
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Josiah Hiu-Yuen Wong
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Qixuan Wang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Qi Jin
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Qiushi Jin
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Yihao Fu
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Hongbo Yang
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
| | - Ross C Hardison
- Department of Biochemistry and Molecular Biology, Huck Institutes of Life Sciences, Penn State University, State College, PA, USA
| | - Sinisa Dovat
- Department of Medicine, Division of Hematology and Oncology, Penn State Cancer Institute, Penn State University, Hershey, PA, USA
| | - Leonidas C Platanias
- Robert H. Lurie Comprehensive Cancer Center, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA
- Department of Medicine, Jesse Brown Veterans Affairs Medical Center, Chicago, IL, USA
| | - Yarui Diao
- Department of Cell Biology, Duke University School of Medicine, Durham, NC, USA
| | - Yue Yang
- Department of Neurobiology, Northwestern University, Evanston, IL, USA
| | - Tomoko Yamada
- Department of Neurobiology, Northwestern University, Evanston, IL, USA
| | - Aaron D Viny
- Division of Hematology/Oncology and Columbia Stem Cell Initiative, Columbia University Irving Medical Center, New York, NY, USA
| | - Ross L Levine
- Memorial Sloan Kettering Cancer Center, New York, NY, USA
| | - David Claxton
- Department of Medicine, Division of Hematology and Oncology, Penn State Cancer Institute, Penn State University, Hershey, PA, USA
| | - James R Broach
- Department of Biochemistry and Molecular Biology, Penn State College of Medicine, Penn State University, Hershey, PA, USA
| | - Hong Zheng
- Department of Medicine, Division of Hematology and Oncology, Penn State Cancer Institute, Penn State University, Hershey, PA, USA.
| | - Feng Yue
- Department of Biochemistry and Molecular Genetics, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
- Robert H. Lurie Comprehensive Cancer Center, Feinberg School of Medicine, Northwestern University, Chicago, IL, USA.
| |
Collapse
|
32
|
Wang S, Zhang Q, He Y, Cui Z, Guo Z, Han K, Huang DS. DLoopCaller: A deep learning approach for predicting genome-wide chromatin loops by integrating accessible chromatin landscapes. PLoS Comput Biol 2022; 18:e1010572. [PMID: 36206320 DOI: 10.1371/journal.pcbi.1010572] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/04/2022] [Revised: 10/19/2022] [Accepted: 09/14/2022] [Indexed: 11/20/2022] Open
Abstract
In recent years, major advances have been made in various chromosome conformation capture technologies to further satisfy the needs of researchers for high-quality, high-resolution contact interactions. Discriminating the loops from genome-wide contact interactions is crucial for dissecting three-dimensional(3D) genome structure and function. Here, we present a deep learning method to predict genome-wide chromatin loops, called DLoopCaller, by combining accessible chromatin landscapes and raw Hi-C contact maps. Some available orthogonal data ChIA-PET/HiChIP and Capture Hi-C were used to generate positive samples with a wider contact matrix which provides the possibility to find more potential genome-wide chromatin loops. The experimental results demonstrate that DLoopCaller effectively improves the accuracy of predicting genome-wide chromatin loops compared to the state-of-the-art method Peakachu. Moreover, compared to two of most popular loop callers, such as HiCCUPS and Fit-Hi-C, DLoopCaller identifies some unique interactions. We conclude that a combination of chromatin landscapes on the one-dimensional genome contributes to understanding the 3D genome organization, and the identified chromatin loops reveal cell-type specificity and transcription factor motif co-enrichment across different cell lines and species.
Collapse
|
33
|
Zhang P, Wu Y, Zhou H, Zhou B, Zhang H, Wu H. CLNN-loop: a deep learning model to predict CTCF-mediated chromatin loops in the different cell lines and CTCF-binding sites (CBS) pair types. Bioinformatics 2022; 38:4497-4504. [PMID: 35997565 DOI: 10.1093/bioinformatics/btac575] [Citation(s) in RCA: 14] [Impact Index Per Article: 7.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2022] [Revised: 06/28/2022] [Accepted: 08/22/2022] [Indexed: 12/24/2022] Open
Abstract
MOTIVATION Three-dimensional (3D) genome organization is of vital importance in gene regulation and disease mechanisms. Previous studies have shown that CTCF-mediated chromatin loops are crucial to studying the 3D structure of cells. Although various experimental techniques have been developed to detect chromatin loops, they have been found to be time-consuming and costly. Nowadays, various sequence-based computational methods can capture significant features of 3D genome organization and help predict chromatin loops. However, these methods have low performance and poor generalization ability in predicting chromatin loops. RESULTS Here, we propose a novel deep learning model, called CLNN-loop, to predict chromatin loops in different cell lines and CTCF-binding sites (CBS) pair types by fusing multiple sequence-based features. The analysis of a series of examinations based on the datasets in the previous study shows that CLNN-loop has satisfactory performance and is superior to the existing methods in terms of predicting chromatin loops. In addition, we apply the SHAP framework to interpret the predictions of different models, and find that CTCF motif and sequence conservation are important signs of chromatin loops in different cell lines and CBS pair types. AVAILABILITY AND IMPLEMENTATION The source code of CLNN-loop is freely available at https://github.com/HaoWuLab-Bioinformatics/CLNN-loop and the webserver of CLNN-loop is freely available at http://hwclnn.sdu.edu.cn. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pengyu Zhang
- School of Software, Shandong University, Jinan, Shandong 250101, China.,College of Information Engineering, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Yingfu Wu
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Haoru Zhou
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Bing Zhou
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Hongming Zhang
- College of Information Engineering, Northwest A&F University, Yangling, Shaanxi 712100, China
| | - Hao Wu
- School of Software, Shandong University, Jinan, Shandong 250101, China
| |
Collapse
|
34
|
Zhang P, Zhang H, Wu H. iPro-WAEL: a comprehensive and robust framework for identifying promoters in multiple species. Nucleic Acids Res 2022; 50:10278-10289. [PMID: 36161334 PMCID: PMC9561371 DOI: 10.1093/nar/gkac824] [Citation(s) in RCA: 22] [Impact Index Per Article: 11.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/18/2022] [Revised: 08/24/2022] [Accepted: 09/14/2022] [Indexed: 11/27/2022] Open
Abstract
Promoters are consensus DNA sequences located near the transcription start sites and they play an important role in transcription initiation. Due to their importance in biological processes, the identification of promoters is significantly important for characterizing the expression of the genes. Numerous computational methods have been proposed to predict promoters. However, it is difficult for these methods to achieve satisfactory performance in multiple species. In this study, we propose a novel weighted average ensemble learning model, termed iPro-WAEL, for identifying promoters in multiple species, including Human, Mouse, E.coli, Arabidopsis, B.amyloliquefaciens, B.subtilis and R.capsulatus. Extensive benchmarking experiments illustrate that iPro-WAEL has optimal performance and is superior to the current methods in promoter prediction. The experimental results also demonstrate a satisfactory prediction ability of iPro-WAEL on cross-cell lines, promoters annotated by other methods and distinguishing between promoters and enhancers. Moreover, we identify the most important transcription factor binding site (TFBS) motif in promoter regions to facilitate the study of identifying important motifs in the promoter regions. The source code of iPro-WAEL is freely available at https://github.com/HaoWuLab-Bioinformatics/iPro-WAEL.
Collapse
Affiliation(s)
- Pengyu Zhang
- School of Software, Shandong University, Jinan, 250101, Shandong, China.,College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Hongming Zhang
- College of Information Engineering, Northwest A&F University, Yangling, 712100, Shaanxi, China
| | - Hao Wu
- School of Software, Shandong University, Jinan, 250101, Shandong, China
| |
Collapse
|
35
|
Wang X, Yan J, Ye Z, Zhang Z, Wang S, Hao S, Shen B, Wei G. Reorganization of 3D chromatin architecture in doxorubicin-resistant breast cancer cells. Front Cell Dev Biol 2022; 10:974750. [PMID: 36003143 PMCID: PMC9393755 DOI: 10.3389/fcell.2022.974750] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2022] [Accepted: 07/11/2022] [Indexed: 11/16/2022] Open
Abstract
Background: Doxorubicin resistance remains a major therapeutic challenge leading to poor survival prognosis and treatment failure in breast cancer. Although doxorubicin induces massive changes in the transcriptional landscape are well known, potential diagnostic or therapeutic targets associated with the reorganization of three-dimensional (3D) chromatin architecture have not yet been systematically investigated. Methods: Here we performed in situ high-throughput chromosome conformation capture (Hi-C) on parental and doxorubicin-resistant MCF7 (MCF7-DR) human breast cancer cells, followed by integrative analysis of HiC, ATAC-seq, RNA-seq and TCGA data. Results: It revealed that A/B compartment switching was positively correlated to genome-wide differential gene expression. The genome of MCF7-DR cells was spatially reorganized into smaller topologically associating domains (TADs) and chromatin loops. We also revealed the contribution of increased chromatin accessibility and potential transcription factor families, including CTCF, AP-1 and bHLH, to gained TADs or loops. Intriguingly, we observed two condensed genomic regions (∼20 kb) with decreased chromatin accessibility flanking TAD boundaries, which might play a critical role in the formation or maintenance of TADs. Finally, combining data from TCGA, we identified a number of gained and lost enhancer-promoter interactions and their corresponding differentially expressed genes involved in chromatin organization and breast cancer signaling pathways, including FA2H, FOXA1 and JRKL, which might serve as potential treatment targets for breast cancer. Conclusion: These data uncovered a close connection between 3D genome reorganization, chromatin accessibility as well as gene transcription and provide novel insights into the epigenomic mechanisms involving doxorubicin resistance in breast cancer.
Collapse
Affiliation(s)
- Xuelong Wang
- Department of General Surgery, Pancreatic Disease Center, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China.,CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China.,Research Institute of Pancreatic Diseases, Shanghai Jiaotong University School of Medicine, Shanghai, China
| | - Jizhou Yan
- Department of Developmental Biology, Institute for Marine Biosystem and Neurosciences, Shanghai Ocean University, Shanghai, China
| | - Zhao Ye
- Department of Endocrinology and Metabolism, Huashan Hospital, Shanghai Medical College, Fudan University, Shanghai, China
| | - Zhiqiang Zhang
- Institute of Translational Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Sheng Wang
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| | - Shuang Hao
- Key Laboratory of Breast Cancer in Shanghai, Department of Breast Surgery, Fudan University Shanghai Cancer Center, Shanghai, China
| | - Baiyong Shen
- Department of General Surgery, Pancreatic Disease Center, Ruijin Hospital, Shanghai Jiaotong University School of Medicine, Shanghai, China.,Research Institute of Pancreatic Diseases, Shanghai Jiaotong University School of Medicine, Shanghai, China.,Institute of Translational Medicine, Shanghai Jiaotong University, Shanghai, China
| | - Gang Wei
- CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai, China
| |
Collapse
|
36
|
Abstract
BACKGROUND Chromatin loops are an essential factor in the structural organization of the genome; however, their detection in Hi-C interaction matrices is a challenging and compute-intensive task. The approach presented here, integrated into the HiCExplorer software, shows a chromatin loop detection algorithm that applies a strict candidate selection based on continuous negative binomial distributions and performs a Wilcoxon rank-sum test to detect enriched Hi-C interactions. RESULTS HiCExplorer's loop detection has a high detection rate and accuracy. It is the fastest available CPU implementation and utilizes all threads offered by modern multicore platforms. CONCLUSIONS HiCExplorer's method to detect loops by using a continuous negative binomial function combined with the donut approach from HiCCUPS leads to reliable and fast computation of loops. All the loop-calling algorithms investigated provide differing results, which intersect by $\sim 50\%$ at most. The tested in situ Hi-C data contain a large amount of noise; achieving better agreement between loop calling algorithms will require cleaner Hi-C data and therefore future improvements to the experimental methods that generate the data.
Collapse
Affiliation(s)
- Joachim Wolff
- Correspondence author. Joachim Wolff. Friedrich Miescher Institut for Biomedical Research, Maulbeerstrasse 66, 4058 Basel, Switzerland, E-mail: ,
| | - Rolf Backofen
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany
- Signalling Research Centres CIBSS, University of Freiburg, Schänzlestr. 18, 79104 Freiburg, Germany
| | - Björn Grüning
- Bioinformatics Group, Department of Computer Science, University of Freiburg, Georges-Köhler-Allee 106, 79110 Freiburg, Germany
| |
Collapse
|
37
|
Alinejad-Rokny H, Ghavami Modegh R, Rabiee HR, Ramezani Sarbandi E, Rezaie N, Tam KT, Forrest ARR. MaxHiC: A robust background correction model to identify biologically relevant chromatin interactions in Hi-C and capture Hi-C experiments. PLoS Comput Biol 2022; 18:e1010241. [PMID: 35749574 PMCID: PMC9262194 DOI: 10.1371/journal.pcbi.1010241] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/10/2021] [Revised: 07/07/2022] [Accepted: 05/23/2022] [Indexed: 12/13/2022] Open
Abstract
Hi-C is a genome-wide chromosome conformation capture technology that detects interactions between pairs of genomic regions and exploits higher order chromatin structures. Conceptually Hi-C data counts interaction frequencies between every position in the genome and every other position. Biologically functional interactions are expected to occur more frequently than transient background and artefactual interactions. To identify biologically relevant interactions, several background models that take biases such as distance, GC content and mappability into account have been proposed. Here we introduce MaxHiC, a background correction tool that deals with these complex biases and robustly identifies statistically significant interactions in both Hi-C and capture Hi-C experiments. MaxHiC uses a negative binomial distribution model and a maximum likelihood technique to correct biases in both Hi-C and capture Hi-C libraries. We systematically benchmark MaxHiC against major Hi-C background correction tools including Hi-C significant interaction callers (SIC) and Hi-C loop callers using published Hi-C, capture Hi-C, and Micro-C datasets. Our results demonstrate that 1) Interacting regions identified by MaxHiC have significantly greater levels of overlap with known regulatory features (e.g. active chromatin histone marks, CTCF binding sites, DNase sensitivity) and also disease-associated genome-wide association SNPs than those identified by currently existing models, 2) the pairs of interacting regions are more likely to be linked by eQTL pairs and 3) more likely to link known regulatory features including known functional enhancer-promoter pairs validated by CRISPRi than any of the existing methods. We also demonstrate that interactions between different genomic region types have distinct distance distributions only revealed by MaxHiC. MaxHiC is publicly available as a python package for the analysis of Hi-C, capture Hi-C and Micro-C data.
Collapse
Affiliation(s)
- Hamid Alinejad-Rokny
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Australia
- Bio Medical Machine Learning Lab (BML), The Graduate School of Biomedical Engineering, UNSW Sydney, Sydney, Australia
- Health Data Analytics Program, AI-enabled Processes (AIP) Research Centre, Macquarie University, Sydney, Australia
- * E-mail: (HAR); (ARRF)
| | - Rassa Ghavami Modegh
- Bioinformatics and Computational Biology Lab, Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Hamid R. Rabiee
- Bioinformatics and Computational Biology Lab, Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Ehsan Ramezani Sarbandi
- Bioinformatics and Computational Biology Lab, Department of Computer Engineering, Sharif University of Technology, Tehran, Iran
| | - Narges Rezaie
- Center for Complex Biological Systems, University of California Irvine, Irvine, California, United States of America
| | - Kin Tung Tam
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Australia
| | - Alistair R. R. Forrest
- Harry Perkins Institute of Medical Research, QEII Medical Centre and Centre for Medical Research, The University of Western Australia, Perth, Australia
- * E-mail: (HAR); (ARRF)
| |
Collapse
|
38
|
Liu E, Lyu H, Peng Q, Liu Y, Wang T, Han J. TADfit is a multivariate linear regression model for profiling hierarchical chromatin domains on replicate Hi-C data. Commun Biol 2022; 5:608. [PMID: 35725901 PMCID: PMC9209495 DOI: 10.1038/s42003-022-03546-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/10/2021] [Accepted: 05/31/2022] [Indexed: 11/26/2022] Open
Abstract
Topologically associating domains (TADs) are fundamental building blocks of three dimensional genome, and organized into complex hierarchies. Identifying hierarchical TADs on Hi-C data helps to understand the relationship between genome architectures and gene regulation. Herein we propose TADfit, a multivariate linear regression model for profiling hierarchical chromatin domains, which tries to fit the interaction frequencies in Hi-C contact matrix with and without replicates using all-possible hierarchical TADs, and the significant ones can be determined by the regression coefficients obtained with the help of an online learning solver called Follow-The-Regularized-Leader (FTRL). Beyond the existing methods, TADfit has an ability to handle multiple contact matrix replicates and find partially overlapping TADs on them, which helps to find the comprehensive underlying TADs across replicates from different experiments. The comparative results tell that TADfit has better accuracy and reproducibility, and the hierarchical TADs called by it exhibit a reasonable biological relevance. TADfit is a computational method that can identify hierarchical or partially overlapping TADs from Hi-C data, in part by using information from multiple replicates to improve detection power.
Collapse
Affiliation(s)
- Erhu Liu
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Shaanxi, 710049, China
| | - Hongqiang Lyu
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Shaanxi, 710049, China. .,Guangdong Artificial Intelligence and Digital Economy Laboratory, Guangdong, 510335, China.
| | - Qinke Peng
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Shaanxi, 710049, China
| | - Yuan Liu
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Shaanxi, 710049, China
| | - Tian Wang
- Institute of Artificial Intelligence, Beihang University, Beijing, 100191, China
| | - Jiuqiang Han
- School of Automation Science and Engineering, Faculty of Electronic and Information Engineering, Xi'an Jiaotong University, Shaanxi, 710049, China.,Guangdong Artificial Intelligence and Digital Economy Laboratory, Guangdong, 510335, China
| |
Collapse
|
39
|
Yi X, Luo M, Feng X, Zhou Y, Wang J, Li MJ. 3DCoop: An approach for computational inference of cell-type-specific transcriptional regulators cooperation in 3D chromatin. STAR Protoc 2022; 3:101382. [PMID: 35600920 PMCID: PMC9114683 DOI: 10.1016/j.xpro.2022.101382] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Precise identification of context-specific transcriptional regulators (TRs) cooperation facilitates the understanding of complex gene regulation. However, previous methods are highly reliant on the availability of ChIPped TRs. Here, we provide a protocol for running 3DCoop, a pipeline for computational inference of cell type-specific TR cooperation in 3D chromatin by integrating TR motifs, open chromatin profiles, gene expression, and chromatin loops. 3DCoop provides a feasible solution to study the potential interplay among TRs across multiple human or mouse tissue/cell types. For complete details on the use and execution of this protocol, please refer to Yi et al. (2021). Inference of transcriptional regulator (TR) cooperation in 3D chromatin Integration of TR motifs, open chromatin, gene expression, and chromatin loops Identification of context-specific TR cooperation not relying on ChIPped factors 3DCoop facilitates TR cooperation study across multiple human/mouse cell types
Collapse
Affiliation(s)
- Xianfu Yi
- Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin 300070, China
- Corresponding author
| | - Menghan Luo
- Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xiangling Feng
- Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Yao Zhou
- Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Jianhua Wang
- Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Mulin Jun Li
- Department of Bioinformatics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
- Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
- Corresponding author
| |
Collapse
|
40
|
Deshpande AS, Ulahannan N, Pendleton M, Dai X, Ly L, Behr JM, Schwenk S, Liao W, Augello MA, Tyer C, Rughani P, Kudman S, Tian H, Otis HG, Adney E, Wilkes D, Mosquera JM, Barbieri CE, Melnick A, Stoddart D, Turner DJ, Juul S, Harrington E, Imieliński M. Identifying synergistic high-order 3D chromatin conformations from genome-scale nanopore concatemer sequencing. Nat Biotechnol 2022; 40:1488-1499. [DOI: 10.1038/s41587-022-01289-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2020] [Accepted: 03/16/2022] [Indexed: 12/28/2022]
|
41
|
Abstract
Chromatin has distinct three-dimensional (3D) architectures important in key biological processes, such as cell cycle, replication, differentiation, and transcription regulation. In turn, aberrant 3D structures play a vital role in developing abnormalities and diseases such as cancer. This review discusses key 3D chromatin structures (topologically associating domain, lamina-associated domain, and enhancer-promoter interactions) and corresponding structural protein elements mediating 3D chromatin interactions [CCCTC-binding factor, polycomb group protein, cohesin, and Brother of the Regulator of Imprinted Sites (BORIS) protein] with a highlight of their associations with cancer. We also summarise the recent development of technologies and bioinformatics approaches to study the 3D chromatin interactions in gene expression regulation, including crosslinking and proximity ligation methods in the bulk cell population (ChIA-PET and HiChIP) or single-molecule resolution (ChIA-drop), and methods other than proximity ligation, such as GAM, SPRITE, and super-resolution microscopy techniques.
Collapse
Affiliation(s)
- Siwei Deng
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Old Road, Headington, Oxford, OX3 7LD, UK
| | - Yuliang Feng
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Old Road, Headington, Oxford, OX3 7LD, UK
| | - Siim Pauklin
- Nuffield Department of Orthopaedics, Rheumatology and Musculoskeletal Sciences, Botnar Research Centre, University of Oxford, Old Road, Headington, Oxford, OX3 7LD, UK.
| |
Collapse
|
42
|
Galan S, Serra F, Marti-Renom MA. Identification of chromatin loops from Hi-C interaction matrices by CTCF-CTCF topology classification. NAR Genom Bioinform 2022; 4:lqac021. [PMID: 35274099 PMCID: PMC8903010 DOI: 10.1093/nargab/lqac021] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Revised: 02/01/2022] [Accepted: 02/23/2022] [Indexed: 12/28/2022] Open
Abstract
Genome-wide profiling of long-range interactions has revealed that the CCCTC-Binding factor (CTCF) often anchors chromatin loops and is enriched at boundaries of the so-called Topologically Associating Domains, which suggests that CTCF is essential in the 3D organization of chromatin. However, the systematic topological classification of pairwise CTCF-CTCF interactions has not been yet explored. Here, we developed a computational pipeline able to classify all CTCF-CTCF pairs according to their chromatin interactions from Hi-C experiments. The interaction profiles of all CTCF-CTCF pairs were further structurally clustered using self-organizing feature maps and their functionality characterized by their epigenetic states. The resulting clusters were then input to a convolutional neural network aiming at the de novo detecting chromatin loops from Hi-C interaction matrices. Our new method, called LOOPbit, is able to automatically detect significant interactions with a higher proportion of enhancer-promoter loops compared to other callers. Our highly specific loop caller adds a new layer of detail to the link between chromatin structure and function.
Collapse
Affiliation(s)
- Silvia Galan
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
| | - François Serra
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
| | - Marc A Marti-Renom
- CNAG-CRG, Centre for Genomic Regulation (CRG), Barcelona Institute of Science and Technology (BIST), Baldiri i Reixac 4, 08028 Barcelona, Spain
| |
Collapse
|
43
|
Shen Y, Zhong Q, Liu T, Wen Z, Shen W, Li L. CharID: a two-step model for universal prediction of interactions between chromatin accessible regions. Brief Bioinform 2022; 23:6514800. [PMID: 35077535 DOI: 10.1093/bib/bbab602] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/25/2021] [Revised: 12/23/2021] [Accepted: 12/24/2021] [Indexed: 11/14/2022] Open
Abstract
Open chromatin regions (OCRs) allow direct interaction between cis-regulatory elements and trans-acting factors. Therefore, predicting all potential OCR-mediated loops is essential for deciphering the regulation mechanism of gene expression. However, existing loop prediction tools are restricted to specific anchor types. Here, we present CharID (Chromatin Accessible Region Interaction Detector), a two-step model that combines neural network and ensemble learning to predict OCR-mediated loops. In the first step, CharID-Anchor, an attention-based hybrid CNN-BiGRU network is constructed to discriminate between the anchor and nonanchor OCRs. In the second step, CharID-Loop uses gradient boosting decision tree with chromosome-split strategy to predict the interactions between anchor OCRs. The performance was assessed in three human cell lines, and CharID showed superior prediction performance compared with other algorithms. In contrast to the methods designed to predict a particular type of loops, CharID can detect varieties of chromatin loops not limited to enhancer-promoter loops or architectural protein-mediated loops. We constructed the OCR-mediated interaction network using the predicted loops and identified hub anchors, which are highlighted by their proximity to housekeeping genes. By analyzing loops containing SNPs associated with cardiovascular disease, we identified an SNP-gene loop indicating the regulation mechanism of the GFOD1. Taken together, CharID universally predicts diverse chromatin loops beyond other state-of-the-art methods, which are limited by anchor types, and experimental techniques, which are limited by sensitivities drastically decaying with the genomic distance of anchors. Finally, we hosted Peaksniffer, a user-friendly web server that provides online prediction, query and visualization of OCRs and associated loops.
Collapse
Affiliation(s)
- Yin Shen
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
- 3D Genomics Research Center, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| | - Quan Zhong
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
- 3D Genomics Research Center, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| | - Tian Liu
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| | - Zi Wen
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
- 3D Genomics Research Center, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| | - Wei Shen
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
- 3D Genomics Research Center, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| | - Li Li
- Hubei Key Laboratory of Agricultural Bioinformatics, College of Informatics, Huazhong Agricultural University, Wuhan, 430070, P. R. China
- 3D Genomics Research Center, Huazhong Agricultural University, Wuhan, 430070, P. R. China
| |
Collapse
|
44
|
Fino J, Marques B, Dong Z, David D. SVInterpreter: A Comprehensive Topologically Associated Domain-Based Clinical Outcome Prediction Tool for Balanced and Unbalanced Structural Variants. Front Genet 2021; 12:757170. [PMID: 34925449 PMCID: PMC8671832 DOI: 10.3389/fgene.2021.757170] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2021] [Accepted: 10/12/2021] [Indexed: 11/13/2022] Open
Abstract
With the advent of genomic sequencing, a number of balanced and unbalanced structural variants (SVs) can be detected per individual. Mainly due to incompleteness and the scattered nature of the available annotation data of the human genome, manual interpretation of the SV's clinical significance is laborious and cumbersome. Since bioinformatic tools developed for this task are limited, a comprehensive tool to assist clinical outcome prediction of SVs is warranted. Herein, we present SVInterpreter, a free Web application, which analyzes both balanced and unbalanced SVs using topologically associated domains (TADs) as genome units. Among others, gene-associated data (as function and dosage sensitivity), phenotype similarity scores, and copy number variants (CNVs) scoring metrics are retrieved for an informed SV interpretation. For evaluation, we retrospectively applied SVInterpreter to 97 balanced (translocations and inversions) and 125 unbalanced (deletions, duplications, and insertions) previously published SVs, and 145 SVs identified from 20 clinical samples. Our results showed the ability of SVInterpreter to support the evaluation of SVs by (1) confirming more than half of the predictions of the original studies, (2) decreasing 40% of the variants of uncertain significance, and (3) indicating several potential position effect events. To our knowledge, SVInterpreter is the most comprehensive TAD-based tool to identify the possible disease-causing candidate genes and to assist prediction of the clinical outcome of SVs. SVInterpreter is available at http://dgrctools-insa.min-saude.pt/cgi-bin/SVInterpreter.py.
Collapse
Affiliation(s)
- Joana Fino
- Department of Human Genetics, National Health Institute Doutor Ricardo Jorge, Lisbon, Portugal
| | - Bárbara Marques
- Department of Human Genetics, National Health Institute Doutor Ricardo Jorge, Lisbon, Portugal
| | - Zirui Dong
- Department of Obstetrics and Gynaecology, The Chinese University of Hong Kong, Hong Kong, China.,Shenzhen Research Institute, The Chinese University of Hong Kong, Shenzhen, China.,Hong Kong Hub of Pediatric Excellence, The Chinese University of Hong Kong, Hong Kong, China
| | - Dezső David
- Department of Human Genetics, National Health Institute Doutor Ricardo Jorge, Lisbon, Portugal
| |
Collapse
|
45
|
Yi X, Zheng Z, Xu H, Zhou Y, Huang D, Wang J, Feng X, Zhao K, Fan X, Zhang S, Dong X, Wang Z, Shen Y, Cheng H, Shi L, Li MJ. Interrogating cell type-specific cooperation of transcriptional regulators in 3D chromatin. iScience 2021; 24:103468. [PMID: 34888502 PMCID: PMC8634045 DOI: 10.1016/j.isci.2021.103468] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 09/23/2021] [Accepted: 11/12/2021] [Indexed: 12/14/2022] Open
Abstract
Context-specific activities of transcription regulators (TRs) in the nucleus modulate spatiotemporal gene expression precisely. Using the largest ChIP-seq data and chromatin loops in the human K562 cell line, we initially interrogated TR cooperation in 3D chromatin via a graphical model and revealed many known and novel TRs manipulating context-specific pathways. To explore TR cooperation across broad tissue/cell types, we systematically leveraged large-scale open chromatin profiles, computational footprinting, and high-resolution chromatin interactions to investigate tissue/cell type-specific TR cooperation. We first delineated a landscape of TR cooperation across 40 human tissue/cell types. Network modularity analyses uncovered the commonality and specificity of TR cooperation in different conditions. We also demonstrated that TR cooperation information can better interpret the disease-causal variants identified by genome-wide association studies and recapitulate cell states during neural development. Our study characterizes shared and unique patterns of TR cooperation associated with the cell type specificity of gene regulation in 3D chromatin. Computational inference of transcriptional regulator (TR) cooperation in 3D chromatin A landscape of 3D TR cooperation across 40 human tissue/cell types TR cooperation can better interpret the disease-causal variants identified by GWAS Cooperation of certain TRs shapes context-specific gene regulation in cell development
Collapse
Affiliation(s)
- Xianfu Yi
- School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin 300070, China.,Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China
| | - Zhanye Zheng
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Hang Xu
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China
| | - Yao Zhou
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Dandan Huang
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Jianhua Wang
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xiangling Feng
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Ke Zhao
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xutong Fan
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Shijie Zhang
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Xiaobao Dong
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Genetics, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Zhao Wang
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Yujun Shen
- Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Hui Cheng
- State Key Laboratory of Experimental Hematology, Chinese Academy of Medical Sciences, Tianjin 300070, China
| | - Lei Shi
- Department of Biochemistry and Molecular Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China
| | - Mulin Jun Li
- Department of Bioinformatics, The Province and Ministry Co-sponsored Collaborative Innovation Center for Medical Epigenetics, Tianjin Medical University, Tianjin 300070, China.,Department of Pharmacology, Tianjin Key Laboratory of Inflammation Biology, School of Basic Medical Sciences, Tianjin Medical University, Tianjin 300070, China.,Department of Epidemiology and Biostatistics, Tianjin Key Laboratory of Molecular Cancer Epidemiology, National Clinical Research Center for Cancer, Tianjin Medical University Cancer Institute and Hospital, Tianjin Medical University, Tianjin 300070, China
| |
Collapse
|
46
|
Liu Y, Li H, Czajkowsky DM, Shao Z. Monocytic THP-1 cells diverge significantly from their primary counterparts: a comparative examination of the chromosomal conformations and transcriptomes. Hereditas 2021; 158:43. [PMID: 34740370 PMCID: PMC8569982 DOI: 10.1186/s41065-021-00205-w] [Citation(s) in RCA: 6] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/30/2021] [Accepted: 10/11/2021] [Indexed: 11/25/2022] Open
Abstract
Immortalized cell lines have long been used as model systems to systematically investigate biological processes under controlled and reproducible conditions, providing insights that have greatly advanced cellular biology and medical sciences. Recently, the widely used monocytic leukemia cell line, THP-1, was comprehensively examined to understand mechanistic relationships between the 3D chromatin structure and transcription during the trans-differentiation of monocytes to macrophages. To corroborate these observations in primary cells, we analyze in situ Hi-C and RNA-seq data of human primary monocytes and their differentiated macrophages in comparison to that obtained from the monocytic/macrophagic THP-1 cells. Surprisingly, we find significant differences between the primary cells and the THP-1 cells at all levels of chromatin structure, from loops to topologically associated domains to compartments. Importantly, the compartment-level differences correlate significantly with transcription: those genes that are in A-compartments in the primary cells but are in B-compartments in the THP-1 cells exhibit a higher level of expression in the primary cells than in the THP-1 cells, and vice versa. Overall, the genes in these different compartments are enriched for a wide range of pathways, and, at least in the case of the monocytic cells, their altered expression in certain pathways in the THP-1 cells argues for a less immune cell-like phenotype, suggesting that immortalization or prolonged culturing of THP-1 caused a divergence of these cells from their primary counterparts. It is thus essential to reexamine phenotypic details observed in cell lines with their primary counterparts so as to ensure a proper understanding of functional cell states in vivo.
Collapse
Affiliation(s)
- Yulong Liu
- State Key Laboratory for Oncogenes & Related Genes and Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Hua Li
- State Key Laboratory for Oncogenes & Related Genes and Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| | - Daniel M Czajkowsky
- State Key Laboratory for Oncogenes & Related Genes and Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China.
| | - Zhifeng Shao
- State Key Laboratory for Oncogenes & Related Genes and Bio-ID Center, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, 200240, China
| |
Collapse
|
47
|
Xu W, Zhong Q, Lin D, Zuo Y, Dai J, Li G, Cao G. CoolBox: a flexible toolkit for visual analysis of genomics data. BMC Bioinformatics 2021; 22:489. [PMID: 34629071 DOI: 10.1186/s12859-021-04408-w] [Citation(s) in RCA: 14] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/02/2021] [Accepted: 09/27/2021] [Indexed: 01/20/2023] Open
Abstract
Background Data visualization, especially the genome track plots, is crucial for genomics researchers to discover patterns in large-scale sequencing dataset. Although existing tools works well for producing a normal view of the input data, they are not convenient when users want to create customized data representations. Such gap between the visualization and data processing, prevents the users to uncover more hidden structure of the dataset. Results We developed CoolBox—an open-source toolkit for visual analysis of genomics data. This user-friendly toolkit is highly compatible with the Python ecosystem and customizable with a well-designed user interface. It can be used in various visualization situations like a Swiss army knife. For example, to produce high-quality genome track plots or fetch commonly used genomic data files with a Python script or command line, to explore genomic data interactively within Jupyter environment or web browser. Moreover, owing to the highly extensible Application Programming Interface design, users can customize their own tracks without difficulty, which greatly facilitate analytical, comparative genomic data visualization tasks. Conclusions CoolBox allows users to produce high-quality visualization plots and explore their data in a flexible, programmable and user-friendly way.
Collapse
|
48
|
Ricker E, Manni M, Flores-Castro D, Jenkins D, Gupta S, Rivera-Correa J, Meng W, Rosenfeld AM, Pannellini T, Bachu M, Chinenov Y, Sculco PK, Jessberger R, Prak ETL, Pernis AB. Altered function and differentiation of age-associated B cells contribute to the female bias in lupus mice. Nat Commun 2021; 12:4813. [PMID: 34376664 PMCID: PMC8355159 DOI: 10.1038/s41467-021-25102-8] [Citation(s) in RCA: 39] [Impact Index Per Article: 13.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/15/2020] [Accepted: 07/15/2021] [Indexed: 12/11/2022] Open
Abstract
Differences in immune responses to viruses and autoimmune diseases such as systemic lupus erythematosus (SLE) can show sexual dimorphism. Age-associated B cells (ABC) are a population of CD11c+T-bet+ B cells critical for antiviral responses and autoimmune disorders. Absence of DEF6 and SWAP-70, two homologous guanine exchange factors, in double-knock-out (DKO) mice leads to a lupus-like syndrome in females marked by accumulation of ABCs. Here we demonstrate that DKO ABCs show sex-specific differences in cell number, upregulation of an ISG signature, and further differentiation. DKO ABCs undergo oligoclonal expansion and differentiate into both CD11c+ and CD11c- effector B cell populations with pathogenic and pro-inflammatory function as demonstrated by BCR sequencing and fate-mapping experiments. Tlr7 duplication in DKO males overrides the sex-bias and further augments the dissemination and pathogenicity of ABCs, resulting in severe pulmonary inflammation and early mortality. Thus, sexual dimorphism shapes the expansion, function and differentiation of ABCs that accompanies TLR7-driven immunopathogenesis.
Collapse
Affiliation(s)
- Edd Ricker
- Autoimmunity and Inflammation Program, Hospital for Special Surgery, New York, NY, USA
- Department of Microbiology and Immunology, Weill Cornell Medicine, New York, NY, USA
| | - Michela Manni
- Autoimmunity and Inflammation Program, Hospital for Special Surgery, New York, NY, USA
| | - Danny Flores-Castro
- Autoimmunity and Inflammation Program, Hospital for Special Surgery, New York, NY, USA
| | - Daniel Jenkins
- Autoimmunity and Inflammation Program, Hospital for Special Surgery, New York, NY, USA
| | - Sanjay Gupta
- Autoimmunity and Inflammation Program, Hospital for Special Surgery, New York, NY, USA
| | - Juan Rivera-Correa
- Autoimmunity and Inflammation Program, Hospital for Special Surgery, New York, NY, USA
- Department of Microbiology and Immunology, Weill Cornell Medicine, New York, NY, USA
| | - Wenzhao Meng
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, Philadelphia, PA, USA
| | - Aaron M Rosenfeld
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, Philadelphia, PA, USA
| | - Tania Pannellini
- Research Division and Precision Medicine Laboratory, Hospital for Special Surgery, New York, NY, USA
| | - Mahesh Bachu
- Arthritis and Tissue Degeneration Program, Hospital for Special Surgery, New York, NY, USA
| | - Yurii Chinenov
- David Z. Rosensweig Genomics Research Center, Hospital for Special Surgery, New York, NY, USA
| | - Peter K Sculco
- Department of Orthopedic Surgery, Hospital for Special Surgery, New York, NY, USA
| | - Rolf Jessberger
- Institute of Physiological Chemistry, Technische Universitat, Dresden, Germany
| | - Eline T Luning Prak
- Department of Pathology and Laboratory Medicine, Perelman School of Medicine, Philadelphia, PA, USA
| | - Alessandra B Pernis
- Autoimmunity and Inflammation Program, Hospital for Special Surgery, New York, NY, USA.
- Department of Microbiology and Immunology, Weill Cornell Medicine, New York, NY, USA.
- David Z. Rosensweig Genomics Research Center, Hospital for Special Surgery, New York, NY, USA.
- Department of Medicine, Weill Cornell Medicine, New York, NY, USA.
| |
Collapse
|
49
|
Cheng F, Li H, Brooks BW, You J. Signposts for Aquatic Toxicity Evaluation in China: Text Mining using Event-Driven Taxonomy within and among Regions. Environ Sci Technol 2021; 55:8977-8986. [PMID: 34142809 DOI: 10.1021/acs.est.1c00152] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]
Abstract
Selection of toxicity endpoints affects outcomes of risk assessment. Scientific decisions based on more holistic evidence is preferable for designing bioassay batteries rather than subjective selections, particularly when systems are poorly understood. Here, we propose a novel event-driven taxonomy (EDT)-based text mining tool to prioritize stressors likely to elicit water quality deterioration. The tool integrated automated literature collection, natural language processing using adverse outcome pathway-based toxicological terminologies and machine learning to classify event drivers (EDs). From aquatic toxicity assessments within China over the past decade, we gathered over 14 000 sources of information. With a dictionary that included 1039 toxicological terms, 15 bioassay-related modes of actions were mapped, yet less than half of the bioassays could be elucidated by available adverse outcome pathways. To fill these mechanistic knowledge gaps, we developed a Naïve Bayesian ED-classifier to annotate apical responses. The classifier's 4-fold cross-validation reached 74% accuracy and labeled 85% bioassays as 26 EDs. Narcosis, estrogen receptor-, and aryl hydrogen receptor-mediators were the major EDs in aquatic systems across China, whereas individual regions had distinct ED fingerprints. The EDT-based tool provides a promising diagnostic strategy to inform region-specific bioassay design and selection for water quality assessments in a big data era.
Collapse
Affiliation(s)
- Fei Cheng
- Guangdong Key Laboratory of Environmental Pollution and Health, School of Environment, Jinan University, Guangzhou 511443, China
| | - Huizhen Li
- Guangdong Key Laboratory of Environmental Pollution and Health, School of Environment, Jinan University, Guangzhou 511443, China
| | - Bryan W Brooks
- Guangdong Key Laboratory of Environmental Pollution and Health, School of Environment, Jinan University, Guangzhou 511443, China
- Department of Environmental Science, Institute of Biomedical Studies, Center for Reservoir and Aquatic Systems Research, Baylor University, Waco, Texas 76798, United States
| | - Jing You
- Guangdong Key Laboratory of Environmental Pollution and Health, School of Environment, Jinan University, Guangzhou 511443, China
| |
Collapse
|
50
|
Wang J, Huang TYT, Hou Y, Bartom E, Lu X, Shilatifard A, Yue F, Saratsis A. Epigenomic landscape and 3D genome structure in pediatric high-grade glioma. Sci Adv 2021; 7:7/23/eabg4126. [PMID: 34078608 PMCID: PMC10166578 DOI: 10.1126/sciadv.abg4126] [Citation(s) in RCA: 24] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 01/04/2021] [Accepted: 04/16/2021] [Indexed: 05/10/2023]
Abstract
Pediatric high-grade gliomas (pHGGs), including glioblastoma multiforme (GBM) and diffuse intrinsic pontine glioma (DIPG), are morbid brain tumors. Even with treatment survival is poor, making pHGG the number one cause of cancer death in children. Up to 80% of DIPGs harbor a somatic missense mutation in genes encoding histone H3. To investigate whether H3K27M is associated with distinct chromatin structure that alters transcription regulation, we generated the first high-resolution Hi-C maps of pHGG cell lines and tumor tissue. By integrating transcriptome (RNA-seq), enhancer landscape (ChIP-seq), genome structure (Hi-C), and chromatin accessibility (ATAC-seq) datasets from H3K27M and wild-type specimens, we identified tumor-specific enhancers and regulatory networks for known oncogenes. We identified genomic structural variations that lead to potential enhancer hijacking and gene coamplification, including A2M, JAG2, and FLRT1 Together, our results imply three-dimensional genome alterations may play a critical role in the pHGG epigenetic landscape and contribute to tumorigenesis.
Collapse
Affiliation(s)
- Juan Wang
- Department of Biochemistry and Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Tina Yi-Ting Huang
- Department of Neurological Surgery, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Ye Hou
- Department of Biochemistry and Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Elizabeth Bartom
- Department of Biochemistry and Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Xinyan Lu
- Department of Pathology, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Ali Shilatifard
- Department of Biochemistry and Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
| | - Feng Yue
- Department of Biochemistry and Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA.
- Robert H. Lurie Comprehensive Cancer Center of Northwestern University, Chicago, IL 60611, USA
| | - Amanda Saratsis
- Department of Biochemistry and Molecular Genetics, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA.
- Department of Neurological Surgery, Northwestern University Feinberg School of Medicine, Chicago, IL 60611, USA
- Robert H. Lurie Comprehensive Cancer Center of Northwestern University, Chicago, IL 60611, USA
- Division of Pediatric Neurosurgery, Department of Surgery, Ann & Robert H. Lurie Children's Hospital of Chicago, Chicago, IL 60611, USA
| |
Collapse
|