1
|
Zhao Y, Yang M, Gong F, Pan Y, Hu M, Peng Q, Lu L, Lyu X, Sun K. Accelerating 3D genomics data analysis with Microcket. Commun Biol 2024; 7:675. [PMID: 38824179 PMCID: PMC11144199 DOI: 10.1038/s42003-024-06382-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2024] [Accepted: 05/24/2024] [Indexed: 06/03/2024] Open
Abstract
The three-dimensional (3D) organization of genome is fundamental to cell biology. To explore 3D genome, emerging high-throughput approaches have produced billions of sequencing reads, which is challenging and time-consuming to analyze. Here we present Microcket, a package for mapping and extracting interacting pairs from 3D genomics data, including Hi-C, Micro-C, and derivant protocols. Microcket utilizes a unique read-stitch strategy that takes advantage of the long read cycles in modern DNA sequencers; benchmark evaluations reveal that Microcket runs much faster than the current tools along with improved mapping efficiency, and thus shows high potential in accelerating and enhancing the biological investigations into 3D genome. Microcket is freely available at https://github.com/hellosunking/Microcket .
Collapse
Affiliation(s)
- Yu Zhao
- Molecular Cancer Research Center, School of Medicine, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, 518107, China
| | - Mengqi Yang
- Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, 518132, China
- Department of Chemical and Biological Engineering, Division of Life Science, Hong Kong University of Science and Technology, Hong Kong SAR, 999077, China
| | - Fanglei Gong
- Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, 518132, China
| | - Yuqi Pan
- Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, 518132, China
- Department of Biology, School of Life Sciences, Southern University of Science and Technology, Shenzhen, 518055, China
| | - Minghui Hu
- Molecular Cancer Research Center, School of Medicine, Shenzhen Campus of Sun Yat-sen University, Sun Yat-sen University, Shenzhen, 518107, China
| | - Qin Peng
- Institute of Systems and Physical Biology, Shenzhen Bay Laboratory, Shenzhen, 518132, China
| | - Leina Lu
- Department of Genetics and Genome Sciences, School of Medicine, Case Western Reserve University, Cleveland, OH, 44106, USA
| | - Xiaowen Lyu
- State Key Laboratory of Cellular Stress Biology, Fujian Provincial Key Laboratory of Reproductive Health Research, Fujian Provincial Key Laboratory of Organ and Tissue Regeneration, School of Medicine, Faculty of Medicine and Life Sciences, Xiamen University, Xiamen, 361102, China
| | - Kun Sun
- Institute of Cancer Research, Shenzhen Bay Laboratory, Shenzhen, 518132, China.
| |
Collapse
|
2
|
Tapia Del Fierro A, den Hamer B, Benetti N, Jansz N, Chen K, Beck T, Vanyai H, Gurzau AD, Daxinger L, Xue S, Ly TTN, Wanigasuriya I, Iminitoff M, Breslin K, Oey H, Krom YD, van der Hoorn D, Bouwman LF, Johanson TM, Ritchie ME, Gouil QA, Reversade B, Prin F, Mohun T, van der Maarel SM, McGlinn E, Murphy JM, Keniry A, de Greef JC, Blewitt ME. SMCHD1 has separable roles in chromatin architecture and gene silencing that could be targeted in disease. Nat Commun 2023; 14:5466. [PMID: 37749075 PMCID: PMC10519958 DOI: 10.1038/s41467-023-40992-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/10/2021] [Accepted: 08/07/2023] [Indexed: 09/27/2023] Open
Abstract
The interplay between 3D chromatin architecture and gene silencing is incompletely understood. Here, we report a novel point mutation in the non-canonical SMC protein SMCHD1 that enhances its silencing capacity at endogenous developmental targets. Moreover, it also results in enhanced silencing at the facioscapulohumeral muscular dystrophy associated macrosatellite-array, D4Z4, resulting in enhanced repression of DUX4 encoded by this repeat. Heightened SMCHD1 silencing perturbs developmental Hox gene activation, causing a homeotic transformation in mice. Paradoxically, the mutant SMCHD1 appears to enhance insulation against other epigenetic regulators, including PRC2 and CTCF, while depleting long range chromatin interactions akin to what is observed in the absence of SMCHD1. These data suggest that SMCHD1's role in long range chromatin interactions is not directly linked to gene silencing or insulating the chromatin, refining the model for how the different levels of SMCHD1-mediated chromatin regulation interact to bring about gene silencing in normal development and disease.
Collapse
Affiliation(s)
- Andres Tapia Del Fierro
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Bianca den Hamer
- Department of Human Genetics, Leiden University Medical Center, Leiden, Netherlands
| | - Natalia Benetti
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Natasha Jansz
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Kelan Chen
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Tamara Beck
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
| | - Hannah Vanyai
- Crick Advanced Light Microscopy Facility, The Francis Crick Institute, London, UK
| | - Alexandra D Gurzau
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Lucia Daxinger
- Queensland Institute of Medical Research, Brisbane, QLD, Australia
| | - Shifeng Xue
- Department of Biological Sciences, National University of Singapore, Singapore, Singapore
- Institute of Molecular and Cell Biology, A*STAR, Singapore, Singapore
| | - Thanh Thao Nguyen Ly
- Department of Biological Sciences, National University of Singapore, Singapore, Singapore
- Institute of Molecular and Cell Biology, A*STAR, Singapore, Singapore
| | - Iromi Wanigasuriya
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Megan Iminitoff
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Kelsey Breslin
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
| | - Harald Oey
- Queensland Institute of Medical Research, Brisbane, QLD, Australia
| | - Yvonne D Krom
- Department of Human Genetics, Leiden University Medical Center, Leiden, Netherlands
| | - Dinja van der Hoorn
- Department of Human Genetics, Leiden University Medical Center, Leiden, Netherlands
| | - Linde F Bouwman
- Department of Human Genetics, Leiden University Medical Center, Leiden, Netherlands
| | - Timothy M Johanson
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Matthew E Ritchie
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Quentin A Gouil
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Bruno Reversade
- Institute of Molecular and Cell Biology, A*STAR, Singapore, Singapore
- Genome Institute of Singapore, A*STAR, Singapore, Singapore
| | - Fabrice Prin
- Crick Advanced Light Microscopy Facility, The Francis Crick Institute, London, UK
| | - Timothy Mohun
- Crick Advanced Light Microscopy Facility, The Francis Crick Institute, London, UK
| | | | - Edwina McGlinn
- EMBL Australia, Monash University, Clayton, VIC, Australia
- Australian Regenerative Medicine Institute, Monash University, Clayton, VIC, Australia
| | - James M Murphy
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
- Drug Discovery Biology, Monash Institute of Pharmaceutical Sciences, Monash University, Parkville, VIC, Australia
| | - Andrew Keniry
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia
| | - Jessica C de Greef
- Department of Human Genetics, Leiden University Medical Center, Leiden, Netherlands
| | - Marnie E Blewitt
- The Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC, Australia.
- The Department of Medical Biology, University of Melbourne, Melbourne, VIC, Australia.
| |
Collapse
|
3
|
Tao X, Li S, Chen G, Wang J, Xu S. Approaches for Modes of Action Study of Long Non-Coding RNAs: From Single Verification to Genome-Wide Determination. Int J Mol Sci 2023; 24:ijms24065562. [PMID: 36982636 PMCID: PMC10054671 DOI: 10.3390/ijms24065562] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2023] [Revised: 03/08/2023] [Accepted: 03/10/2023] [Indexed: 03/17/2023] Open
Abstract
Long non-coding RNAs (lncRNAs) are transcripts longer than 200 nucleotides (nt) that are not translated into known functional proteins. This broad definition covers a large collection of transcripts with diverse genomic origins, biogenesis, and modes of action. Thus, it is very important to choose appropriate research methodologies when investigating lncRNAs with biological significance. Multiple reviews to date have summarized the mechanisms of lncRNA biogenesis, their localization, their functions in gene regulation at multiple levels, and also their potential applications. However, little has been reviewed on the leading strategies for lncRNA research. Here, we generalize a basic and systemic mind map for lncRNA research and discuss the mechanisms and the application scenarios of ‘up-to-date’ techniques as applied to molecular function studies of lncRNAs. Taking advantage of documented lncRNA research paradigms as examples, we aim to provide an overview of the developing techniques for elucidating lncRNA interactions with genomic DNA, proteins, and other RNAs. In the end, we propose the future direction and potential technological challenges of lncRNA studies, focusing on techniques and applications.
Collapse
Affiliation(s)
- Xiaoyuan Tao
- Xianghu Laboratory, Hangzhou 311231, China
- Central Laboratory, State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-Products, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China
| | - Sujuan Li
- Central Laboratory, State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-Products, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China
| | - Guang Chen
- Central Laboratory, State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-Products, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China
| | - Jian Wang
- Central Laboratory, State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-Products, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China
| | - Shengchun Xu
- Xianghu Laboratory, Hangzhou 311231, China
- Central Laboratory, State Key Laboratory for Managing Biotic and Chemical Threats to the Quality and Safety of Agro-Products, Zhejiang Academy of Agricultural Sciences, Hangzhou 310021, China
- Correspondence:
| |
Collapse
|
4
|
Richer S, Tian Y, Schoenfelder S, Hurst L, Murrell A, Pisignano G. Widespread allele-specific topological domains in the human genome are not confined to imprinted gene clusters. Genome Biol 2023; 24:40. [PMID: 36869353 PMCID: PMC9983196 DOI: 10.1186/s13059-023-02876-2] [Citation(s) in RCA: 4] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2022] [Accepted: 02/13/2023] [Indexed: 03/05/2023] Open
Abstract
BACKGROUND There is widespread interest in the three-dimensional chromatin conformation of the genome and its impact on gene expression. However, these studies frequently do not consider parent-of-origin differences, such as genomic imprinting, which result in monoallelic expression. In addition, genome-wide allele-specific chromatin conformation associations have not been extensively explored. There are few accessible bioinformatic workflows for investigating allelic conformation differences and these require pre-phased haplotypes which are not widely available. RESULTS We developed a bioinformatic pipeline, "HiCFlow," that performs haplotype assembly and visualization of parental chromatin architecture. We benchmarked the pipeline using prototype haplotype phased Hi-C data from GM12878 cells at three disease-associated imprinted gene clusters. Using Region Capture Hi-C and Hi-C data from human cell lines (1-7HB2, IMR-90, and H1-hESCs), we can robustly identify the known stable allele-specific interactions at the IGF2-H19 locus. Other imprinted loci (DLK1 and SNRPN) are more variable and there is no "canonical imprinted 3D structure," but we could detect allele-specific differences in A/B compartmentalization. Genome-wide, when topologically associating domains (TADs) are unbiasedly ranked according to their allele-specific contact frequencies, a set of allele-specific TADs could be defined. These occur in genomic regions of high sequence variation. In addition to imprinted genes, allele-specific TADs are also enriched for allele-specific expressed genes. We find loci that have not previously been identified as allele-specific expressed genes such as the bitter taste receptors (TAS2Rs). CONCLUSIONS This study highlights the widespread differences in chromatin conformation between heterozygous loci and provides a new framework for understanding allele-specific expressed genes.
Collapse
Affiliation(s)
- Stephen Richer
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Yuan Tian
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
- UCL Cancer Institute, University College London, Paul O'Gorman Building, London, UK
| | | | - Laurence Hurst
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK
| | - Adele Murrell
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK.
| | - Giuseppina Pisignano
- Department of Life Sciences, University of Bath, Claverton Down, Bath, BA2 7AY, UK.
| |
Collapse
|
5
|
Ding T, Zhang H. Novel biological insights revealed from the investigation of multiscale genome architecture. Comput Struct Biotechnol J 2022; 21:312-325. [PMID: 36582436 PMCID: PMC9791078 DOI: 10.1016/j.csbj.2022.12.009] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2022] [Revised: 12/06/2022] [Accepted: 12/06/2022] [Indexed: 12/13/2022] Open
Abstract
Gene expression and cell fate determination require precise and coordinated epigenetic regulation. The complex three-dimensional (3D) genome organization plays a critical role in transcription in myriad biological processes. A wide range of architectural features of the 3D genome, including chromatin loops, topologically associated domains (TADs), chromatin compartments, and phase separation, together regulate the chromatin state and transcriptional activity at multiple levels. With the help of 3D genome informatics, recent biochemistry and imaging approaches based on different strategies have revealed functional interactions among biomacromolecules, even at the single-cell level. Here, we review the occurrence, mechanistic basis, and functional implications of dynamic genome organization, and outline recent experimental and computational approaches for profiling multiscale genome architecture to provide robust tools for studying the 3D genome.
Collapse
Affiliation(s)
| | - He Zhang
- Corresponding author at: School of Life Science and Technology, Tongji University, Shanghai 200092, PR China.
| |
Collapse
|
6
|
Garske KM, Comenho C, Pan DZ, Alvarez M, Mohlke K, Laakso M, Pietiläinen KH, Pajukanta P. Long-range chromosomal interactions increase and mark repressed gene expression during adipogenesis. Epigenetics 2022; 17:1849-1862. [PMID: 35746833 PMCID: PMC9665133 DOI: 10.1080/15592294.2022.2088145] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/09/2023] Open
Abstract
Obesity perturbs central functions of human adipose tissue, centred on differentiation of preadipocytes to adipocytes, i.e., adipogenesis. The large environmental component of obesity makes it important to elucidate epigenetic regulatory factors impacting adipogenesis. Promoter Capture Hi-C (pCHi-C) has been used to identify chromosomal interactions between promoters and associated regulatory elements. However, long range interactions (LRIs) greater than 1 Mb are often filtered out of pCHi-C datasets, due to technical challenges and their low prevalence. To elucidate the unknown role of LRIs in adipogenesis, we investigated preadipocyte differentiation to adipocytes using pCHi-C and bulk and single nucleus RNA-seq data. We first show that LRIs are reproducible between biological replicates, and they increase >2-fold in frequency across adipogenesis. We further demonstrate that genomic loci containing LRIs are more epigenetically repressed than regions without LRIs, corresponding to lower gene expression in the LRI regions. Accordingly, as preadipocytes differentiate into adipocytes, LRI regions are more likely to contain repressed preadipocyte marker genes; whereas these same LRI regions are depleted of actively expressed adipocyte marker genes. Finally, we show that LRIs can be used to restrict multiple testing of the long-range cis-eQTL analysis to identify variants that regulate genes via LRIs. We exemplify this by identifying a putative long range cis regulatory mechanism at the LYPLAL1/TGFB2 obesity locus. In summary, we identify LRIs that mark repressed regions of the genome, and these interactions increase across adipogenesis, pinpointing developmental regions that need to be repressed in a cell-type specific way for adipogenesis to proceed.
Collapse
Affiliation(s)
- Kristina M. Garske
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - Caroline Comenho
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - David Z. Pan
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA,Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA
| | - Marcus Alvarez
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA
| | - Karen Mohlke
- Department of Genetics, University of North Carolina, Chapel Hill, NC, USA
| | - Markku Laakso
- Internal Medicine, Institute of Clinical Medicine, University of Eastern Finland and Kuopio University Hospital, Kuopio, Finland
| | - Kirsi H. Pietiläinen
- Obesity Research Unit, Research Program for Clinical and Molecular Metabolism, Faculty of Medicine, University of Helsinki, Helsinki, Finland,Obesity Center, Abdominal Center, Helsinki University Hospital and University of Helsinki, Helsinki, Finland
| | - Päivi Pajukanta
- Department of Human Genetics, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA,Bioinformatics Interdepartmental Program, UCLA, Los Angeles, CA, USA,Institute for Precision Heath, David Geffen School of Medicine at UCLA, Los Angeles, CA, USA,CONTACT Päivi Pajukanta Department of Human Genetics David Geffen School of Medicine at UCLA
| |
Collapse
|
7
|
Integrating epigenetics and metabolomics to advance treatments for pulmonary arterial hypertension. Biochem Pharmacol 2022; 204:115245. [PMID: 36096239 DOI: 10.1016/j.bcp.2022.115245] [Citation(s) in RCA: 6] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/19/2022] [Revised: 08/25/2022] [Accepted: 09/02/2022] [Indexed: 11/23/2022]
Abstract
Pulmonary arterial hypertension (PAH) is a devastating vascular disease with multiple etiologies. Emerging evidence supports a fundamental role for epigenetic machinery and metabolism in the initiation and progression of PAH. Here, we summarize emerging epigenetic mechanisms that have been identified as contributors to PAH evolution, specifically, DNA methylation, histone modifications, and microRNAs. Furthermore, the interplay between epigenetics with metabolism is explored while new crosstalk targets to be investigated in PAH are proposed that highlight multi-omics strategies including integrated epigenomics and metabolomics. Therapeutic opportunities and challenges associated with epigenetics and metabolomics in PAH are examined, highlighting the role that epigenetics and metabolomics have in facilitating early detection, personalized dietary plans, and advanced drug therapy for PAH.
Collapse
|
8
|
Aljogol D, Thompson IR, Osborne CS, Mifsud B. Comparison of Capture Hi-C Analytical Pipelines. Front Genet 2022; 13:786501. [PMID: 35198004 PMCID: PMC8859814 DOI: 10.3389/fgene.2022.786501] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/30/2021] [Accepted: 01/03/2022] [Indexed: 11/13/2022] Open
Abstract
It is now evident that DNA forms an organized nuclear architecture, which is essential to maintain the structural and functional integrity of the genome. Chromatin organization can be systematically studied due to the recent boom in chromosome conformation capture technologies (e.g., 3C and its successors 4C, 5C and Hi-C), which is accompanied by the development of computational pipelines to identify biologically meaningful chromatin contacts in such data. However, not all tools are applicable to all experimental designs and all structural features. Capture Hi-C (CHi-C) is a method that uses an intermediate hybridization step to target and select predefined regions of interest in a Hi-C library, thereby increasing effective sequencing depth for those regions. It allows researchers to investigate fine chromatin structures at high resolution, for instance promoter-enhancer loops, but it introduces additional biases with the capture step, and therefore requires specialized pipelines. Here, we compare multiple analytical pipelines for CHi-C data analysis. We consider the effect of retaining multi-mapping reads and compare the efficiency of different statistical approaches in both identifying reproducible interactions and determining biologically significant interactions. At restriction fragment level resolution, the number of multi-mapping reads that could be rescued was negligible. The number of identified interactions varied widely, depending on the analytical method, indicating large differences in type I and type II error rates. The optimal pipeline depends on the project-specific tolerance level of false positive and false negative chromatin contacts.
Collapse
Affiliation(s)
- Dina Aljogol
- College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar
| | - I. Richard Thompson
- Qatar Biomedical Research Institute, Hamad Bin Khalifa University, Doha, Qatar
| | - Cameron S. Osborne
- Department of Medical and Molecular Genetics, King’s College London, London, United Kingdom
| | - Borbala Mifsud
- College of Health and Life Sciences, Hamad Bin Khalifa University, Doha, Qatar
- William Harvey Research Institute, Queen Mary University of London, London, United Kingdom
- *Correspondence: Borbala Mifsud,
| |
Collapse
|
9
|
Orozco G, Schoenfelder S, Walker N, Eyre S, Fraser P. 3D genome organization links non-coding disease-associated variants to genes. Front Cell Dev Biol 2022; 10:995388. [PMID: 36340032 PMCID: PMC9631826 DOI: 10.3389/fcell.2022.995388] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/15/2022] [Accepted: 09/27/2022] [Indexed: 11/13/2022] Open
Abstract
Genome sequencing has revealed over 300 million genetic variations in human populations. Over 90% of variants are single nucleotide polymorphisms (SNPs), the remainder include short deletions or insertions, and small numbers of structural variants. Hundreds of thousands of these variants have been associated with specific phenotypic traits and diseases through genome wide association studies which link significant differences in variant frequencies with specific phenotypes among large groups of individuals. Only 5% of disease-associated SNPs are located in gene coding sequences, with the potential to disrupt gene expression or alter of the function of encoded proteins. The remaining 95% of disease-associated SNPs are located in non-coding DNA sequences which make up 98% of the genome. The role of non-coding, disease-associated SNPs, many of which are located at considerable distances from any gene, was at first a mystery until the discovery that gene promoters regularly interact with distal regulatory elements to control gene expression. Disease-associated SNPs are enriched at the millions of gene regulatory elements that are dispersed throughout the non-coding sequences of the genome, suggesting they function as gene regulation variants. Assigning specific regulatory elements to the genes they control is not straightforward since they can be millions of base pairs apart. In this review we describe how understanding 3D genome organization can identify specific interactions between gene promoters and distal regulatory elements and how 3D genomics can link disease-associated SNPs to their target genes. Understanding which gene or genes contribute to a specific disease is the first step in designing rational therapeutic interventions.
Collapse
Affiliation(s)
- Gisela Orozco
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom.,NIHR Manchester Biomedical Research Centre, Manchester University Foundation Trust, Manchester, United Kingdom
| | - Stefan Schoenfelder
- Enhanc3D Genomics Ltd., Cambridge, United Kingdom.,Epigenetics Programme, The Babraham Institute, Babraham Research Campus, CB22 3AT Cambridge, Cambridge, United Kingdom
| | | | - Stephan Eyre
- Centre for Genetics and Genomics Versus Arthritis, Division of Musculoskeletal and Dermatological Sciences, School of Biological Sciences, Faculty of Biology, Medicine and Health, The University of Manchester, Manchester Academic Health Science Centre, Manchester, United Kingdom.,NIHR Manchester Biomedical Research Centre, Manchester University Foundation Trust, Manchester, United Kingdom
| | - Peter Fraser
- Enhanc3D Genomics Ltd., Cambridge, United Kingdom.,Department of Biological Science, Florida State University, Tallahassee, FL, United States
| |
Collapse
|
10
|
Ray-Jones H, Spivakov M. Transcriptional enhancers and their communication with gene promoters. Cell Mol Life Sci 2021; 78:6453-6485. [PMID: 34414474 PMCID: PMC8558291 DOI: 10.1007/s00018-021-03903-w] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/12/2021] [Revised: 07/08/2021] [Accepted: 07/19/2021] [Indexed: 12/13/2022]
Abstract
Transcriptional enhancers play a key role in the initiation and maintenance of gene expression programmes, particularly in metazoa. How these elements control their target genes in the right place and time is one of the most pertinent questions in functional genomics, with wide implications for most areas of biology. Here, we synthesise classic and recent evidence on the regulatory logic of enhancers, including the principles of enhancer organisation, factors that facilitate and delimit enhancer-promoter communication, and the joint effects of multiple enhancers. We show how modern approaches building on classic insights have begun to unravel the complexity of enhancer-promoter relationships, paving the way towards a quantitative understanding of gene control.
Collapse
Affiliation(s)
- Helen Ray-Jones
- MRC London Institute of Medical Sciences, London, W12 0NN, UK
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College, London, W12 0NN, UK
| | - Mikhail Spivakov
- MRC London Institute of Medical Sciences, London, W12 0NN, UK.
- Institute of Clinical Sciences, Faculty of Medicine, Imperial College, London, W12 0NN, UK.
| |
Collapse
|
11
|
Freire-Pritchett P, Ray-Jones H, Della Rosa M, Eijsbouts CQ, Orchard WR, Wingett SW, Wallace C, Cairns J, Spivakov M, Malysheva V. Detecting chromosomal interactions in Capture Hi-C data with CHiCAGO and companion tools. Nat Protoc 2021; 16:4144-4176. [PMID: 34373652 PMCID: PMC7612634 DOI: 10.1038/s41596-021-00567-5] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/10/2020] [Accepted: 04/28/2021] [Indexed: 11/10/2022]
Abstract
Capture Hi-C is widely used to obtain high-resolution profiles of chromosomal interactions involving, at least on one end, regions of interest such as gene promoters. Signal detection in Capture Hi-C data is challenging and cannot be adequately accomplished with tools developed for other chromosome conformation capture methods, including standard Hi-C. Capture Hi-C Analysis of Genomic Organization (CHiCAGO) is a computational pipeline developed specifically for Capture Hi-C analysis. It implements a statistical model accounting for biological and technical background components, as well as bespoke normalization and multiple testing procedures for this data type. Here we provide a step-by-step guide to the CHiCAGO workflow that is aimed at users with basic experience of the command line and R. We also describe more advanced strategies for tuning the key parameters for custom experiments and provide guidance on data preprocessing and downstream analysis using companion tools. In a typical experiment, CHiCAGO takes ~2-3 h to run, although pre- and postprocessing steps may take much longer.
Collapse
Affiliation(s)
| | - Helen Ray-Jones
- Functional Gene Control Group, Epigenetics Section, MRC London Institute of Medical Sciences, London, UK.,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK
| | - Monica Della Rosa
- Functional Gene Control Group, Epigenetics Section, MRC London Institute of Medical Sciences, London, UK.,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK
| | - Chris Q Eijsbouts
- Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK.,Wellcome Centre for Human Genetics, University of Oxford, Oxford, UK
| | | | - Steven W Wingett
- Bioinformatics, The Babraham Institute, Cambridge, UK.,Cell Biology Division, MRC Laboratory of Molecular Biology, Cambridge, UK
| | - Chris Wallace
- Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, University of Cambridge, Cambridge, UK.,MRC Biostatistics Unit, Cambridge Biomedical Campus, Cambridge Institute of Public Health, Forvie Site, Robinson Way, Cambridge, UK
| | | | - Mikhail Spivakov
- Functional Gene Control Group, Epigenetics Section, MRC London Institute of Medical Sciences, London, UK. .,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK.
| | - Valeriya Malysheva
- Functional Gene Control Group, Epigenetics Section, MRC London Institute of Medical Sciences, London, UK. .,Institute of Clinical Sciences, Faculty of Medicine, Imperial College London, London, UK.
| |
Collapse
|