1
|
Langley AR, Gräf S, Smith JC, Krude T. Genome-wide identification and characterisation of human DNA replication origins by initiation site sequencing (ini-seq). Nucleic Acids Res 2016; 44:10230-10247. [PMID: 27587586 PMCID: PMC5137433 DOI: 10.1093/nar/gkw760] [Citation(s) in RCA: 51] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/21/2016] [Revised: 08/18/2016] [Accepted: 08/20/2016] [Indexed: 12/25/2022] Open
Abstract
Next-generation sequencing has enabled the genome-wide identification of human DNA replication origins. However, different approaches to mapping replication origins, namely (i) sequencing isolated small nascent DNA strands (SNS-seq); (ii) sequencing replication bubbles (bubble-seq) and (iii) sequencing Okazaki fragments (OK-seq), show only limited concordance. To address this controversy, we describe here an independent high-resolution origin mapping technique that we call initiation site sequencing (ini-seq). In this approach, newly replicated DNA is directly labelled with digoxigenin-dUTP near the sites of its initiation in a cell-free system. The labelled DNA is then immunoprecipitated and genomic locations are determined by DNA sequencing. Using this technique we identify >25,000 discrete origin sites at sub-kilobase resolution on the human genome, with high concordance between biological replicates. Most activated origins identified by ini-seq are found at transcriptional start sites and contain G-quadruplex (G4) motifs. They tend to cluster in early-replicating domains, providing a correlation between early replication timing and local density of activated origins. Origins identified by ini-seq show highest concordance with sites identified by SNS-seq, followed by OK-seq and bubble-seq. Furthermore, germline origins identified by positive nucleotide distribution skew jumps overlap with origins identified by ini-seq and OK-seq more frequently and more specifically than do sites identified by either SNS-seq or bubble-seq.
Collapse
Affiliation(s)
- Alexander R Langley
- Francis Crick Institute, Mill Hill Laboratory, The Ridgeway, London NW7 1AA, UK
| | - Stefan Gräf
- Department of Medicine, University of Cambridge, Cambridge CB2 0QQ, UK
- Department of Haematology, University of Cambridge, Cambridge CB2 0PT, UK
| | - James C Smith
- Francis Crick Institute, Mill Hill Laboratory, The Ridgeway, London NW7 1AA, UK
| | - Torsten Krude
- Department of Zoology, University of Cambridge, Downing Street, Cambridge CB2 3EJ, UK
| |
Collapse
|
2
|
Petryk N, Kahli M, d'Aubenton-Carafa Y, Jaszczyszyn Y, Shen Y, Silvain M, Thermes C, Chen CL, Hyrien O. Replication landscape of the human genome. Nat Commun 2016; 7:10208. [PMID: 26751768 PMCID: PMC4729899 DOI: 10.1038/ncomms10208] [Citation(s) in RCA: 215] [Impact Index Per Article: 23.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/22/2015] [Accepted: 11/13/2015] [Indexed: 12/21/2022] Open
Abstract
Despite intense investigation, human replication origins and termini remain elusive. Existing data have shown strong discrepancies. Here we sequenced highly purified Okazaki fragments from two cell types and, for the first time, quantitated replication fork directionality and delineated initiation and termination zones genome-wide. Replication initiates stochastically, primarily within non-transcribed, broad (up to 150 kb) zones that often abut transcribed genes, and terminates dispersively between them. Replication fork progression is significantly co-oriented with the transcription. Initiation and termination zones are frequently contiguous, sometimes separated by regions of unidirectional replication. Initiation zones are enriched in open chromatin and enhancer marks, even when not flanked by genes, and often border ‘topologically associating domains' (TADs). Initiation zones are enriched in origin recognition complex (ORC)-binding sites and better align to origins previously mapped using bubble-trap than λ-exonuclease. This novel panorama of replication reveals how chromatin and transcription modulate the initiation process to create cell-type-specific replication programs. The physical origin and termination sites of DNA replication in human cells have remained elusive. Here the authors use Okazaki fragment sequencing to reveal global replication patterns and show how chromatin and transcription modulate the process.
Collapse
Affiliation(s)
- Nataliya Petryk
- Ecole Normale Supérieure, Institut de Biologie de l'ENS (IBENS), and Inserm U1024, and CNRS UMR 8197, 46 rue d'Ulm, Paris F-75005, France.,Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Malik Kahli
- Ecole Normale Supérieure, Institut de Biologie de l'ENS (IBENS), and Inserm U1024, and CNRS UMR 8197, 46 rue d'Ulm, Paris F-75005, France
| | - Yves d'Aubenton-Carafa
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Yan Jaszczyszyn
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Yimin Shen
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Maud Silvain
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Claude Thermes
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Chun-Long Chen
- Institute for Integrative Biology of the Cell (I2BC), CEA, CNRS, Université Paris-Sud, UMR 9198, FRC 3115, Avenue de la Terrasse, Bâtiment 24, Gif-sur-Yvette, Paris F-91198, France
| | - Olivier Hyrien
- Ecole Normale Supérieure, Institut de Biologie de l'ENS (IBENS), and Inserm U1024, and CNRS UMR 8197, 46 rue d'Ulm, Paris F-75005, France
| |
Collapse
|
3
|
Liu F, Ren C, Li H, Zhou P, Bo X, Shu W. De novo identification of replication-timing domains in the human genome by deep learning. Bioinformatics 2015; 32:641-9. [PMID: 26545821 PMCID: PMC4795613 DOI: 10.1093/bioinformatics/btv643] [Citation(s) in RCA: 38] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/07/2015] [Accepted: 10/27/2015] [Indexed: 12/31/2022] Open
Abstract
Motivation: The de novo identification of the initiation and termination zones—regions that replicate earlier or later than their upstream and downstream neighbours, respectively—remains a key challenge in DNA replication. Results: Building on advances in deep learning, we developed a novel hybrid architecture combining a pre-trained, deep neural network and a hidden Markov model (DNN-HMM) for the de novo identification of replication domains using replication timing profiles. Our results demonstrate that DNN-HMM can significantly outperform strong, discriminatively trained Gaussian mixture model–HMM (GMM-HMM) systems and other six reported methods that can be applied to this challenge. We applied our trained DNN-HMM to identify distinct replication domain types, namely the early replication domain (ERD), the down transition zone (DTZ), the late replication domain (LRD) and the up transition zone (UTZ), using newly replicated DNA sequencing (Repli-Seq) data across 15 human cells. A subsequent integrative analysis revealed that these replication domains harbour unique genomic and epigenetic patterns, transcriptional activity and higher-order chromosomal structure. Our findings support the ‘replication-domain’ model, which states (1) that ERDs and LRDs, connected by UTZs and DTZs, are spatially compartmentalized structural and functional units of higher-order chromosomal structure, (2) that the adjacent DTZ-UTZ pairs form chromatin loops and (3) that intra-interactions within ERDs and LRDs tend to be short-range and long-range, respectively. Our model reveals an important chromatin organizational principle of the human genome and represents a critical step towards understanding the mechanisms regulating replication timing. Availability and implementation: Our DNN-HMM method and three additional algorithms can be freely accessed at https://github.com/wenjiegroup/DNN-HMM. The replication domain regions identified in this study are available in GEO under the accession ID GSE53984. Contact:shuwj@bmi.ac.cn or boxc@bmi.ac.cn Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Feng Liu
- Department of Biotechnology, Beijing Institute of Radiation Medicine and
| | - Chao Ren
- Department of Biotechnology, Beijing Institute of Radiation Medicine and
| | - Hao Li
- Department of Biotechnology, Beijing Institute of Radiation Medicine and
| | - Pingkun Zhou
- Department of Radiation Toxicology and Oncology, Beijing Institute of Radiation Medicine, Beijing 100850, China
| | - Xiaochen Bo
- Department of Biotechnology, Beijing Institute of Radiation Medicine and
| | - Wenjie Shu
- Department of Biotechnology, Beijing Institute of Radiation Medicine and
| |
Collapse
|
4
|
Boulos RE, Drillon G, Argoul F, Arneodo A, Audit B. Structural organization of human replication timing domains. FEBS Lett 2015; 589:2944-57. [PMID: 25912651 DOI: 10.1016/j.febslet.2015.04.015] [Citation(s) in RCA: 28] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2015] [Revised: 04/09/2015] [Accepted: 04/10/2015] [Indexed: 12/16/2022]
Abstract
Recent analysis of genome-wide epigenetic modification data, mean replication timing (MRT) profiles and chromosome conformation data in mammals have provided increasing evidence that flexibility in replication origin usage is regulated locally by the epigenetic landscape and over larger genomic distances by the 3D chromatin architecture. Here, we review the recent results establishing some link between replication domains and chromatin structural domains in pluripotent and various differentiated cell types in human. We reconcile the originally proposed dichotomic picture of early and late constant timing regions that replicate by multiple rather synchronous origins in separated nuclear compartments of open and closed chromatins, with the U-shaped MRT domains bordered by "master" replication origins specified by a localized (∼200-300 kb) zone of open and transcriptionally active chromatin from which a replication wave likely initiates and propagates toward the domain center via a cascade of origin firing. We discuss the relationships between these MRT domains, topologically associated domains and lamina-associated domains. This review sheds a new light on the epigenetically regulated global chromatin reorganization that underlies the loss of pluripotency and the determination of differentiation properties.
Collapse
Affiliation(s)
- Rasha E Boulos
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Guénola Drillon
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Françoise Argoul
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Alain Arneodo
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Benjamin Audit
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France.
| |
Collapse
|
5
|
Drillon G, Audit B, Argoul F, Arneodo A. Ubiquitous human 'master' origins of replication are encoded in the DNA sequence via a local enrichment in nucleosome excluding energy barriers. JOURNAL OF PHYSICS. CONDENSED MATTER : AN INSTITUTE OF PHYSICS JOURNAL 2015; 27:064102. [PMID: 25563930 DOI: 10.1088/0953-8984/27/6/064102] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/04/2023]
Abstract
As the elementary building block of eukaryotic chromatin, the nucleosome is at the heart of the compromise between the necessity of compacting DNA in the cell nucleus and the required accessibility to regulatory proteins. The recent availability of genome-wide experimental maps of nucleosome positions for many different organisms and cell types has provided an unprecedented opportunity to elucidate to what extent the DNA sequence conditions the primary structure of chromatin and in turn participates in the chromatin-mediated regulation of nuclear functions, such as gene expression and DNA replication. In this study, we use in vivo and in vitro genome-wide nucleosome occupancy data together with the set of nucleosome-free regions (NFRs) predicted by a physical model of nucleosome formation based on sequence-dependent bending properties of the DNA double-helix, to investigate the role of intrinsic nucleosome occupancy in the regulation of the replication spatio-temporal programme in human. We focus our analysis on the so-called replication U/N-domains that were shown to cover about half of the human genome in the germline (skew-N domains) as well as in embryonic stem cells, somatic and HeLa cells (mean replication timing U-domains). The 'master' origins of replication (MaOris) that border these megabase-sized U/N-domains were found to be specified by a few hundred kb wide regions that are hyper-sensitive to DNase I cleavage, hypomethylated, and enriched in epigenetic marks involved in transcription regulation, the hallmarks of localized open chromatin structures. Here we show that replication U/N-domain borders that are conserved in all considered cell lines have an environment highly enriched in nucleosome-excluding-energy barriers, suggesting that these ubiquitous MaOris have been selected during evolution. In contrast, MaOris that are cell-type-specific are mainly regulated epigenetically and are no longer favoured by a local abundance of intrinsic NFRs encoded in the DNA sequence. At the smaller few hundred bp scale of gene promoters, CpG-rich promoters of housekeeping genes found nearby ubiquitous MaOris as well as CpG-poor promoters of tissue-specific genes found nearby cell-type-specific MaOris, both correspond to in vivo NFRs that are not coded as nucleosome-excluding-energy barriers. Whereas the former promoters are likely to correspond to high occupancy transcription factor binding regions, the latter are an illustration that gene regulation in human is typically cell-type-specific.
Collapse
Affiliation(s)
- Guénola Drillon
- Université de Lyon, F-69000 Lyon, France. Laboratoire de Physique, CNRS UMR 5672, École Normale Supérieure de Lyon, F-69007 Lyon, France
| | | | | | | |
Collapse
|
6
|
Embryonic stem cell specific "master" replication origins at the heart of the loss of pluripotency. PLoS Comput Biol 2015; 11:e1003969. [PMID: 25658386 PMCID: PMC4319821 DOI: 10.1371/journal.pcbi.1003969] [Citation(s) in RCA: 19] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/10/2014] [Accepted: 10/06/2014] [Indexed: 11/29/2022] Open
Abstract
Epigenetic regulation of the replication program during mammalian cell differentiation remains poorly understood. We performed an integrative analysis of eleven genome-wide epigenetic profiles at 100 kb resolution of Mean Replication Timing (MRT) data in six human cell lines. Compared to the organization in four chromatin states shared by the five somatic cell lines, embryonic stem cell (ESC) line H1 displays (i) a gene-poor but highly dynamic chromatin state (EC4) associated to histone variant H2AZ rather than a HP1-associated heterochromatin state (C4) and (ii) a mid-S accessible chromatin state with bivalent gene marks instead of a polycomb-repressed heterochromatin state. Plastic MRT regions (≲ 20% of the genome) are predominantly localized at the borders of U-shaped timing domains. Whereas somatic-specific U-domain borders are gene-dense GC-rich regions, 31.6% of H1-specific U-domain borders are early EC4 regions enriched in pluripotency transcription factors NANOG and OCT4 despite being GC poor and gene deserts. Silencing of these ESC-specific “master” replication initiation zones during differentiation corresponds to a loss of H2AZ and an enrichment in H3K9me3 mark characteristic of late replicating C4 heterochromatin. These results shed a new light on the epigenetically regulated global chromatin reorganization that underlies the loss of pluripotency and lineage commitment. During development, embryonic stem cell (ESC) enter a program of cell differentiation eventually leading to all the necessary differentiated cell types. Understanding the mechanisms responsible for the underlying modifications of the gene expression program is of fundamental importance, as it will likely have strong impact on the development of regenerative medicine. We show that besides some epigenetic regulation, ubiquitous master replication origins at replication timing U-domain borders shared by 6 human cell types are transcriptionally active open chromatin regions specified by a local enrichment in nucleosome free regions encoded in the DNA sequence suggesting that they have been selected during evolution. In contrast, ESC specific master replication origins bear a unique epigenetic signature (enrichment in CTCF, H2AZ, NANOG, OCT4, …) likely contributing to maintain ESC chromatin in a highly dynamic and accessible state that is refractory to polycomb and HP1 heterochromatin spreading. These ESC specific master origins thus appear as key genomic regions where epigenetic control of chromatin organization is at play to maintain pluripotency of stem cell lineages and to guide lineage commitment to somatic cell types.
Collapse
|
7
|
Hyrien O. Peaks cloaked in the mist: the landscape of mammalian replication origins. J Cell Biol 2015; 208:147-60. [PMID: 25601401 PMCID: PMC4298691 DOI: 10.1083/jcb.201407004] [Citation(s) in RCA: 72] [Impact Index Per Article: 7.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2014] [Accepted: 12/16/2014] [Indexed: 12/23/2022] Open
Abstract
Replication of mammalian genomes starts at sites termed replication origins, which historically have been difficult to locate as a result of large genome sizes, limited power of genetic identification schemes, and rareness and fragility of initiation intermediates. However, origins are now mapped by the thousands using microarrays and sequencing techniques. Independent studies show modest concordance, suggesting that mammalian origins can form at any DNA sequence but are suppressed by read-through transcription or that they can overlap the 5' end or even the entire gene. These results require a critical reevaluation of whether origins form at specific DNA elements and/or epigenetic signals or require no such determinants.
Collapse
Affiliation(s)
- Olivier Hyrien
- Institut de Biologie de l'Ecole Normale Supérieure, Centre National de la Recherche Scientifique UMR8197 and Institut National de la Santé et de la Recherche Médicale U1024, 75005 Paris, France
| |
Collapse
|
8
|
Zaghloul L, Drillon G, Boulos RE, Argoul F, Thermes C, Arneodo A, Audit B. Large replication skew domains delimit GC-poor gene deserts in human. Comput Biol Chem 2014; 53 Pt A:153-65. [PMID: 25224847 DOI: 10.1016/j.compbiolchem.2014.08.020] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 07/11/2014] [Indexed: 01/25/2023]
Abstract
Besides their large-scale organization in isochores, mammalian genomes display megabase-sized regions, spanning both genes and intergenes, where the strand nucleotide composition asymmetry decreases linearly, possibly due to replication activity. These so-called skew-N domains cover about a third of the human genome and are bordered by two skew upward jumps that were hypothesized to compose a subset of "master" replication origins active in the germline. Skew-N domains were shown to exhibit a particular gene organization. Genes with CpG-rich promoters likely expressed in the germline are over represented near the master replication origins, with large genes being co-oriented with replication fork progression, which suggests some coordination of replication and transcription. In this study, we describe another skew structure that covers ∼13% of the human genome and that is bordered by putative master replication origins similar to the ones flanking skew-N domains. These skew-split-N domains have a shape reminiscent of a N, but split in half, leaving in the center a region of null skew whose length increases with domain size. These central regions (median size ∼860 kb) have a homogeneous composition, i.e. both a null and constant skew and a constant and low GC content. They correspond to heterochromatin gene deserts found in low-GC isochores with an average gene density of 0.81 promoters/Mb as compared to 7.73 promoters/Mb genome wide. The analysis of epigenetic marks and replication timing data confirms that, in these late replicating heterochomatic regions, the initiation of replication is likely to be random. This contrasts with the transcriptionally active euchromatin state found around the bordering well positioned master replication origins. Altogether skew-N domains and skew-split-N domains cover about 50% of the human genome.
Collapse
Affiliation(s)
- Lamia Zaghloul
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Guénola Drillon
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Rasha E Boulos
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Françoise Argoul
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Claude Thermes
- Centre de Génétique Moléculaire, CNRS UPR 3404, Gif-sur-Yvette, France
| | - Alain Arneodo
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France
| | - Benjamin Audit
- Université de Lyon, F-69000 Lyon, France; Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, F-69007 Lyon, France.
| |
Collapse
|
9
|
Julienne H, Zoufir A, Audit B, Arneodo A. Human genome replication proceeds through four chromatin states. PLoS Comput Biol 2013; 9:e1003233. [PMID: 24130466 PMCID: PMC3794905 DOI: 10.1371/journal.pcbi.1003233] [Citation(s) in RCA: 47] [Impact Index Per Article: 3.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/26/2013] [Accepted: 08/06/2013] [Indexed: 12/26/2022] Open
Abstract
Advances in genomic studies have led to significant progress in understanding the epigenetically controlled interplay between chromatin structure and nuclear functions. Epigenetic modifications were shown to play a key role in transcription regulation and genome activity during development and differentiation or in response to the environment. Paradoxically, the molecular mechanisms that regulate the initiation and the maintenance of the spatio-temporal replication program in higher eukaryotes, and in particular their links to epigenetic modifications, still remain elusive. By integrative analysis of the genome-wide distributions of thirteen epigenetic marks in the human cell line K562, at the 100 kb resolution of corresponding mean replication timing (MRT) data, we identify four major groups of chromatin marks with shared features. These states have different MRT, namely from early to late replicating, replication proceeds though a transcriptionally active euchromatin state (C1), a repressive type of chromatin (C2) associated with polycomb complexes, a silent state (C3) not enriched in any available marks, and a gene poor HP1-associated heterochromatin state (C4). When mapping these chromatin states inside the megabase-sized U-domains (U-shaped MRT profile) covering about 50% of the human genome, we reveal that the associated replication fork polarity gradient corresponds to a directional path across the four chromatin states, from C1 at U-domains borders followed by C2, C3 and C4 at centers. Analysis of the other genome half is consistent with early and late replication loci occurring in separate compartments, the former correspond to gene-rich, high-GC domains of intermingled chromatin states C1 and C2, whereas the latter correspond to gene-poor, low-GC domains of alternating chromatin states C3 and C4 or long C4 domains. This new segmentation sheds a new light on the epigenetic regulation of the spatio-temporal replication program in human and provides a framework for further studies in different cell types, in both health and disease. Previous studies revealed spatially coherent and biological-meaningful chromatin mark combinations in human cells. Here, we analyze thirteen epigenetic mark maps in the human cell line K562 at 100 kb resolution of MRT data. The complexity of epigenetic data is reduced to four chromatin states that display remarkable similarities with those reported in fly, worm and plants. These states have different MRT: (C1) is transcriptionally active, early replicating, enriched in CTCF; (C2) is Polycomb repressed, mid-S replicating; (C3) lacks of marks and replicates late and (C4) is a late-replicating gene-poor HP1 repressed heterochromatin state. When mapping these states inside the 876 replication U-domains of K562, the replication fork polarity gradient observed in these U-domains comes along with a remarkable epigenetic organization from C1 at U-domain borders to C2, C3 and ultimately C4 at centers. The remaining genome half displays early replicating, gene rich and high GC domains of intermingled C1 and C2 states segregating from late replicating, gene poor and low GC domains of concatenated C3 and/or C4 states. This constitutes the first evidence of epigenetic compartmentalization of the human genome into replication domains likely corresponding to autonomous units in the 3D chromatin architecture.
Collapse
Affiliation(s)
- Hanna Julienne
- Université de Lyon, Lyon, France
- Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, Lyon, France
| | - Azedine Zoufir
- Université de Lyon, Lyon, France
- Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, Lyon, France
| | - Benjamin Audit
- Université de Lyon, Lyon, France
- Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, Lyon, France
- * E-mail:
| | - Alain Arneodo
- Université de Lyon, Lyon, France
- Laboratoire de Physique, CNRS UMR 5672, Ecole Normale Supérieure de Lyon, Lyon, France
| |
Collapse
|
10
|
Hyrien O, Rappailles A, Guilbaud G, Baker A, Chen CL, Goldar A, Petryk N, Kahli M, Ma E, d'Aubenton-Carafa Y, Audit B, Thermes C, Arneodo A. From simple bacterial and archaeal replicons to replication N/U-domains. J Mol Biol 2013; 425:4673-89. [PMID: 24095859 DOI: 10.1016/j.jmb.2013.09.021] [Citation(s) in RCA: 23] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/12/2013] [Revised: 09/15/2013] [Accepted: 09/19/2013] [Indexed: 10/26/2022]
Abstract
The Replicon Theory proposed 50 years ago has proven to apply for replicons of the three domains of life. Here, we review our knowledge of genome organization into single and multiple replicons in bacteria, archaea and eukarya. Bacterial and archaeal replicator/initiator systems are quite specific and efficient, whereas eukaryotic replicons show degenerate specificity and efficiency, allowing for complex regulation of origin firing time. We expand on recent evidence that ~50% of the human genome is organized as ~1,500 megabase-sized replication domains with a characteristic parabolic (U-shaped) replication timing profile and linear (N-shaped) gradient of replication fork polarity. These N/U-domains correspond to self-interacting segments of the chromatin fiber bordered by open chromatin zones and replicate by cascades of origin firing initiating at their borders and propagating to their center, possibly by fork-stimulated initiation. The conserved occurrence of this replication pattern in the germline of mammals has resulted over evolutionary times in the formation of megabase-sized domains with an N-shaped nucleotide compositional skew profile due to replication-associated mutational asymmetries. Overall, these results reveal an evolutionarily conserved but developmentally plastic organization of replication that is driving mammalian genome evolution.
Collapse
Affiliation(s)
- Olivier Hyrien
- Ecole Normale Supérieure, IBENS UMR8197 U1024, Paris 75005, France.
| | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
11
|
Boulos RE, Arneodo A, Jensen P, Audit B. Revealing long-range interconnected hubs in human chromatin interaction data using graph theory. PHYSICAL REVIEW LETTERS 2013; 111:118102. [PMID: 24074120 DOI: 10.1103/physrevlett.111.118102] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/19/2013] [Indexed: 06/02/2023]
Abstract
We use graph theory to analyze chromatin interaction (Hi-C) data in the human genome. We show that a key functional feature of the genome--"master" replication origins--corresponds to DNA loci of maximal network centrality. These loci form a set of interconnected hubs both within chromosomes and between different chromosomes. Our results open the way to a fruitful use of graph theory concepts to decipher DNA structural organization in relation to genome functions such as replication and transcription. This quantitative information should prove useful to discriminate between possible polymer models of nuclear organization.
Collapse
Affiliation(s)
- R E Boulos
- Université de Lyon, F-69000 Lyon, France and Laboratoire de Physique, ENS de Lyon, CNRS UMR5672, F-69007 Lyon, France
| | | | | | | |
Collapse
|
12
|
Julienne H, Zoufir A, Audit B, Arneodo A. Epigenetic regulation of the human genome: coherence between promoter activity and large-scale chromatin environment. FRONTIERS IN LIFE SCIENCE 2013. [DOI: 10.1080/21553769.2013.832706] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/26/2022]
|
13
|
Audit B, Zaghloul L, Baker A, Arneodo A, Chen CL, d'Aubenton-Carafa Y, Thermes C. Megabase replication domains along the human genome: relation to chromatin structure and genome organisation. Subcell Biochem 2013; 61:57-80. [PMID: 23150246 DOI: 10.1007/978-94-007-4525-4_3] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/03/2023]
Abstract
In higher eukaryotes, the absence of specific sequence motifs, marking the origins of replication has been a serious hindrance to the understanding of (i) the mechanisms that regulate the spatio-temporal replication program, and (ii) the links between origins activation, chromatin structure and transcription. In this chapter, we review the partitioning of the human genome into megabased-size replication domains delineated as N-shaped motifs in the strand compositional asymmetry profiles. They collectively span 28.3% of the genome and are bordered by more than 1,000 putative replication origins. We recapitulate the comparison of this partition of the human genome with high-resolution experimental data that confirms that replication domain borders are likely to be preferential replication initiation zones in the germline. In addition, we highlight the specific distribution of experimental and numerical chromatin marks along replication domains. Domain borders correspond to particular open chromatin regions, possibly encoded in the DNA sequence, and around which replication and transcription are highly coordinated. These regions also present a high evolutionary breakpoint density, suggesting that susceptibility to breakage might be linked to local open chromatin fiber state. Altogether, this chapter presents a compartmentalization of the human genome into replication domains that are landmarks of the human genome organization and are likely to play a key role in genome dynamics during evolution and in pathological situations.
Collapse
|
14
|
Audit B, Baker A, Chen CL, Rappailles A, Guilbaud G, Julienne H, Goldar A, d'Aubenton-Carafa Y, Hyrien O, Thermes C, Arneodo A. Multiscale analysis of genome-wide replication timing profiles using a wavelet-based signal-processing algorithm. Nat Protoc 2012; 8:98-110. [PMID: 23237832 DOI: 10.1038/nprot.2012.145] [Citation(s) in RCA: 45] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
In this protocol, we describe the use of the LastWave open-source signal-processing command language (http://perso.ens-lyon.fr/benjamin.audit/LastWave/) for analyzing cellular DNA replication timing profiles. LastWave makes use of a multiscale, wavelet-based signal-processing algorithm that is based on a rigorous theoretical analysis linking timing profiles to fundamental features of the cell's DNA replication program, such as the average replication fork polarity and the difference between replication origin density and termination site density. We describe the flow of signal-processing operations to obtain interactive visual analyses of DNA replication timing profiles. We focus on procedures for exploring the space-scale map of apparent replication speeds to detect peaks in the replication timing profiles that represent preferential replication initiation zones, and for delimiting U-shaped domains in the replication timing profile. In comparison with the generally adopted approach that involves genome segmentation into regions of constant timing separated by timing transition regions, the present protocol enables the recognition of more complex patterns of the spatio-temporal replication program and has a broader range of applications. Completing the full procedure should not take more than 1 h, although learning the basics of the program can take a few hours and achieving full proficiency in the use of the software may take days.
Collapse
|
15
|
Xia X. DNA replication and strand asymmetry in prokaryotic and mitochondrial genomes. Curr Genomics 2012; 13:16-27. [PMID: 22942672 PMCID: PMC3269012 DOI: 10.2174/138920212799034776] [Citation(s) in RCA: 43] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2011] [Revised: 09/26/2011] [Accepted: 10/02/2011] [Indexed: 11/22/2022] Open
Abstract
Different patterns of strand asymmetry have been documented in a variety of prokaryotic genomes as well as mitochondrial genomes. Because different replication mechanisms often lead to different patterns of strand asymmetry, much can be learned of replication mechanisms by examining strand asymmetry. Here I summarize the diverse patterns of strand asymmetry among different taxonomic groups to suggest that (1) the single-origin replication may not be universal among bacterial species as the endosymbionts Wigglesworthia glossinidia, Wolbachia species, cyanobacterium Synechocystis 6803 and Mycoplasma pulmonis genomes all exhibit strand asymmetry patterns consistent with the multiple origins of replication, (2) different replication origins in some archaeal genomes leave quite different patterns of strand asymmetry, suggesting that different replication origins in the same genome may be differentially used, (3) mitochondrial genomes from representative vertebrate species share one strand asymmetry pattern consistent with the strand-displacement replication documented in mammalian mtDNA, suggesting that the mtDNA replication mechanism in mammals may be shared among all vertebrate species, and (4) mitochondrial genomes from primitive forms of metazoans such as the sponge and hydra (representing Porifera and Cnidaria, respectively), as well as those from plants, have strand asymmetry patterns similar to single-origin or multi-origin replications observed in prokaryotes and are drastically different from mitochondrial genomes from other metazoans. This may explain why sponge and hydra mitochondrial genomes, as well as plant mitochondrial genomes, evolves much slower than those from other metazoans.
Collapse
Affiliation(s)
- Xuhua Xia
- Department of Biology and Center for Advanced Research in Environmental Genomics, University of Ottawa, 30 Marie Curie, P.O. Box 450, Station A, Ottawa, Ontario, Canada
| |
Collapse
|
16
|
Arakawa K, Tomita M. Measures of compositional strand bias related to replication machinery and its applications. Curr Genomics 2012; 13:4-15. [PMID: 22942671 PMCID: PMC3269016 DOI: 10.2174/138920212799034749] [Citation(s) in RCA: 16] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2011] [Revised: 09/10/2011] [Accepted: 09/20/2011] [Indexed: 11/22/2022] Open
Abstract
The compositional asymmetry of complementary bases in nucleotide sequences implies the existence of a mutational or selectional bias in the two strands of the DNA duplex, which is commonly shaped by strand-specific mechanisms in transcription or replication. Such strand bias in genomes, frequently visualized by GC skew graphs, is used for the computational prediction of transcription start sites and replication origins, as well as for comparative evolutionary genomics studies. The use of measures of compositional strand bias in order to quantify the degree of strand asymmetry is crucial, as it is the basis for determining the applicability of compositional analysis and comparing the strength of the mutational bias in different biological machineries in various species. Here, we review the measures of strand bias that have been proposed to date, including the ∆GC skew, the B1 index, the predictability score of linear discriminant analysis for gene orientation, the signal-to-noise ratio of the oligonucleotide bias, and the GC skew index. These measures have been predominantly designed for and applied to the analysis of replication-related mutational processes in prokaryotes, but we also give research examples in eukaryotes.
Collapse
Affiliation(s)
- Kazuharu Arakawa
- Institute for Advanced Biosciences, Keio University, Fujisawa 252-8520, Japan
| | | |
Collapse
|
17
|
Baker A, Julienne H, Chen CL, Audit B, d'Aubenton-Carafa Y, Thermes C, Arneodo A. Linking the DNA strand asymmetry to the spatio-temporal replication program. I. About the role of the replication fork polarity in genome evolution. THE EUROPEAN PHYSICAL JOURNAL. E, SOFT MATTER 2012; 35:92. [PMID: 23001787 DOI: 10.1140/epje/i2012-12092-y] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/06/2012] [Revised: 08/08/2012] [Accepted: 08/21/2012] [Indexed: 06/01/2023]
Abstract
Two key cellular processes, namely transcription and replication, require the opening of the DNA double helix and act differently on the two DNA strands, generating different mutational patterns (mutational asymmetry) that may result, after long evolutionary time, in different nucleotide compositions on the two DNA strands (compositional asymmetry). We elaborate on the simplest model of neutral substitution rates that takes into account the strand asymmetries generated by the transcription and replication processes. Using perturbation theory, we then solve the time evolution of the DNA composition under strand-asymmetric substitution rates. In our minimal model, the compositional and substitutional asymmetries are predicted to decompose into a transcription- and a replication-associated components. The transcription-associated asymmetry increases in magnitude with transcription rate and changes sign with gene orientation while the replication-associated asymmetry is proportional to the replication fork polarity. These results are confirmed experimentally in the human genome, using substitution rates obtained by aligning the human and chimpanzee genomes using macaca and orangutan as outgroups, and replication fork polarity determined in the HeLa cell line as estimated from the derivative of the mean replication timing. When further investigating the dynamics of compositional skew evolution, we show that it is not at equilibrium yet and that its evolution is an extremely slow process with characteristic time scales of several hundred Myrs.
Collapse
Affiliation(s)
- A Baker
- Université de Lyon, Lyon, France
| | | | | | | | | | | | | |
Collapse
|
18
|
Moindrot B, Audit B, Klous P, Baker A, Thermes C, de Laat W, Bouvet P, Mongelard F, Arneodo A. 3D chromatin conformation correlates with replication timing and is conserved in resting cells. Nucleic Acids Res 2012; 40:9470-81. [PMID: 22879376 PMCID: PMC3479194 DOI: 10.1093/nar/gks736] [Citation(s) in RCA: 57] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Although chromatin folding is known to be of functional importance to control the gene expression program, less is known regarding its interplay with DNA replication. Here, using Circular Chromatin Conformation Capture combined with high-throughput sequencing, we identified megabase-sized self-interacting domains in the nucleus of a human lymphoblastoid cell line, as well as in cycling and resting peripheral blood mononuclear cells (PBMC). Strikingly, the boundaries of those domains coincide with early-initiation zones in every cell types. Preferential interactions have been observed between the consecutive early-initiation zones, but also between those separated by several tens of megabases. Thus, the 3D conformation of chromatin is strongly correlated with the replication timing along the whole chromosome. We furthermore provide direct clues that, in addition to the timing value per se, the shape of the timing profile at a given locus defines its set of genomic contacts. As this timing-related scheme of chromatin organization exists in lymphoblastoid cells, resting and cycling PBMC, this indicates that it is maintained several weeks or months after the previous S-phase. Lastly, our work highlights that the major chromatin changes accompanying PBMC entry into cell cycle occur while keeping largely unchanged the long-range chromatin contacts.
Collapse
Affiliation(s)
- Benoit Moindrot
- Laboratoire Joliot-Curie, Ecole Normale Supérieure de Lyon, CNRS, F-69007 Lyon, France
| | | | | | | | | | | | | | | | | |
Collapse
|
19
|
Baker A, Audit B, Yang SCH, Bechhoefer J, Arneodo A. Inferring where and when replication initiates from genome-wide replication timing data. PHYSICAL REVIEW LETTERS 2012; 108:268101. [PMID: 23005017 DOI: 10.1103/physrevlett.108.268101] [Citation(s) in RCA: 12] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/11/2012] [Indexed: 06/01/2023]
Abstract
Based on an analogy between DNA replication and one dimensional nucleation-and-growth processes, various attempts to infer the local initiation rate I(x,t) of DNA replication origins from replication timing data have been developed in the framework of phase transition kinetics theories. These works have all used curve-fit strategies to estimate I(x,t) from genome-wide replication timing data. Here, we show how to invert analytically the Kolmogorov-Johnson-Mehl-Avrami model and extract I(x,t) directly. Tests on both simulated and experimental budding-yeast data confirm the location and firing-time distribution of replication origins.
Collapse
Affiliation(s)
- A Baker
- Université de Lyon, F-69000 Lyon, France, and Laboratoire de Physique, ENS de Lyon, CNRS, F-69007 Lyon, France
| | | | | | | | | |
Collapse
|
20
|
Baker A, Audit B, Chen CL, Moindrot B, Leleu A, Guilbaud G, Rappailles A, Vaillant C, Goldar A, Mongelard F, d'Aubenton-Carafa Y, Hyrien O, Thermes C, Arneodo A. Replication fork polarity gradients revealed by megabase-sized U-shaped replication timing domains in human cell lines. PLoS Comput Biol 2012; 8:e1002443. [PMID: 22496629 PMCID: PMC3320577 DOI: 10.1371/journal.pcbi.1002443] [Citation(s) in RCA: 65] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2011] [Accepted: 02/09/2012] [Indexed: 12/26/2022] Open
Abstract
In higher eukaryotes, replication program specification in different cell types remains to be fully understood. We show for seven human cell lines that about half of the genome is divided in domains that display a characteristic U-shaped replication timing profile with early initiation zones at borders and late replication at centers. Significant overlap is observed between U-domains of different cell lines and also with germline replication domains exhibiting a N-shaped nucleotide compositional skew. From the demonstration that the average fork polarity is directly reflected by both the compositional skew and the derivative of the replication timing profile, we argue that the fact that this derivative displays a N-shape in U-domains sustains the existence of large-scale gradients of replication fork polarity in somatic and germline cells. Analysis of chromatin interaction (Hi-C) and chromatin marker data reveals that U-domains correspond to high-order chromatin structural units. We discuss possible models for replication origin activation within U/N-domains. The compartmentalization of the genome into replication U/N-domains provides new insights on the organization of the replication program in the human genome.
Collapse
Affiliation(s)
- Antoine Baker
- Université de Lyon, Lyon, France
- Laboratoire Joliot-Curie, CNRS, Ecole Normale Supérieure de Lyon, Lyon, France
- Laboratoire de Physique, CNRS, Ecole Normale Supérieure de Lyon, Lyon, France
| | - Benjamin Audit
- Université de Lyon, Lyon, France
- Laboratoire Joliot-Curie, CNRS, Ecole Normale Supérieure de Lyon, Lyon, France
- Laboratoire de Physique, CNRS, Ecole Normale Supérieure de Lyon, Lyon, France
| | - Chun-Long Chen
- Centre de Génétique Moléculaire UPR 3404, CNRS, Gif-sur-Yvette, France
| | - Benoit Moindrot
- Université de Lyon, Lyon, France
- Laboratoire Joliot-Curie, CNRS, Ecole Normale Supérieure de Lyon, Lyon, France
| | - Antoine Leleu
- Université de Lyon, Lyon, France
- Laboratoire Joliot-Curie, CNRS, Ecole Normale Supérieure de Lyon, Lyon, France
| | - Guillaume Guilbaud
- Institut de Biologie de l'Ecole Normale Supérieure, CNRS UMR8197, Inserm U1024, Paris, France
| | - Aurélien Rappailles
- Institut de Biologie de l'Ecole Normale Supérieure, CNRS UMR8197, Inserm U1024, Paris, France
| | - Cédric Vaillant
- Université de Lyon, Lyon, France
- Laboratoire Joliot-Curie, CNRS, Ecole Normale Supérieure de Lyon, Lyon, France
- Laboratoire de Physique, CNRS, Ecole Normale Supérieure de Lyon, Lyon, France
| | - Arach Goldar
- Commissariat à l'énergie atomique, iBiTecS, Gif-sur-Yvette, France
| | - Fabien Mongelard
- Université de Lyon, Lyon, France
- Laboratoire Joliot-Curie, CNRS, Ecole Normale Supérieure de Lyon, Lyon, France
- Laboratoire de Biologie Moléculaire de la Cellule, CNRS, Ecole Normale Supérieure de Lyon, Lyon, France
| | | | - Olivier Hyrien
- Institut de Biologie de l'Ecole Normale Supérieure, CNRS UMR8197, Inserm U1024, Paris, France
| | - Claude Thermes
- Centre de Génétique Moléculaire UPR 3404, CNRS, Gif-sur-Yvette, France
| | - Alain Arneodo
- Université de Lyon, Lyon, France
- Laboratoire Joliot-Curie, CNRS, Ecole Normale Supérieure de Lyon, Lyon, France
- Laboratoire de Physique, CNRS, Ecole Normale Supérieure de Lyon, Lyon, France
- * E-mail:
| |
Collapse
|
21
|
Chen CL, Duquenne L, Audit B, Guilbaud G, Rappailles A, Baker A, Huvet M, d'Aubenton-Carafa Y, Hyrien O, Arneodo A, Thermes C. Replication-associated mutational asymmetry in the human genome. Mol Biol Evol 2011; 28:2327-37. [PMID: 21368316 DOI: 10.1093/molbev/msr056] [Citation(s) in RCA: 60] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/12/2023] Open
Abstract
During evolution, mutations occur at rates that can differ between the two DNA strands. In the human genome, nucleotide substitutions occur at different rates on the transcribed and non-transcribed strands that may result from transcription-coupled repair. These mutational asymmetries generate transcription-associated compositional skews. To date, the existence of such asymmetries associated with replication has not yet been established. Here, we compute the nucleotide substitution matrices around replication initiation zones identified as sharp peaks in replication timing profiles and associated with abrupt jumps in the compositional skew profile. We show that the substitution matrices computed in these regions fully explain the jumps in the compositional skew profile when crossing initiation zones. In intergenic regions, we observe mutational asymmetries measured as differences between complementary substitution rates; their sign changes when crossing initiation zones. These mutational asymmetries are unlikely to result from cryptic transcription but can be explained by a model based on replication errors and strand-biased repair. In transcribed regions, mutational asymmetries associated with replication superimpose on the previously described mutational asymmetries associated with transcription. We separate the substitution asymmetries associated with both mechanisms, which allows us to determine for the first time in eukaryotes, the mutational asymmetries associated with replication and to reevaluate those associated with transcription. Replication-associated mutational asymmetry may result from unequal rates of complementary base misincorporation by the DNA polymerases coupled with DNA mismatch repair (MMR) acting with different efficiencies on the leading and lagging strands. Replication, acting in germ line cells during long evolutionary times, contributed equally with transcription to produce the present abrupt jumps in the compositional skew. These results demonstrate that DNA replication is one of the major processes that shape human genome composition.
Collapse
Affiliation(s)
- Chun-Long Chen
- Centre de Génétique Moléculaire, Centre National de la Recherche Scientifique (CNRS), Gif-sur-Yvette, France
| | | | | | | | | | | | | | | | | | | | | |
Collapse
|
22
|
Increased rate of human mutations where DNA and RNA polymerases collide. Trends Genet 2009; 25:523-7. [PMID: 19853958 DOI: 10.1016/j.tig.2009.10.002] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2009] [Revised: 10/05/2009] [Accepted: 10/05/2009] [Indexed: 12/27/2022]
Abstract
Gene density and orientation of genes in eukaryotes seem to be correlated with the replication origin and the mutation rate is greater in late replicating regions; however, the reason for these patterns is unknown. Here, we investigate predicted replication origins in the human genome and find that levels of polymorphism as well as divergence from the chimpanzee genome are greater in genes transcribed on the lagging strand than those on the leading strand. This might be caused by interference between RNA and DNA polymerases, and avoidance of collisions between these enzymes might be an evolutionary force shaping gene orientation and density surrounding replication start sites. Physical constraints might have a larger influence on genome evolution than previously thought.
Collapse
|
23
|
Audit B, Zaghloul L, Vaillant C, Chevereau G, d'Aubenton-Carafa Y, Thermes C, Arneodo A. Open chromatin encoded in DNA sequence is the signature of 'master' replication origins in human cells. Nucleic Acids Res 2009; 37:6064-75. [PMID: 19671527 PMCID: PMC2764438 DOI: 10.1093/nar/gkp631] [Citation(s) in RCA: 47] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022] Open
Abstract
For years, progress in elucidating the mechanisms underlying replication initiation and its coupling to transcriptional activities and to local chromatin structure has been hampered by the small number (approximately 30) of well-established origins in the human genome and more generally in mammalian genomes. Recent in silico studies of compositional strand asymmetries revealed a high level of organization of human genes around 1000 putative replication origins. Here, by comparing with recently experimentally identified replication origins, we provide further support that these putative origins are active in vivo. We show that regions approximately 300-kb wide surrounding most of these putative replication origins that replicate early in the S phase are hypersensitive to DNase I cleavage, hypomethylated and present a significant enrichment in genomic energy barriers that impair nucleosome formation (nucleosome-free regions). This suggests that these putative replication origins are specified by an open chromatin structure favored by the DNA sequence. We discuss how this distinctive attribute makes these origins, further qualified as 'master' replication origins, priviledged loci for future research to decipher the human spatio-temporal replication program. Finally, we argue that these 'master' origins are likely to play a key role in genome dynamics during evolution and in pathological situations.
Collapse
|
24
|
Lemaitre C, Zaghloul L, Sagot MF, Gautier C, Arneodo A, Tannier E, Audit B. Analysis of fine-scale mammalian evolutionary breakpoints provides new insight into their relation to genome organisation. BMC Genomics 2009; 10:335. [PMID: 19630943 PMCID: PMC2722678 DOI: 10.1186/1471-2164-10-335] [Citation(s) in RCA: 56] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/26/2009] [Accepted: 07/24/2009] [Indexed: 11/21/2022] Open
Abstract
Background The Intergenic Breakage Model, which is the current model of structural genome evolution, considers that evolutionary rearrangement breakages happen with a uniform propensity along the genome but are selected against in genes, their regulatory regions and in-between. However, a growing body of evidence shows that there exists regions along mammalian genomes that present a high susceptibility to breakage. We reconsidered this question taking advantage of a recently published methodology for the precise detection of rearrangement breakpoints based on pairwise genome comparisons. Results We applied this methodology between the genome of human and those of five sequenced eutherian mammals which allowed us to delineate evolutionary breakpoint regions along the human genome with a finer resolution (median size 26.6 kb) than obtained before. We investigated the distribution of these breakpoints with respect to genome organisation into domains of different activity. In agreement with the Intergenic Breakage Model, we observed that breakpoints are under-represented in genes. Surprisingly however, the density of breakpoints in small intergenes (1 per Mb) appears significantly higher than in gene deserts (0.1 per Mb). More generally, we found a heterogeneous distribution of breakpoints that follows the organisation of the genome into isochores (breakpoints are more frequent in GC-rich regions). We then discuss the hypothesis that regions with an enhanced susceptibility to breakage correspond to regions of high transcriptional activity and replication initiation. Conclusion We propose a model to describe the heterogeneous distribution of evolutionary breakpoints along human chromosomes that combines natural selection and a mutational bias linked to local open chromatin state.
Collapse
Affiliation(s)
- Claire Lemaitre
- Université de Bordeaux, Centre de Bioinformatique - Génomique Fonctionnelle Bordeaux, F-33000 Bordeaux, France.
| | | | | | | | | | | | | |
Collapse
|
25
|
Schwaiger M, Stadler MB, Bell O, Kohler H, Oakeley EJ, Schübeler D. Chromatin state marks cell-type- and gender-specific replication of the Drosophila genome. Genes Dev 2009; 23:589-601. [PMID: 19270159 DOI: 10.1101/gad.511809] [Citation(s) in RCA: 129] [Impact Index Per Article: 8.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/15/2023]
Abstract
Duplication of eukaryotic genomes during S phase is coordinated in space and time. In order to identify zones of initiation and cell-type- as well as gender-specific plasticity of DNA replication, we profiled replication timing, histone acetylation, and transcription throughout the Drosophila genome. We observed two waves of replication initiation with many distinct zones firing in early-S phase and multiple, less defined peaks at the end of S phase, suggesting that initiation becomes more promiscuous in late-S phase. A comparison of different cell types revealed widespread plasticity of replication timing on autosomes. Most occur in large regions, but only half coincide with local differences in transcription. In contrast to confined autosomal differences, a global shift in replication timing occurs throughout the single male X chromosome. Unlike in females, the dosage-compensated X chromosome replicates almost exclusively early. This difference occurs at sites that are not transcriptionally hyperactivated, but show increased acetylation of Lys 16 of histone H4 (H4K16ac). This suggests a transcription-independent, yet chromosome-wide process related to chromatin. Importantly, H4K16ac is also enriched at initiation zones as well as early replicating regions on autosomes during S phase. Together, our study reveals novel organizational principles of DNA replication of the Drosophila genome and suggests that H4K16ac is more closely correlated with replication timing than is transcription.
Collapse
Affiliation(s)
- Michaela Schwaiger
- Friedrich Miescher Institute for Biomedical Research, CH-4058 Basel, Switzerland
| | | | | | | | | | | |
Collapse
|
26
|
Benecke A. Gene regulatory network inference using out of equilibrium statistical mechanics. HFSP JOURNAL 2008; 2:183-8. [PMID: 19404429 DOI: 10.2976/1.2957743] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/24/2008] [Indexed: 11/19/2022]
Abstract
Spatiotemporal control of gene expression is fundamental to multicellular life. Despite prodigious efforts, the encoding of gene expression regulation in eukaryotes is not understood. Gene expression analyses nourish the hope to reverse engineer effector-target gene networks using inference techniques. Inference from noisy and circumstantial data relies on using robust models with few parameters for the underlying mechanisms. However, a systematic path to gene regulatory network reverse engineering from functional genomics data is still impeded by fundamental problems. Recently, Johannes Berg from the Theoretical Physics Institute of Cologne University has made two remarkable contributions that significantly advance the gene regulatory network inference problem. Berg, who uses gene expression data from yeast, has demonstrated a nonequilibrium regime for mRNA concentration dynamics and was able to map the gene regulatory process upon simple stochastic systems driven out of equilibrium. The impact of his demonstration is twofold, affecting both the understanding of the operational constraints under which transcription occurs and the capacity to extract relevant information from highly time-resolved expression data. Berg has used his observation to predict target genes of selected transcription factors, and thereby, in principle, demonstrated applicability of his out of equilibrium statistical mechanics approach to the gene network inference problem.
Collapse
|