1
|
Soni V, Versoza CJ, Terbot JW, Jensen JD, Pfeifer SP. Inferring fine-scale mutation and recombination rate maps in aye-ayes ( Daubentonia madagascariensis ). BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.12.28.630620. [PMID: 39763842 PMCID: PMC11703150 DOI: 10.1101/2024.12.28.630620] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 03/20/2025]
Abstract
The rate of input of new genetic mutations, and the rate at which that variation is reshuffled, are key evolutionary processes shaping genomic diversity. Importantly, these rates vary not just across populations and species, but also across individual genomes. Despite previous studies having demonstrated that failing to account for rate heterogeneity across the genome can bias the inference of both selective and neutral population genetic processes, mutation and recombination rate maps have to date only been generated for a relatively small number of organisms. Here, we infer such fine-scale maps for the aye-aye ( Daubentonia madagascariensis ) - a highly endangered strepsirrhine that represents one of the earliest splits in the primate clade, and thus stands as an important outgroup to the more commonly-studied haplorrhines - utilizing a recently released fully-annotated genome combined with high-quality population sequencing data. We compare our indirectly inferred rates to previous pedigree-based estimates, finding further evidence of relatively low mutation and recombination rates in aye-ayes compared to other primates.
Collapse
|
2
|
Corzo G, Seeling-Branscomb CE, Seeling JM. Differential Synonymous Codon Selection in the B56 Gene Family of PP2A Regulatory Subunits. Int J Mol Sci 2023; 25:392. [PMID: 38203563 PMCID: PMC10778929 DOI: 10.3390/ijms25010392] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/20/2023] [Revised: 12/18/2023] [Accepted: 12/23/2023] [Indexed: 01/12/2024] Open
Abstract
Protein phosphatase 2A (PP2A) functions as a tumor suppressor and consists of a scaffolding, catalytic, and regulatory subunit. The B56 gene family of regulatory subunits impart distinct functions onto PP2A. Codon usage bias (CUB) involves the selection of synonymous codons, which can affect gene expression by modulating processes such as transcription and translation. CUB can vary along the length of a gene, and differential use of synonymous codons can be important in the divergence of gene families. The N-termini of the gene product encoded by B56α possessed high CUB, high GC content at the third codon position (GC3), and high rare codon content. In addition, differential CUB was found in the sequence encoding two B56γ N-terminal splice forms. The sequence encoding the N-termini of B56γ/γ, relative to B56δ/γ, displayed CUB, utilized more frequent codons, and had higher GC3 content. B56α mRNA had stronger than predicted secondary structure at their 5' end, and the B56δ/γ splice variants had long regions of weaker than predicted secondary structure at their 5' end. The data suggest that B56α is expressed at relatively low levels as compared to the other B56 isoforms and that the B56δ/γ splice variant is expressed more highly than B56γ/γ.
Collapse
Affiliation(s)
- Gabriel Corzo
- Department of Biology, Hofstra University, Hempstead, NY 11549, USA;
| | | | - Joni M. Seeling
- Department of Biology, Hofstra University, Hempstead, NY 11549, USA;
| |
Collapse
|
3
|
Liu Y, Liang N, Xian Q, Zhang W. GC heterogeneity reveals sequence-structures evolution of angiosperm ITS2. BMC PLANT BIOLOGY 2023; 23:608. [PMID: 38036992 PMCID: PMC10691020 DOI: 10.1186/s12870-023-04634-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/21/2023] [Accepted: 11/26/2023] [Indexed: 12/02/2023]
Abstract
BACKGROUND Despite GC variation constitutes a fundamental element of genome and species diversity, the precise mechanisms driving it remain unclear. The abundant sequence data available for the ITS2, a commonly employed phylogenetic marker in plants, offers an exceptional resource for exploring the GC variation across angiosperms. RESULTS A comprehensive selection of 8666 species, comprising 165 genera, 63 families, and 30 orders were used for the analyses. The alignment of ITS2 sequence-structures and partitioning of secondary structures into paired and unpaired regions were performed using 4SALE. Substitution rates and frequencies among GC base-pairs in the paired regions of ITS2 were calculated using RNA-specific models in the PHASE package. The results showed that the distribution of ITS2 GC contents on the angiosperm phylogeny was heterogeneous, but their increase was generally associated with ITS2 sequence homogenization, thereby supporting the occurrence of GC-biased gene conversion (gBGC) during the concerted evolution of ITS2. Additionally, the GC content in the paired regions of the ITS2 secondary structure was significantly higher than that of the unpaired regions, indicating the selection of GC for thermodynamic stability. Furthermore, the RNA substitution models demonstrated that base-pair transformations favored both the elevation and fixation of GC in the paired regions, providing further support for gBGC. CONCLUSIONS Our findings highlight the significance of secondary structure in GC investigation, which demonstrate that both gBGC and structure-based selection are influential factors driving angiosperm ITS2 GC content.
Collapse
Affiliation(s)
- Yubo Liu
- Marine College, Shandong University, Weihai, 264209, China
- Division of Physical Biology, CAS Key Laboratory of Interfacial Physics and Technology, Shanghai Institute of Applied Physics, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Shanghai, 201800, China
| | - Nan Liang
- Marine College, Shandong University, Weihai, 264209, China
- Allergy Department, State Key Laboratory of Complex Severe and Rare Diseases, Peking Union Medical College Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100730, China
| | - Qing Xian
- Marine College, Shandong University, Weihai, 264209, China
| | - Wei Zhang
- Marine College, Shandong University, Weihai, 264209, China.
| |
Collapse
|
4
|
Heng J, Heng HH. Karyotype as code of codes: An inheritance platform to shape the pattern and scale of evolution. Biosystems 2023; 233:105016. [PMID: 37659678 DOI: 10.1016/j.biosystems.2023.105016] [Citation(s) in RCA: 9] [Impact Index Per Article: 4.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/12/2023] [Revised: 08/27/2023] [Accepted: 08/28/2023] [Indexed: 09/04/2023]
Abstract
Organismal evolution displays complex dynamics in phase and scale which seem to trend towards increasing biocomplexity and diversity. For over a century, such amazing dynamics have been cleverly explained by the apparently straightforward mechanism of natural selection: all diversification, including speciation, results from the gradual accumulation of small beneficial or near-neutral alterations over long timescales. However, although this has been widely accepted, natural selection makes a crucial assumption that has not yet been validated. Specifically, the informational relationship between small microevolutionary alterations and large macroevolutionary changes in natural selection is unclear. To address the macroevolution-microevolution relationship, it is crucial to incorporate the concept of organic codes and particularly the "karyotype code" which defines macroevolutionary changes. This concept piece examines the karyotype from the perspective of two-phased evolution and four key components of information management. It offers insight into how the karyotype creates and preserves information that defines the scale and phase of macroevolution and, by extension, microevolution. We briefly describe the relationship between the karyotype code, the genetic code, and other organic codes in the context of generating evolutionary novelties in macroevolution and imposing constraints on them as biological routines in microevolution. Our analyses suggest that karyotype coding preserves many organic codes by providing system-level inheritance, and similar analyses are needed to classify and prioritize a large number of different organic codes based on the phases and scales of evolution. Finally, the importance of natural information self-creation is briefly discussed, leading to a call to integrate information and time into the relationship between matter and energy.
Collapse
Affiliation(s)
- Julie Heng
- Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, 02138, USA
| | - Henry H Heng
- Molecular Medicine and Genomics, Wayne State University School of Medicine, Detroit, MI, 48201, USA; Department of Pathology, Wayne State University School of Medicine, Detroit, MI, 48201, USA.
| |
Collapse
|
5
|
Genome Evolution and the Future of Phylogenomics of Non-Avian Reptiles. Animals (Basel) 2023; 13:ani13030471. [PMID: 36766360 PMCID: PMC9913427 DOI: 10.3390/ani13030471] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2022] [Revised: 01/13/2023] [Accepted: 01/15/2023] [Indexed: 02/01/2023] Open
Abstract
Non-avian reptiles comprise a large proportion of amniote vertebrate diversity, with squamate reptiles-lizards and snakes-recently overtaking birds as the most species-rich tetrapod radiation. Despite displaying an extraordinary diversity of phenotypic and genomic traits, genomic resources in non-avian reptiles have accumulated more slowly than they have in mammals and birds, the remaining amniotes. Here we review the remarkable natural history of non-avian reptiles, with a focus on the physical traits, genomic characteristics, and sequence compositional patterns that comprise key axes of variation across amniotes. We argue that the high evolutionary diversity of non-avian reptiles can fuel a new generation of whole-genome phylogenomic analyses. A survey of phylogenetic investigations in non-avian reptiles shows that sequence capture-based approaches are the most commonly used, with studies of markers known as ultraconserved elements (UCEs) especially well represented. However, many other types of markers exist and are increasingly being mined from genome assemblies in silico, including some with greater information potential than UCEs for certain investigations. We discuss the importance of high-quality genomic resources and methods for bioinformatically extracting a range of marker sets from genome assemblies. Finally, we encourage herpetologists working in genomics, genetics, evolutionary biology, and other fields to work collectively towards building genomic resources for non-avian reptiles, especially squamates, that rival those already in place for mammals and birds. Overall, the development of this cross-amniote phylogenomic tree of life will contribute to illuminate interesting dimensions of biodiversity across non-avian reptiles and broader amniotes.
Collapse
|
6
|
Matoulek D, Ježek B, Vohnoutová M, Symonová R. Advances in Vertebrate (Cyto)Genomics Shed New Light on Fish Compositional Genome Evolution. Genes (Basel) 2023; 14:genes14020244. [PMID: 36833171 PMCID: PMC9956151 DOI: 10.3390/genes14020244] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2022] [Accepted: 01/05/2023] [Indexed: 01/19/2023] Open
Abstract
Cytogenetic and compositional studies considered fish genomes rather poor in guanine-cytosine content (GC%) because of a putative "sharp increase in genic GC% during the evolution of higher vertebrates". However, the available genomic data have not been exploited to confirm this viewpoint. In contrast, further misunderstandings in GC%, mostly of fish genomes, originated from a misapprehension of the current flood of data. Utilizing public databases, we calculated the GC% in animal genomes of three different, technically well-established fractions: DNA (entire genome), cDNA (complementary DNA), and cds (exons). Our results across chordates help set borders of GC% values that are still incorrect in literature and show: (i) fish in their immense diversity possess comparably GC-rich (or even GC-richer) genomes as higher vertebrates, and fish exons are GC-enriched among vertebrates; (ii) animal genomes generally show a GC-enrichment from the DNA, over cDNA, to the cds level (i.e., not only the higher vertebrates); (iii) fish and invertebrates show a broad(er) inter-quartile range in GC%, while avian and mammalian genomes are more constrained in their GC%. These results indicate no sharp increase in the GC% of genes during the transition to higher vertebrates, as stated and numerously repeated before. We present our results in 2D and 3D space to explore the compositional genome landscape and prepared an online platform to explore the AT/GC compositional genome evolution.
Collapse
Affiliation(s)
- Dominik Matoulek
- Department of Physics, Faculty of Science, University of Hradec Králové, 500 03 Hradec Králové, Czech Republic
| | - Bruno Ježek
- Faculty of Informatics and Management, University of Hradec Králové, Rokitanského 62, 500 02 Hradec Králové, Czech Republic
| | - Marta Vohnoutová
- Department of Computer Science, Faculty of Science, University of South Bohemia, Branišovská 1760, 370 05 České Budějovice, Czech Republic
| | - Radka Symonová
- Department of Computer Science, Faculty of Science, University of South Bohemia, Branišovská 1760, 370 05 České Budějovice, Czech Republic
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, 85354 Freising, Germany
- Institute of Hydrobiology, Biology Centre of the Czech Academy of Sciences, 370 05 České Budějovice, Czech Republic
- Correspondence:
| |
Collapse
|
7
|
On the Base Composition of Transposable Elements. Int J Mol Sci 2022; 23:ijms23094755. [PMID: 35563146 PMCID: PMC9099904 DOI: 10.3390/ijms23094755] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/24/2022] [Revised: 04/22/2022] [Accepted: 04/23/2022] [Indexed: 01/27/2023] Open
Abstract
Transposable elements exhibit a base composition that is often different from the genomic average and from hosts’ genes. The most common compositional bias is towards Adenosine and Thymine, although this bias is not universal, and elements with drastically different base composition can coexist within the same genome. The AT-richness of transposable elements is apparently maladaptive because it results in poor transcription and sub-optimal translation of proteins encoded by the elements. The cause(s) of this unusual base composition remain unclear and have yet to be investigated. Here, I review what is known about the nucleotide content of transposable elements and how this content can affect the genome of their host as well as their own replication. The compositional bias of transposable elements could result from several non-exclusive processes including horizontal transfer, mutational bias, and selection. It appears that mutation alone cannot explain the high AT-content of transposons and that selection plays a major role in the evolution of the compositional bias. The reason why selection would favor a maladaptive nucleotide content remains however unexplained and is an area of investigation that clearly deserves attention.
Collapse
|
8
|
Popova LV, Nagarajan P, Lovejoy CM, Sunkel B, Gardner M, Wang M, Freitas M, Stanton B, Parthun M. Epigenetic regulation of nuclear lamina-associated heterochromatin by HAT1 and the acetylation of newly synthesized histones. Nucleic Acids Res 2021; 49:12136-12151. [PMID: 34788845 PMCID: PMC8643632 DOI: 10.1093/nar/gkab1044] [Citation(s) in RCA: 12] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2021] [Revised: 09/20/2021] [Accepted: 10/14/2021] [Indexed: 12/15/2022] Open
Abstract
A central component of the epigenome is the pattern of histone post-translational modifications that play a critical role in the formation of specific chromatin states. Following DNA replication, nascent chromatin is a 1:1 mixture of parental and newly synthesized histones and the transfer of modification patterns from parental histones to new histones is a fundamental step in epigenetic inheritance. Here we report that loss of HAT1, which acetylates lysines 5 and 12 of newly synthesized histone H4 during replication-coupled chromatin assembly, results in the loss of accessibility of large domains of heterochromatin, termed HAT1-dependent Accessibility Domains (HADs). HADs are mega base-scale domains that comprise ∼10% of the mouse genome. HAT1 globally represses H3 K9 me3 levels and HADs correspond to the regions of the genome that display HAT1-dependent increases in H3 K9me3 peak density. HADs display a high degree of overlap with a subset of Lamin-Associated Domains (LADs). HAT1 is required to maintain nuclear structure and integrity. These results indicate that HAT1 and the acetylation of newly synthesized histones may be critical regulators of the epigenetic inheritance of heterochromatin and suggest a new mechanism for the epigenetic regulation of nuclear lamina-heterochromatin interactions.
Collapse
Affiliation(s)
- Liudmila V Popova
- Department of Biological Chemistry and Pharmacology, The Ohio State University, Columbus, OH 43210, USA
| | - Prabakaran Nagarajan
- Department of Biological Chemistry and Pharmacology, The Ohio State University, Columbus, OH 43210, USA
| | - Callie M Lovejoy
- Department of Biological Chemistry and Pharmacology, The Ohio State University, Columbus, OH 43210, USA
| | - Benjamin D Sunkel
- Abigail Wexner Research Institute at Nationwide Children's, Center for Childhood Cancer and Blood Diseases, Columbus, OH 43205, USA
| | - Miranda L Gardner
- Campus Chemical Instrument Center, Mass Spectrometry and Proteomics Facility, The Ohio State University, Columbus, OH 43210, USA
| | - Meng Wang
- Abigail Wexner Research Institute at Nationwide Children's, Center for Childhood Cancer and Blood Diseases, Columbus, OH 43205, USA
| | - Michael A Freitas
- Department of Cancer Biology and Genetics, The Ohio State University, Columbus, OH 43210, USA
| | - Benjamin Z Stanton
- Department of Biological Chemistry and Pharmacology, The Ohio State University, Columbus, OH 43210, USA
- Abigail Wexner Research Institute at Nationwide Children's, Center for Childhood Cancer and Blood Diseases, Columbus, OH 43205, USA
| | - Mark R Parthun
- Department of Biological Chemistry and Pharmacology, The Ohio State University, Columbus, OH 43210, USA
| |
Collapse
|
9
|
Simón D, Cristina J, Musto H. Nucleotide Composition and Codon Usage Across Viruses and Their Respective Hosts. Front Microbiol 2021; 12:646300. [PMID: 34262534 PMCID: PMC8274242 DOI: 10.3389/fmicb.2021.646300] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2020] [Accepted: 06/04/2021] [Indexed: 11/13/2022] Open
Abstract
The genetic material of the three domains of life (Bacteria, Archaea, and Eukaryota) is always double-stranded DNA, and their GC content (molar content of guanine plus cytosine) varies between ≈ 13% and ≈ 75%. Nucleotide composition is the simplest way of characterizing genomes. Despite this simplicity, it has several implications. Indeed, it is the main factor that determines, among other features, dinucleotide frequencies, repeated short DNA sequences, and codon and amino acid usage. Which forces drive this strong variation is still a matter of controversy. For rather obvious reasons, most of the studies concerning this huge variation and its consequences, have been done in free-living organisms. However, no recent comprehensive study of all known viruses has been done (that is, concerning all available sequences). Viruses, by far the most abundant biological entities on Earth, are the causative agents of many diseases. An overview of these entities is important also because their genetic material is not always double-stranded DNA: indeed, certain viruses have as genetic material single-stranded DNA, double-stranded RNA, single-stranded RNA, and/or retro-transcribing. Therefore, one may wonder if what we have learned about the evolution of GC content and its implications in prokaryotes and eukaryotes also applies to viruses. In this contribution, we attempt to describe compositional properties of ∼ 10,000 viral species: base composition (globally and according to Baltimore classification), correlations among non-coding regions and the three codon positions, and the relationship of the nucleotide frequencies and codon usage of viruses with the same feature of their hosts. This allowed us to determine how the base composition of phages strongly correlate with the value of their respective hosts, while eukaryotic viruses do not (with fungi and protists as exceptions). Finally, we discuss some of these results concerning codon usage: reinforcing previous results, we found that phages and hosts exhibit moderate to high correlations, while for eukaryotes and their viruses the correlations are weak or do not exist.
Collapse
Affiliation(s)
- Diego Simón
- Laboratorio de Genómica Evolutiva, Departamento de Biología Celular y Molecular, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay.,Laboratorio de Virología Molecular, Centro de Investigaciones Nucleares, Facultad de Ciencias, Universidad de la Republica, Montevideo, Uruguay.,Laboratorio de Evolución Experimental de Virus, Institut Pasteur de Montevideo, Montevideo, Uruguay
| | - Juan Cristina
- Laboratorio de Virología Molecular, Centro de Investigaciones Nucleares, Facultad de Ciencias, Universidad de la Republica, Montevideo, Uruguay
| | - Héctor Musto
- Laboratorio de Genómica Evolutiva, Departamento de Biología Celular y Molecular, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| |
Collapse
|
10
|
Abstract
Recombination increases the local GC-content in genomic regions through GC-biased gene conversion (gBGC). The recent discovery of a large genomic region with extreme GC-content in the fat sand rat Psammomys obesus provides a model to study the effects of gBGC on chromosome evolution. Here, we compare the GC-content and GC-to-AT substitution patterns across protein-coding genes of four gerbil species and two murine rodents (mouse and rat). We find that the known high-GC region is present in all the gerbils, and is characterized by high substitution rates for all mutational categories (AT-to-GC, GC-to-AT, and GC-conservative) both at synonymous and nonsynonymous sites. A higher AT-to-GC than GC-to-AT rate is consistent with the high GC-content. Additionally, we find more than 300 genes outside the known region with outlying values of AT-to-GC synonymous substitution rates in gerbils. Of these, over 30% are organized into at least 17 large clusters observable at the megabase-scale. The unusual GC-skewed substitution pattern suggests the evolution of genomic regions with very high recombination rates in the gerbil lineage, which can lead to a runaway increase in GC-content. Our results imply that rapid evolution of GC-content is possible in mammals, with gerbil species providing a powerful model to study the mechanisms of gBGC.
Collapse
Affiliation(s)
- Rodrigo Pracana
- Department of Zoology, University of Oxford, Oxford, United Kingdom
| | | | - John F Mulley
- School of Natural Sciences, Bangor University, Bangor, Gwynedd, United Kingdom
| | | |
Collapse
|
11
|
Borůvková V, Howell WM, Matoulek D, Symonová R. Quantitative Approach to Fish Cytogenetics in the Context of Vertebrate Genome Evolution. Genes (Basel) 2021; 12:genes12020312. [PMID: 33671814 PMCID: PMC7926999 DOI: 10.3390/genes12020312] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/23/2020] [Revised: 02/01/2021] [Accepted: 02/17/2021] [Indexed: 01/14/2023] Open
Abstract
Our novel Python-based tool EVANGELIST allows the visualization of GC and repeats percentages along chromosomes in sequenced genomes and has enabled us to perform quantitative large-scale analyses on the chromosome level in fish and other vertebrates. This is a different approach from the prevailing analyses, i.e., analyses of GC% in the coding sequences that make up not more than 2% in human. We identified GC content (GC%) elevations in microchromosomes in ancient fish lineages similar to avian microchromosomes and a large variability in the relationship between the chromosome size and their GC% across fish lineages. This raises the question as to what extent does the chromosome size drive GC% as posited by the currently accepted explanation based on the recombination rate. We ascribe the differences found across fishes to varying GC% of repetitive sequences. Generally, our results suggest that the GC% of repeats and proportion of repeats are independent of the chromosome size. This leaves an open space for another mechanism driving the GC evolution in vertebrates.
Collapse
Affiliation(s)
- Veronika Borůvková
- Faculty of Science, University of Hradec Kralove, 500 03 Hradec Kralove, Czech Republic; (V.B.); (D.M.)
| | - W. Mike Howell
- Department of Biological and Environmental Sciences, Samford University, Birmingham, AL 35226, USA;
| | - Dominik Matoulek
- Faculty of Science, University of Hradec Kralove, 500 03 Hradec Kralove, Czech Republic; (V.B.); (D.M.)
| | - Radka Symonová
- Department of Bioinformatics, Wissenschaftszentrum Weihenstephan, Technische Universität München, 85354 Freising, Germany
- Correspondence:
| |
Collapse
|
12
|
Pantier R, Chhatbar K, Quante T, Skourti-Stathaki K, Cholewa-Waclaw J, Alston G, Alexander-Howden B, Lee HY, Cook AG, Spruijt CG, Vermeulen M, Selfridge J, Bird A. SALL4 controls cell fate in response to DNA base composition. Mol Cell 2021; 81:845-858.e8. [PMID: 33406384 PMCID: PMC7895904 DOI: 10.1016/j.molcel.2020.11.046] [Citation(s) in RCA: 24] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 10/23/2020] [Accepted: 11/25/2020] [Indexed: 12/30/2022]
Abstract
Mammalian genomes contain long domains with distinct average compositions of A/T versus G/C base pairs. In a screen for proteins that might interpret base composition by binding to AT-rich motifs, we identified the stem cell factor SALL4, which contains multiple zinc fingers. Mutation of the domain responsible for AT binding drastically reduced SALL4 genome occupancy and prematurely upregulated genes in proportion to their AT content. Inactivation of this single AT-binding zinc-finger cluster mimicked defects seen in Sall4 null cells, including precocious differentiation of embryonic stem cells (ESCs) and embryonic lethality in mice. In contrast, deletion of two other zinc-finger clusters was phenotypically neutral. Our data indicate that loss of pluripotency is triggered by downregulation of SALL4, leading to de-repression of a set of AT-rich genes that promotes neuronal differentiation. We conclude that base composition is not merely a passive byproduct of genome evolution and constitutes a signal that aids control of cell fate.
Collapse
Affiliation(s)
- Raphaël Pantier
- The Wellcome Centre for Cell Biology, University of Edinburgh, Michael Swann Building, Max Born Crescent, The King's Buildings, Edinburgh EH9 3BF, UK
| | - Kashyap Chhatbar
- The Wellcome Centre for Cell Biology, University of Edinburgh, Michael Swann Building, Max Born Crescent, The King's Buildings, Edinburgh EH9 3BF, UK; Informatics Forum, School of Informatics, University of Edinburgh, 10 Crichton Street, Edinburgh EH8 9AB, UK
| | - Timo Quante
- The Wellcome Centre for Cell Biology, University of Edinburgh, Michael Swann Building, Max Born Crescent, The King's Buildings, Edinburgh EH9 3BF, UK
| | - Konstantina Skourti-Stathaki
- The Wellcome Centre for Cell Biology, University of Edinburgh, Michael Swann Building, Max Born Crescent, The King's Buildings, Edinburgh EH9 3BF, UK
| | - Justyna Cholewa-Waclaw
- The Wellcome Centre for Cell Biology, University of Edinburgh, Michael Swann Building, Max Born Crescent, The King's Buildings, Edinburgh EH9 3BF, UK
| | - Grace Alston
- The Wellcome Centre for Cell Biology, University of Edinburgh, Michael Swann Building, Max Born Crescent, The King's Buildings, Edinburgh EH9 3BF, UK
| | - Beatrice Alexander-Howden
- The Wellcome Centre for Cell Biology, University of Edinburgh, Michael Swann Building, Max Born Crescent, The King's Buildings, Edinburgh EH9 3BF, UK
| | - Heng Yang Lee
- The Wellcome Centre for Cell Biology, University of Edinburgh, Michael Swann Building, Max Born Crescent, The King's Buildings, Edinburgh EH9 3BF, UK
| | - Atlanta G Cook
- The Wellcome Centre for Cell Biology, University of Edinburgh, Michael Swann Building, Max Born Crescent, The King's Buildings, Edinburgh EH9 3BF, UK
| | - Cornelia G Spruijt
- Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands
| | - Michiel Vermeulen
- Department of Molecular Biology, Faculty of Science, Radboud Institute for Molecular Life Sciences, Oncode Institute, Radboud University Nijmegen, Nijmegen, the Netherlands
| | - Jim Selfridge
- The Wellcome Centre for Cell Biology, University of Edinburgh, Michael Swann Building, Max Born Crescent, The King's Buildings, Edinburgh EH9 3BF, UK
| | - Adrian Bird
- The Wellcome Centre for Cell Biology, University of Edinburgh, Michael Swann Building, Max Born Crescent, The King's Buildings, Edinburgh EH9 3BF, UK.
| |
Collapse
|
13
|
Carducci F, Barucca M, Canapa A, Carotti E, Biscotti MA. Mobile Elements in Ray-Finned Fish Genomes. Life (Basel) 2020; 10:E221. [PMID: 32992841 PMCID: PMC7599744 DOI: 10.3390/life10100221] [Citation(s) in RCA: 19] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Revised: 09/18/2020] [Accepted: 09/22/2020] [Indexed: 12/12/2022] Open
Abstract
Ray-finned fishes (Actinopterygii) are a very diverse group of vertebrates, encompassing species adapted to live in freshwater and marine environments, from the deep sea to high mountain streams. Genome sequencing offers a genetic resource for investigating the molecular bases of this phenotypic diversity and these adaptations to various habitats. The wide range of genome sizes observed in fishes is due to the role of transposable elements (TEs), which are powerful drivers of species diversity. Analyses performed to date provide evidence that class II DNA transposons are the most abundant component in most fish genomes and that compared to other vertebrate genomes, many TE superfamilies are present in actinopterygians. Moreover, specific TEs have been reported in ray-finned fishes as a possible result of an intricate relationship between TE evolution and the environment. The data summarized here underline the biological interest in Actinopterygii as a model group to investigate the mechanisms responsible for the high biodiversity observed in this taxon.
Collapse
Affiliation(s)
| | | | | | | | - Maria Assunta Biscotti
- Dipartimento di Scienze della Vita e dell’Ambiente, Università Politecnica delle Marche, 60131 Ancona, Italy; (F.C.); (M.B.); (A.C.); (E.C.)
| |
Collapse
|
14
|
Ayad LAK, Dourou AM, Arhondakis S, Pissis SP. IsoXpressor: A Tool to Assess Transcriptional Activity within Isochores. Genome Biol Evol 2020; 12:1573-1578. [PMID: 32857856 DOI: 10.1093/gbe/evaa171] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/07/2020] [Indexed: 01/20/2023] Open
Abstract
Genomes are characterized by large regions of homogeneous base compositions known as isochores. The latter are divided into GC-poor and GC-rich classes linked to distinct functional and structural properties. Several studies have addressed how isochores shape function and structure. To aid in this important subject, we present IsoXpressor, a tool designed for the analysis of the functional property of transcription within isochores. IsoXpressor allows users to process RNA-Seq data in relation to the isochores, and it can be employed to investigate any biological question of interest for any species. The results presented herein as proof of concept are focused on the preimplantation process in Homo sapiens (human) and Macaca mulatta (rhesus monkey).
Collapse
Affiliation(s)
| | | | | | - Solon P Pissis
- CWI, Amsterdam, The Netherlands.,Vrije Universiteit, Amsterdam, The Netherlands
| |
Collapse
|
15
|
Savelev I, Myakishev-Rempel M. Evidence for DNA resonance signaling via longitudinal hydrogen bonds. PROGRESS IN BIOPHYSICS AND MOLECULAR BIOLOGY 2020; 156:14-19. [PMID: 32712047 DOI: 10.1016/j.pbiomolbio.2020.07.005] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Subscribe] [Scholar Register] [Received: 05/07/2020] [Revised: 07/14/2020] [Accepted: 07/19/2020] [Indexed: 12/22/2022]
Abstract
The theory of the morphogenic field suggests that chemical signaling is supplemented by electromagnetic signaling governing the structure and shape of tissues, organs and the body. The theory of DNA resonance suggests that the morphogenic field is created by the genomic DNA which sends and receives electromagnetic signals in a sequence-specific manner. Previously, the authors have proposed the existence of HIDERs, genomic elements that serve as antennas in resonance signaling and demonstrated that they occur nonrandomly and are conserved in evolution. Here, it is proposed that longitudinal hydrogen bonds exist in the double helix, that chains of these bonds form delocalized proton clouds, that the shapes of these clouds are sequence-specific and form the basis of sequence-specificity of resonance between HIDERs. Based on longitudinal hydrogen bonds, a proton DNA resonance code was devised and used to identify HIDERs which are enriched 20 fold in the genome and conserved in evolution. It was suggested that these HIDERs are the key elements responsible for DNA resonance signaling and the formation of the morphogenic field.
Collapse
Affiliation(s)
| | - Max Myakishev-Rempel
- Localized Therapeutics, San Diego, CA, USA; DNA Resonance Lab, San Diego, CA, USA.
| |
Collapse
|
16
|
Arhondakis S, Milanesi M, Castrignanò T, Gioiosa S, Valentini A, Chillemi G. Evidence of distinct gene functional patterns in GC-poor and GC-rich isochores in Bos taurus. Anim Genet 2020; 51:358-368. [PMID: 32069522 DOI: 10.1111/age.12917] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 01/20/2020] [Indexed: 01/10/2023]
Abstract
Vertebrate genomes are mosaics of megabase-size DNA segments with a fairly homogeneous base composition, called isochores. They are divided into five families characterized by different guanine-cytosine (GC) levels and linked to several functional and structural properties. The increased availability of fully sequenced genomes allows the investigation of isochores in several species, assessing their level of conservation across vertebrate genomes. In this work, we characterized the isochores in Bos taurus using the ARS-UCD1.2 genome version. The comparison of our results with the well-studied human isochores and those of other mammals revealed a large conservation in isochore families, in number, average GC levels and gene density. Exceptions to the established increase in gene density with the increase in isochores (GC%) were observed for the following gene biotypes: tRNA, small nuclear RNA, small nucleolar RNA and pseudogenes that have their maximum number in H2 and H1 isochores. Subsequently, we assessed the ontology of all gene biotypes looking for functional classes that are statistically over- or under-represented in each isochore. Receptor activity and sensory perception pathways were significantly over-represented in L1 and L2 (GC-poor) isochores. This was also validated for the horse genome. Our analysis of housekeeping genes confirmed a preferential localization in GC-rich isochores, as reported in other species. Finally, we assessed the SNP distribution of a bovine high-density SNP chip across the isochores, finding a higher density in the GC-rich families, reflecting a potential bias in the chip, widely used for genetic selection and biodiversity studies.
Collapse
Affiliation(s)
- S Arhondakis
- Bioinformatics and Computational Science (BioCoS), Boniali 11-19, Chania, 73134, Crete, Greece
| | - M Milanesi
- Department of Support, Production and Animal Health, School of Veterinary Medicine, São Paulo State University, 16050-680 R. Clóvis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil.,International Atomic Energy Agency Collaborating Centre on Animal Genomics and Bioinformatics, 16050-680 R. Clóvis Pestana 793 - Dona Amelia, Araçatuba, SP, Brazil
| | - T Castrignanò
- SCAI - Super Computing Applications and Innovation Department, CINECA, Rome, Italy
| | - S Gioiosa
- SCAI - Super Computing Applications and Innovation Department, CINECA, Rome, Italy
| | - A Valentini
- Department for Innovation in Biological, Agro-food and Forest Systems, DIBAF, University of Tuscia, via S. Camillo de Lellis s.n.c, 01100, Viterbo, Italy
| | - G Chillemi
- Department for Innovation in Biological, Agro-food and Forest Systems, DIBAF, University of Tuscia, via S. Camillo de Lellis s.n.c, 01100, Viterbo, Italy.,Institute of Biomembranes, Bioenergetics and Molecular Biotechnologies, IBIOM, CNR, Bari, Italy
| |
Collapse
|
17
|
Huttener R, Thorrez L, In't Veld T, Granvik M, Snoeck L, Van Lommel L, Schuit F. GC content of vertebrate exome landscapes reveal areas of accelerated protein evolution. BMC Evol Biol 2019; 19:144. [PMID: 31311498 PMCID: PMC6636035 DOI: 10.1186/s12862-019-1469-1] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/18/2019] [Accepted: 06/26/2019] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Rapid accumulation of vertebrate genome sequences render comparative genomics a powerful approach to study macro-evolutionary events. The assessment of phylogenic relationships between species routinely depends on the analysis of sequence homology at the nucleotide or protein level. RESULTS We analyzed mRNA GC content, codon usage and divergence of orthologous proteins in 55 vertebrate genomes. Data were visualized in genome-wide landscapes using a sliding window approach. Landscapes of GC content reveal both evolutionary conservation of clustered genes, and lineage-specific changes, so that it was possible to construct a phylogenetic tree that closely matched the classic "tree of life". Landscapes of GC content also strongly correlated to landscapes of amino acid usage: positive correlation with glycine, alanine, arginine and proline and negative correlation with phenylalanine, tyrosine, methionine, isoleucine, asparagine and lysine. Peaks of GC content correlated strongly with increased protein divergence. CONCLUSIONS Landscapes of base- and amino acid composition of the coding genome opens a new approach in comparative genomics, allowing identification of discrete regions in which protein evolution accelerated over deep evolutionary time. Insight in the evolution of genome structure may spur novel studies assessing the evolutionary benefit of genes in particular genomic regions.
Collapse
Affiliation(s)
- R Huttener
- Gene Expression Unit, Dept of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
| | - L Thorrez
- Gene Expression Unit, Dept of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium.,Tissue Engineering Laboratory, Dept of Development and Regeneration, KU Leuven, Kortrijk, Belgium
| | - T In't Veld
- Gene Expression Unit, Dept of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
| | - M Granvik
- Gene Expression Unit, Dept of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
| | - L Snoeck
- Tissue Engineering Laboratory, Dept of Development and Regeneration, KU Leuven, Kortrijk, Belgium
| | - L Van Lommel
- Gene Expression Unit, Dept of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium
| | - F Schuit
- Gene Expression Unit, Dept of Cellular and Molecular Medicine, KU Leuven, Leuven, Belgium.
| |
Collapse
|
18
|
Jabbari K, Wirtz J, Rauscher M, Wiehe T. A common genomic code for chromatin architecture and recombination landscape. PLoS One 2019; 14:e0213278. [PMID: 30865674 PMCID: PMC6415826 DOI: 10.1371/journal.pone.0213278] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/29/2018] [Accepted: 02/18/2019] [Indexed: 12/14/2022] Open
Abstract
Recent findings established a link between DNA sequence composition and interphase chromatin architecture and explained the evolutionary conservation of TADs (Topologically Associated Domains) and LADs (Lamina Associated Domains) in mammals. This prompted us to analyse conformation capture and recombination rate data to study the relationship between chromatin architecture and recombination landscape of human and mouse genomes. The results reveal that: (1) low recombination domains and blocks of elevated linkage disequilibrium tend to coincide with TADs and isochores, indicating co-evolving regulatory elements and genes in insulated neighbourhoods; (2) double strand break (DSB) and recombination frequencies increase in the short loops of GC-rich TADs, whereas recombination cold spots are typical of LADs and (3) the binding and loading of proteins, which are critical for DSB and meiotic recombination (SPO11, DMC1, H3K4me3 and PRMD9) are higher in GC-rich TADs. One explanation for these observations is that the occurrence of DSB and recombination in meiotic cells are associated with compositional and epigenetic features (genomic code) that influence DNA stiffness/flexibility and appear to be similar to those guiding the chromatin architecture in the interphase nucleus of pre-leptotene cells.
Collapse
Affiliation(s)
- Kamel Jabbari
- Institute for Genetics, Biocenter Cologne, University of Cologne, Köln, Germany
- * E-mail:
| | - Johannes Wirtz
- Institute for Genetics, Biocenter Cologne, University of Cologne, Köln, Germany
| | - Martina Rauscher
- Institute for Genetics, Biocenter Cologne, University of Cologne, Köln, Germany
| | - Thomas Wiehe
- Institute for Genetics, Biocenter Cologne, University of Cologne, Köln, Germany
| |
Collapse
|
19
|
Gul IS, Staal J, Hulpiau P, De Keuckelaere E, Kamm K, Deroo T, Sanders E, Staes K, Driege Y, Saeys Y, Beyaert R, Technau U, Schierwater B, van Roy F. GC Content of Early Metazoan Genes and Its Impact on Gene Expression Levels in Mammalian Cell Lines. Genome Biol Evol 2018; 10:909-917. [PMID: 29608715 PMCID: PMC5952964 DOI: 10.1093/gbe/evy040] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 02/20/2018] [Indexed: 01/20/2023] Open
Abstract
With the genomes available for many animal clades, including the early-branching metazoans, one can readily study the functional conservation of genes across a diversity of animal lineages. Ectopic expression of an animal protein in, for instance, a mammalian cell line is a generally used strategy in structure–function analysis. However, this might turn out to be problematic in case of distantly related species. Here we analyzed the GC content of the coding sequences of basal animals and show its impact on gene expression levels in human cell lines, and, importantly, how this expression efficiency can be improved. Optimization of the GC3 content in the coding sequences of cadherin, alpha-catenin, and paracaspase of Trichoplax adhaerens dramatically increased the expression of these basal animal genes in human cell lines.
Collapse
Affiliation(s)
- Ismail Sahin Gul
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Jens Staal
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Paco Hulpiau
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Evi De Keuckelaere
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Kai Kamm
- Institut für Tierökologie und Zellbiologie (ITZ), Division of Ecology and Evolution, Stiftung Tieraerztliche Hochschule Hannover, Hannover, Germany
| | - Tom Deroo
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Ellen Sanders
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Katrien Staes
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Yasmine Driege
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Yvan Saeys
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Belgium
| | - Rudi Beyaert
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| | - Ulrich Technau
- Department of Molecular Evolution and Development, Faculty of Life Sciences, University of Vienna, Austria
| | - Bernd Schierwater
- Institut für Tierökologie und Zellbiologie (ITZ), Division of Ecology and Evolution, Stiftung Tieraerztliche Hochschule Hannover, Hannover, Germany
| | - Frans van Roy
- Center for Inflammation Research, Flanders Institute for Biotechnology (VIB), Ghent, Belgium.,Department of Biomedical Molecular Biology, Ghent University, Belgium
| |
Collapse
|
20
|
Zhang D, Hu P, Liu T, Wang J, Jiang S, Xu Q, Chen L. GC bias lead to increased small amino acids and random coils of proteins in cold-water fishes. BMC Genomics 2018; 19:315. [PMID: 29720106 PMCID: PMC5930961 DOI: 10.1186/s12864-018-4684-z] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2017] [Accepted: 04/16/2018] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Temperature adaptation of biological molecules is fundamental in evolutionary studies but remains unsolved. Fishes living in cold water are adapted to low temperatures through adaptive modification of their biological molecules, which enables their functioning in extreme cold. To study nucleotide and amino acid preference in cold-water fishes, we investigated the substitution asymmetry of codons and amino acids in protein-coding DNA sequences between cold-water fishes and tropical fishes., The former includes two Antarctic fishes, Dissostichus mawsoni (Antarctic toothfish), Gymnodraco acuticeps (Antarctic dragonfish), and two temperate fishes, Gadus morhua (Atlantic cod) and Gasterosteus aculeatus (stickleback), and the latter includes three tropical fishes, including Danio rerio (zebrafish), Oreochromis niloticus (Nile tilapia) and Xiphophorus maculatus (Platyfish). RESULTS Cold-water fishes showed preference for Guanines and cytosines (GCs) in both synonymous and nonsynonymous codon substitution when compared with tropical fishes. Amino acids coded by GC-rich codons are favored in the temperate fishes, while those coded by AT-rich codons are disfavored. Similar trends were discovered in Antarctic fishes but were statistically weaker. The preference of GC rich codons in nonsynonymous substitution tends to increase ratio of small amino acid in proteins, which was demonstrated by biased small amino acid substitutions in the cold-water species when compared with the tropical species, especially in the temperate species. Prediction and comparison of secondary structure of the proteomes showed that frequency of random coils are significantly larger in the cold-water fish proteomes than those of the tropical fishes. CONCLUSIONS Our results suggested that natural selection in cold temperature might favor biased GC content in the coding DNA sequences, which lead to increased frequency of small amino acids and consequently increased random coils in the proteomes of cold-water fishes.
Collapse
Affiliation(s)
- Dongsheng Zhang
- Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources, Shanghai Ocean University, Ministry of Education, National Demonstration Center for Experimental Fisheries Science Education (Shanghai Ocean University), Shanghai, People's Republic of China
| | - Peng Hu
- Department of Genetics, University of Pennsylvania, Philadelphia, USA
| | - Taigang Liu
- College of Informatics, Shanghai Ocean University, Shanghai, People's Republic of China
| | - Jian Wang
- Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources, Shanghai Ocean University, Ministry of Education, National Demonstration Center for Experimental Fisheries Science Education (Shanghai Ocean University), Shanghai, People's Republic of China
| | - Shouwen Jiang
- Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources, Shanghai Ocean University, Ministry of Education, National Demonstration Center for Experimental Fisheries Science Education (Shanghai Ocean University), Shanghai, People's Republic of China
| | - Qianghua Xu
- College of Marine Sciences, Shanghai Ocean University, Shanghai, People's Republic of China
| | - Liangbiao Chen
- Key Laboratory of Exploration and Utilization of Aquatic Genetic Resources, Shanghai Ocean University, Ministry of Education, National Demonstration Center for Experimental Fisheries Science Education (Shanghai Ocean University), Shanghai, People's Republic of China.
| |
Collapse
|
21
|
Berná L, Rodriguez M, Chiribao ML, Parodi-Talice A, Pita S, Rijo G, Alvarez-Valin F, Robello C. Expanding an expanded genome: long-read sequencing of Trypanosoma cruzi. Microb Genom 2018; 4. [PMID: 29708484 PMCID: PMC5994713 DOI: 10.1099/mgen.0.000177] [Citation(s) in RCA: 82] [Impact Index Per Article: 11.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/18/2023] Open
Abstract
Although the genome of Trypanosoma cruzi, the causative agent of Chagas disease, was first made available in 2005, with additional strains reported later, the intrinsic genome complexity of this parasite (the abundance of repetitive sequences and genes organized in tandem) has traditionally hindered high-quality genome assembly and annotation. This also limits diverse types of analyses that require high degrees of precision. Long reads generated by third-generation sequencing technologies are particularly suitable to address the challenges associated with T. cruzi’s genome since they permit direct determination of the full sequence of large clusters of repetitive sequences without collapsing them. This, in turn, not only allows accurate estimation of gene copy numbers but also circumvents assembly fragmentation. Here, we present the analysis of the genome sequences of two T. cruzi clones: the hybrid TCC (TcVI) and the non-hybrid Dm28c (TcI), determined by PacBio Single Molecular Real-Time (SMRT) technology. The improved assemblies herein obtained permitted us to accurately estimate gene copy numbers, abundance and distribution of repetitive sequences (including satellites and retroelements). We found that the genome of T. cruzi is composed of a ‘core compartment’ and a ‘disruptive compartment’ which exhibit opposite GC content and gene composition. Novel tandem and dispersed repetitive sequences were identified, including some located inside coding sequences. Additionally, homologous chromosomes were separately assembled, allowing us to retrieve haplotypes as separate contigs instead of a unique mosaic sequence. Finally, manual annotation of surface multigene families, mucins and trans-sialidases allows now a better overview of these complex groups of genes.
Collapse
Affiliation(s)
- Luisa Berná
- 1Laboratory of Host Pathogen Interactions-UBM, Institut Pasteur de Montevideo, Montevideo, Uruguay
| | - Matias Rodriguez
- 2Sección Biomatemática - Unidad de Genómica Evolutiva, Facultad de Ciencias-UDELAR, Montevideo, Uruguay
| | - María Laura Chiribao
- 1Laboratory of Host Pathogen Interactions-UBM, Institut Pasteur de Montevideo, Montevideo, Uruguay.,3Departamento de Bioquímica, Facultad de Medicina-UDELAR, Montevideo, Uruguay
| | - Adriana Parodi-Talice
- 1Laboratory of Host Pathogen Interactions-UBM, Institut Pasteur de Montevideo, Montevideo, Uruguay.,4Sección Genética, Facultad de Ciencias-UDELAR, Montevideo, Uruguay
| | - Sebastián Pita
- 1Laboratory of Host Pathogen Interactions-UBM, Institut Pasteur de Montevideo, Montevideo, Uruguay.,4Sección Genética, Facultad de Ciencias-UDELAR, Montevideo, Uruguay
| | - Gastón Rijo
- 1Laboratory of Host Pathogen Interactions-UBM, Institut Pasteur de Montevideo, Montevideo, Uruguay
| | - Fernando Alvarez-Valin
- 2Sección Biomatemática - Unidad de Genómica Evolutiva, Facultad de Ciencias-UDELAR, Montevideo, Uruguay
| | - Carlos Robello
- 1Laboratory of Host Pathogen Interactions-UBM, Institut Pasteur de Montevideo, Montevideo, Uruguay.,3Departamento de Bioquímica, Facultad de Medicina-UDELAR, Montevideo, Uruguay
| |
Collapse
|
22
|
Lisachov AP, Trifonov VA, Giovannotti M, Ferguson-Smith MA, Borodin PM. Immunocytological analysis of meiotic recombination in two anole lizards (Squamata, Dactyloidae). COMPARATIVE CYTOGENETICS 2017; 11:129-141. [PMID: 28919954 PMCID: PMC5599703 DOI: 10.3897/compcytogen.v11i1.10916] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/25/2016] [Accepted: 01/16/2017] [Indexed: 05/13/2023]
Abstract
Although the evolutionary importance of meiotic recombination is not disputed, the significance of interspecies differences in the recombination rates and recombination landscapes remains under-appreciated. Recombination rates and distribution of chiasmata have been examined cytologically in many mammalian species, whereas data on other vertebrates are scarce. Immunolocalization of the protein of the synaptonemal complex (SYCP3), centromere proteins and the mismatch-repair protein MLH1 was used, which is associated with the most common type of recombination nodules, to analyze the pattern of meiotic recombination in the male of two species of iguanian lizards, Anolis carolinensis Voigt, 1832 and Deiroptyx coelestinus (Cope, 1862). These species are separated by a relatively long evolutionary history although they retain the ancestral iguanian karyotype. In both species similar and extremely uneven distributions of MLH1 foci along the macrochromosome bivalents were detected: approximately 90% of crossovers were located at the distal 20% of the chromosome arm length. Almost total suppression of recombination in the intermediate and proximal regions of the chromosome arms contradicts the hypothesis that "homogenous recombination" is responsible for the low variation in GC content across the anole genome. It also leads to strong linkage disequilibrium between the genes located in these regions, which may benefit conservation of co-adaptive gene arrays responsible for the ecological adaptations of the anoles.
Collapse
Affiliation(s)
- Artem P. Lisachov
- Institute of Cytology and Genetics, Russian Academy of Sciences, Siberian Branch, Novosibirsk 630090, Russia
| | - Vladimir A. Trifonov
- Institute of Molecular and Cellular Biology, Russian Academy of Sciences, Siberian Branch, Novosibirsk 630090, Russia
- Novosibirsk State University, Novosibirsk 630090, Russia
| | - Massimo Giovannotti
- Dipartimento di Scienze della Vita e dell’Ambiente, Università Politecnica delle Marche, via Brecce Bianche, 60131 Ancona, Italy
| | - Malcolm A. Ferguson-Smith
- Cambridge Resource Centre for Comparative Genomics, Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| | | |
Collapse
|
23
|
Costantini M, Musto H. The Isochores as a Fundamental Level of Genome Structure and Organization: A General Overview. J Mol Evol 2017; 84:93-103. [PMID: 28243687 DOI: 10.1007/s00239-017-9785-9] [Citation(s) in RCA: 25] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2016] [Accepted: 02/15/2017] [Indexed: 11/30/2022]
Abstract
The recent availability of a number of fully sequenced genomes (including marine organisms) allowed to map very precisely the isochores, based on DNA sequences, confirming the results obtained before genome sequencing by the ultracentrifugation in CsCl. In fact, the analytical profile of human DNA showed that the vertebrate genome is a mosaic of isochores, typically megabase-size DNA segments that belong to a small number of families characterized by different GC levels. In this review, we will concentrate on some general genome features regarding the compositional organization from different organisms and their evolution, ranging from vertebrates to invertebrates until unicellular organisms. Since isochores are tightly linked to biological properties such as gene density, replication timing, and recombination, the new level of detail provided by the isochore map helped the understanding of genome structure, function, and evolution. All the findings reported here confirm the idea that the isochores can be considered as a "fundamental level of genome structure and organization." We stress that we do not discuss in this review the origin of isochores, which is still a matter of controversy, but we focus on well established structural and physiological aspects.
Collapse
Affiliation(s)
- Maria Costantini
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121, Napoli, Italy.
| | - Héctor Musto
- Laboratorio de Organización y Evolución del Genoma, Unidad de Genómica Evolutiva, Facultad de Ciencias, 11400, Montevideo, Uruguay
| |
Collapse
|
24
|
Jabbari K, Bernardi G. An Isochore Framework Underlies Chromatin Architecture. PLoS One 2017; 12:e0168023. [PMID: 28060840 PMCID: PMC5218411 DOI: 10.1371/journal.pone.0168023] [Citation(s) in RCA: 46] [Impact Index Per Article: 5.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/28/2016] [Accepted: 11/24/2016] [Indexed: 01/03/2023] Open
Abstract
A recent investigation showed the existence of correlations between the architectural features of mammalian interphase chromosomes and the compositional properties of isochores. This result prompted us to compare maps of the Topologically Associating Domains (TADs) and of the Lamina Associated Domains (LADs) with the corresponding isochore maps of mouse and human chromosomes. This approach revealed that: 1) TADs and LADs correspond to isochores, i.e., isochores are the genomic units that underlie chromatin domains; 2) the conservation of TADs and LADs in mammalian genomes is explained by the evolutionary conservation of isochores; 3) chromatin domains corresponding to GC-poor isochores (e.g., LADs) show not only self-interactions but also intrachromosomal interactions with other domains also corresponding to GC-poor isochores even if located far away; in contrast, chromatin domains corresponding to GC-rich isochores (e.g., TADs) show more localized chromosomal interactions, many of which are inter-chromosomal. In conclusion, this investigation establishes a link between DNA sequences and chromatin architecture, explains the evolutionary conservation of TADs and LADs and provides new information on the spatial distribution of GC-poor/gene-poor and GC-rich/gene-rich chromosomal regions in the interphase nucleus.
Collapse
Affiliation(s)
- Kamel Jabbari
- Max Planck Institute for Biology of Ageing, Joseph-Stelzmann-Straße 9B, Köln, Germany
| | - Giorgio Bernardi
- Science Department, Roma Tre University, Viale Marconi, Rome, Italy, and Stazione Zoologica Anton Dohrn, Villa Comunale, Naples, Italy
| |
Collapse
|
25
|
Symonová R, Majtánová Z, Arias-Rodriguez L, Mořkovský L, Kořínková T, Cavin L, Pokorná MJ, Doležálková M, Flajšhans M, Normandeau E, Ráb P, Meyer A, Bernatchez L. Genome Compositional Organization in Gars Shows More Similarities to Mammals than to Other Ray-Finned Fish. JOURNAL OF EXPERIMENTAL ZOOLOGY PART B-MOLECULAR AND DEVELOPMENTAL EVOLUTION 2016; 328:607-619. [DOI: 10.1002/jez.b.22719] [Citation(s) in RCA: 19] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/17/2016] [Revised: 11/13/2016] [Accepted: 11/22/2016] [Indexed: 12/12/2022]
Affiliation(s)
- Radka Symonová
- Laboratory of Fish Genetics; Institute of Animal Physiology and Genetics; The Czech Academy of Sciences; Liběchov Czech Republic
- Department of Zoology; Faculty of Science; Charles University; Prague 2 Czech Republic
- Research Institute for Limnology; University of Innsbruck; Mondsee Austria
| | - Zuzana Majtánová
- Laboratory of Fish Genetics; Institute of Animal Physiology and Genetics; The Czech Academy of Sciences; Liběchov Czech Republic
- Department of Zoology; Faculty of Science; Charles University; Prague 2 Czech Republic
| | - Lenin Arias-Rodriguez
- División Académica de Ciencias Biológicas; Universidad Juárez Autónoma de Tabasco (UJAT); Villahermosa Tabasco México
| | - Libor Mořkovský
- Department of Zoology; Faculty of Science; Charles University; Prague 2 Czech Republic
| | - Tereza Kořínková
- Laboratory of Fish Genetics; Institute of Animal Physiology and Genetics; The Czech Academy of Sciences; Liběchov Czech Republic
| | - Lionel Cavin
- Muséum d'Histoire Naturelle; Geneva 6 Switzerland
| | - Martina Johnson Pokorná
- Laboratory of Fish Genetics; Institute of Animal Physiology and Genetics; The Czech Academy of Sciences; Liběchov Czech Republic
- Department of Ecology; Faculty of Science; Charles University; Prague 2 Czech Republic
| | - Marie Doležálková
- Laboratory of Fish Genetics; Institute of Animal Physiology and Genetics; The Czech Academy of Sciences; Liběchov Czech Republic
- Department of Zoology; Faculty of Science; Charles University; Prague 2 Czech Republic
| | - Martin Flajšhans
- Faculty of Fisheries and Protection of Waters; South Bohemian Research Centre of Aquaculture and Biodiversity of Hydrocenoses; University of South Bohemia in České Budějovice; Vodňany Czech Republic
| | - Eric Normandeau
- IBIS, Department of Biology, University Laval, Pavillon Charles-Eugène-Marchand; Avenue de la Médecine Quebec City; Canada
| | - Petr Ráb
- Laboratory of Fish Genetics; Institute of Animal Physiology and Genetics; The Czech Academy of Sciences; Liběchov Czech Republic
| | - Axel Meyer
- Chair in Zoology and Evolutionary Biology; Department of Biology; University of Konstanz; Konstanz Germany
| | - Louis Bernatchez
- IBIS, Department of Biology, University Laval, Pavillon Charles-Eugène-Marchand; Avenue de la Médecine Quebec City; Canada
| |
Collapse
|
26
|
Sizova TV, Karpova OI. The length of chromatin loops in meiotic prophase I of warm-blooded vertebrates depends on the DNA compositional organization. RUSS J GENET+ 2016. [DOI: 10.1134/s1022795416110144] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
27
|
Lamolle G, Protasio AV, Iriarte A, Jara E, Simón D, Musto H. An Isochore-Like Structure in the Genome of the Flatworm Schistosoma mansoni. Genome Biol Evol 2016; 8:2312-8. [PMID: 27435793 PMCID: PMC5010904 DOI: 10.1093/gbe/evw170] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
Eukaryotic genomes are compositionally heterogeneous, that is, composed by regions that differ in guanine-cytosine (GC) content (isochores). The most well documented case is that of vertebrates (mainly mammals) although it has been also noted among unicellular eukaryotes and invertebrates. In the human genome, regarded as a typical mammal, this heterogeneity is associated with several features. Specifically, genes located in GC-richest regions are the GC3-richest, display CpG islands and have shorter introns. Furthermore, these genes are more heavily expressed and tend to be located at the extremes of the chromosomes. Although the compositional heterogeneity seems to be widespread among eukaryotes, the associated properties noted in the human genome and other mammals have not been investigated in depth in other taxa Here we provide evidence that the genome of the parasitic flatworm Schistosoma mansoni is compositionally heterogeneous and exhibits an isochore-like structure, displaying some features associated, until now, only with the human and other vertebrate genomes, with the exception of gene concentration.
Collapse
Affiliation(s)
- Guillermo Lamolle
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias, Udelar, Montevideo, Uruguay
| | - Anna V Protasio
- Wellcome Trust Genome Campus, Wellcome Trust Sanger Institute, Cambridge, United Kingdom
| | - Andrés Iriarte
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias, Udelar, Montevideo, Uruguay Dpto. de Desarrollo Biotecnológico, Facultad de Medicina, Instituto de Higiene, Udelar, Montevideo, Uruguay
| | - Eugenio Jara
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias, Udelar, Montevideo, Uruguay
| | - Diego Simón
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias, Udelar, Montevideo, Uruguay
| | - Héctor Musto
- Laboratorio de Organización y Evolución del Genoma, Facultad de Ciencias, Udelar, Montevideo, Uruguay
| |
Collapse
|
28
|
Barton C, Iliopoulos CS, Pissis SP, Arhondakis S. Transcriptome activity of isochores during preimplantation process in human and mouse. FEBS Lett 2016; 590:2297-306. [PMID: 27279593 DOI: 10.1002/1873-3468.12245] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/29/2016] [Revised: 05/27/2016] [Accepted: 06/03/2016] [Indexed: 12/17/2022]
Abstract
This work investigates the role of isochores during preimplantation process. Using RNA-seq data from human and mouse preimplantation stages, we created the spatio-temporal transcriptional profiles of the isochores during preimplantation. We found that from early to late stages, GC-rich isochores increase their expression while GC-poor ones decrease it. Network analysis revealed that modules with few coexpressed isochores are GC-poorer than medium-large ones, characterized by an opposite expression as preimplantation advances, decreasing and increasing respectively. Our results reveal a functional contribution of the isochores, supporting the presence of structural-functional interactions during maturation and early-embryonic development.
Collapse
Affiliation(s)
- Carl Barton
- The Blizard Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, UK
| | | | | | - Stilianos Arhondakis
- Institute of Molecular Biology and Biotechnology (IMBB), Foundation for Research and Technology - Hellas (FORTH), Heraklion, Crete, Greece
| |
Collapse
|
29
|
Costantini M, Greif G, Alvarez-Valin F, Bernardi G. The Anolis Lizard Genome: An Amniote Genome without Isochores? Genome Biol Evol 2016; 8:1048-55. [PMID: 26992416 PMCID: PMC4860688 DOI: 10.1093/gbe/evw056] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Two articles published 5 years ago concluded that the genome of the lizard Anolis carolinensis is an amniote genome without isochores. This claim was apparently contradicting previous results on the general presence of an isochore organization in all vertebrate genomes tested (including Anolis). In this investigation, we demonstrate that the Anolis genome is indeed heterogeneous in base composition, since its macrochromosomes comprise isochores mainly from the L2 and H1 families (a moderately GC-poor and a moderately GC-rich family, respectively), and since the majority of the sequenced microchromosomes consists of H1 isochores. These families are associated with different features of genome structure, including gene density and compositional correlations (e.g., GC3 vs flanking sequence GC and intron GC), as in the case of mammalian and avian genomes. Moreover, the assembled Anolis chromosomes have an enormous number of gaps, which could be due to sequencing problems in GC-rich regions of the genome. In conclusion, the Anolis genome is no exception to the general rule of an isochore organization in the genomes of vertebrates (and other eukaryotes).
Collapse
Affiliation(s)
- Maria Costantini
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Naples, Italy
| | - Gonzalo Greif
- Unidad de Biología Molecular, Instituto Pasteur de Montevideo, Montevideo, Uruguay
| | - Fernando Alvarez-Valin
- Sección Biomatemática, Facultad de Ciencias, Universidad de la República, Montevideo, Uruguay
| | - Giorgio Bernardi
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Naples, Italy Science Department, Roma Tre University, Rome, Italy
| |
Collapse
|
30
|
Bernardi G. Genome Organization and Chromosome Architecture. COLD SPRING HARBOR SYMPOSIA ON QUANTITATIVE BIOLOGY 2016; 80:83-91. [PMID: 26801160 DOI: 10.1101/sqb.2015.80.027318] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
How the same DNA sequences can function in the three-dimensional architecture of interphase nucleus, fold in the very compact structure of metaphase chromosomes, and go precisely back to the original interphase architecture in the following cell cycle remains an unresolved question to this day. The solution to this question presented here rests on the correlations that were found to hold between the isochore organization of the genome and the architecture of chromosomes from interphase to metaphase. The key points are the following: (1) The transition from the looped domains and subdomains of interphase chromatin to the 30-nm fiber loops of early prophase chromosomes goes through their unfolding into an extended chromatin structure (probably a 10-nm "beads-on-a-string" structure); (2) the architectural proteins of interphase chromatin, such as CTCF and cohesin subunits, are retained in mitosis and are part of the discontinuous protein scaffold of mitotic chromosomes; and (3) the conservation of the link between architectural proteins and their binding sites on DNA through the cell cycle explains the reversibility of the interphase to mitosis process and the "mitotic memory" of interphase architecture.
Collapse
Affiliation(s)
- Giorgio Bernardi
- Science Department, Roma Tre University, 00146 Rome, Italy Stazione Zoologica Anton Dohrn, 80121 Naples, Italy
| |
Collapse
|
31
|
Abstract
How the same DNA sequences can function in the three-dimensional architecture of interphase nucleus, fold in the very compact structure of metaphase chromosomes and go precisely back to the original interphase architecture in the following cell cycle remains an unresolved question to this day. The strategy used to address this issue was to analyze the correlations between chromosome architecture and the compositional patterns of DNA sequences spanning a size range from a few hundreds to a few thousands Kilobases. This is a critical range that encompasses isochores, interphase chromatin domains and boundaries, and chromosomal bands. The solution rests on the following key points: 1) the transition from the looped domains and sub-domains of interphase chromatin to the 30-nm fiber loops of early prophase chromosomes goes through the unfolding into an extended chromatin structure (probably a 10-nm "beads-on-a-string" structure); 2) the architectural proteins of interphase chromatin, such as CTCF and cohesin sub-units, are retained in mitosis and are part of the discontinuous protein scaffold of mitotic chromosomes; 3) the conservation of the link between architectural proteins and their binding sites on DNA through the cell cycle explains the "mitotic memory" of interphase architecture and the reversibility of the interphase to mitosis process. The results presented here also lead to a general conclusion which concerns the existence of correlations between the isochore organization of the genome and the architecture of chromosomes from interphase to metaphase.
Collapse
Affiliation(s)
- Giorgio Bernardi
- Science Department, Roma Tre University, Marconi, Rome, Italy
- Stazione Zoologica Anton Dohrn, Villa Comunale, Naples, Italy
| |
Collapse
|
32
|
Cozzi P, Milanesi L, Bernardi G. Segmenting the Human Genome into Isochores. Evol Bioinform Online 2015; 11:253-61. [PMID: 26640363 PMCID: PMC4662427 DOI: 10.4137/ebo.s27693] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2015] [Revised: 08/25/2015] [Accepted: 08/31/2015] [Indexed: 02/06/2023] Open
Abstract
The human genome is a mosaic of isochores, which are long (>200 kb) DNA sequences that are fairly homogeneous in base composition and can be assigned to five families comprising 33%–59% of GC composition. Although the compartmentalized organization of the mammalian genome has been investigated for more than 40 years, no satisfactory automatic procedure for segmenting the genome into isochores is available so far. We present a critical discussion of the currently available methods and a new approach called isoSegmenter which allows segmenting the genome into isochores in a fast and completely automatic manner. This approach relies on two types of experimentally defined parameters, the compositional boundaries of isochore families and an optimal window size of 100 kb. The approach represents an improvement over the existing methods, is ideally suited for investigating long-range features of sequenced and assembled genomes, and is publicly available at https://github.com/bunop/isoSegmenter.
Collapse
Affiliation(s)
- Paolo Cozzi
- National Research Council, Institute for Biomedical Technologies, Segrate, Milan, Italy. ; Parco Tecnologico Padano, Lodi, Italy
| | - Luciano Milanesi
- National Research Council, Institute for Biomedical Technologies, Segrate, Milan, Italy
| | - Giorgio Bernardi
- National Research Council, Institute for Biomedical Technologies, Segrate, Milan, Italy. ; Science Department, Rome 3 University, Rome, Italy
| |
Collapse
|
33
|
Costantini M. An overview on genome organization of marine organisms. Mar Genomics 2015; 24 Pt 1:3-9. [PMID: 25899406 DOI: 10.1016/j.margen.2015.03.015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2015] [Revised: 03/17/2015] [Accepted: 03/17/2015] [Indexed: 11/16/2022]
Abstract
In this review we will concentrate on some general genome features of marine organisms and their evolution, ranging from vertebrate to invertebrates until unicellular organisms. Before genome sequencing, the ultracentrifugation in CsCl led to high resolution of mammalian DNA (without seeing at the sequence). The analytical profile of human DNA showed that the vertebrate genome is a mosaic of isochores, typically megabase-size DNA segments that belong in a small number of families characterized by different GC levels. The recent availability of a number of fully sequenced genomes allowed mapping very precisely the isochores, based on DNA sequences. Since isochores are tightly linked to biological properties such as gene density, replication timing and recombination, the new level of detail provided by the isochore map helped the understanding of genome structure, function and evolution. This led the current level of knowledge and to further insights.
Collapse
Affiliation(s)
- Maria Costantini
- Department of Biology and Evolution of Marine Organisms, Stazione Zoologica Anton Dohrn, Villa Comunale, 80121 Naples, Italy.
| |
Collapse
|
34
|
Blazie SM, Babb C, Wilky H, Rawls A, Park JG, Mangone M. Comparative RNA-Seq analysis reveals pervasive tissue-specific alternative polyadenylation in Caenorhabditis elegans intestine and muscles. BMC Biol 2015; 13:4. [PMID: 25601023 PMCID: PMC4343181 DOI: 10.1186/s12915-015-0116-6] [Citation(s) in RCA: 61] [Impact Index Per Article: 6.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/29/2014] [Accepted: 01/12/2015] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Tissue-specific RNA plasticity broadly impacts the development, tissue identity and adaptability of all organisms, but changes in composition, expression levels and its impact on gene regulation in different somatic tissues are largely unknown. Here we developed a new method, polyA-tagging and sequencing (PAT-Seq) to isolate high-quality tissue-specific mRNA from Caenorhabditis elegans intestine, pharynx and body muscle tissues and study changes in their tissue-specific transcriptomes and 3'UTRomes. RESULTS We have identified thousands of novel genes and isoforms differentially expressed between these three tissues. The intestine transcriptome is expansive, expressing over 30% of C. elegans mRNAs, while muscle transcriptomes are smaller but contain characteristic unique gene signatures. Active promoter regions in all three tissues reveal both known and novel enriched tissue-specific elements, along with putative transcription factors, suggesting novel tissue-specific modes of transcription initiation. We have precisely mapped approximately 20,000 tissue-specific polyadenylation sites and discovered that about 30% of transcripts in somatic cells use alternative polyadenylation in a tissue-specific manner, with their 3'UTR isoforms significantly enriched with microRNA targets. CONCLUSIONS For the first time, PAT-Seq allowed us to directly study tissue specific gene expression changes in an in vivo setting and compare these changes between three somatic tissues from the same organism at single-base resolution within the same experiment. We pinpoint precise tissue-specific transcriptome rearrangements and for the first time link tissue-specific alternative polyadenylation to miRNA regulation, suggesting novel and unexplored tissue-specific post-transcriptional regulatory networks in somatic cells.
Collapse
Affiliation(s)
- Stephen M Blazie
- Molecular and Cellular Biology Graduate Program, Arizona State University, Tempe, AZ, USA.
- Virginia G. Piper Center for Personalized Diagnostics, The Biodesign Institute at Arizona State University, 1001 S McAllister Ave, Tempe, AZ, USA.
| | - Cody Babb
- Virginia G. Piper Center for Personalized Diagnostics, The Biodesign Institute at Arizona State University, 1001 S McAllister Ave, Tempe, AZ, USA.
| | - Henry Wilky
- Barrett Honors College, Arizona State University, 751 E Lemon Mall, 1282 Tempe, AZ, USA.
| | - Alan Rawls
- Molecular and Cellular Biology Graduate Program, Arizona State University, Tempe, AZ, USA.
- Barrett Honors College, Arizona State University, 751 E Lemon Mall, 1282 Tempe, AZ, USA.
| | - Jin G Park
- Virginia G. Piper Center for Personalized Diagnostics, The Biodesign Institute at Arizona State University, 1001 S McAllister Ave, Tempe, AZ, USA.
| | - Marco Mangone
- Molecular and Cellular Biology Graduate Program, Arizona State University, Tempe, AZ, USA.
- Virginia G. Piper Center for Personalized Diagnostics, The Biodesign Institute at Arizona State University, 1001 S McAllister Ave, Tempe, AZ, USA.
- Barrett Honors College, Arizona State University, 751 E Lemon Mall, 1282 Tempe, AZ, USA.
| |
Collapse
|
35
|
Figuet E, Ballenghien M, Romiguier J, Galtier N. Biased gene conversion and GC-content evolution in the coding sequences of reptiles and vertebrates. Genome Biol Evol 2014; 7:240-50. [PMID: 25527834 PMCID: PMC4316630 DOI: 10.1093/gbe/evu277] [Citation(s) in RCA: 56] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/15/2022] Open
Abstract
Mammalian and avian genomes are characterized by a substantial spatial heterogeneity of GC-content, which is often interpreted as reflecting the effect of local GC-biased gene conversion (gBGC), a meiotic repair bias that favors G and C over A and T alleles in high-recombining genomic regions. Surprisingly, the first fully sequenced nonavian sauropsid (i.e., reptile), the green anole Anolis carolinensis, revealed a highly homogeneous genomic GC-content landscape, suggesting the possibility that gBGC might not be at work in this lineage. Here, we analyze GC-content evolution at third-codon positions (GC3) in 44 vertebrates species, including eight newly sequenced transcriptomes, with a specific focus on nonavian sauropsids. We report that reptiles, including the green anole, have a genome-wide distribution of GC3 similar to that of mammals and birds, and we infer a strong GC3-heterogeneity to be already present in the tetrapod ancestor. We further show that the dynamic of coding sequence GC-content is largely governed by karyotypic features in vertebrates, notably in the green anole, in agreement with the gBGC hypothesis. The discrepancy between third-codon positions and noncoding DNA regarding GC-content dynamics in the green anole could not be explained by the activity of transposable elements or selection on codon usage. This analysis highlights the unique value of third-codon positions as an insertion/deletion-free marker of nucleotide substitution biases that ultimately affect the evolution of proteins.
Collapse
Affiliation(s)
- Emeric Figuet
- CNRS, Université Montpellier 2, UMR 5554, Institut des Sciences de l'Evolution de Montpellier, France
| | - Marion Ballenghien
- CNRS, Université Montpellier 2, UMR 5554, Institut des Sciences de l'Evolution de Montpellier, France
| | - Jonathan Romiguier
- CNRS, Université Montpellier 2, UMR 5554, Institut des Sciences de l'Evolution de Montpellier, France Department of Ecology and Evolution, Biophore, University of Lausanne, Switzerland
| | - Nicolas Galtier
- CNRS, Université Montpellier 2, UMR 5554, Institut des Sciences de l'Evolution de Montpellier, France
| |
Collapse
|
36
|
Šmarda P, Bureš P, Horová L, Leitch IJ, Mucina L, Pacini E, Tichý L, Grulich V, Rotreklová O. Ecological and evolutionary significance of genomic GC content diversity in monocots. Proc Natl Acad Sci U S A 2014; 111:E4096-102. [PMID: 25225383 PMCID: PMC4191780 DOI: 10.1073/pnas.1321152111] [Citation(s) in RCA: 180] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/20/2023] Open
Abstract
Genomic DNA base composition (GC content) is predicted to significantly affect genome functioning and species ecology. Although several hypotheses have been put forward to address the biological impact of GC content variation in microbial and vertebrate organisms, the biological significance of GC content diversity in plants remains unclear because of a lack of sufficiently robust genomic data. Using flow cytometry, we report genomic GC contents for 239 species representing 70 of 78 monocot families and compare them with genomic characters, a suite of life history traits and climatic niche data using phylogeny-based statistics. GC content of monocots varied between 33.6% and 48.9%, with several groups exceeding the GC content known for any other vascular plant group, highlighting their unusual genome architecture and organization. GC content showed a quadratic relationship with genome size, with the decreases in GC content in larger genomes possibly being a consequence of the higher biochemical costs of GC base synthesis. Dramatic decreases in GC content were observed in species with holocentric chromosomes, whereas increased GC content was documented in species able to grow in seasonally cold and/or dry climates, possibly indicating an advantage of GC-rich DNA during cell freezing and desiccation. We also show that genomic adaptations associated with changing GC content might have played a significant role in the evolution of the Earth's contemporary biota, such as the rise of grass-dominated biomes during the mid-Tertiary. One of the major selective advantages of GC-rich DNA is hypothesized to be facilitating more complex gene regulation.
Collapse
Affiliation(s)
- Petr Šmarda
- Department of Botany and Zoology, Masaryk University, CZ-61137 Brno, Czech Republic;
| | - Petr Bureš
- Department of Botany and Zoology, Masaryk University, CZ-61137 Brno, Czech Republic
| | - Lucie Horová
- Department of Botany and Zoology, Masaryk University, CZ-61137 Brno, Czech Republic
| | - Ilia J Leitch
- Jodrell Laboratory, Royal Botanic Gardens, Kew, Surrey TW93DS, United Kingdom
| | - Ladislav Mucina
- School of Plant Biology, University of Western Australia, Perth, WA 6009, Australia; Centre for Geographic Analysis, Department of Geography and Environmental Studies, Stellenbosch University, Stellenbosch 7600, South Africa; and
| | - Ettore Pacini
- Department of Life Sciences, Siena University, 53100 Siena, Italy
| | - Lubomír Tichý
- Department of Botany and Zoology, Masaryk University, CZ-61137 Brno, Czech Republic
| | - Vít Grulich
- Department of Botany and Zoology, Masaryk University, CZ-61137 Brno, Czech Republic
| | - Olga Rotreklová
- Department of Botany and Zoology, Masaryk University, CZ-61137 Brno, Czech Republic
| |
Collapse
|
37
|
Panda A, Podder S, Chakraborty S, Ghosh TC. GC-made protein disorder sheds new light on vertebrate evolution. Genomics 2014; 104:530-7. [PMID: 25240915 DOI: 10.1016/j.ygeno.2014.09.003] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2014] [Revised: 08/05/2014] [Accepted: 09/10/2014] [Indexed: 10/24/2022]
Abstract
At the emergence of endothermic vertebrates, GC rich regions of the ectothermic ancestral genomes underwent a significant GC increase. Such an increase was previously postulated to increase thermodynamic and structural stability of proteins through selective increase of protein hydrophobicity. Here, we found that, increase in GC content promotes a higher content of disorder promoting amino acid in endothermic vertebrates proteins and that the increase in hydrophobicity is mainly due to a higher content of the small disorder promoting amino acid alanine. In endothermic vertebrates, prevalence of disordered residues was found to promote functional diversity of proteins encoded by GC rich genes. Higher fraction of disordered residues in this group of proteins was also found to minimize their aggregation tendency. Thus, we propose that the GC transition has favored disordered residues to promote functional diversity in GC rich genes, and to protect them against functional loss by protein misfolding.
Collapse
Affiliation(s)
- Arup Panda
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | - Soumita Podder
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | - Sandip Chakraborty
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, India
| | - Tapash Chandra Ghosh
- Bioinformatics Centre, Bose Institute, P 1/12, C.I.T. Scheme VII M, Kolkata 700 054, India.
| |
Collapse
|
38
|
Sizova TV, Karpova OI. Evolution conservatively of SCAR DNA localization in genome isochores of warm-blooded vertebrates. Mol Biol 2014. [DOI: 10.1134/s0026893314030194] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/23/2022]
|
39
|
Implications of human genome structural heterogeneity: functionally related genes tend to reside in organizationally similar genomic regions. BMC Genomics 2014; 15:252. [PMID: 24684786 PMCID: PMC4234528 DOI: 10.1186/1471-2164-15-252] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2012] [Accepted: 03/21/2014] [Indexed: 01/30/2023] Open
Abstract
Background In an earlier study, we hypothesized that genomic segments with different sequence
organization patterns (OPs) might display functional specificity despite their
similar GC content. Here we tested this hypothesis by dividing the human genome
into 100 kb segments, classifying these segments into five compositional
groups according to GC content, and then characterizing each segment within the
five groups by oligonucleotide counting (k-mer analysis; also referred to as
compositional spectrum analysis, or CSA), to examine the distribution of sequence
OPs in the segments. We performed the CSA on the entire DNA, i.e., its coding and
non-coding parts the latter being much more abundant in the genome than the
former. Results We identified 38 OP-type clusters of segments that differ in their compositional
spectrum (CS) organization. Many of the segments that shared the same OP type were
enriched with genes related to the same biological processes (developmental,
signaling, etc.), components of biochemical complexes, or organelles. Thirteen
OP-type clusters showed significant enrichment in genes connected to specific
gene-ontology terms. Some of these clusters seemed to reflect certain events
during periods of horizontal gene transfer and genome expansion, and subsequent
evolution of genomic regions requiring coordinated regulation. Conclusions There may be a tendency for genes that are involved in the same biological
process, complex or organelle to use the same OP, even at a distance of ~
100 kb from the genes. Although the intergenic DNA is non-coding, the general
pattern of sequence organization (e.g., reflected in over-represented
oligonucleotide “words”) may be important and were protected, to some
extent, in the course of evolution.
Collapse
|
40
|
Guariniello S, Colonna G, Raucci R, Costantini M, Di Bernardo G, Bergantino F, Castello G, Costantini S. Structure-function relationship and evolutionary history of the human selenoprotein M (SelM) found over-expressed in hepatocellular carcinoma. BIOCHIMICA ET BIOPHYSICA ACTA 2014; 1844:447-456. [PMID: 24332979 DOI: 10.1016/j.bbapap.2013.12.001] [Citation(s) in RCA: 13] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/27/2013] [Revised: 11/28/2013] [Accepted: 12/02/2013] [Indexed: 11/17/2022]
Abstract
In humans we know 25 selenoproteins that play important roles in redox regulation, detoxification, immune-system protection and viral suppression. In particular, selenoprotein M (SelM) may function as thiol disulfide oxidoreductase that participates in the formation of disulfide bonds, and can be implicated in calcium responses. However, it presents a redox motif (CXXU), where U is a selenocysteine, and may also function as redox regulator because its decreased or increased expression regulated by dietary selenium alters redox homeostasis. No data are reported in literature about its involvement in cancer but only in neurodegenerative diseases. In this paper we evaluated the SelM expression in two hepatoma cell lines, HepG2 and Huh7, compared to normal hepatocytes. The results suggested its involvement in hepatocellular carcinoma (HCC) as well as its possible use to follow the progression of this cancer as putative marker. The aim of this study has been to analyze the structure-function relationships of SelM. Hence, firstly we studied the evolutionary history of this protein by phylogenetic analysis and GC content of genes from various species. So, we modeled the three-dimensional structure of the human SelM evaluating its energetic stability by molecular dynamics simulations. Moreover, we modeled some of its mutants to obtain structural information helpful for structure-based drug design.
Collapse
Affiliation(s)
- Stefano Guariniello
- Biochemistry, Biophysics and General Pathology Department and Computational Biology Doctorate, Second University of Naples, Naples, Italy
| | - Giovanni Colonna
- Biochemistry, Biophysics and General Pathology Department and Computational Biology Doctorate, Second University of Naples, Naples, Italy
| | - Raffaele Raucci
- Biochemistry, Biophysics and General Pathology Department and Computational Biology Doctorate, Second University of Naples, Naples, Italy
| | | | - Gianni Di Bernardo
- Department of Experimental Medicine, Section of Biotechnology and Molecular Biology, Faculty of Medicine, Second University of Naples, Naples, Italy
| | - Francesca Bergantino
- Pharmacogenomic Laboratory, Oncology Research Center of Mercogliano (CROM), Istituto Nazionale Per Lo Studio E La Cura Dei Tumori "Fondazione Giovanni Pascale", IRCCS, Italy
| | - Giuseppe Castello
- Istituto Nazionale Per Lo Studio E La Cura Dei Tumori "Fondazione Giovanni Pascale", IRCCS, Italy
| | - Susan Costantini
- Istituto Nazionale Per Lo Studio E La Cura Dei Tumori "Fondazione Giovanni Pascale", IRCCS, Italy.
| |
Collapse
|
41
|
Costantini M, Alvarez-Valin F, Costantini S, Cammarano R, Bernardi G. Compositional patterns in the genomes of unicellular eukaryotes. BMC Genomics 2013; 14:755. [PMID: 24188247 PMCID: PMC4007698 DOI: 10.1186/1471-2164-14-755] [Citation(s) in RCA: 10] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/01/2012] [Accepted: 10/31/2013] [Indexed: 11/29/2022] Open
Abstract
Background The genomes of multicellular eukaryotes are compartmentalized in mosaics of isochores, large and fairly homogeneous stretches of DNA that belong to a small number of families characterized by different average GC levels, by different gene concentration (that increase with GC), different chromatin structures, different replication timing in the cell cycle, and other different properties. A question raised by these basic results concerns how far back in evolution the compartmentalized organization of the eukaryotic genomes arose. Results In the present work we approached this problem by studying the compositional organization of the genomes from the unicellular eukaryotes for which full sequences are available, the sample used being representative. The average GC levels of the genomes from unicellular eukaryotes cover an extremely wide range (19%-60% GC) and the compositional patterns of individual genomes are extremely different but all genomes tested show a compositional compartmentalization. Conclusions The average GC range of the genomes of unicellular eukaryotes is very broad (as broad as that of prokaryotes) and individual compositional patterns cover a very broad range from very narrow to very complex. Both features are not surprising for organisms that are very far from each other both in terms of phylogenetic distances and of environmental life conditions. Most importantly, all genomes tested, a representative sample of all supergroups of unicellular eukaryotes, are compositionally compartmentalized, a major difference with prokaryotes.
Collapse
Affiliation(s)
- Maria Costantini
- Laboratory of Animal Physiology and Evolution, Stazione Zoologica Anton Dohrn, Villa Comunale, Naples 80121, Italy.
| | | | | | | | | |
Collapse
|
42
|
Wan QH, Pan SK, Hu L, Zhu Y, Xu PW, Xia JQ, Chen H, He GY, He J, Ni XW, Hou HL, Liao SG, Yang HQ, Chen Y, Gao SK, Ge YF, Cao CC, Li PF, Fang LM, Liao L, Zhang S, Wang MZ, Dong W, Fang SG. Genome analysis and signature discovery for diving and sensory properties of the endangered Chinese alligator. Cell Res 2013; 23:1091-105. [PMID: 23917531 PMCID: PMC3760627 DOI: 10.1038/cr.2013.104] [Citation(s) in RCA: 90] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2013] [Revised: 06/20/2013] [Accepted: 07/08/2013] [Indexed: 12/27/2022] Open
Abstract
Crocodilians are diving reptiles that can hold their breath under water for long periods of time and are crepuscular animals with excellent sensory abilities. They comprise a sister lineage of birds and have no sex chromosome. Here we report the genome sequence of the endangered Chinese alligator (Alligator sinensis) and describe its unique features. The next-generation sequencing generated 314 Gb of raw sequence, yielding a genome size of 2.3 Gb. A total of 22 200 genes were predicted in Alligator sinensis using a de novo, homology- and RNA-based combined model. The genetic basis of long-diving behavior includes duplication of the bicarbonate-binding hemoglobin gene, co-functioning of routine phosphate-binding and special bicarbonate-binding oxygen transport, and positively selected energy metabolism, ammonium bicarbonate excretion and cardiac muscle contraction. Further, we elucidated the robust Alligator sinensis sensory system, including a significantly expanded olfactory receptor repertoire, rapidly evolving nerve-related cellular components and visual perception, and positive selection of the night vision-related opsin and sound detection-associated otopetrin. We also discovered a well-developed immune system with a considerable number of lineage-specific antigen-presentation genes for adaptive immunity as well as expansion of the tripartite motif-containing C-type lectin and butyrophilin genes for innate immunity and expression of antibacterial peptides. Multifluorescence in situ hybridization showed that alligator chromosome 3, which encodes DMRT1, exhibits significant synteny with chicken chromosome Z. Finally, population history analysis indicated population admixture 0.60-1.05 million years ago, when the Qinghai-Tibetan Plateau was uplifted.
Collapse
Affiliation(s)
- Qiu-Hong Wan
- The Key Laboratory of Conservation Biology for Endangered Wildlife of the Ministry of Education, State Conservation Center for Gene Resources of Endangered Wildlife, College of Life Sciences, Zhejiang University, Hangzhou, Zhejiang 310058, China
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | |
Collapse
|
43
|
Abstract
Patterns of replication within eukaryotic genomes correlate with gene expression, chromatin structure, and genome evolution. Recent advances in genome-scale mapping of replication kinetics have allowed these correlations to be explored in many species, cell types, and growth conditions, and these large data sets have allowed quantitative and computational analyses. One striking new correlation to emerge from these analyses is between replication timing and the three-dimensional structure of chromosomes. This correlation, which is significantly stronger than with any single histone modification or chromosome-binding protein, suggests that replication timing is controlled at the level of chromosomal domains. This conclusion dovetails with parallel work on the heterogeneity of origin firing and the competition between origins for limiting activators to suggest a model in which the stochastic probability of individual origin firing is modulated by chromosomal domain structure to produce patterns of replication. Whether these patterns have inherent biological functions or simply reflect higher-order genome structure is an open question.
Collapse
Affiliation(s)
- Nicholas Rhind
- Department of Biochemistry and Molecular Pharmacology, University of Massachusetts Medical School, Worcester, Massachusetts 01605, USA.
| | | |
Collapse
|
44
|
Qiu R, Chen C, Jiang H, Shen L, Wu M, Liu C. Large genomic region free of GWAS-based common variants contains fertility-related genes. PLoS One 2013; 8:e61917. [PMID: 23613972 PMCID: PMC3629113 DOI: 10.1371/journal.pone.0061917] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/21/2012] [Accepted: 03/15/2013] [Indexed: 02/01/2023] Open
Abstract
DNA variants, such as single nucleotide polymorphisms (SNPs) and copy number variants (CNVs), are unevenly distributed across the human genome. Currently, dbSNP contains more than 6 million human SNPs, and whole-genome genotyping arrays can assay more than 4 million of them simultaneously. In our study, we first questioned whether published genome-wide association studies (GWASs) assays cover all regions well in the genome. Using dbSNP build 135 data, we identified 50 genomic regions longer than 100 Kb that do not contain any common SNPs, i.e., those with minor allele frequency (MAF)≥ 1%. Secondly, because conserved regions are generally of functional importance, we tested genes in those large genomic regions without common SNPs. We found 97 genes and were enriched for reproduction function. In addition, we further filtered out regions with CNVs listed in the Database of Genomic Variants (DGV), segmental duplications from Human Genome Project and common variants identified by personal genome sequencing (UCSC). No region survived after those filtering. Our analysis suggests that, while there may not be many large genomic regions free of common variants, there are still some "holes" in the current human genomic map for common SNPs. Because GWAS only focused on common SNPs, interpretation of GWAS results should take this limitation into account. Particularly, two recent GWAS of fertility may be incomplete due to the map deficit. Additional SNP discovery efforts should pay close attention to these regions.
Collapse
Affiliation(s)
- Rong Qiu
- School of Information Science and Engineering, Central South University, Changsha, China
- Hunan Engineering Laboratory for Advanced Control and Intelligent Automation, Changsha, China
| | - Chao Chen
- Department of Psychiatry, University of Illinois at Chicago, Chicago, United States of America
- Institute of Human Genetics, University of Illinois at Chicago, Chicago, United States of America
| | - Hong Jiang
- Department of Neurology, Xiangya Hospital, Central South University, Changsha, China
| | - Libing Shen
- School of Life Science, Fudan University, Shanghai, China
| | - Min Wu
- School of Information Science and Engineering, Central South University, Changsha, China
- Hunan Engineering Laboratory for Advanced Control and Intelligent Automation, Changsha, China
| | - Chunyu Liu
- Department of Psychiatry, University of Illinois at Chicago, Chicago, United States of America
- Institute of Human Genetics, University of Illinois at Chicago, Chicago, United States of America
- State Key Laboratory of Medical Genetics of China, Central South University, Changsha, China
| |
Collapse
|
45
|
Mugal CF, Arndt PF, Ellegren H. Twisted signatures of GC-biased gene conversion embedded in an evolutionary stable karyotype. Mol Biol Evol 2013; 30:1700-12. [PMID: 23564940 PMCID: PMC3684855 DOI: 10.1093/molbev/mst067] [Citation(s) in RCA: 33] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/14/2022] Open
Abstract
The genomes of many vertebrates show a characteristic heterogeneous distribution of GC content, the so-called GC isochore structure. The origin of isochores has been explained via the mechanism of GC-biased gene conversion (gBGC). However, although the isochore structure is declining in many mammalian genomes, the heterogeneity in GC content is being reinforced in the avian genome. Despite this discrepancy, which remains unexplained, examinations of individual substitution frequencies in mammals and birds are both consistent with the gBGC model of isochore evolution. On the other hand, a negative correlation between substitution and recombination rate found in the chicken genome is inconsistent with the gBGC model. It should therefore be important to consider along with gBGC other consequences of recombination on the origin and fate of mutations, as well as to account for relationships between recombination rate and other genomic features. We therefore developed an analytical model to describe the substitution patterns found in the chicken genome, and further investigated the relationships between substitution patterns and several genomic features in a rigorous statistical framework. Our analysis indicates that GC content itself, either directly or indirectly via interrelations to other genomic features, has an impact on the substitution pattern. Further, we suggest that this phenomenon is particularly visible in avian genomes due to their unusually low rate of chromosomal evolution. Because of this, interrelations between GC content and other genomic features are being reinforced, and are as such more pronounced in avian genomes as compared with other vertebrate genomes with a less stable karyotype.
Collapse
Affiliation(s)
- Carina F Mugal
- Department of Evolutionary Biology, Evolutionary Biology Centre, Uppsala University, Uppsala, Sweden
| | | | | |
Collapse
|
46
|
Chambers EV, Bickmore WA, Semple CA. Divergence of mammalian higher order chromatin structure is associated with developmental loci. PLoS Comput Biol 2013; 9:e1003017. [PMID: 23592965 PMCID: PMC3617018 DOI: 10.1371/journal.pcbi.1003017] [Citation(s) in RCA: 34] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/08/2012] [Accepted: 02/18/2013] [Indexed: 02/03/2023] Open
Abstract
Several recent studies have examined different aspects of mammalian higher order chromatin structure - replication timing, lamina association and Hi-C inter-locus interactions - and have suggested that most of these features of genome organisation are conserved over evolution. However, the extent of evolutionary divergence in higher order structure has not been rigorously measured across the mammalian genome, and until now little has been known about the characteristics of any divergent loci present. Here, we generate a dataset combining multiple measurements of chromatin structure and organisation over many embryonic cell types for both human and mouse that, for the first time, allows a comprehensive assessment of the extent of structural divergence between mammalian genomes. Comparison of orthologous regions confirms that all measurable facets of higher order structure are conserved between human and mouse, across the vast majority of the detectably orthologous genome. This broad similarity is observed in spite of many loci possessing cell type specific structures. However, we also identify hundreds of regions (from 100 Kb to 2.7 Mb in size) showing consistent evidence of divergence between these species, constituting at least 10% of the orthologous mammalian genome and encompassing many hundreds of human and mouse genes. These regions show unusual shifts in human GC content, are unevenly distributed across both genomes, and are enriched in human subtelomeric regions. Divergent regions are also relatively enriched for genes showing divergent expression patterns between human and mouse ES cells, implying these regions cause divergent regulation. Particular divergent loci are strikingly enriched in genes implicated in vertebrate development, suggesting important roles for structural divergence in the evolution of mammalian developmental programmes. These data suggest that, though relatively rare in the mammalian genome, divergence in higher order chromatin structure has played important roles during evolution.
Collapse
Affiliation(s)
- Emily V. Chambers
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
| | - Wendy A. Bickmore
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
| | - Colin A. Semple
- MRC Human Genetics Unit, MRC Institute of Genetics and Molecular Medicine, University of Edinburgh, Western General Hospital, Edinburgh, United Kingdom
- * E-mail:
| |
Collapse
|
47
|
Nabiyouni M, Prakash A, Fedorov A. Vertebrate codon bias indicates a highly GC-rich ancestral genome. Gene 2013; 519:113-9. [PMID: 23376453 DOI: 10.1016/j.gene.2013.01.033] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2012] [Revised: 01/10/2013] [Accepted: 01/17/2013] [Indexed: 11/16/2022]
Abstract
Two factors are thought to have contributed to the origin of codon usage bias in eukaryotes: 1) genome-wide mutational forces that shape overall GC-content and create context-dependent nucleotide bias, and 2) positive selection for codons that maximize efficient and accurate translation. Particularly in vertebrates, these two explanations contradict each other and cloud the origin of codon bias in the taxon. On the one hand, mutational forces fail to explain GC-richness (~60%) of third codon positions, given the GC-poor overall genomic composition among vertebrates (~40%). On the other hand, positive selection cannot easily explain strict regularities in codon preferences. Large-scale bioinformatic assessment, of nucleotide composition of coding and non-coding sequences in vertebrates and other taxa, suggests a simple possible resolution for this contradiction. Specifically, we propose that the last common vertebrate ancestor had a GC-rich genome (~65% GC). The data suggest that whole-genome mutational bias is the major driving force for generating codon bias. As the bias becomes prominent, it begins to affect translation and can result in positive selection for optimal codons. The positive selection can, in turn, significantly modulate codon preferences.
Collapse
Affiliation(s)
- Maryam Nabiyouni
- Program in Bioinformatics and Proteomics/Genomics, University of Toledo, Health Science Campus, Toledo, OH 43614, USA.
| | | | | |
Collapse
|
48
|
Frousios K, Iliopoulos CS, Tischler G, Kossida S, Pissis SP, Arhondakis S. Transcriptome map of mouse isochores in embryonic and neonatal cortex. Genomics 2012. [PMID: 23195409 DOI: 10.1016/j.ygeno.2012.11.006] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Several studies on adult tissues agree on the presence of a positive effect of the genomic and genic base composition on mammalian gene expression. Recent literature supports the idea that during developmental processes GC-poor genomic regions are preferentially implicated. We investigate the relationship between the compositional properties of the isochores and of the genes with their respective expression activity during developmental processes. Using RNA-seq data from two distinct developmental stages of the mouse cortex, embryonic day 18 (E18) and postnatal day 7 (P7), we established for the first time a developmental-related transcriptome map of the mouse isochores. Additionally, for each stage we estimated the correlation between isochores' GC level and their expression activity, and the genes' expression patterns for each isochore family. Our analyses add evidence supporting the idea that during development GC-poor isochores are preferentially implicated, and confirm the positive effect of genes' GC level on their expression activity.
Collapse
Affiliation(s)
- Kimon Frousios
- Department of Informatics, King's College London, The Strand, London WC2R 2LS, UK
| | - Costas S Iliopoulos
- Department of Informatics, King's College London, The Strand, London WC2R 2LS, UK; School of Mathematics and Statistics, University of Western Australia, 35 Stirling Highway, Crawley, Perth WA 6009, Australia
| | - German Tischler
- Lehrstuhl für Informatik 2, Universität Würzburg, Am Hubland, 97074 Würzburg, Germany
| | - Sophia Kossida
- Biomedical Research Foundation of the Academy of Athens, 4 Soranou Ephessiou, Athens 115 27, Greece
| | - Solon P Pissis
- Florida Museum of Natural History, University of Florida, 1659 Museum Road, Gainesville, FL 32611, USA; Heidelberg Institute for Theoretical Studies, 35 Schloss-Wolfsbrunnenweg, Heidelberg D-69118, Germany
| | - Stilianos Arhondakis
- Biomedical Research Foundation of the Academy of Athens, 4 Soranou Ephessiou, Athens 115 27, Greece.
| |
Collapse
|
49
|
Matsubara K, Kuraku S, Tarui H, Nishimura O, Nishida C, Agata K, Kumazawa Y, Matsuda Y. Intra-genomic GC heterogeneity in sauropsids: evolutionary insights from cDNA mapping and GC(3) profiling in snake. BMC Genomics 2012; 13:604. [PMID: 23140509 PMCID: PMC3549455 DOI: 10.1186/1471-2164-13-604] [Citation(s) in RCA: 37] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/28/2012] [Accepted: 10/24/2012] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Extant sauropsids (reptiles and birds) are divided into two major lineages, the lineage of Testudines (turtles) and Archosauria (crocodilians and birds) and the lineage of Lepidosauria (tuatara, lizards, worm lizards and snakes). Karyotypes of these sauropsidan groups generally consist of macrochromosomes and microchromosomes. In chicken, microchromosomes exhibit a higher GC-content than macrochromosomes. To examine the pattern of intra-genomic GC heterogeneity in lepidosaurian genomes, we constructed a cytogenetic map of the Japanese four-striped rat snake (Elaphe quadrivirgata) with 183 cDNA clones by fluorescence in situ hybridization, and examined the correlation between the GC-content of exonic third codon positions (GC3) of the genes and the size of chromosomes on which the genes were localized. RESULTS Although GC3 distribution of snake genes was relatively homogeneous compared with those of the other amniotes, microchromosomal genes showed significantly higher GC3 than macrochromosomal genes as in chicken. Our snake cytogenetic map also identified several conserved segments between the snake macrochromosomes and the chicken microchromosomes. Cross-species comparisons revealed that GC3 of most snake orthologs in such macrochromosomal segments were GC-poor (GC3 < 50%) whereas those of chicken orthologs in microchromosomes were relatively GC-rich (GC3 ≥ 50%). CONCLUSION Our results suggest that the chromosome size-dependent GC heterogeneity had already occurred before the lepidosaur-archosaur split, 275 million years ago. This character was probably present in the common ancestor of lepidosaurs and but lost in the lineage leading to Anolis during the diversification of lepidosaurs. We also identified several genes whose GC-content might have been influenced by the size of the chromosomes on which they were harbored over the course of sauropsid evolution.
Collapse
Affiliation(s)
- Kazumi Matsubara
- Department of Information and Biological Sciences, Graduate School of Natural Sciences, Nagoya City University, 1 Yamanohata, Mizuho-cho, Mizuho-ku, Nagoya, Aichi 467-8501, Japan.
| | | | | | | | | | | | | | | |
Collapse
|
50
|
Meuleman W, Peric-Hupkes D, Kind J, Beaudry JB, Pagie L, Kellis M, Reinders M, Wessels L, van Steensel B. Constitutive nuclear lamina-genome interactions are highly conserved and associated with A/T-rich sequence. Genome Res 2012; 23:270-80. [PMID: 23124521 PMCID: PMC3561868 DOI: 10.1101/gr.141028.112] [Citation(s) in RCA: 336] [Impact Index Per Article: 25.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/24/2022]
Abstract
In metazoans, the nuclear lamina is thought to play an important role in the spatial organization of interphase chromosomes, by providing anchoring sites for large genomic segments named lamina-associated domains (LADs). Some of these LADs are cell-type specific, while many others appear constitutively associated with the lamina. Constitutive LADs (cLADs) may contribute to a basal chromosome architecture. By comparison of mouse and human lamina interaction maps, we find that the sizes and genomic positions of cLADs are strongly conserved. Moreover, cLADs are depleted of synteny breakpoints, pointing to evolutionary selective pressure to keep cLADs intact. Paradoxically, the overall sequence conservation is low for cLADs. Instead, cLADs are universally characterized by long stretches of DNA of high A/T content. Cell-type specific LADs also tend to adhere to this “A/T rule” in embryonic stem cells, but not in differentiated cells. This suggests that the A/T rule represents a default positioning mechanism that is locally overruled during lineage commitment. Analysis of paralogs suggests that during evolution changes in A/T content have driven the relocation of genes to and from the nuclear lamina, in tight association with changes in expression level. Taken together, these results reveal that the spatial organization of mammalian genomes is highly conserved and tightly linked to local nucleotide composition.
Collapse
Affiliation(s)
- Wouter Meuleman
- Division of Gene Regulation, Netherlands Cancer Institute, 1066 CX Amsterdam, The Netherlands
| | | | | | | | | | | | | | | | | |
Collapse
|