1
|
Zhang W, Yang Z, Wang W, Sun Q. Primase promotes the competition between transcription and replication on the same template strand resulting in DNA damage. Nat Commun 2024; 15:73. [PMID: 38168108 PMCID: PMC10761990 DOI: 10.1038/s41467-023-44443-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/20/2023] [Accepted: 12/13/2023] [Indexed: 01/05/2024] Open
Abstract
Transcription-replication conflicts (TRCs), especially Head-On TRCs (HO-TRCs) can introduce R-loops and DNA damage, however, the underlying mechanisms are still largely unclear. We previously identified a chloroplast-localized RNase H1 protein AtRNH1C that can remove R-loops and relax HO-TRCs for genome integrity. Through the mutagenesis screen, we identify a mutation in chloroplast-localized primase ATH that weakens the binding affinity of DNA template and reduces the activities of RNA primer synthesis and delivery. This slows down DNA replication, and reduces competition of transcription-replication, thus rescuing the developmental defects of atrnh1c. Strand-specific DNA damage sequencing reveals that HO-TRCs cause DNA damage at the end of the transcription unit in the lagging strand and overexpression of ATH can boost HO-TRCs and exacerbates DNA damage. Furthermore, mutation of plastid DNA polymerase Pol1A can similarly rescue the defects in atrnh1c mutants. Taken together these results illustrate a potentially conserved mechanism among organisms, of which the primase activity can promote the occurrence of transcription-replication conflicts leading to HO-TRCs and genome instability.
Collapse
Affiliation(s)
- Weifeng Zhang
- Center for Plant Biology, School of Life Sciences, Tsinghua University, 100084, Beijing, China
- Tsinghua-Peking Center for Life Sciences, 100084, Beijing, China
| | - Zhuo Yang
- Center for Plant Biology, School of Life Sciences, Tsinghua University, 100084, Beijing, China
- Tsinghua-Peking Center for Life Sciences, 100084, Beijing, China
| | - Wenjie Wang
- Center for Plant Biology, School of Life Sciences, Tsinghua University, 100084, Beijing, China
- Tsinghua-Peking Center for Life Sciences, 100084, Beijing, China
| | - Qianwen Sun
- Center for Plant Biology, School of Life Sciences, Tsinghua University, 100084, Beijing, China.
- Tsinghua-Peking Center for Life Sciences, 100084, Beijing, China.
| |
Collapse
|
2
|
Genomic stability of mouse spermatogonial stem cells in vitro. Sci Rep 2021; 11:24199. [PMID: 34921203 PMCID: PMC8683475 DOI: 10.1038/s41598-021-03658-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2021] [Accepted: 12/08/2021] [Indexed: 11/08/2022] Open
Abstract
Germline mutations underlie genetic diversity and species evolution. Previous studies have assessed the theoretical mutation rates and spectra in germ cells mostly by analyzing genetic markers and reporter genes in populations and pedigrees. This study reported the direct measurement of germline mutations by whole-genome sequencing of cultured spermatogonial stem cells in mice, namely germline stem (GS) cells, together with multipotent GS (mGS) cells that spontaneously dedifferentiated from GS cells. GS cells produce functional sperm that can generate offspring by transplantation into seminiferous tubules, whereas mGS cells contribute to germline chimeras by microinjection into blastocysts in a manner similar to embryonic stem cells. The estimated mutation rate of GS and mGS cells was approximately 0.22 × 10-9 and 1.0 × 10-9 per base per cell population doubling, respectively, indicating that GS cells have a lower mutation rate compared to mGS cells. GS and mGS cells also showed distinct mutation patterns, with C-to-T transition as the most frequent in GS cells and C-to-A transversion as the most predominant in mGS cells. By karyotype analysis, GS cells showed recurrent trisomy of chromosomes 15 and 16, whereas mGS cells frequently exhibited chromosomes 1, 6, 8, and 11 amplifications, suggesting that distinct chromosomal abnormalities confer a selective growth advantage for each cell type in vitro. These data provide the basis for studying germline mutations and a foundation for the future utilization of GS cells for reproductive technology and clinical applications.
Collapse
|
3
|
Khrustalev VV, Giri R, Khrustaleva TA, Kapuganti SK, Stojarov AN, Poboinev VV. Translation-Associated Mutational U-Pressure in the First ORF of SARS-CoV-2 and Other Coronaviruses. Front Microbiol 2020; 11:559165. [PMID: 33072018 PMCID: PMC7536284 DOI: 10.3389/fmicb.2020.559165] [Citation(s) in RCA: 9] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 08/31/2020] [Indexed: 12/17/2022] Open
Abstract
Within 4 months of the ongoing COVID-19 pandemic caused by SARS-CoV-2, more than 250 nucleotide mutations have been detected in ORF1ab of the virus isolated from infected persons from different parts of the globe. These observations open up an obvious question about the rate and direction of mutational pressure for further vaccine and therapeutics designing. In this study, we did a comparative analysis of ORF1a and ORF1b by using the first isolate (Wuhan strain) as the parent sequence. We observed that most of the nucleotide mutations are C to U transitions. The rate of synonymous C to U transitions is significantly higher than the rate of non-synonymous ones, indicating negative selection on amino acid substitutions. Further, trends in nucleotide usage bias have been investigated in 49 coronaviruses species. A strong bias in nucleotide usage in fourfold degenerate sites toward uracil residues is seen in ORF1ab of all the studied coronaviruses: both in the ORF1a and in the ORF1b translated thanks to the programmed ribosomal frameshifting that has an efficiency of 14 – 45% in different species. A more substantial mutational U-pressure is observed in ORF1a than in ORF1b perhaps because ORF1a is translated more frequently than ORF1b. Mutational U-pressure is there even in ORFs that are not translated from genomic RNA plus strands, but the bias is weaker than in ORF1ab. Unlike other nucleotide mutations, mutational U-pressure caused by cytosine deamination, mostly occurring during the RNA plus strand replication and also translation, cannot be corrected by the proof-reading machinery of coronaviruses. The knowledge generated on the mutational U-pressure that becomes stronger during translation of viral RNA plus strands has implications for vaccine and nucleoside analog development for treating COVID-19 and other coronavirus infections.
Collapse
Affiliation(s)
| | - Rajanish Giri
- School of Basic Sciences, Indian Institute of Technology Mandi, Mandi, India
| | - Tatyana Aleksandrovna Khrustaleva
- Biochemical Group of Multidisciplinary Diagnostic Laboratory, Institute of Physiology of the National Academy of Sciences of Belarus, Minsk, Belarus
| | | | | | | |
Collapse
|
4
|
Whittle CA, Kulkarni A, Extavour CG. Evidence of multifaceted functions of codon usage in translation within the model beetle Tribolium castaneum. DNA Res 2020; 26:473-484. [PMID: 31922535 PMCID: PMC6993815 DOI: 10.1093/dnares/dsz025] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/23/2019] [Accepted: 01/07/2020] [Indexed: 01/06/2023] Open
Abstract
Synonymous codon use is non-random. Codons most used in highly transcribed genes, often called optimal codons, typically have high gene counts of matching tRNA genes (tRNA abundance) and promote accurate and/or efficient translation. Non-optimal codons, those least used in highly expressed genes, may also affect translation. In multicellular organisms, codon optimality may vary among tissues. At present, however, tissue specificity of codon use remains poorly understood. Here, we studied codon usage of genes highly transcribed in germ line (testis and ovary) and somatic tissues (gonadectomized males and females) of the beetle Tribolium castaneum. The results demonstrate that: (i) the majority of optimal codons were organism-wide, the same in all tissues, and had numerous matching tRNA gene copies (Opt-codon↑tRNAs), consistent with translational selection; (ii) some optimal codons varied among tissues, suggesting tissue-specific tRNA populations; (iii) wobble tRNA were required for translation of certain optimal codons (Opt-codonwobble), possibly allowing precise translation and/or protein folding; and (iv) remarkably, some non-optimal codons had abundant tRNA genes (Nonopt-codon↑tRNAs), and genes using those codons were tightly linked to ribosomal and stress-response functions. Thus, Nonopt-codon↑tRNAs codons may regulate translation of specific genes. Together, the evidence suggests that codon use and tRNA genes regulate multiple translational processes in T. castaneum.
Collapse
Affiliation(s)
| | | | - Cassandra G Extavour
- Department of Organismic and Evolutionary Biology.,Department of Molecular and Cellular Biology, Harvard University, Cambridge, MA, USA
| |
Collapse
|
5
|
Exploration of the Germline Genome of the Ciliate Chilodonella uncinata through Single-Cell Omics (Transcriptomics and Genomics). mBio 2018; 9:mBio.01836-17. [PMID: 29317511 PMCID: PMC5760741 DOI: 10.1128/mbio.01836-17] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
Abstract
Separate germline and somatic genomes are found in numerous lineages across the eukaryotic tree of life, often separated into distinct tissues (e.g., in plants, animals, and fungi) or distinct nuclei sharing a common cytoplasm (e.g., in ciliates and some foraminifera). In ciliates, germline-limited (i.e., micronuclear-specific) DNA is eliminated during the development of a new somatic (i.e., macronuclear) genome in a process that is tightly linked to large-scale genome rearrangements, such as deletions and reordering of protein-coding sequences. Most studies of germline genome architecture in ciliates have focused on the model ciliates Oxytricha trifallax, Paramecium tetraurelia, and Tetrahymena thermophila, for which the complete germline genome sequences are known. Outside of these model taxa, only a few dozen germline loci have been characterized from a limited number of cultivable species, which is likely due to difficulties in obtaining sufficient quantities of “purified” germline DNA in these taxa. Combining single-cell transcriptomics and genomics, we have overcome these limitations and provide the first insights into the structure of the germline genome of the ciliate Chilodonella uncinata, a member of the understudied class Phyllopharyngea. Our analyses reveal the following: (i) large gene families contain a disproportionate number of genes from scrambled germline loci; (ii) germline-soma boundaries in the germline genome are demarcated by substantial shifts in GC content; (iii) single-cell omics techniques provide large-scale quality germline genome data with limited effort, at least for ciliates with extensively fragmented somatic genomes. Our approach provides an efficient means to understand better the evolution of genome rearrangements between germline and soma in ciliates. Our understanding of the distinctions between germline and somatic genomes in ciliates has largely relied on studies of a few model genera (e.g., Oxytricha, Paramecium, Tetrahymena). We have used single-cell omics to explore germline-soma distinctions in the ciliate Chilodonella uncinata, which likely diverged from the better-studied ciliates ~700 million years ago. The analyses presented here indicate that developmentally regulated genome rearrangements between germline and soma are demarcated by rapid transitions in local GC composition and lead to diversification of protein families. The approaches used here provide the basis for future work aimed at discerning the evolutionary impacts of germline-soma distinctions among diverse ciliates.
Collapse
|
6
|
Assaf ZJ, Tilk S, Park J, Siegal ML, Petrov DA. Deep sequencing of natural and experimental populations of Drosophila melanogaster reveals biases in the spectrum of new mutations. Genome Res 2017; 27:1988-2000. [PMID: 29079675 PMCID: PMC5741049 DOI: 10.1101/gr.219956.116] [Citation(s) in RCA: 31] [Impact Index Per Article: 4.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/19/2016] [Accepted: 10/20/2017] [Indexed: 11/25/2022]
Abstract
Mutations provide the raw material of evolution, and thus our ability to study evolution depends fundamentally on having precise measurements of mutational rates and patterns. We generate a data set for this purpose using (1) de novo mutations from mutation accumulation experiments and (2) extremely rare polymorphisms from natural populations. The first, mutation accumulation (MA) lines are the product of maintaining flies in tiny populations for many generations, therefore rendering natural selection ineffective and allowing new mutations to accrue in the genome. The second, rare genetic variation from natural populations allows the study of mutation because extremely rare polymorphisms are relatively unaffected by the filter of natural selection. We use both methods in Drosophila melanogaster, first generating our own novel data set of sequenced MA lines and performing a meta-analysis of all published MA mutations (∼2000 events) and then identifying a high quality set of ∼70,000 extremely rare (≤0.1%) polymorphisms that are fully validated with resequencing. We use these data sets to precisely measure mutational rates and patterns. Highlights of our results include: a high rate of multinucleotide mutation events at both short (∼5 bp) and long (∼1 kb) genomic distances, showing that mutation drives GC content lower in already GC-poor regions, and using our precise context-dependent mutation rates to predict long-term evolutionary patterns at synonymous sites. We also show that de novo mutations from independent MA experiments display similar patterns of single nucleotide mutation and well match the patterns of mutation found in natural populations.
Collapse
Affiliation(s)
- Zoe June Assaf
- Department of Genetics, Stanford University, Stanford, California 94305, USA.,Department of Biology, Stanford University, Stanford, California 94305, USA
| | - Susanne Tilk
- Department of Biology, Stanford University, Stanford, California 94305, USA
| | - Jane Park
- Department of Biology, Stanford University, Stanford, California 94305, USA
| | - Mark L Siegal
- Department of Biology, New York University, New York, New York 10003, USA
| | - Dmitri A Petrov
- Department of Biology, Stanford University, Stanford, California 94305, USA
| |
Collapse
|
7
|
Chen WH, Lu G, Bork P, Hu S, Lercher MJ. Energy efficiency trade-offs drive nucleotide usage in transcribed regions. Nat Commun 2016; 7:11334. [PMID: 27098217 PMCID: PMC4844684 DOI: 10.1038/ncomms11334] [Citation(s) in RCA: 58] [Impact Index Per Article: 7.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Accepted: 03/16/2016] [Indexed: 01/29/2023] Open
Abstract
Efficient nutrient usage is a trait under universal selection. A substantial part of cellular resources is spent on making nucleotides. We thus expect preferential use of cheaper nucleotides especially in transcribed sequences, which are often amplified thousand-fold compared with genomic sequences. To test this hypothesis, we derive a mutation-selection-drift equilibrium model for nucleotide skews (strand-specific usage of 'A' versus 'T' and 'G' versus 'C'), which explains nucleotide skews across 1,550 prokaryotic genomes as a consequence of selection on efficient resource usage. Transcription-related selection generally favours the cheaper nucleotides 'U' and 'C' at synonymous sites. However, the information encoded in mRNA is further amplified through translation. Due to unexpected trade-offs in the codon table, cheaper nucleotides encode on average energetically more expensive amino acids. These trade-offs apply to both strand-specific nucleotide usage and GC content, causing a universal bias towards the more expensive nucleotides 'A' and 'G' at non-synonymous coding sites.
Collapse
Affiliation(s)
- Wei-Hua Chen
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
- Structural and Computational Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
| | - Guanting Lu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Peer Bork
- Structural and Computational Unit, European Molecular Biology Laboratory, Heidelberg 69117, Germany
- Bioinformatics department, Max Delbrück Centre for Molecular Medicine, Berlin 13125, Germany
| | - Songnian Hu
- CAS Key Laboratory of Genome Sciences and Information, Beijing Institute of Genomics, Chinese Academy of Sciences, Beijing 100101, China
| | - Martin J Lercher
- Institute for Computer Science and Cluster of Excellence on Plant Sciences, Heinrich Heine University, Düsseldorf 40225, Germany
| |
Collapse
|
8
|
Haradhvala NJ, Polak P, Stojanov P, Covington KR, Shinbrot E, Hess JM, Rheinbay E, Kim J, Maruvka YE, Braunstein LZ, Kamburov A, Hanawalt PC, Wheeler DA, Koren A, Lawrence MS, Getz G. Mutational Strand Asymmetries in Cancer Genomes Reveal Mechanisms of DNA Damage and Repair. Cell 2016; 164:538-49. [PMID: 26806129 DOI: 10.1016/j.cell.2015.12.050] [Citation(s) in RCA: 271] [Impact Index Per Article: 33.9] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/13/2015] [Revised: 12/21/2015] [Accepted: 12/24/2015] [Indexed: 12/20/2022]
Abstract
Mutational processes constantly shape the somatic genome, leading to immunity, aging, cancer, and other diseases. When cancer is the outcome, we are afforded a glimpse into these processes by the clonal expansion of the malignant cell. Here, we characterize a less explored layer of the mutational landscape of cancer: mutational asymmetries between the two DNA strands. Analyzing whole-genome sequences of 590 tumors from 14 different cancer types, we reveal widespread asymmetries across mutagenic processes, with transcriptional ("T-class") asymmetry dominating UV-, smoking-, and liver-cancer-associated mutations and replicative ("R-class") asymmetry dominating POLE-, APOBEC-, and MSI-associated mutations. We report a striking phenomenon of transcription-coupled damage (TCD) on the non-transcribed DNA strand and provide evidence that APOBEC mutagenesis occurs on the lagging-strand template during DNA replication. As more genomes are sequenced, studying and classifying their asymmetries will illuminate the underlying biological mechanisms of DNA damage and repair.
Collapse
Affiliation(s)
- Nicholas J Haradhvala
- Massachusetts General Hospital Cancer Center and Department of Pathology, 55 Fruit Street, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA
| | - Paz Polak
- Massachusetts General Hospital Cancer Center and Department of Pathology, 55 Fruit Street, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA; Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
| | - Petar Stojanov
- Carnegie Mellon University School of Computer Science, 5000 Forbes Avenue, Pittsburgh, PA 15213, USA
| | - Kyle R Covington
- Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Eve Shinbrot
- Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Julian M Hess
- Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA
| | - Esther Rheinbay
- Massachusetts General Hospital Cancer Center and Department of Pathology, 55 Fruit Street, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA
| | - Jaegil Kim
- Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA
| | - Yosef E Maruvka
- Massachusetts General Hospital Cancer Center and Department of Pathology, 55 Fruit Street, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA
| | - Lior Z Braunstein
- Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA
| | - Atanas Kamburov
- Massachusetts General Hospital Cancer Center and Department of Pathology, 55 Fruit Street, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA; Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA
| | - Philip C Hanawalt
- Stanford University Department of Biology, 450 Serra Mall, Stanford, CA 94305, USA
| | - David A Wheeler
- Baylor College of Medicine, 1 Baylor Plaza, Houston, TX 77030, USA
| | - Amnon Koren
- Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA; Cornell University Department of Molecular Biology and Genetics, 526 Campus Road, Ithaca, NY 14853, USA
| | - Michael S Lawrence
- Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA.
| | - Gad Getz
- Massachusetts General Hospital Cancer Center and Department of Pathology, 55 Fruit Street, Boston, MA 02114, USA; Broad Institute of Harvard and MIT, 415 Main Street, Cambridge, MA 02142, USA; Harvard Medical School, 25 Shattuck Street, Boston, MA 02115, USA.
| |
Collapse
|
9
|
Haerty W, Ponting CP. Unexpected selection to retain high GC content and splicing enhancers within exons of multiexonic lncRNA loci. RNA (NEW YORK, N.Y.) 2015; 21:333-46. [PMID: 25589248 PMCID: PMC4338330 DOI: 10.1261/rna.047324.114] [Citation(s) in RCA: 64] [Impact Index Per Article: 7.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 07/16/2014] [Accepted: 11/25/2014] [Indexed: 06/04/2023]
Abstract
If sequencing was possible only for genomes, and not for RNAs or proteins, then functional protein-coding exons would be recognizable by their unusual patterns of nucleotide composition, specifically a high GC content across the body of exons, and an unusual nucleotide content near their edges. RNAs and proteins can, of course, be sequenced but the extent of functionality of intergenic long noncoding RNAs (lncRNAs) remains under question owing to their low nucleotide conservation. Inspired by the nucleotide composition patterns of protein-coding exons, we sought evidence for functionality across lncRNA loci from diverse species. We found that such patterns across multiexonic lncRNA loci mirror those of proteincoding genes, although to a lesser degree: Specifically, compared with introns, lncRNA exons are GC rich. Additionally we report evidence for the action of purifying selection to preserve exonic splicing enhancers within human multiexonic lncRNAs and nucleotide composition in fruit fly lncRNAs. Our findings provide evidence for selection for more efficient rates of transcription and splicing within lncRNA loci. Despite only a minor proportion of their RNA bases being constrained, multiexonic intergenic lncRNAs appear to require accurate splicing of their exons to transact their function.
Collapse
|
10
|
Scala G, Affinito O, Miele G, Monticelli A, Cocozza S. Evidence for evolutionary and nonevolutionary forces shaping the distribution of human genetic variants near transcription start sites. PLoS One 2014; 9:e114432. [PMID: 25474578 PMCID: PMC4256220 DOI: 10.1371/journal.pone.0114432] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/07/2014] [Accepted: 11/09/2014] [Indexed: 11/19/2022] Open
Abstract
The regions surrounding transcription start sites (TSSs) of genes play a critical role in the regulation of gene expression. At the same time, current evidence indicates that these regions are particularly stressed by transcription-related mutagenic phenomena. In this work we performed a genome-wide analysis of the distribution of single nucleotide polymorphisms (SNPs) inside the 10 kb region flanking human TSSs by dividing SNPs into four classes according to their frequency (rare, two intermediate classes, and common). We found that, in this 10 kb region, the distribution of variants depends on their frequency and on their localization relative to the TSS. We found that the distribution of variants is generally different for TSSs located inside or outside of CpG islands. We found a significant relationship between the distribution of rare variants and nucleosome occupancy scores. Furthermore, our analysis suggests that evolutionary (purifying selection) and nonevolutionary (biased gene conversion) forces both play a role in determining the relative SNP frequency around TSSs. Finally, we analyzed the potential pathogenicity of each class of variant using the Combined Annotation Dependent Depletion score. In conclusion, this study provides a novel and detailed view of the distribution of genomic variants around TSSs, providing insight into the forces that instigate and maintain variability in such critical regions.
Collapse
Affiliation(s)
- Giovanni Scala
- Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Dipartimento di Fisica, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Naples, Italy
- * E-mail:
| | - Ornella Affinito
- Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Dipartimento di Medicina Molecolare e Biotecnologie Mediche, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Istituto di Endocrinologia ed Oncologia Sperimentale (IEOS), CNR, Naples, Italy
| | - Gennaro Miele
- Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Dipartimento di Fisica, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Istituto Nazionale di Fisica Nucleare, Sezione di Napoli, Naples, Italy
| | - Antonella Monticelli
- Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Istituto di Endocrinologia ed Oncologia Sperimentale (IEOS), CNR, Naples, Italy
| | - Sergio Cocozza
- Gruppo Interdipartimentale di Bioinformatica e Biologia Computazionale, Università degli Studi di Napoli “Federico II”, Naples, Italy
- Dipartimento di Medicina Molecolare e Biotecnologie Mediche, Università degli Studi di Napoli “Federico II”, Naples, Italy
| |
Collapse
|
11
|
Lenz C, Haerty W, Golding GB. Increased substitution rates surrounding low-complexity regions within primate proteins. Genome Biol Evol 2014; 6:655-65. [PMID: 24572016 PMCID: PMC3971593 DOI: 10.1093/gbe/evu042] [Citation(s) in RCA: 14] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/30/2022] Open
Abstract
Previous studies have found that DNA-flanking low-complexity regions (LCRs) have an increased substitution rate. Here, the substitution rate was confirmed to increase in the vicinity of LCRs in several primate species, including humans. This effect was also found among human sequences from the 1000 Genomes Project. A strong correlation was found between average substitution rate per site and distance from the LCR, as well as the proportion of genes with gaps in the alignment at each site and distance from the LCR. Along with substitution rates, dN/dS ratios were also determined for each site, and the proportion of sites undergoing negative selection was found to have a negative relationship with distance from the LCR.
Collapse
Affiliation(s)
- Carolyn Lenz
- Department of Biology, McMaster University, Hamilton, Ontario, Canada
| | | | | |
Collapse
|
12
|
Abstract
Transcription requires unwinding complementary DNA strands, generating torsional stress, and sensitizing the exposed single strands to chemical reactions and endogenous damaging agents. In addition, transcription can occur concomitantly with the other major DNA metabolic processes (replication, repair, and recombination), creating opportunities for either cooperation or conflict. Genetic modifications associated with transcription are a global issue in the small genomes of microorganisms in which noncoding sequences are rare. Transcription likewise becomes significant when one considers that most of the human genome is transcriptionally active. In this review, we focus specifically on the mutagenic consequences of transcription. Mechanisms of transcription-associated mutagenesis in microorganisms are discussed, as is the role of transcription in somatic instability of the vertebrate immune system.
Collapse
Affiliation(s)
- Sue Jinks-Robertson
- Department of Molecular Genetics and Microbiology, Duke University Medical Center, Durham, North Carolina 27710;
| | | |
Collapse
|
13
|
Khrustalev VV, Barkovsky EV, Khrustaleva TA, Lelevich SV. Intragenic isochores (intrachores) in the platelet phosphofructokinase gene of Passeriform birds. Gene 2014; 546:16-24. [PMID: 24861647 DOI: 10.1016/j.gene.2014.05.045] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2014] [Revised: 05/09/2014] [Accepted: 05/21/2014] [Indexed: 10/25/2022]
Abstract
Total GC-content in the platelet phosphofructokinase gene of Zebra Finch (Taeniopygia guttata) is low (37.53±0.51%), while there are short areas (about 300 nucleotides in length) with increased GC-content overlapping its exon 4 and exon 17. GC-content in third codon positions (3GC) of those two exons is equal to 88.42 and 80.00%, respectively, while overall 3GC of the coding region is equal to 49.9%. Similar distribution of GC-content has been found in platelet phosphofructokinase genes of other birds from Passeriformes order. According to the results of phylogenetic analysis, formation of those areas with high G+C started from 91.4 to 47.1millionyears ago, since there are no such peaks of GC-content in homologous genes of other birds and reptiles. There are clusters of transcription factor binding sites in those areas with higher GC-content, as well as microRNA precursors conserved in Zebra Finch and Flycatcher genes. According to our hypothesis those intragenic isochores (intrachores) may be consequences of autonomous microRNA precursor transcription at certain period(s) of embryogenesis and gametogenesis, when the platelet phosphofructokinase gene itself is not expressed. Transcription-associated mutational pressure existing during those periods may cause the increase in rates of AT to GC mutations in those genes which are transcribed.
Collapse
Affiliation(s)
| | | | | | - Sergey Vladimirovich Lelevich
- Department of Clinical Laboratory Diagnostics, Allergology and Immunology, Grodno State Medical University, Gorkogo 80, Grodno, Belarus
| |
Collapse
|
14
|
Abstract
The mammalian genome is extensively transcribed, a large fraction of which is divergent transcription from promoters and enhancers that is tightly coupled with active gene transcription. Here, we propose that divergent transcription may shape the evolution of the genome by new gene origination.
Collapse
Affiliation(s)
- Xuebing Wu
- David H. Koch Institute for Integrative Cancer Research, Massachusetts Institute of Technology, Cambridge, MA 02139, USA; Computational and Systems Biology Graduate Program, Massachusetts Institute of Technology, Cambridge, MA 02139, USA
| | | |
Collapse
|
15
|
De Maio N, Schlötterer C, Kosiol C. Linking great apes genome evolution across time scales using polymorphism-aware phylogenetic models. Mol Biol Evol 2013; 30:2249-62. [PMID: 23906727 PMCID: PMC3773373 DOI: 10.1093/molbev/mst131] [Citation(s) in RCA: 59] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/17/2022] Open
Abstract
The genomes of related species contain valuable information on the history of the considered taxa. Great apes in particular exhibit variation of evolutionary patterns along their genomes. However, the great ape data also bring new challenges, such as the presence of incomplete lineage sorting and ancestral shared polymorphisms. Previous methods for genome-scale analysis are restricted to very few individuals or cannot disentangle the contribution of mutation rates and fixation biases. This represents a limitation both for the understanding of these forces as well as for the detection of regions affected by selection. Here, we present a new model designed to estimate mutation rates and fixation biases from genetic variation within and between species. We relax the assumption of instantaneous substitutions, modeling substitutions as mutational events followed by a gradual fixation. Hence, we straightforwardly account for shared ancestral polymorphisms and incomplete lineage sorting. We analyze genome-wide synonymous site alignments of human, chimpanzee, and two orangutan species. From each taxon, we include data from several individuals. We estimate mutation rates and GC-biased gene conversion intensity. We find that both mutation rates and biased gene conversion vary with GC content. We also find lineage-specific differences, with weaker fixation biases in orangutan species, suggesting a reduced historical effective population size. Finally, our results are consistent with directional selection acting on coding sequences in relation to exonic splicing enhancers.
Collapse
Affiliation(s)
- Nicola De Maio
- Institut für Populationsgenetik, Vetmeduni Vienna, Wien, Austria
| | | | | |
Collapse
|
16
|
McLean MA, Tirosh I. Opposite GC skews at the 5' and 3' ends of genes in unicellular fungi. BMC Genomics 2011; 12:638. [PMID: 22208287 PMCID: PMC3315797 DOI: 10.1186/1471-2164-12-638] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/15/2011] [Accepted: 12/30/2011] [Indexed: 11/24/2022] Open
Abstract
Background GC-skews have previously been linked to transcription in some eukaryotes. They have been associated with transcription start sites, with the coding strand G-biased in mammals and C-biased in fungi and invertebrates. Results We show a consistent and highly significant pattern of GC-skew within genes of almost all unicellular fungi. The pattern of GC-skew is asymmetrical: the coding strand of genes is typically C-biased at the 5' ends but G-biased at the 3' ends, with intermediate skews at the middle of genes. Thus, the initiation, elongation, and termination phases of transcription are associated with different skews. This pattern influences the encoded proteins by generating differential usage of amino acids at the 5' and 3' ends of genes. These biases also affect fourfold-degenerate positions and extend into promoters and 3' UTRs, indicating that skews cannot be accounted by selection for protein function or translation. Conclusions We propose two explanations, the mutational pressure hypothesis, and the adaptive hypothesis. The mutational pressure hypothesis is that different co-factors bind to RNA pol II at different phases of transcription, producing different mutational regimes. The adaptive hypothesis is that cytidine triphosphate deficiency may lead to C-avoidance at the 3' ends of transcripts to control the flow of RNA pol II molecules and reduce their frequency of collisions.
Collapse
Affiliation(s)
- Malcolm A McLean
- Department of Molecular Genetics, Weizmann Institute of Science, Rehovot, Israel.
| | | |
Collapse
|