401
|
Consiglio A, Mencar C, Grillo G, Marzano F, Caratozzolo MF, Liuni S. A fuzzy method for RNA-Seq differential expression analysis in presence of multireads. BMC Bioinformatics 2016; 17:345. [PMID: 28185579 PMCID: PMC5123383 DOI: 10.1186/s12859-016-1195-2] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/17/2023] Open
Abstract
Background When the reads obtained from high-throughput RNA sequencing are mapped against a reference database, a significant proportion of them - known as multireads - can map to more than one reference sequence. These multireads originate from gene duplications, repetitive regions or overlapping genes. Removing the multireads from the mapping results, in RNA-Seq analyses, causes an underestimation of the read counts, while estimating the real read count can lead to false positives during the detection of differentially expressed sequences. Results We present an innovative approach to deal with multireads and evaluate differential expression events, entirely based on fuzzy set theory. Since multireads cause uncertainty in the estimation of read counts during gene expression computation, they can also influence the reliability of differential expression analysis results, by producing false positives. Our method manages the uncertainty in gene expression estimation by defining the fuzzy read counts and evaluates the possibility of a gene to be differentially expressed with three fuzzy concepts: over-expression, same-expression and under-expression. The output of the method is a list of differentially expressed genes enriched with information about the uncertainty of the results due to the multiread presence. We have tested the method on RNA-Seq data designed for case-control studies and we have compared the obtained results with other existing tools for read count estimation and differential expression analysis. Conclusions The management of multireads with the use of fuzzy sets allows to obtain a list of differential expression events which takes in account the uncertainty in the results caused by the presence of multireads. Such additional information can be used by the biologists when they have to select the most relevant differential expression events to validate with laboratory assays. Our method can be used to compute reliable differential expression events and to highlight possible false positives in the lists of differentially expressed genes computed with other tools. Electronic supplementary material The online version of this article (doi:10.1186/s12859-016-1195-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Arianna Consiglio
- Institute for Biomedical Technologies of Bari - ITB, National Research Council, Bari, 70126, Italy.
| | - Corrado Mencar
- Department of Informatics, University of Bari Aldo Moro, Bari, 70121, Italy
| | - Giorgio Grillo
- Institute for Biomedical Technologies of Bari - ITB, National Research Council, Bari, 70126, Italy
| | - Flaviana Marzano
- Institute for Biomedical Technologies of Bari - ITB, National Research Council, Bari, 70126, Italy
| | | | - Sabino Liuni
- Institute for Biomedical Technologies of Bari - ITB, National Research Council, Bari, 70126, Italy
| |
Collapse
|
402
|
A Novel Analytical Strategy to Identify Fusion Transcripts between Repetitive Elements and Protein Coding-Exons Using RNA-Seq. PLoS One 2016; 11:e0159028. [PMID: 27415830 PMCID: PMC4945064 DOI: 10.1371/journal.pone.0159028] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/10/2016] [Accepted: 06/24/2016] [Indexed: 12/27/2022] Open
Abstract
Repetitive elements (REs) comprise 40-60% of the mammalian genome and have been shown to epigenetically influence the expression of genes through the formation of fusion transcript (FTs). We previously showed that an intracisternal A particle forms an FT with the agouti gene in mice, causing obesity/type 2 diabetes. To determine the frequency of FTs genome-wide, we developed a TopHat-Fusion-based analytical pipeline to identify FTs with high specificity. We applied it to an RNA-seq dataset from the nucleus accumbens (NAc) of mice repeatedly exposed to cocaine. Cocaine was previously shown to increase the expression of certain REs in this brain region. Using this pipeline that can be applied to single- or paired-end reads, we identified 438 genes expressing 813 different FTs in the NAc. Although all types of studied repeats were present in FTs, simple sequence repeats were underrepresented. Most importantly, reverse-transcription and quantitative PCR validated the expression of selected FTs in an independent cohort of animals, which also revealed that some FTs are the prominent isoforms expressed in the NAc by some genes. In other RNA-seq datasets, developmental expression as well as tissue specificity of some FTs differed from their corresponding non-fusion counterparts. Finally, in silico analysis predicted changes in the structure of proteins encoded by some FTs, potentially resulting in gain or loss of function. Collectively, these results indicate the robustness of our pipeline in detecting these new isoforms of genes, which we believe provides a valuable tool to aid in better understanding the broad role of REs in mammalian cellular biology.
Collapse
|
403
|
Du J, Leung A, Trac C, Lee M, Parks BW, Lusis AJ, Natarajan R, Schones DE. Chromatin variation associated with liver metabolism is mediated by transposable elements. Epigenetics Chromatin 2016; 9:28. [PMID: 27398095 PMCID: PMC4939004 DOI: 10.1186/s13072-016-0078-0] [Citation(s) in RCA: 28] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/18/2016] [Accepted: 06/29/2016] [Indexed: 01/23/2023] Open
Abstract
Background Functional regulatory regions in eukaryotic genomes are characterized by the disruption of nucleosomes leading to accessible chromatin. The modulation of chromatin accessibility is one of the key mediators of transcriptional regulation, and variation in chromatin accessibility across individuals has been linked to complex traits and disease susceptibility. While mechanisms responsible for chromatin variation across individuals have been investigated, the overwhelming majority of chromatin variation remains unexplained. Furthermore, the processes through which the variation of chromatin accessibility contributes to phenotypic diversity remain poorly understood. Results We profiled chromatin accessibility in liver from seven strains of mice with phenotypic diversity in response to a high-fat/high-sucrose (HF/HS) diet and identified reproducible chromatin variation across the individuals. We found that sites of variable chromatin accessibility were more likely to coincide with particular classes of transposable elements (TEs) than sites with common chromatin signatures. Evolutionarily younger long interspersed nuclear elements (LINEs) are particularly likely to harbor variable chromatin sites. These younger LINEs are enriched for binding sites of immune-associated transcription factors, whereas older LINEs are enriched for liver-specific transcription factors. Genomic region enrichment analysis indicates that variable chromatin sites at TEs may function to regulate liver metabolic pathways. CRISPR-Cas9 deletion of a number of variable chromatin sites at TEs altered expression of nearby metabolic genes. Finally, we show that polymorphism of TEs and differential DNA methylation at TEs can both influence chromatin variation. Conclusions Our results demonstrate that specific classes of TEs show variable chromatin accessibility across strains of mice that display phenotypic diversity in response to a HF/HS diet. These results indicate that chromatin variation at TEs is an important contributor to phenotypic variation among populations. Electronic supplementary material The online version of this article (doi:10.1186/s13072-016-0078-0) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Juan Du
- Department of Diabetes Complications and Metabolism, Beckman Research Institute, City of Hope, Duarte, CA USA ; Irell & Manella Graduate School of Biological Sciences, City of Hope, Duarte, CA USA
| | - Amy Leung
- Department of Diabetes Complications and Metabolism, Beckman Research Institute, City of Hope, Duarte, CA USA
| | - Candi Trac
- Department of Diabetes Complications and Metabolism, Beckman Research Institute, City of Hope, Duarte, CA USA
| | - Michael Lee
- Department of Diabetes Complications and Metabolism, Beckman Research Institute, City of Hope, Duarte, CA USA ; Irell & Manella Graduate School of Biological Sciences, City of Hope, Duarte, CA USA
| | - Brian W Parks
- Department of Nutritional Sciences, University of Wisconsin-Madison, Madison, WI USA
| | - Aldons J Lusis
- Department of Medicine, University of California, Los Angeles, CA USA
| | - Rama Natarajan
- Department of Diabetes Complications and Metabolism, Beckman Research Institute, City of Hope, Duarte, CA USA ; Irell & Manella Graduate School of Biological Sciences, City of Hope, Duarte, CA USA
| | - Dustin E Schones
- Department of Diabetes Complications and Metabolism, Beckman Research Institute, City of Hope, Duarte, CA USA ; Irell & Manella Graduate School of Biological Sciences, City of Hope, Duarte, CA USA
| |
Collapse
|
404
|
Arkhipova IR, Rice PA. Mobile genetic elements: in silico, in vitro, in vivo. Mol Ecol 2016; 25:1027-31. [PMID: 26822117 DOI: 10.1111/mec.13543] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2015] [Accepted: 01/15/2016] [Indexed: 11/27/2022]
Abstract
Mobile genetic elements (MGEs), also called transposable elements (TEs), represent universal components of most genomes and are intimately involved in nearly all aspects of genome organization, function and evolution. However, there is currently a gap between the fast pace of TE discovery in silico, driven by the exponential growth of comparative genomic studies, and a limited number of experimental models amenable to more traditional in vitro and in vivo studies of structural, mechanistic and regulatory properties of diverse MGEs. Experimental and computational scientists came together to bridge this gap at a recent conference, 'Mobile Genetic Elements: in silico, in vitro, in vivo', held at the Marine Biological Laboratory (MBL) in Woods Hole, MA, USA.
Collapse
Affiliation(s)
- Irina R Arkhipova
- Josephine Bay Paul Center for Comparative Molecular Biology and Evolution, Marine Biological Laboratory, 7 MBL Street, Woods Hole, MA, 02543, USA
| | - Phoebe A Rice
- Department of Biochemistry and Molecular Biology, The University of Chicago, 929 E. 57th Street, Chicago, IL, 60637, USA
| |
Collapse
|
405
|
Yu Y, Gu J, Jin Y, Luo Y, Preall JB, Ma J, Czech B, Hannon GJ. Panoramix enforces piRNA-dependent cotranscriptional silencing. Science 2015; 350:339-42. [PMID: 26472911 PMCID: PMC4722808 DOI: 10.1126/science.aab0700] [Citation(s) in RCA: 144] [Impact Index Per Article: 14.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/11/2022]
Abstract
The Piwi-interacting RNA (piRNA) pathway is a small RNA-based innate immune system that defends germ cell genomes against transposons. In Drosophila ovaries, the nuclear Piwi protein is required for transcriptional silencing of transposons, though the precise mechanisms by which this occurs are unknown. Here we show that the CG9754 protein is a component of Piwi complexes that functions downstream of Piwi and its binding partner, Asterix, in transcriptional silencing. Enforced tethering of CG9754 to nascent messenger RNA transcripts causes cotranscriptional silencing of the source locus and the deposition of repressive chromatin marks. We have named CG9754 "Panoramix," and we propose that this protein could act as an adaptor, scaffolding interactions between the piRNA pathway and the general silencing machinery that it recruits to enforce transcriptional repression.
Collapse
Affiliation(s)
- Yang Yu
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA. Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Jiaqi Gu
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA. Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA. State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Biochemistry, School of Life Sciences, Fudan University, Shanghai, China
| | - Ying Jin
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA
| | - Yicheng Luo
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA. Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Jonathan B Preall
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA. Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA
| | - Jinbiao Ma
- State Key Laboratory of Genetic Engineering, Collaborative Innovation Center of Genetics and Development, Department of Biochemistry, School of Life Sciences, Fudan University, Shanghai, China
| | - Benjamin Czech
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA. Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA. Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK
| | - Gregory J Hannon
- Watson School of Biological Sciences, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY 11724, USA. Howard Hughes Medical Institute, Chevy Chase, MD 20815, USA. Cancer Research UK Cambridge Institute, Li Ka Shing Centre, University of Cambridge, Cambridge, UK. The New York Genome Center, 101 Avenue of the Americas, New York, NY 10013, USA.
| |
Collapse
|