1
|
Lee S, Barbour JA, Tam YM, Yang H, Huang Y, Wong JWH. LocusMasterTE: integrating long-read RNA sequencing improves locus-specific quantification of transposable element expression. Genome Biol 2025; 26:72. [PMID: 40140852 PMCID: PMC11948968 DOI: 10.1186/s13059-025-03522-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/02/2023] [Accepted: 02/28/2025] [Indexed: 03/28/2025] Open
Abstract
Transposable elements (TEs) can influence human diseases by disrupting genome integrity, yet their quantification has been challenging due to the repetitive nature of these sequences across the genome. We develop LocusMasterTE, a method that integrates long-read with short-read RNA-seq to increase the accuracy of TE expression quantification. By incorporating fractional transcript per million values from long-read sequencing data into an expectation-maximization algorithm, LocusMasterTE reassigns multi-mapped reads, enhancing accuracy in short-read-based TE quantification. We validate the method with simulated and human datasets. LocusMasterTE may give new insights into TE functions through precise quantification.
Collapse
Affiliation(s)
- Sojung Lee
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Jayne A Barbour
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong SAR, China
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China
| | - Yee Man Tam
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong SAR, China
| | - Haocheng Yang
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong SAR, China
| | - Yuanhua Huang
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong SAR, China
- Center for Translational Stem Cell Biology, Hong Kong Science and Technology Park, Hong Kong SAR, China
| | - Jason W H Wong
- School of Biomedical Sciences, Li Ka Shing Faculty of Medicine, University of Hong Kong, Hong Kong SAR, China.
- Centre for Oncology and Immunology, Hong Kong Science Park, Hong Kong SAR, China.
| |
Collapse
|
2
|
Netschitailo O, Raub S, Kaftanoglu O, Beye M. Sexual diversification of splicing regulation during embryonic development in honeybees (Apis mellifera), A haplodiploid system. INSECT MOLECULAR BIOLOGY 2022; 31:170-176. [PMID: 34773317 DOI: 10.1111/imb.12748] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/04/2021] [Revised: 10/23/2021] [Accepted: 11/09/2021] [Indexed: 06/13/2023]
Abstract
The honeybee is a haplodiploid organism in which sexual development is determined by the complementary sex determiner (csd) gene and realized by sex-specific splicing processes involving the feminizer (fem) gene. We used high throughput transcriptome sequencing (RNA-Seq) to characterize the transcriptional differences between the sexes caused by the fertilization and sex determination processes in honeybee (Apis mellifera) embryos. We identified 758, 372 and 43 differentially expressed genes (DEGs) and 58, 176 and 233 differentially spliced genes (DSGs) in 10-15-h-old, 25-40-h-old and 55-70-h-old female and male embryos, respectively. The early difference in male and female embryos in response to the fertilization and non-fertilization processes resulted mainly in differential expression of genes (758 DEGs vs. 58 DSGs). In the latest sampled embryonic stage, the transcriptional differences between the sexes were dominated by alternative splicing of transcripts (43 DEGs vs. 233 DSGs). Interestingly, differentially spliced transcripts that encode RNA-binding properties were overrepresented in 55-70-h-old embryos, indicating a more diverse regulation via alternative splicing than previous work on the sex determination pathway suggested. These stage- and sex-specific transcriptome data from honeybee embryos provide a comprehensive resource for examining the roles of fertilization and sex determination in developmental programming in a haplodiploid system.
Collapse
Affiliation(s)
- Oksana Netschitailo
- Institute of Evolutionary Genetics, Heinrich-Heine University Duesseldorf, Duesseldorf, Germany
| | - Stephan Raub
- Center for Scientific Computing and Storage, Heinrich-Heine University Duesseldorf, Duesseldorf, Germany
| | - Osman Kaftanoglu
- School of Life Sciences, Arizona State University, Phoenix, Arizona, USA
| | - Martin Beye
- Institute of Evolutionary Genetics, Heinrich-Heine University Duesseldorf, Duesseldorf, Germany
| |
Collapse
|
3
|
Huang Y, Sanguinetti G. BRIE2: computational identification of splicing phenotypes from single-cell transcriptomic experiments. Genome Biol 2021; 22:251. [PMID: 34452629 PMCID: PMC8393734 DOI: 10.1186/s13059-021-02461-5] [Citation(s) in RCA: 17] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2020] [Accepted: 08/10/2021] [Indexed: 02/06/2023] Open
Abstract
RNA splicing is an important driver of heterogeneity in single cells through the expression of alternative transcripts and as a determinant of transcriptional kinetics. However, the intrinsic coverage limitations of scRNA-seq technologies make it challenging to associate specific splicing events to cell-level phenotypes. BRIE2 is a scalable computational method that resolves these issues by regressing single-cell transcriptomic data against cell-level features. We show that BRIE2 effectively identifies differential disease-associated alternative splicing events and allows a principled selection of genes that capture heterogeneity in transcriptional kinetics and improve RNA velocity analyses, enabling the identification of splicing phenotypes associated with biological changes.
Collapse
Affiliation(s)
- Yuanhua Huang
- School of Biomedical Sciences, University of Hong Kong, Hong Kong SAR, Pok Fu Lam, China.
- Department of Statistics and Actuarial Science, University of Hong Kong, Hong Kong SAR, Pok Fu Lam, China.
| | - Guido Sanguinetti
- School of Informatics, University of Edinburgh, Edinburgh, UK.
- SISSA, International School of Advanced Studies, Trieste, Italy.
| |
Collapse
|
4
|
Liu S, Zhou B, Wu L, Sun Y, Chen J, Liu S. Single-cell differential splicing analysis reveals high heterogeneity of liver tumor-infiltrating T cells. Sci Rep 2021; 11:5325. [PMID: 33674641 PMCID: PMC7935992 DOI: 10.1038/s41598-021-84693-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2020] [Accepted: 02/19/2021] [Indexed: 11/12/2022] Open
Abstract
Recent advances in single-cell RNA sequencing (scRNA-seq) have improved our understanding of the association between tumor-infiltrating lymphocyte (TILs) heterogeneity and cancer initiation and progression. However, studies investigating alternative splicing (AS) as an important regulatory factor of heterogeneity remain limited. Here, we developed a new computational tool, DESJ-detection, which accurately detects differentially expressed splicing junctions (DESJs) between cell groups at the single-cell level. We analyzed 5063 T cells of hepatocellular carcinoma (HCC) and identified 1176 DESJs across 11 T cell subtypes. Interestingly, DESJs were enriched in UTRs, and have putative effects on heterogeneity. Cell subtypes with a similar function closely clustered together at the AS level. Meanwhile, we identified a novel cell state, pre-activation with the isoform markers ARHGAP15-205. In summary, we present a comprehensive investigation of alternative splicing differences, which provided novel insights into T cell heterogeneity and can be applied to other full-length scRNA-seq datasets.
Collapse
Affiliation(s)
- Shang Liu
- BGI Education Center, University of Chinese Academy of Sciences (UCAS), Shenzhen, 518083, China
- BGI-Shenzhen, Beishan Industrial Zone, Shenzhen, 518083, China
- Shenzhen Key Laboratory of Single-Cell Omics, China National GeneBank, Shenzhen, 518120, China
| | - Biaofeng Zhou
- BGI Education Center, University of Chinese Academy of Sciences (UCAS), Shenzhen, 518083, China
- BGI-Shenzhen, Beishan Industrial Zone, Shenzhen, 518083, China
- Shenzhen Key Laboratory of Single-Cell Omics, China National GeneBank, Shenzhen, 518120, China
| | - Liang Wu
- BGI Education Center, University of Chinese Academy of Sciences (UCAS), Shenzhen, 518083, China
- BGI-Shenzhen, Beishan Industrial Zone, Shenzhen, 518083, China
- Shenzhen Key Laboratory of Single-Cell Omics, China National GeneBank, Shenzhen, 518120, China
| | - Yan Sun
- BGI Education Center, University of Chinese Academy of Sciences (UCAS), Shenzhen, 518083, China
- BGI-Shenzhen, Beishan Industrial Zone, Shenzhen, 518083, China
- Shenzhen Key Laboratory of Single-Cell Omics, China National GeneBank, Shenzhen, 518120, China
| | - Jie Chen
- BGI Education Center, University of Chinese Academy of Sciences (UCAS), Shenzhen, 518083, China
- BGI-Shenzhen, Beishan Industrial Zone, Shenzhen, 518083, China
| | - Shiping Liu
- BGI-Shenzhen, Beishan Industrial Zone, Shenzhen, 518083, China.
- Shenzhen Key Laboratory of Single-Cell Omics, China National GeneBank, Shenzhen, 518120, China.
| |
Collapse
|
5
|
Zhang Z, Fu T, Liu Z, Wang X, Xun H, Li G, Ding B, Dong Y, Lin X, Sanguinet KA, Liu B, Wu Y, Gong L. Extensive changes in gene expression and alternative splicing due to homoeologous exchange in rice segmental allopolyploids. TAG. THEORETICAL AND APPLIED GENETICS. THEORETISCHE UND ANGEWANDTE GENETIK 2019; 132:2295-2308. [PMID: 31098756 DOI: 10.1007/s00122-019-03355-8] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/11/2018] [Accepted: 04/26/2019] [Indexed: 06/09/2023]
Abstract
We report rampant homoeologous exchanges in progenies of a newly synthesized rice segmental allotetraploid and demonstrate their consequences to changes of gene expression and alternative splicing. Allopolyploidization is recurrent across the tree of angiosperms and known as a driving evolutionary force in both plants and animals. A salient feature of allopolyploidization is the induction of homoeologous exchange (HE) events between the constituent subgenomes, which may in turn cause changes in gene expression, transcript alternative splicing, and phenotypic novelty. However, this issue has been poorly studied, largely because lack of a system in which the exact parentage donating the subgenomes is known and the HE events are occurring in real time. Here, we employed whole-genome re-sequencing and RNA-seq-based transcriptome profiling in four randomly chosen progeny individuals (at the 10th-selfed generation) of segmental allotetraploids that were constructed by colchicine-mediated whole-genome doubling of F1 hybrids between the two subspecies (japonica and indica) of Asian cultivated Oryza sativa. We show that rampant HE events occurred in these tetraploid individuals, which converted most of the otherwise heterozygous genomic regions into a homogenized state of one parental subgenome. We demonstrate that genes within these homogenized genomic regions in the tetraploids showed high frequencies of altered expression and enhanced alternative splicing relative to their counterparts in the corresponding diploid parents in the embryo tissue. Intriguingly, limited overlaps between the differentially expressed genes and the differential alternative spliced genes were identified, which were partitioned to distinctly enriched gene ontology terms. Together, our results indicate that HE is a major mechanism to rapidly generate novelty in gene expression and transcriptome diversity, which may facilitate phenotypic innovation in nascent allopolyploids and relevant to allopolyploid crop breeding.
Collapse
Affiliation(s)
- Zhibin Zhang
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, China
| | - Tiansi Fu
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, China
| | - Zhijian Liu
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, China
| | - Xutong Wang
- Department of Agronomy, Purdue University, West Lafayette, IN, 47907, USA
| | - Hongwei Xun
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, China
| | - Guo Li
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, China
| | - Baoxu Ding
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, China
| | - Yuzhu Dong
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, China
| | - Xiuyun Lin
- Jilin Academy of Agricultural Sciences (JAAS), Changchun, 136100, China
| | - Karen A Sanguinet
- Department of Crop and Soil Sciences, Washington State University, Pullman, WA, 99164, USA
| | - Bao Liu
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, China
| | - Ying Wu
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, China.
| | - Lei Gong
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, China.
| |
Collapse
|
6
|
Yu X, Meng X, Liu Y, Wang X, Wang TJ, Zhang A, Li N, Qi X, Liu B, Xu ZY. The chromatin remodeler ZmCHB101 impacts alternative splicing contexts in response to osmotic stress. PLANT CELL REPORTS 2019; 38:131-145. [PMID: 30443733 DOI: 10.1007/s00299-018-2354-x] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/02/2018] [Accepted: 11/07/2018] [Indexed: 05/16/2023]
Abstract
Maize SWI3-type chromatin remodeler impacts alternative splicing contexts in response to osmotic stress by altering nucleosome density and affecting transcriptional elongation rate. Alternative splicing (AS) is commonly found in higher eukaryotes and is an important posttranscriptional regulatory mechanism to generate transcript diversity. AS has been widely accepted as playing essential roles in different biological processes including growth, development, signal transduction and responses to biotic and abiotic stresses in plants. However, whether and how chromatin remodeling complex functions in AS in plant under osmotic stress remains unknown. Here, we show that a maize SWI3D protein, ZmCHB101, impacts AS contexts in response to osmotic stress. Genome-wide analysis of mRNA contexts in response to osmotic stress using ZmCHB101-RNAi lines reveals that ZmCHB101 impacts alternative splicing contexts of a subset of osmotic stress-responsive genes. Intriguingly, ZmCHB101-mediated regulation of gene expression and AS is largely uncoupled, pointing to diverse molecular functions of ZmCHB101 in transcriptional and posttranscriptional regulation. We further found ZmCHB101 impacts the alternative splicing contexts by influencing alteration of chromatin and histone modification status as well as transcriptional elongation rates mediated by RNA polymerase II. Taken together, our findings suggest a novel insight of how plant chromatin remodeling complex impacts AS under osmotic stress .
Collapse
Affiliation(s)
- Xiaoming Yu
- School of Agronomy, Jilin Agricultural Science and Technology University, Jilin, 132301, People's Republic of China
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, People's Republic of China
| | - Xinchao Meng
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, People's Republic of China
| | - Yutong Liu
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, People's Republic of China
| | - Xutong Wang
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, People's Republic of China
- Department of Agronomy, Purdue University, West Lafayette, USA
| | - Tian-Jing Wang
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, People's Republic of China
| | - Ai Zhang
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, People's Republic of China
| | - Ning Li
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, People's Republic of China
| | - Xin Qi
- Department of Agronomy, Jilin Agricultural University, Changchun, 130118, People's Republic of China
| | - Bao Liu
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, People's Republic of China.
| | - Zheng-Yi Xu
- Key Laboratory of Molecular Epigenetics of the Ministry of Education (MOE), Northeast Normal University, Changchun, 130024, People's Republic of China.
| |
Collapse
|
7
|
Yang H, Jaime M, Polihronakis M, Kanegawa K, Markow T, Kaneshiro K, Oliver B. Re-annotation of eight Drosophila genomes. Life Sci Alliance 2018; 1:e201800156. [PMID: 30599046 PMCID: PMC6305970 DOI: 10.26508/lsa.201800156] [Citation(s) in RCA: 38] [Impact Index Per Article: 5.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/13/2018] [Revised: 12/15/2018] [Accepted: 12/16/2018] [Indexed: 12/11/2022] Open
Abstract
The sequenced genomes of the Drosophila phylogeny are a central resource for comparative work supporting the understanding of the Drosophila melanogaster non-mammalian model system. These have also facilitated evolutionary studies on the selected and random differences that distinguish the thousands of extant species of Drosophila. However, full utility has been hampered by uneven genome annotation. We have generated a large expression profile dataset for nine species of Drosophila and trained a transcriptome assembly approach on D. melanogaster that best matched the extensively curated annotation. We then applied this to the other species to add more than 10000 transcript models per species. We also developed new orthologs to facilitate cross-species comparisons. We validated the new annotation of the distantly related Drosophila grimshawi with an extensive collection of newly sequenced cDNAs. This re-annotation will facilitate understanding both the core commonalities and the species differences in this important group of model organisms, and suggests a strategy for annotating the many forthcoming genomes covering the tree of life.
Collapse
Affiliation(s)
- Haiwang Yang
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Maria Jaime
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
| | - Maxi Polihronakis
- Drosophila Species Stock Center, Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Kelvin Kanegawa
- Hawaiian Drosophila Research Stock Center, Pacific Biosciences Research Center, University of Hawai'i at Manoa, Honolulu, HI, USA
| | - Therese Markow
- National Laboratory of Genomics for Biodiversity (LANGEBIO), Irapuato, Guanajuato, Mexico.,Drosophila Species Stock Center, Division of Biological Sciences, University of California San Diego, La Jolla, CA, USA
| | - Kenneth Kaneshiro
- Hawaiian Drosophila Research Stock Center, Pacific Biosciences Research Center, University of Hawai'i at Manoa, Honolulu, HI, USA
| | - Brian Oliver
- Section of Developmental Genomics, Laboratory of Cellular and Developmental Biology, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD, USA
| |
Collapse
|
8
|
Mapleson D, Venturini L, Kaithakottil G, Swarbreck D. Efficient and accurate detection of splice junctions from RNA-seq with Portcullis. Gigascience 2018; 7:5173486. [PMID: 30418570 PMCID: PMC6302956 DOI: 10.1093/gigascience/giy131] [Citation(s) in RCA: 80] [Impact Index Per Article: 11.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/06/2018] [Accepted: 10/25/2018] [Indexed: 12/14/2022] Open
Abstract
Next-generation sequencing technologies enable rapid and cheap genome-wide transcriptome analysis, providing vital information about gene structure, transcript expression, and alternative splicing. Key to this is the accurate identification of exon-exon junctions from RNA sequenced (RNA-seq) reads. A number of RNA-seq aligners capable of splitting reads across these splice junctions (SJs) have been developed; however, it has been shown that while they correctly identify most genuine SJs available in a given sample, they also often produce large numbers of incorrect SJs. Here, we describe the extent of this problem using popular RNA-seq mapping tools and present a new method, called Portcullis, to rapidly filter false SJs derived from spliced alignments. We show that Portcullis distinguishes between genuine and false-positive junctions to a high degree of accuracy across different species, samples, expression levels, error profiles, and read lengths. Portcullis is portable, efficient, and, to our knowledge, currently the only SJ prediction tool that reliably scales for use with large RNA-seq datasets and large, highly fragmented genomes, while delivering accurate SJs.
Collapse
Affiliation(s)
- Daniel Mapleson
- Earlham Institute, Norwich Research Park, NR47UZ, Norwich, United Kingdom
| | - Luca Venturini
- Earlham Institute, Norwich Research Park, NR47UZ, Norwich, United Kingdom
| | - Gemy Kaithakottil
- Earlham Institute, Norwich Research Park, NR47UZ, Norwich, United Kingdom
| | - David Swarbreck
- Earlham Institute, Norwich Research Park, NR47UZ, Norwich, United Kingdom
| |
Collapse
|
9
|
Arango D, Sturgill D, Alhusaini N, Dillman AA, Sweet TJ, Hanson G, Hosogane M, Sinclair WR, Nanan KK, Mandler MD, Fox SD, Zengeya TT, Andresson T, Meier JL, Coller J, Oberdoerffer S. Acetylation of Cytidine in mRNA Promotes Translation Efficiency. Cell 2018; 175:1872-1886.e24. [PMID: 30449621 DOI: 10.1016/j.cell.2018.10.030] [Citation(s) in RCA: 510] [Impact Index Per Article: 72.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/09/2018] [Revised: 06/11/2018] [Accepted: 10/12/2018] [Indexed: 01/27/2023]
Abstract
Generation of the "epitranscriptome" through post-transcriptional ribonucleoside modification embeds a layer of regulatory complexity into RNA structure and function. Here, we describe N4-acetylcytidine (ac4C) as an mRNA modification that is catalyzed by the acetyltransferase NAT10. Transcriptome-wide mapping of ac4C revealed discretely acetylated regions that were enriched within coding sequences. Ablation of NAT10 reduced ac4C detection at the mapped mRNA sites and was globally associated with target mRNA downregulation. Analysis of mRNA half-lives revealed a NAT10-dependent increase in stability in the cohort of acetylated mRNAs. mRNA acetylation was further demonstrated to enhance substrate translation in vitro and in vivo. Codon content analysis within ac4C peaks uncovered a biased representation of cytidine within wobble sites that was empirically determined to influence mRNA decoding efficiency. These findings expand the repertoire of mRNA modifications to include an acetylated residue and establish a role for ac4C in the regulation of mRNA translation.
Collapse
Affiliation(s)
- Daniel Arango
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - David Sturgill
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Najwa Alhusaini
- Center for RNA Science and Therapeutics, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Allissa A Dillman
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Thomas J Sweet
- Center for RNA Science and Therapeutics, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Gavin Hanson
- Center for RNA Science and Therapeutics, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Masaki Hosogane
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Wilson R Sinclair
- Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, NIH, Frederick, MD 21702, USA
| | - Kyster K Nanan
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Mariana D Mandler
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, MD 20892, USA
| | - Stephen D Fox
- Protein Characterization Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., Frederick, MD 21701, USA
| | - Thomas T Zengeya
- Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, NIH, Frederick, MD 21702, USA
| | - Thorkell Andresson
- Protein Characterization Laboratory, Cancer Research Technology Program, Frederick National Laboratory for Cancer Research, Leidos Biomedical Research, Inc., Frederick, MD 21701, USA
| | - Jordan L Meier
- Chemical Biology Laboratory, Center for Cancer Research, National Cancer Institute, NIH, Frederick, MD 21702, USA
| | - Jeffery Coller
- Center for RNA Science and Therapeutics, Case Western Reserve University, Cleveland, OH 44106, USA
| | - Shalini Oberdoerffer
- Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, NIH, Bethesda, MD 20892, USA.
| |
Collapse
|
10
|
Event Analysis: Using Transcript Events To Improve Estimates of Abundance in RNA-seq Data. G3-GENES GENOMES GENETICS 2018; 8:2923-2940. [PMID: 30021829 PMCID: PMC6118309 DOI: 10.1534/g3.118.200373] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 02/07/2023]
Abstract
Alternative splicing leverages genomic content by allowing the synthesis of multiple transcripts and, by implication, protein isoforms, from a single gene. However, estimating the abundance of transcripts produced in a given tissue from short sequencing reads is difficult and can result in both the construction of transcripts that do not exist, and the failure to identify true transcripts. An alternative approach is to catalog the events that make up isoforms (splice junctions and exons). We present here the Event Analysis (EA) approach, where we project transcripts onto the genome and identify overlapping/unique regions and junctions. In addition, all possible logical junctions are assembled into a catalog. Transcripts are filtered before quantitation based on simple measures: the proportion of the events detected, and the coverage. We find that mapping to a junction catalog is more efficient at detecting novel junctions than mapping in a splice aware manner. We identify 99.8% of true transcripts while iReckon identifies 82% of the true transcripts and creates more transcripts not included in the simulation than were initially used in the simulation. Using PacBio Iso-seq data from a mouse neural progenitor cell model, EA detects 60% of the novel junctions that are combinations of existing exons while only 43% are detected by STAR. EA further detects ∼5,000 annotated junctions missed by STAR. Filtering transcripts based on the proportion of the transcript detected and the number of reads on average supporting that transcript captures 95% of the PacBio transcriptome. Filtering the reference transcriptome before quantitation, results in is a more stable estimate of isoform abundance, with improved correlation between replicates. This was particularly evident when EA is applied to an RNA-seq study of type 1 diabetes (T1D), where the coefficient of variation among subjects (n = 81) in the transcript abundance estimates was substantially reduced compared to the estimation using the full reference. EA focuses on individual transcriptional events. These events can be quantitate and analyzed directly or used to identify the probable set of expressed transcripts. Simple rules based on detected events and coverage used in filtering result in a dramatic improvement in isoform estimation without the use of ancillary data (e.g., ChIP, long reads) that may not be available for many studies.
Collapse
|
11
|
Venturini L, Caim S, Kaithakottil GG, Mapleson DL, Swarbreck D. Leveraging multiple transcriptome assembly methods for improved gene structure annotation. Gigascience 2018; 7:5057872. [PMID: 30052957 PMCID: PMC6105091 DOI: 10.1093/gigascience/giy093] [Citation(s) in RCA: 98] [Impact Index Per Article: 14.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/12/2018] [Accepted: 07/20/2018] [Indexed: 01/22/2023] Open
Abstract
Background The performance of RNA sequencing (RNA-seq) aligners and assemblers varies greatly across different organisms and experiments, and often the optimal approach is not known beforehand. Results Here, we show that the accuracy of transcript reconstruction can be boosted by combining multiple methods, and we present a novel algorithm to integrate multiple RNA-seq assemblies into a coherent transcript annotation. Our algorithm can remove redundancies and select the best transcript models according to user-specified metrics, while solving common artifacts such as erroneous transcript chimerisms. Conclusions We have implemented this method in an open-source Python3 and Cython program, Mikado, available on GitHub.
Collapse
Affiliation(s)
- Luca Venturini
- Earlham Institute, Norwich Research Park, NR47UZ, Norwich, United Kingdom
| | - Shabhonam Caim
- Earlham Institute, Norwich Research Park, NR47UZ, Norwich, United Kingdom
- Quadram Institute Biosciences, Norwich Research Park, NR47UA, Norwich, United Kingdom
| | | | | | - David Swarbreck
- Earlham Institute, Norwich Research Park, NR47UZ, Norwich, United Kingdom
| |
Collapse
|
12
|
The Y Chromosome Modulates Splicing and Sex-Biased Intron Retention Rates in Drosophila. Genetics 2017; 208:1057-1067. [PMID: 29263027 DOI: 10.1534/genetics.117.300637] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2017] [Accepted: 12/18/2017] [Indexed: 01/01/2023] Open
Abstract
The Drosophila Y chromosome is a 40-Mb segment of mostly repetitive DNA; it harbors a handful of protein-coding genes and a disproportionate amount of satellite repeats, transposable elements, and multicopy DNA arrays. Intron retention (IR) is a type of alternative splicing (AS) event by which one or more introns remain within the mature transcript. IR recently emerged as a deliberate cellular mechanism to modulate gene expression levels and has been implicated in multiple biological processes. However, the extent of sex differences in IR and the contribution of the Y chromosome to the modulation of AS and IR rates has not been addressed. Here we showed pervasive IR in the fruit fly Drosophila melanogaster with thousands of novel IR events, hundreds of which displayed extensive sex bias. The data also revealed an unsuspected role for the Y chromosome in the modulation of AS and IR. The majority of sex-biased IR events introduced premature termination codons and the magnitude of sex bias was associated with gene expression differences between the sexes. Surprisingly, an extra Y chromosome in males (X^YY genotype) or the presence of a Y chromosome in females (X^XY genotype) significantly modulated IR and recapitulated natural differences in IR between the sexes. Our results highlight the significance of sex-biased IR in tuning sex differences and the role of the Y chromosome as a source of variable IR rates between the sexes. Modulation of splicing and IR rates across the genome represent new and unexpected outcomes of the Drosophila Y chromosome.
Collapse
|
13
|
Peng H, Yang Y, Zhe S, Wang J, Gribskov M, Qi Y. DEIsoM: a hierarchical Bayesian model for identifying differentially expressed isoforms using biological replicates. Bioinformatics 2017; 33:3018-3027. [PMID: 28595376 PMCID: PMC5870796 DOI: 10.1093/bioinformatics/btx357] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/16/2016] [Accepted: 06/02/2017] [Indexed: 11/18/2022] Open
Abstract
Motivation High-throughput mRNA sequencing (RNA-Seq) is a powerful tool for quantifying gene expression. Identification of transcript isoforms that are differentially expressed in different conditions, such as in patients and healthy subjects, can provide insights into the molecular basis of diseases. Current transcript quantification approaches, however, do not take advantage of the shared information in the biological replicates, potentially decreasing sensitivity and accuracy. Results We present a novel hierarchical Bayesian model called Differentially Expressed Isoform detection from Multiple biological replicates (DEIsoM) for identifying differentially expressed (DE) isoforms from multiple biological replicates representing two conditions, e.g. multiple samples from healthy and diseased subjects. DEIsoM first estimates isoform expression within each condition by (1) capturing common patterns from sample replicates while allowing individual differences, and (2) modeling the uncertainty introduced by ambiguous read mapping in each replicate. Specifically, we introduce a Dirichlet prior distribution to capture the common expression pattern of replicates from the same condition, and treat the isoform expression of individual replicates as samples from this distribution. Ambiguous read mapping is modeled as a multinomial distribution, and ambiguous reads are assigned to the most probable isoform in each replicate. Additionally, DEIsoM couples an efficient variational inference and a post-analysis method to improve the accuracy and speed of identification of DE isoforms over alternative methods. Application of DEIsoM to an hepatocellular carcinoma (HCC) dataset identifies biologically relevant DE isoforms. The relevance of these genes/isoforms to HCC are supported by principal component analysis (PCA), read coverage visualization, and the biological literature. Availability and implementation The software is available at https://github.com/hao-peng/DEIsoM Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
| | - Yifan Yang
- Department of Computer Science.,Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | | | - Jian Wang
- Eli Lilly and Company, Indianapolis, IN 46285, USA
| | - Michael Gribskov
- Department of Computer Science.,Department of Biological Sciences, Purdue University, West Lafayette, IN 47907, USA
| | - Yuan Qi
- Department of Computer Science.,Department of Statistics, Purdue University, West Lafayette, IN 47907, USA
| |
Collapse
|
14
|
Evolutionarily Conserved Alternative Splicing Across Monocots. Genetics 2017; 207:465-480. [PMID: 28839042 DOI: 10.1534/genetics.117.300189] [Citation(s) in RCA: 23] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2017] [Accepted: 08/11/2017] [Indexed: 12/22/2022] Open
Abstract
One difficulty when identifying alternative splicing (AS) events in plants is distinguishing functional AS from splicing noise. One way to add confidence to the validity of a splice isoform is to observe that it is conserved across evolutionarily related species. We use a high throughput method to identify junction-based conserved AS events from RNA-Seq data across nine plant species, including five grass monocots (maize, sorghum, rice, Brachpodium, and foxtail millet), plus two nongrass monocots (banana and African oil palm), the eudicot Arabidopsis, and the basal angiosperm Amborella In total, 9804 AS events were found to be conserved between two or more species studied. In grasses containing large regions of conserved synteny, the frequency of conserved AS events is twice that observed for genes outside of conserved synteny blocks. In plant-specific RS and RS2Z subfamilies of the serine/arginine (SR) splice-factor proteins, we observe both conservation and divergence of AS events after the whole genome duplication in maize. In addition, plant-specific RS and RS2Z splice-factor subfamilies are highly connected with R2R3-MYB in STRING functional protein association networks built using genes exhibiting conserved AS. Furthermore, we discovered that functional protein association networks constructed around genes harboring conserved AS events are enriched for phosphatases, kinases, and ubiquitylation genes, which suggests that AS may participate in regulating signaling pathways. These data lay the foundation for identifying and studying conserved AS events in the monocots, particularly across grass species, and this conserved AS resource identifies an additional layer between genotype to phenotype that may impact future crop improvement efforts.
Collapse
|
15
|
Huang Y, Sanguinetti G. BRIE: transcriptome-wide splicing quantification in single cells. Genome Biol 2017; 18:123. [PMID: 28655331 PMCID: PMC5488362 DOI: 10.1186/s13059-017-1248-5] [Citation(s) in RCA: 60] [Impact Index Per Article: 7.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2017] [Accepted: 05/30/2017] [Indexed: 11/12/2022] Open
Abstract
Single-cell RNA-seq (scRNA-seq) provides a comprehensive measurement of stochasticity in transcription, but the limitations of the technology have prevented its application to dissect variability in RNA processing events such as splicing. Here, we present BRIE (Bayesian regression for isoform estimation), a Bayesian hierarchical model that resolves these problems by learning an informative prior distribution from sequence features. We show that BRIE yields reproducible estimates of exon inclusion ratios in single cells and provides an effective tool for differential isoform quantification between scRNA-seq data sets. BRIE, therefore, expands the scope of scRNA-seq experiments to probe the stochasticity of RNA processing.
Collapse
Affiliation(s)
- Yuanhua Huang
- School of Informatics, University of Edinburgh, Edinburgh, EH8 9AB, UK
| | - Guido Sanguinetti
- School of Informatics, University of Edinburgh, Edinburgh, EH8 9AB, UK. .,Centre for Synthetic and Systems Biology (SynthSys), University of Edinburgh, Edinburgh, EH9 3BF, UK.
| |
Collapse
|
16
|
Coskran TM, Jiang Z, Klaunig JE, Mager DL, Obert L, Robertson A, Tsinoremas N, Wang Z, Gosink M. Induction of endogenous retroelements as a potential mechanism for mouse-specific drug-induced carcinogenicity. PLoS One 2017; 12:e0176768. [PMID: 28472135 PMCID: PMC5417610 DOI: 10.1371/journal.pone.0176768] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/22/2016] [Accepted: 04/17/2017] [Indexed: 11/23/2022] Open
Abstract
A number of chemical compounds have been shown to induce liver tumors in mice but not in other species. While several mechanisms for this species-specific tumorigenicity have been proposed, no definitive mechanism has been established. We examined the effects of the nongenotoxic rodent hepatic carcinogen, WY-14,643, in male mice from a high liver tumor susceptible strain (C3H/HeJ), and from a low tumor susceptible strain (C57BL/6). WY-14,643, a PPARα activator induced widespread increases in the expression of some endogenous retroelements, namely members of LTR and LINE elements in both strains. The expression of a number of known retroviral defense genes was also elevated. We also demonstrated that basal immune-mediated viral defense was elevated in C57BL/6 mice (the resistant strain) and that WY-14,643 further activated those immuno-defense processes. We propose that the previously reported >100X activity of retroelements in mice drives mouse-specific tumorigenicity. We also propose that C57BL/6's competent immune to retroviral activation allows it to remove cells before the activation of these elements can result in significant chromosomal insertions and mutation. Finally, we showed that WY-14,643 treatment induced gene signatures of DNA recombination in the sensitive C3H/HeJ strain.
Collapse
Affiliation(s)
- Timothy M. Coskran
- Drug Safety Research & Development, Pfizer Inc., Groton, Connecticut, United States of America
| | - Zhijie Jiang
- Department of Computer Science, University of Miami, Miami, Florida, United States of America
| | - James E. Klaunig
- Environmental Health, Indiana University, Bloomington, Indiana, United States of America
| | - Dixie L. Mager
- Terry Fox Laboratory, BC Cancer Agency, Vancouver, British Columbia, Canada
| | - Leslie Obert
- GlaxoSmithKline plc, King of Prussia, Pennsylvania, United States of America
| | - Andrew Robertson
- Drug Safety Research & Development, Pfizer Inc., Groton, Connecticut, United States of America
| | - Nicholas Tsinoremas
- Department of Computer Science, University of Miami, Miami, Florida, United States of America
| | - Zemin Wang
- Environmental Health, Indiana University, Bloomington, Indiana, United States of America
| | - Mark Gosink
- Drug Safety Research & Development, Pfizer Inc., Groton, Connecticut, United States of America
| |
Collapse
|
17
|
Conn VM, Hugouvieux V, Nayak A, Conos SA, Capovilla G, Cildir G, Jourdain A, Tergaonkar V, Schmid M, Zubieta C, Conn SJ. A circRNA from SEPALLATA3 regulates splicing of its cognate mRNA through R-loop formation. NATURE PLANTS 2017; 3:17053. [PMID: 28418376 DOI: 10.1038/nplants.2017.53] [Citation(s) in RCA: 427] [Impact Index Per Article: 53.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 08/24/2016] [Accepted: 03/20/2017] [Indexed: 05/03/2023]
Abstract
Circular RNAs (circRNAs) are a diverse and abundant class of hyper-stable, non-canonical RNAs that arise through a form of alternative splicing (AS) called back-splicing. These single-stranded, covalently-closed circRNA molecules have been identified in all eukaryotic kingdoms of life1, yet their functions have remained elusive. Here, we report that circRNAs can be used as bona fide biomarkers of functional, exon-skipped AS variants in Arabidopsis, including in the homeotic MADS-box transcription factor family. Furthermore, we demonstrate that circRNAs derived from exon 6 of the SEPALLATA3 (SEP3) gene increase abundance of the cognate exon-skipped AS variant (SEP3.3 which lacks exon 6), in turn driving floral homeotic phenotypes. Toward demonstrating the underlying mechanism, we show that the SEP3 exon 6 circRNA can bind strongly to its cognate DNA locus, forming an RNA:DNA hybrid, or R-loop, whereas the linear RNA equivalent bound significantly more weakly to DNA. R-loop formation results in transcriptional pausing, which has been shown to coincide with splicing factor recruitment and AS2-4. This report presents a novel mechanistic insight for how at least a subset of circRNAs probably contribute to increased splicing efficiency of their cognate exon-skipped messenger RNA and provides the first evidence of an organismal-level phenotype mediated by circRNA manipulation.
Collapse
Affiliation(s)
- Vanessa M Conn
- Laboratoire de Physiologie Cellulaire and Végétale, CNRS, CEA, INRA, Université Grenoble-Alpes, BIG, UMR 5168, Grenoble 38000, France
- Centre for Cancer Biology, SA Pathology and the University of South Australia, Adelaide, SA 5000, Australia
| | - Véronique Hugouvieux
- Laboratoire de Physiologie Cellulaire and Végétale, CNRS, CEA, INRA, Université Grenoble-Alpes, BIG, UMR 5168, Grenoble 38000, France
| | - Aditya Nayak
- Laboratoire de Physiologie Cellulaire and Végétale, CNRS, CEA, INRA, Université Grenoble-Alpes, BIG, UMR 5168, Grenoble 38000, France
| | - Stephanie A Conos
- Centre for Cancer Biology, SA Pathology and the University of South Australia, Adelaide, SA 5000, Australia
| | - Giovanna Capovilla
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen 72076, Germany
| | - Gökhan Cildir
- Centre for Cancer Biology, SA Pathology and the University of South Australia, Adelaide, SA 5000, Australia
| | - Agnès Jourdain
- Laboratoire de Physiologie Cellulaire and Végétale, CNRS, CEA, INRA, Université Grenoble-Alpes, BIG, UMR 5168, Grenoble 38000, France
| | - Vinay Tergaonkar
- Centre for Cancer Biology, SA Pathology and the University of South Australia, Adelaide, SA 5000, Australia
- Laboratory of NF-κB Signaling, Institute of Molecular and Cell Biology (IMCB), 61 Biopolis Drive, Proteos, Singapore 138673, Singapore
- Department of Biochemistry, Yong Loo Lin School of Medicine, National University of Singapore (NUS), Singapore 117597, Singapore
| | - Markus Schmid
- Department of Molecular Biology, Max Planck Institute for Developmental Biology, Tübingen 72076, Germany
- Umeå Plant Science Centre, Department of Plant Physiology, Umeå University, SE-901 87 Umeå, Sweden
| | - Chloe Zubieta
- Laboratoire de Physiologie Cellulaire and Végétale, CNRS, CEA, INRA, Université Grenoble-Alpes, BIG, UMR 5168, Grenoble 38000, France
| | - Simon J Conn
- Laboratoire de Physiologie Cellulaire and Végétale, CNRS, CEA, INRA, Université Grenoble-Alpes, BIG, UMR 5168, Grenoble 38000, France
- Centre for Cancer Biology, SA Pathology and the University of South Australia, Adelaide, SA 5000, Australia
| |
Collapse
|
18
|
Papastamoulis P, Rattray M. A Bayesian model selection approach for identifying differentially expressed transcripts from RNA sequencing data. J R Stat Soc Ser C Appl Stat 2017; 67:3-23. [PMID: 29353941 PMCID: PMC5763373 DOI: 10.1111/rssc.12213] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/10/2023]
Abstract
Recent advances in molecular biology allow the quantification of the transcriptome and scoring transcripts as differentially or equally expressed between two biological conditions. Although these two tasks are closely linked, the available inference methods treat them separately: a primary model is used to estimate expression and its output is post processed by using a differential expression model. In the paper, both issues are simultaneously addressed by proposing the joint estimation of expression levels and differential expression: the unknown relative abundance of each transcript can either be equal or not between two conditions. A hierarchical Bayesian model builds on the BitSeq framework and the posterior distribution of transcript expression and differential expression is inferred by using Markov chain Monte Carlo sampling. It is shown that the model proposed enjoys conjugacy for fixed dimension variables; thus the full conditional distributions are analytically derived. Two samplers are constructed, a reversible jump Markov chain Monte Carlo sampler and a collapsed Gibbs sampler, and the latter is found to perform better. A cluster representation of the aligned reads to the transcriptome is introduced, allowing parallel estimation of the marginal posterior distribution of subsets of transcripts under reasonable computing time. Under a fixed prior probability of differential expression the clusterwise sampler has the same marginal posterior distributions as the raw sampler, but a more general prior structure is also employed. The algorithm proposed is benchmarked against alternative methods by using synthetic data sets and applied to real RNA sequencing data. Source code is available on line from https://github.com/mqbssppe/cjBitSeq.
Collapse
|
19
|
Mei W, Liu S, Schnable JC, Yeh CT, Springer NM, Schnable PS, Barbazuk WB. A Comprehensive Analysis of Alternative Splicing in Paleopolyploid Maize. FRONTIERS IN PLANT SCIENCE 2017; 8:694. [PMID: 28539927 PMCID: PMC5423905 DOI: 10.3389/fpls.2017.00694] [Citation(s) in RCA: 41] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 11/15/2016] [Accepted: 04/18/2017] [Indexed: 05/19/2023]
Abstract
Identifying and characterizing alternative splicing (AS) enables our understanding of the biological role of transcript isoform diversity. This study describes the use of publicly available RNA-Seq data to identify and characterize the global diversity of AS isoforms in maize using the inbred lines B73 and Mo17, and a related species, sorghum. Identification and characterization of AS within maize tissues revealed that genes expressed in seed exhibit the largest differential AS relative to other tissues examined. Additionally, differences in AS between the two genotypes B73 and Mo17 are greatest within genes expressed in seed. We demonstrate that changes in the level of alternatively spliced transcripts (intron retention and exon skipping) do not solely reflect differences in total transcript abundance, and we present evidence that intron retention may act to fine-tune gene expression across seed development stages. Furthermore, we have identified temperature sensitive AS in maize and demonstrate that drought-induced changes in AS involve distinct sets of genes in reproductive and vegetative tissues. Examining our identified AS isoforms within B73 × Mo17 recombinant inbred lines (RILs) identified splicing QTL (sQTL). The 43.3% of cis-sQTL regulated junctions are actually identified as alternatively spliced junctions in our analysis, while 10 Mb windows on each side of 48.2% of trans-sQTLs overlap with splicing related genes. Using sorghum as an out-group enabled direct examination of loss or conservation of AS between homeologous genes representing the two subgenomes of maize. We identify several instances where AS isoforms that are conserved between one maize homeolog and its sorghum ortholog are absent from the second maize homeolog, suggesting that these AS isoforms may have been lost after the maize whole genome duplication event. This comprehensive analysis provides new insights into the complexity of AS in maize.
Collapse
Affiliation(s)
- Wenbin Mei
- Department of Biology, University of Florida, GainesvilleFL, USA
| | - Sanzhen Liu
- Department of Agronomy, Iowa State University, AmesIA, USA
- Department of Plant Pathology, Kansas State University, ManhattanKS, USA
| | - James C. Schnable
- Department of Agronomy and Horticulture, University of Nebraska–Lincoln, LincolnNE, USA
| | - Cheng-Ting Yeh
- Department of Agronomy, Iowa State University, AmesIA, USA
| | - Nathan M. Springer
- Department of Plant Biology, Microbial and Plant Genomics Institute, University of Minnesota, Saint PaulMN, USA
| | - Patrick S. Schnable
- Department of Agronomy, Iowa State University, AmesIA, USA
- Center for Plant Genomics, Iowa State University, AmesIA, USA
| | - William B. Barbazuk
- Department of Biology, University of Florida, GainesvilleFL, USA
- Genetics Institute, University of Florida, GainesvilleFL, USA
- *Correspondence: William B. Barbazuk,
| |
Collapse
|
20
|
Haussmann IU, Bodi Z, Sanchez-Moran E, Mongan NP, Archer N, Fray RG, Soller M. m6A potentiates Sxl alternative pre-mRNA splicing for robust Drosophila sex determination. Nature 2016; 540:301-304. [DOI: 10.1038/nature20577] [Citation(s) in RCA: 359] [Impact Index Per Article: 39.9] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/23/2016] [Accepted: 10/25/2016] [Indexed: 12/28/2022]
|
21
|
Vleurinck C, Raub S, Sturgill D, Oliver B, Beye M. Linking Genes and Brain Development of Honeybee Workers: A Whole-Transcriptome Approach. PLoS One 2016; 11:e0157980. [PMID: 27490820 PMCID: PMC4973980 DOI: 10.1371/journal.pone.0157980] [Citation(s) in RCA: 18] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/30/2015] [Accepted: 06/08/2016] [Indexed: 01/12/2023] Open
Abstract
Honeybees live in complex societies whose capabilities far exceed those of the sum of their single members. This social synergism is achieved mainly by the worker bees, which form a female caste. The worker bees display diverse collaborative behaviors and engage in different behavioral tasks, which are controlled by the central nervous system (CNS). The development of the worker brain is determined by the female sex and the worker caste determination signal. Here, we report on genes that are controlled by sex or by caste during differentiation of the worker's pupal brain. We sequenced and compared transcriptomes from the pupal brains of honeybee workers, queens and drones. We detected 333 genes that are differently expressed and 519 genes that are differentially spliced between the sexes, and 1760 genes that are differentially expressed and 692 genes that are differentially spliced between castes. We further found that 403 genes are differentially regulated by both the sex and caste signals, providing evidence of the integration of both signals through differential gene regulation. In this gene set, we found that the molecular processes of restructuring the cell shape and cell-to-cell signaling are overrepresented. Our approach identified candidate genes that may be involved in brain differentiation that ensures the various social worker behaviors.
Collapse
Affiliation(s)
- Christina Vleurinck
- Institute of Evolutionary Genetics, Heinrich-Heine University, Düsseldorf, Germany
| | - Stephan Raub
- Centre for Information and Media Technology, Heinrich-Heine University, Düsseldorf, Germany
| | - David Sturgill
- Laboratory of Cellular and Developmental Biology, NIDDK, Bethesda, Maryland, United States of America
| | - Brian Oliver
- Laboratory of Cellular and Developmental Biology, NIDDK, Bethesda, Maryland, United States of America
| | - Martin Beye
- Institute of Evolutionary Genetics, Heinrich-Heine University, Düsseldorf, Germany
- * E-mail:
| |
Collapse
|
22
|
Huang Y, Sanguinetti G. Statistical modeling of isoform splicing dynamics from RNA-seq time series data. Bioinformatics 2016; 32:2965-72. [PMID: 27318208 DOI: 10.1093/bioinformatics/btw364] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2016] [Accepted: 06/05/2016] [Indexed: 01/08/2023] Open
Abstract
MOTIVATION Isoform quantification is an important goal of RNA-seq experiments, yet it remains problematic for genes with low expression or several isoforms. These difficulties may in principle be ameliorated by exploiting correlated experimental designs, such as time series or dosage response experiments. Time series RNA-seq experiments, in particular, are becoming increasingly popular, yet there are no methods that explicitly leverage the experimental design to improve isoform quantification. RESULTS Here, we present DICEseq, the first isoform quantification method tailored to correlated RNA-seq experiments. DICEseq explicitly models the correlations between different RNA-seq experiments to aid the quantification of isoforms across experiments. Numerical experiments on simulated datasets show that DICEseq yields more accurate results than state-of-the-art methods, an advantage that can become considerable at low coverage levels. On real datasets, our results show that DICEseq provides substantially more reproducible and robust quantifications, increasing the correlation of estimates from replicate datasets by up to 10% on genes with low or moderate expression levels (bottom third of all genes). Furthermore, DICEseq permits to quantify the trade-off between temporal sampling of RNA and depth of sequencing, frequently an important choice when planning experiments. Our results have strong implications for the design of RNA-seq experiments, and offer a novel tool for improved analysis of such datasets. AVAILABILITY AND IMPLEMENTATION Python code is freely available at http://diceseq.sf.net CONTACT G.Sanguinetti@ed.ac.uk SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yuanhua Huang
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, UK
| | - Guido Sanguinetti
- School of Informatics, University of Edinburgh, Edinburgh EH8 9AB, UK Centre for Synthetic and Systems Biology (SynthSys), University of Edinburgh, Edinburgh EH9 3BF, UK
| |
Collapse
|
23
|
Wang Q, Sawyer IA, Sung MH, Sturgill D, Shevtsov SP, Pegoraro G, Hakim O, Baek S, Hager GL, Dundr M. Cajal bodies are linked to genome conformation. Nat Commun 2016; 7:10966. [PMID: 26997247 PMCID: PMC4802181 DOI: 10.1038/ncomms10966] [Citation(s) in RCA: 119] [Impact Index Per Article: 13.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/04/2015] [Accepted: 02/07/2016] [Indexed: 12/12/2022] Open
Abstract
The mechanisms underlying nuclear body (NB) formation and their contribution to genome function are unknown. Here we examined the non-random positioning of Cajal bodies (CBs), major NBs involved in spliceosomal snRNP assembly and their role in genome organization. CBs are predominantly located at the periphery of chromosome territories at a multi-chromosome interface. Genome-wide chromosome conformation capture analysis (4C-seq) using CB-interacting loci revealed that CB-associated regions are enriched with highly expressed histone genes and U small nuclear or nucleolar RNA (sn/snoRNA) loci that form intra- and inter-chromosomal clusters. In particular, we observed a number of CB-dependent gene-positioning events on chromosome 1. RNAi-mediated disassembly of CBs disrupts the CB-targeting gene clusters and suppresses the expression of U sn/snoRNA and histone genes. This loss of spliceosomal snRNP production results in increased splicing noise, even in CB-distal regions. Therefore, we conclude that CBs contribute to genome organization with global effects on gene expression and RNA splicing fidelity. Nuclear bodies can nucleate at sites of active transcription and are beneficial for efficient gene expression. Here, the authors show that Cajal bodies, a prominent type of nuclear body, contribute to genome organization with global effects on gene expression and RNA splicing fidelity.
Collapse
Affiliation(s)
- Qiuyan Wang
- Department of Cell Biology, Rosalind Franklin University of Medicine and Science, Chicago Medical School, North Chicago, 60064 Ilinois, USA.,Laboratory of Receptor Biology and Gene Expression, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, 20892 Maryland, USA
| | - Iain A Sawyer
- Department of Cell Biology, Rosalind Franklin University of Medicine and Science, Chicago Medical School, North Chicago, 60064 Ilinois, USA.,Laboratory of Receptor Biology and Gene Expression, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, 20892 Maryland, USA
| | - Myong-Hee Sung
- Laboratory of Receptor Biology and Gene Expression, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, 20892 Maryland, USA
| | - David Sturgill
- Laboratory of Receptor Biology and Gene Expression, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, 20892 Maryland, USA
| | - Sergey P Shevtsov
- Department of Cell Biology, Rosalind Franklin University of Medicine and Science, Chicago Medical School, North Chicago, 60064 Ilinois, USA
| | - Gianluca Pegoraro
- Laboratory of Receptor Biology and Gene Expression, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, 20892 Maryland, USA.,High-Throughput Imaging Facility (HiTIF), Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, 20892 Maryland, USA
| | - Ofir Hakim
- Laboratory of Receptor Biology and Gene Expression, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, 20892 Maryland, USA
| | - Songjoon Baek
- Laboratory of Receptor Biology and Gene Expression, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, 20892 Maryland, USA
| | - Gordon L Hager
- Laboratory of Receptor Biology and Gene Expression, Center for Cancer Research, National Cancer Institute, National Institutes of Health, Bethesda, 20892 Maryland, USA
| | - Miroslav Dundr
- Department of Cell Biology, Rosalind Franklin University of Medicine and Science, Chicago Medical School, North Chicago, 60064 Ilinois, USA
| |
Collapse
|
24
|
Christinat Y, Pawłowski R, Krek W. jSplice: a high-performance method for accurate prediction of alternative splicing events and its application to large-scale renal cancer transcriptome data. Bioinformatics 2016; 32:2111-9. [PMID: 27153587 DOI: 10.1093/bioinformatics/btw145] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/05/2015] [Accepted: 03/11/2016] [Indexed: 01/01/2023] Open
Abstract
MOTIVATION Alternative splicing represents a prime mechanism of post-transcriptional gene regulation whose misregulation is associated with a broad range of human diseases. Despite the vast availability of transcriptome data from different cell types and diseases, bioinformatics-based surveys of alternative splicing patterns remain a major challenge due to limited availability of analytical tools that combine high accuracy and rapidity. RESULTS We describe here a novel junction-centric method, jSplice, that enables de novo extraction of alternative splicing events from RNA-sequencing data with high accuracy, reliability and speed. Application to clear cell renal carcinoma (ccRCC) cell lines and 65 ccRCC patients revealed experimentally validatable alternative splicing changes and signatures able to prognosticate ccRCC outcome. In the aggregate, our results propose jSplice as a key analytic tool for the derivation of cell context-dependent alternative splicing patterns from large-scale RNA-sequencing datasets. AVAILABILITY AND IMPLEMENTATION jSplice is a standalone Python application freely available at http://www.mhs.biol.ethz.ch/research/krek/jsplice CONTACT wilhelm.krek@biol.ethz.ch SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Yann Christinat
- Institute of Molecular Health Sciences, ETH Zurich, Zurich 8093, Switzerland
| | - Rafał Pawłowski
- Institute of Molecular Health Sciences, ETH Zurich, Zurich 8093, Switzerland
| | - Wilhelm Krek
- Institute of Molecular Health Sciences, ETH Zurich, Zurich 8093, Switzerland
| |
Collapse
|
25
|
Marina RJ, Sturgill D, Bailly MA, Thenoz M, Varma G, Prigge MF, Nanan KK, Shukla S, Haque N, Oberdoerffer S. TET-catalyzed oxidation of intragenic 5-methylcytosine regulates CTCF-dependent alternative splicing. EMBO J 2015; 35:335-55. [PMID: 26711177 DOI: 10.15252/embj.201593235] [Citation(s) in RCA: 74] [Impact Index Per Article: 7.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2015] [Accepted: 11/25/2015] [Indexed: 01/09/2023] Open
Abstract
Intragenic 5-methylcytosine and CTCF mediate opposing effects on pre-mRNA splicing: CTCF promotes inclusion of weak upstream exons through RNA polymerase II pausing, whereas 5-methylcytosine evicts CTCF, leading to exon exclusion. However, the mechanisms governing dynamic DNA methylation at CTCF-binding sites were unclear. Here, we reveal the methylcytosine dioxygenases TET1 and TET2 as active regulators of CTCF-mediated alternative splicing through conversion of 5-methylcytosine to its oxidation derivatives. 5-hydroxymethylcytosine and 5-carboxylcytosine are enriched at an intragenic CTCF-binding sites in the CD45 model gene and are associated with alternative exon inclusion. Reduced TET levels culminate in increased 5-methylcytosine, resulting in CTCF eviction and exon exclusion. In vitro analyses establish the oxidation derivatives are not sufficient to stimulate splicing, but efficiently promote CTCF association. We further show genomewide that reciprocal exchange of 5-hydroxymethylcytosine and 5-methylcytosine at downstream CTCF-binding sites is a general feature of alternative splicing in naïve and activated CD4(+) T cells. These findings significantly expand our current concept of the pre-mRNA "splicing code" to include dynamic intragenic DNA methylation catalyzed by the TET proteins.
Collapse
Affiliation(s)
- Ryan J Marina
- Center for Cancer Research, Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, Bethesda, MD, USA
| | - David Sturgill
- Center for Cancer Research, Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, Bethesda, MD, USA
| | - Marc A Bailly
- Center for Cancer Research, Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, Bethesda, MD, USA
| | - Morgan Thenoz
- Center for Cancer Research, Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, Bethesda, MD, USA
| | - Garima Varma
- Center for Cancer Research, Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, Bethesda, MD, USA
| | - Maria F Prigge
- Center for Cancer Research, Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, Bethesda, MD, USA
| | - Kyster K Nanan
- Center for Cancer Research, Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, Bethesda, MD, USA
| | - Sanjeev Shukla
- Center for Cancer Research, Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, Bethesda, MD, USA
| | - Nazmul Haque
- Center for Cancer Research, Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, Bethesda, MD, USA
| | - Shalini Oberdoerffer
- Center for Cancer Research, Laboratory of Receptor Biology and Gene Expression, National Cancer Institute, Bethesda, MD, USA
| |
Collapse
|
26
|
Sxl-Dependent, tra/tra2-Independent Alternative Splicing of the Drosophila melanogaster X-Linked Gene found in neurons. G3-GENES GENOMES GENETICS 2015; 5:2865-74. [PMID: 26511498 PMCID: PMC4683657 DOI: 10.1534/g3.115.023721] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/20/2023]
Abstract
Somatic sexual determination and behavior in Drosophila melanogaster are under the control of a genetic cascade initiated by Sex lethal (Sxl). In the female soma, SXL RNA-binding protein regulates the splicing of transformer (tra) transcripts into a female-specific form. The RNA-binding protein TRA and its cofactor TRA2 function in concert in females, whereas SXL, TRA, and TRA2 are thought to not function in males. To better understand sex-specific regulation of gene expression, we analyzed male and female head transcriptome datasets for expression levels and splicing, quantifying sex-biased gene expression via RNA-Seq and qPCR. Our data uncouple the effects of Sxl and tra/tra2 in females in the-sex-biased alternative splicing of head transcripts from the X-linked locus found in neurons (fne), encoding a pan-neuronal RNA-binding protein of the ELAV family. We show that FNE protein levels are downregulated by Sxl in female heads, also independently of tra/tra2. We argue that this regulation may have important sexually dimorphic consequences for the regulation of nervous system development or function.
Collapse
|
27
|
Hensman J, Papastamoulis P, Glaus P, Honkela A, Rattray M. Fast and accurate approximate inference of transcript expression from RNA-seq data. Bioinformatics 2015; 31:3881-9. [PMID: 26315907 PMCID: PMC4673974 DOI: 10.1093/bioinformatics/btv483] [Citation(s) in RCA: 21] [Impact Index Per Article: 2.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2015] [Accepted: 08/07/2015] [Indexed: 11/25/2022] Open
Abstract
Motivation: Assigning RNA-seq reads to their transcript of origin is a fundamental task in transcript expression estimation. Where ambiguities in assignments exist due to transcripts sharing sequence, e.g. alternative isoforms or alleles, the problem can be solved through probabilistic inference. Bayesian methods have been shown to provide accurate transcript abundance estimates compared with competing methods. However, exact Bayesian inference is intractable and approximate methods such as Markov chain Monte Carlo and Variational Bayes (VB) are typically used. While providing a high degree of accuracy and modelling flexibility, standard implementations can be prohibitively slow for large datasets and complex transcriptome annotations. Results: We propose a novel approximate inference scheme based on VB and apply it to an existing model of transcript expression inference from RNA-seq data. Recent advances in VB algorithmics are used to improve the convergence of the algorithm beyond the standard Variational Bayes Expectation Maximization algorithm. We apply our algorithm to simulated and biological datasets, demonstrating a significant increase in speed with only very small loss in accuracy of expression level estimation. We carry out a comparative study against seven popular alternative methods and demonstrate that our new algorithm provides excellent accuracy and inter-replicate consistency while remaining competitive in computation time. Availability and implementation: The methods were implemented in R and C++, and are available as part of the BitSeq project at github.com/BitSeq. The method is also available through the BitSeq Bioconductor package. The source code to reproduce all simulation results can be accessed via github.com/BitSeq/BitSeqVB_benchmarking. Contact:james.hensman@sheffield.ac.uk or panagiotis.papastamoulis@manchester.ac.uk or Magnus.Rattray@manchester.ac.uk Supplementary information:Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- James Hensman
- Sheffield Institute for Translational Neuroscience (SITraN), Sheffield, UK
| | | | - Peter Glaus
- School of Computer Science, The University of Manchester, Manchester, UK and
| | - Antti Honkela
- Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki, Finland
| | | |
Collapse
|
28
|
Abstract
The modENCODE (Model Organism Encyclopedia of DNA Elements) Consortium aimed to map functional elements-including transcripts, chromatin marks, regulatory factor binding sites, and origins of DNA replication-in the model organisms Drosophila melanogaster and Caenorhabditis elegans. During its five-year span, the consortium conducted more than 2,000 genome-wide assays in developmentally staged animals, dissected tissues, and homogeneous cell lines. Analysis of these data sets provided foundational insights into genome, epigenome, and transcriptome structure and the evolutionary turnover of regulatory pathways. These studies facilitated a comparative analysis with similar data types produced by the ENCODE Consortium for human cells. Genome organization differs drastically in these distant species, and yet quantitative relationships among chromatin state, transcription, and cotranscriptional RNA processing are deeply conserved. Of the many biological discoveries of the modENCODE Consortium, we highlight insights that emerged from integrative studies. We focus on operational and scientific lessons that may aid future projects of similar scale or aims in other, emerging model systems.
Collapse
Affiliation(s)
- James B Brown
- Department of Statistics, University of California, Berkeley, California 94720;
| | | |
Collapse
|
29
|
X Chromosome and Autosome Dosage Responses in Drosophila melanogaster Heads. G3-GENES GENOMES GENETICS 2015; 5:1057-63. [PMID: 25850426 PMCID: PMC4478536 DOI: 10.1534/g3.115.017632] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
X chromosome dosage compensation is required for male viability in Drosophila. Dosage compensation relative to autosomes is two-fold, but this is likely to be due to a combination of homeostatic gene-by-gene regulation and chromosome-wide regulation. We have baseline values for gene-by-gene dosage compensation on autosomes, but not for the X chromosome. Given the evolutionary history of sex chromosomes, these baseline values could differ. We used a series of deficiencies on the X and autosomes, along with mutations in the sex-determination gene transformer-2, to carefully measure the sex-independent X-chromosome response to gene dosage in adult heads by RNA sequencing. We observed modest and indistinguishable dosage compensation for both X chromosome and autosome genes, suggesting that the X chromosome is neither inherently more robust nor sensitive to dosage change.
Collapse
|
30
|
Sex- and tissue-specific functions of Drosophila doublesex transcription factor target genes. Dev Cell 2015; 31:761-73. [PMID: 25535918 DOI: 10.1016/j.devcel.2014.11.021] [Citation(s) in RCA: 91] [Impact Index Per Article: 9.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2014] [Revised: 10/02/2014] [Accepted: 11/13/2014] [Indexed: 11/20/2022]
Abstract
Primary sex-determination "switches" evolve rapidly, but Doublesex (DSX)-related transcription factors (DMRTs) act downstream of these switches to control sexual development in most animal species. Drosophila dsx encodes female- and male-specific isoforms (DSX(F) and DSX(M)), but little is known about how dsx controls sexual development, whether DSX(F) and DSX(M) bind different targets, or how DSX proteins direct different outcomes in diverse tissues. We undertook genome-wide analyses to identify DSX targets using in vivo occupancy, binding site prediction, and evolutionary conservation. We find that DSX(F) and DSX(M) bind thousands of the same targets in multiple tissues in both sexes, yet these targets have sex- and tissue-specific functions. Interestingly, DSX targets show considerable overlap with targets identified for mouse DMRT1. DSX targets include transcription factors and signaling pathway components providing for direct and indirect regulation of sex-biased expression.
Collapse
|
31
|
Chen ZX, Sturgill D, Qu J, Jiang H, Park S, Boley N, Suzuki AM, Fletcher AR, Plachetzki DC, FitzGerald PC, Artieri CG, Atallah J, Barmina O, Brown JB, Blankenburg KP, Clough E, Dasgupta A, Gubbala S, Han Y, Jayaseelan JC, Kalra D, Kim YA, Kovar CL, Lee SL, Li M, Malley JD, Malone JH, Mathew T, Mattiuzzo NR, Munidasa M, Muzny DM, Ongeri F, Perales L, Przytycka TM, Pu LL, Robinson G, Thornton RL, Saada N, Scherer SE, Smith HE, Vinson C, Warner CB, Worley KC, Wu YQ, Zou X, Cherbas P, Kellis M, Eisen MB, Piano F, Kionte K, Fitch DH, Sternberg PW, Cutter AD, Duff MO, Hoskins RA, Graveley BR, Gibbs RA, Bickel PJ, Kopp A, Carninci P, Celniker SE, Oliver B, Richards S. Comparative validation of the D. melanogaster modENCODE transcriptome annotation. Genome Res 2015; 24:1209-23. [PMID: 24985915 PMCID: PMC4079975 DOI: 10.1101/gr.159384.113] [Citation(s) in RCA: 111] [Impact Index Per Article: 11.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/08/2023]
Abstract
Accurate gene model annotation of reference genomes is critical for making them useful. The modENCODE project has improved the D. melanogaster genome annotation by using deep and diverse high-throughput data. Since transcriptional activity that has been evolutionarily conserved is likely to have an advantageous function, we have performed large-scale interspecific comparisons to increase confidence in predicted annotations. To support comparative genomics, we filled in divergence gaps in the Drosophila phylogeny by generating draft genomes for eight new species. For comparative transcriptome analysis, we generated mRNA expression profiles on 81 samples from multiple tissues and developmental stages of 15 Drosophila species, and we performed cap analysis of gene expression in D. melanogaster and D. pseudoobscura. We also describe conservation of four distinct core promoter structures composed of combinations of elements at three positions. Overall, each type of genomic feature shows a characteristic divergence rate relative to neutral models, highlighting the value of multispecies alignment in annotating a target genome that should prove useful in the annotation of other high priority genomes, especially human and other mammalian genomes that are rich in noncoding sequences. We report that the vast majority of elements in the annotation are evolutionarily conserved, indicating that the annotation will be an important springboard for functional genetic testing by the Drosophila community.
Collapse
Affiliation(s)
- Zhen-Xia Chen
- National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - David Sturgill
- National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Jiaxin Qu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Huaiyang Jiang
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Soo Park
- Department of Genome Dynamics, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Nathan Boley
- Department of Statistics, University of California, Berkeley, California 94720, USA
| | - Ana Maria Suzuki
- Technology Development Group, RIKEN Omics Science Center and RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama City, Kanagawa, Japan 230-0045
| | - Anthony R Fletcher
- Division of Computational Bioscience, Center For Information Technology, National Institutes of Health, Bethesda, Maryland 20814, USA
| | - David C Plachetzki
- Department of Evolution and Ecology, University of California, Davis, California 95616, USA
| | - Peter C FitzGerald
- National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Carlo G Artieri
- National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Joel Atallah
- Department of Evolution and Ecology, University of California, Davis, California 95616, USA
| | - Olga Barmina
- Department of Evolution and Ecology, University of California, Davis, California 95616, USA
| | - James B Brown
- Department of Statistics, University of California, Berkeley, California 94720, USA
| | - Kerstin P Blankenburg
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Emily Clough
- National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Abhijit Dasgupta
- Clinical Trials and Outcomes Branch, National Institute of Arthritis and Musculoskeletal and Skin Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Sai Gubbala
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Yi Han
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Joy C Jayaseelan
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Divya Kalra
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Yoo-Ah Kim
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Christie L Kovar
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Sandra L Lee
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Mingmei Li
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - James D Malley
- Division of Computational Bioscience, Center For Information Technology, National Institutes of Health, Bethesda, Maryland 20814, USA
| | - John H Malone
- National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Tittu Mathew
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Nicolas R Mattiuzzo
- National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Mala Munidasa
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Donna M Muzny
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Fiona Ongeri
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Lora Perales
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Teresa M Przytycka
- National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Ling-Ling Pu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Garrett Robinson
- Department of Statistics, University of California, Berkeley, California 94720, USA
| | - Rebecca L Thornton
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Nehad Saada
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Steven E Scherer
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Harold E Smith
- National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Charles Vinson
- National Cancer Institute, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Crystal B Warner
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Kim C Worley
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Yuan-Qing Wu
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Xiaoyan Zou
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Peter Cherbas
- Department of Biology, Indiana University, Bloomington, Indiana 47405, USA
| | - Manolis Kellis
- Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 20139, USA
| | - Michael B Eisen
- Molecular and Cell Biology, University of California, Berkeley, California 94720, USA
| | - Fabio Piano
- Department of Biology, New York University, New York, New York 10003, USA
| | - Karin Kionte
- Department of Biology, New York University, New York, New York 10003, USA
| | - David H Fitch
- Department of Biology, New York University, New York, New York 10003, USA
| | - Paul W Sternberg
- HHMI and Division of Biology, California Institute of Technology, Pasadena, California 91125, USA
| | - Asher D Cutter
- Department of Ecology and Evolutionary Biology, University of Toronto, Toronto, M5S 3B2, Canada
| | - Michael O Duff
- Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, Farmington, Connecticut 06030-6403, USA
| | - Roger A Hoskins
- Department of Genome Dynamics, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Brenton R Graveley
- Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, Farmington, Connecticut 06030-6403, USA
| | - Richard A Gibbs
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| | - Peter J Bickel
- Department of Statistics, University of California, Berkeley, California 94720, USA
| | - Artyom Kopp
- Department of Evolution and Ecology, University of California, Davis, California 95616, USA
| | - Piero Carninci
- Technology Development Group, RIKEN Omics Science Center and RIKEN Center for Life Science Technologies, Division of Genomic Technologies, Yokohama City, Kanagawa, Japan 230-0045
| | - Susan E Celniker
- Department of Genome Dynamics, Life Sciences Division, Lawrence Berkeley National Laboratory, Berkeley, California 94720, USA
| | - Brian Oliver
- National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
| | - Stephen Richards
- Human Genome Sequencing Center, Baylor College of Medicine, Houston, Texas 77030, USA
| |
Collapse
|
32
|
Fifteen novel immunoreactive proteins of Chinese virulent Haemophilus parasuis serotype 5 verified by an immunoproteomic assay. Folia Microbiol (Praha) 2014; 60:81-7. [PMID: 25200063 DOI: 10.1007/s12223-014-0343-1] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/19/2013] [Accepted: 08/25/2014] [Indexed: 11/27/2022]
Abstract
Haemophilus parasuis (H. parasuis) is associated with meningitis, polyserositis, polyarthritis and bacterial pneumonia. At present, its prevention and control is difficult because of the lack of suitable subunit vaccines. Nowadays, high-throughput methods, immunoproteomics, are available to screen for more vaccine candidates. A protein extraction method for H. parasuis and two-dimensional electrophoresis (2-DE) were optimized to provide high-resolution profiles covering pH 3 to 10. Twenty immunoreactive spots were excised from gels after strict comparison between 2-DE Western blot membranes and the relevant gels. Matrix-assisted laser desorption/ionization-time of flight-mass spectrometry (MALDI-TOF-MS) and MALDI-TOF-TOF-MS successfully identified 16 different proteins. Fifteen of them were reported as immunoreactive proteins in H. parasuis for the first time. In addition, recombinant HP5-7 (ABC transporter, periplasmic-binding protein) showed immunoreactivity both with hyperimmune rabbit serum and convalescent swine serum. Four recombinants of the 14 successfully expressed genes showed immunoreactivity with hyperimmune rabbit serum.
Collapse
|
33
|
Lee H, McManus CJ, Cho DY, Eaton M, Renda F, Somma MP, Cherbas L, May G, Powell S, Zhang D, Zhan L, Resch A, Andrews J, Celniker SE, Cherbas P, Przytycka TM, Gatti M, Oliver B, Graveley B, MacAlpine D. DNA copy number evolution in Drosophila cell lines. Genome Biol 2014; 15:R70. [PMID: 25262759 PMCID: PMC4289277 DOI: 10.1186/gb-2014-15-8-r70] [Citation(s) in RCA: 88] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/28/2013] [Accepted: 07/01/2014] [Indexed: 12/19/2022] Open
Abstract
BACKGROUND Structural rearrangements of the genome resulting in genic imbalance due to copy number change are often deleterious at the organismal level, but are common in immortalized cell lines and tumors, where they may be an advantage to cells. In order to explore the biological consequences of copy number changes in the Drosophila genome, we resequenced the genomes of 19 tissue-culture cell lines and generated RNA-Seq profiles. RESULTS Our work revealed dramatic duplications and deletions in all cell lines. We found three lines of evidence indicating that copy number changes were due to selection during tissue culture. First, we found that copy numbers correlated to maintain stoichiometric balance in protein complexes and biochemical pathways, consistent with the gene balance hypothesis. Second, while most copy number changes were cell line-specific, we identified some copy number changes shared by many of the independent cell lines. These included dramatic recurrence of increased copy number of the PDGF/VEGF receptor, which is also over-expressed in many cancer cells, and of bantam, an anti-apoptosis miRNA. Third, even when copy number changes seemed distinct between lines, there was strong evidence that they supported a common phenotypic outcome. For example, we found that proto-oncogenes were over-represented in one cell line (S2-DRSC), whereas tumor suppressor genes were under-represented in another (Kc167). CONCLUSION Our study illustrates how genome structure changes may contribute to selection of cell lines in vitro. This has implications for other cell-level natural selection progressions, including tumorigenesis.
Collapse
Affiliation(s)
- Hangnoh Lee
- />National Institute of Diabetes, Digestive, and Kidney Diseases, National Institutes of Health, 50 South Drive, Bethesda, MD 20892 USA
| | - C Joel McManus
- />Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, CT 06030 USA
- />Department of Biological Sciences, Carnegie Mellon University, 4400 Fifth Avenue, Pittsburgh, PA 15213 USA
| | - Dong-Yeon Cho
- />Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20892 USA
| | - Matthew Eaton
- />Department of Pharmacology and Cancer Biology, Duke University Medical Center, Levine Science Research Center, 308 Research Drive, Durham, NC 27708 USA
| | - Fioranna Renda
- />Istituto di Biologia e Patologia Molecolari (IBPM) del CNR and Dipartimento di Biologia e Biotecnologie, Sapienza, Università di Roma, 5 Aldo Moro Piazzale, Rome, 00185 Italy
| | - Maria Patrizia Somma
- />Istituto di Biologia e Patologia Molecolari (IBPM) del CNR and Dipartimento di Biologia e Biotecnologie, Sapienza, Università di Roma, 5 Aldo Moro Piazzale, Rome, 00185 Italy
| | - Lucy Cherbas
- />Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, IN 47405 USA
| | - Gemma May
- />Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, CT 06030 USA
- />Department of Biological Sciences, Carnegie Mellon University, 4400 Fifth Avenue, Pittsburgh, PA 15213 USA
| | - Sara Powell
- />Department of Pharmacology and Cancer Biology, Duke University Medical Center, Levine Science Research Center, 308 Research Drive, Durham, NC 27708 USA
| | - Dayu Zhang
- />Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, IN 47405 USA
- />School of Agricultural and Food Science, Zhejiang A&F University, 88 Huan Cheng Bei Road, Lin’an, Zhejiang 311300 China
| | - Lijun Zhan
- />Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, CT 06030 USA
| | - Alissa Resch
- />Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, CT 06030 USA
| | - Justen Andrews
- />Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, IN 47405 USA
| | - Susan E Celniker
- />Department of Genome Dynamics, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720 USA
| | - Peter Cherbas
- />Department of Biology, Indiana University, 1001 East 3rd Street, Bloomington, IN 47405 USA
| | - Teresa M Przytycka
- />Computational Biology Branch, National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, 8600 Rockville Pike, Bethesda, MD 20892 USA
| | - Maurizio Gatti
- />Istituto di Biologia e Patologia Molecolari (IBPM) del CNR and Dipartimento di Biologia e Biotecnologie, Sapienza, Università di Roma, 5 Aldo Moro Piazzale, Rome, 00185 Italy
| | - Brian Oliver
- />National Institute of Diabetes, Digestive, and Kidney Diseases, National Institutes of Health, 50 South Drive, Bethesda, MD 20892 USA
| | - Brenton Graveley
- />Department of Genetics and Developmental Biology, Institute for Systems Genomics, University of Connecticut Health Center, 400 Farmington Avenue, Farmington, CT 06030 USA
| | - David MacAlpine
- />Department of Pharmacology and Cancer Biology, Duke University Medical Center, Levine Science Research Center, 308 Research Drive, Durham, NC 27708 USA
| |
Collapse
|
34
|
Abstract
Post-transcriptional pre-mRNA splicing has emerged as a critical step in the gene expression cascade greatly influencing diversification and spatiotemporal control of the proteome in many developmental processes. The percentage of genes targeted by alternative splicing (AS) is shown to be over 95% in humans and 60% in Drosophila. Therefore, it is evident that deregulation of this process underlies many genetic diseases. Among all tissues, the brain shows the highest transcriptome diversity, which is not surprising in view of the complex inter- and intracellular networks underlying the development of this organ. Reports of isoforms known to function at different steps during Drosophila nervous system development are rapidly increasing as well as knowledge on their regulation and function, highlighting the role of AS during neuronal development in Drosophila.
Collapse
Affiliation(s)
- Carmen Mohr
- Institute of Human Genetics, University Medical Center Freiburg , Freiburg , Germany
| | | |
Collapse
|