1
|
Young AM, Van Buren S, Rashid NU. Differential transcript usage analysis incorporating quantification uncertainty via compositional measurement error regression modeling. Biostatistics 2024; 25:559-576. [PMID: 37040757 PMCID: PMC11017126 DOI: 10.1093/biostatistics/kxad008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/01/2021] [Revised: 12/22/2022] [Accepted: 02/06/2023] [Indexed: 04/13/2023] Open
Abstract
Differential transcript usage (DTU) occurs when the relative expression of multiple transcripts arising from the same gene changes between different conditions. Existing approaches to detect DTU often rely on computational procedures that can have speed and scalability issues as the number of samples increases. Here we propose a new method, CompDTU, that uses compositional regression to model the relative abundance proportions of each transcript that are of interest in DTU analyses. This procedure leverages fast matrix-based computations that make it ideally suited for DTU analysis with larger sample sizes. This method also allows for the testing of and adjustment for multiple categorical or continuous covariates. Additionally, many existing approaches for DTU ignore quantification uncertainty in the expression estimates for each transcript in RNA-seq data. We extend our CompDTU method to incorporate quantification uncertainty leveraging common output from RNA-seq expression quantification tool in a novel method CompDTUme. Through several power analyses, we show that CompDTU has excellent sensitivity and reduces false positive results relative to existing methods. Additionally, CompDTUme results in further improvements in performance over CompDTU with sufficient sample size for genes with high levels of quantification uncertainty, while also maintaining favorable speed and scalability. We motivate our methods using data from the Cancer Genome Atlas Breast Invasive Carcinoma data set, specifically using RNA-seq data from primary tumors for 740 patients with breast cancer. We show greatly reduced computation time from our new methods as well as the ability to detect several novel genes with significant DTU across different breast cancer subtypes.
Collapse
Affiliation(s)
- Amber M Young
- Department of Biostatistics, University of North Carolina at Chapel Hill, 135 Dauer Drive, Chapel Hill, NC, 27599, USA
| | - Scott Van Buren
- Department of Biostatistics, University of North Carolina at Chapel Hill, 135 Dauer Drive, Chapel Hill, NC, 27599, USA
| | - Naim U Rashid
- Department of Biostatistics, University of North Carolina at Chapel Hill, 135 Dauer Drive, Chapel Hill, NC, 27599, USA and Lineberger Comprehensive Cancer Center, University of North Carolina at Chapel Hill, 450 West Drive, Chapel Hill, NC, 27599, USA
| |
Collapse
|
2
|
Sananmuang T, Puthier D, Nguyen C, Chokeshaiusaha K. Differential transcript usage across mammalian oocytes at the germinal vesicle and metaphase II stages. Theriogenology 2024; 215:1-9. [PMID: 37995439 DOI: 10.1016/j.theriogenology.2023.11.016] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Revised: 11/11/2023] [Accepted: 11/13/2023] [Indexed: 11/25/2023]
Abstract
Ongoing progress in mRNA-Sequencing technologies has significantly contributed to the refinement of assisted reproductive technologies. However, the prior investigations have predominantly concentrated on alterations in overall gene expression levels, thereby leaving a considerable gap in our understanding of the influence of transcript isoform expression on fundamental cellular mechanisms of oocytes. Given the efficacy of differential transcript usage (DTU) analysis to address such knowledge, we conducted comprehensive DTU analysis utilizing mRNA-Seq datasets of germinal vesicle (GV) and metaphase II (MII) oocytes across six mammalian species from the SRA database, including cow, donkey, horse, human, mouse, and pig. To further illuminate the roles of these genes, we also conducted a rigorous Gene Ontology (GO) term enrichment analysis. While the DTU analysis of each species exhibited several genes with alterations in their transcript isoform usage, referred to as DTU genes, this study focused on only ten cross-species DTU genes sharing among a minimum of five distinct species (FDR≤0.05). These cross-species DTU genes were as follows: ABCF1, CDC6, CFAP36, CNOT10, DNM3, IWS1, NBN, NDEL1, RAD50 and ZCCHC17. GO term enrichment analysis unveiled the alignment of these cross-species DTU gene functions with RNA and cell-cycle control mechanisms across diverse mammalian species, thereby suggesting their vital roles during oocyte maturation. Further exploration of the transcript isoforms of these genes hence bore the potential to uncover novel transcript isoform markers for future reproductive technologies in both human and animal contexts.
Collapse
Affiliation(s)
- Thanida Sananmuang
- Rajamangala University of Technology Tawan-OK, Faculty of Veterinary Medicine, Chonburi, Thailand
| | - Denis Puthier
- Aix-Marseille Université, INSERM UMR 1090, TAGC, Marseille, France
| | - Catherine Nguyen
- Aix-Marseille Université, INSERM UMR 1090, TAGC, Marseille, France
| | - Kaj Chokeshaiusaha
- Rajamangala University of Technology Tawan-OK, Faculty of Veterinary Medicine, Chonburi, Thailand.
| |
Collapse
|
3
|
Weller AE, Doyle GA, Reiner BC, Crist RC, Berrettini WH. Analysis of differential gene expression and transcript usage in hippocampus of Apoe null mutant mice: Implications for Alzheimer's disease. Neurosci Res 2022; 176:85-89. [PMID: 34757086 PMCID: PMC8960320 DOI: 10.1016/j.neures.2021.10.010] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/24/2021] [Revised: 09/28/2021] [Accepted: 10/27/2021] [Indexed: 11/17/2022]
Abstract
A dataset of single-nucleus RNA sequencing (snRNAseq) data was analyzed using Seurat, Sierra, and Ingenuity Pathway Analysis (IPA) programs to assess differentially expressed genes (DEGs) and differential transcript usage (DTU) in mouse hippocampal cell types. Seurat identified DEGs between the wild type (WT) and Apoe knockout (EKO) mice. IPA identified 11 statistically significant canonical pathways in >1 cell type. Sierra identified Sipa1l1 with DTU between WT and EKO samples. Analysis of the Sipa1l1 peak region identified an alternative non-canonical polyadenylation signal and a putative cytoplasmic polyadenylation element. APOE regulation of gene transcription and co-transcriptional RNA processing may underlie Alzheimer's disease.
Collapse
Affiliation(s)
- Andrew E. Weller
- Center for Neurobiology and Behavior, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania Philadelphia, Pennsylvania, 19104, Corresponding author: Andrew E. Weller, MD, 125 S. 31 St., room 2208-2, Philadelphia, PA 19104, Office: (215) 898-6417, Fax: (215) 573-2041,
| | - Glenn A. Doyle
- Center for Neurobiology and Behavior, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania Philadelphia, Pennsylvania, 19104
| | - Benjamin C. Reiner
- Center for Neurobiology and Behavior, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania Philadelphia, Pennsylvania, 19104
| | - Richard C. Crist
- Center for Neurobiology and Behavior, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania Philadelphia, Pennsylvania, 19104
| | - Wade H. Berrettini
- Center for Neurobiology and Behavior, Department of Psychiatry, Perelman School of Medicine, University of Pennsylvania Philadelphia, Pennsylvania, 19104
| |
Collapse
|
4
|
Charton C, Youm DJ, Ko BJ, Seol D, Kim B, Chai HH, Lim D, Kim H. The transcriptomic blueprint of molt in rooster using various tissues from Ginkkoridak (Korean long-tailed chicken). BMC Genomics 2021; 22:594. [PMID: 34348642 PMCID: PMC8340483 DOI: 10.1186/s12864-021-07903-9] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2021] [Accepted: 07/13/2021] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND Annual molt is a critical stage in the life cycle of birds. Although the most extensively documented aspects of molt are the renewing of plumage and the remodeling of the reproductive tract in laying hens, in chicken, molt deeply affects various tissues and physiological functions. However, with exception of the reproductive tract, the effect of molt on gene expression across the tissues known to be affected by molt has to date never been investigated. The present study aimed to decipher the transcriptomic effects of molt in Ginkkoridak, a Korean long-tailed chicken. Messenger RNA data available across 24 types of tissue samples (9 males) and a combination of mRNA and miRNA data on 10 males and 10 females blood were used. RESULTS The impact of molt on gene expression and gene transcript usage appeared to vary substantially across tissues types in terms of histological entities or physiological functions particularly related to nervous system. Blood was the tissue most affected by molt in terms of differentially expressed genes in both sexes, closely followed by meninges, bone marrow and heart. The effect of molt in blood appeared to differ between males and females, with a more than fivefold difference in the number of down-regulated genes between both sexes. The blueprint of molt in roosters appeared to be specific to tissues or group of tissues, with relatively few genes replicating extensively across tissues, excepted for the spliceosome genes (U1, U4) and the ribosomal proteins (RPL21, RPL23). By integrating miRNA and mRNA data, when chickens molt, potential roles of miRNA were discovered such as regulation of neurogenesis, regulation of immunity and development of various organs. Furthermore, reliable candidate biomarkers of molt were found, which are related to cell dynamics, nervous system or immunity, processes or functions that have been shown to be extensively modulated in response to molt. CONCLUSIONS Our results provide a comprehensive description at the scale of the whole organism deciphering the effects of molt on the transcriptome in chicken. Also, the conclusion of this study can be used as a valuable resource in transcriptome analyses of chicken in the future and provide new insights related to molt.
Collapse
Affiliation(s)
- Clémentine Charton
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
| | - Dong-Jae Youm
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
| | - Byung June Ko
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
| | - Donghyeok Seol
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
- eGnome, Inc, Seoul, Republic of Korea
| | - Bongsang Kim
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea
- eGnome, Inc, Seoul, Republic of Korea
| | - Han-Ha Chai
- Animal Genomics & Bioinformatics Division, National Institute of Animal Science, RDA, 1500, Wanju, Republic of Korea
| | - Dajeong Lim
- Animal Genomics & Bioinformatics Division, National Institute of Animal Science, RDA, 1500, Wanju, Republic of Korea
| | - Heebal Kim
- Department of Agricultural Biotechnology and Research Institute of Agriculture and Life Sciences, Seoul National University, Seoul, Republic of Korea.
- eGnome, Inc, Seoul, Republic of Korea.
| |
Collapse
|
5
|
Tiberi S, Robinson MD. BANDITS: Bayesian differential splicing accounting for sample-to-sample variability and mapping uncertainty. Genome Biol 2020; 21:69. [PMID: 32178699 PMCID: PMC7075019 DOI: 10.1186/s13059-020-01967-8] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/16/2019] [Accepted: 02/20/2020] [Indexed: 01/12/2023] Open
Abstract
Alternative splicing is a biological process during gene expression that allows a single gene to code for multiple proteins. However, splicing patterns can be altered in some conditions or diseases. Here, we present BANDITS, a R/Bioconductor package to perform differential splicing, at both gene and transcript level, based on RNA-seq data. BANDITS uses a Bayesian hierarchical structure to explicitly model the variability between samples and treats the transcript allocation of reads as latent variables. We perform an extensive benchmark across both simulated and experimental RNA-seq datasets, where BANDITS has extremely favourable performance with respect to the competitors considered.
Collapse
Affiliation(s)
- Simone Tiberi
- Institute of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zurich, Winterthurerstrasse 190, Zurich, 8057 Switzerland
| | - Mark D. Robinson
- Institute of Molecular Life Sciences and SIB Swiss Institute of Bioinformatics, University of Zurich, Winterthurerstrasse 190, Zurich, 8057 Switzerland
| |
Collapse
|
6
|
Van den Berge K, Soneson C, Robinson MD, Clement L. stageR: a general stage-wise method for controlling the gene-level false discovery rate in differential expression and differential transcript usage. Genome Biol 2017; 18:151. [PMID: 28784146 PMCID: PMC5547545 DOI: 10.1186/s13059-017-1277-0] [Citation(s) in RCA: 54] [Impact Index Per Article: 7.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/10/2017] [Accepted: 07/30/2017] [Indexed: 12/11/2022] Open
Abstract
RNA sequencing studies with complex designs and transcript-resolution analyses involve multiple hypotheses per gene; however, conventional approaches fail to control the false discovery rate (FDR) at gene level. We propose stageR, a two-stage testing paradigm that leverages the increased power of aggregated gene-level tests and allows post hoc assessment for significant genes. This method provides gene-level FDR control and boosts power for testing interaction effects. In transcript-level analysis, it provides a framework that performs powerful gene-level tests while maintaining biological interpretation at transcript-level resolution. The procedure is applicable whenever individual hypotheses can be aggregated, providing a unified framework for complex high-throughput experiments.
Collapse
Affiliation(s)
- Koen Van den Berge
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Krijgslaan 281, S9, Ghent, 9000 Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, 9000 Belgium
| | - Charlotte Soneson
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057 Switzerland
- SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, 8057 Switzerland
| | - Mark D. Robinson
- Institute of Molecular Life Sciences, University of Zurich, Winterthurerstrasse 190, Zurich, 8057 Switzerland
- SIB Swiss Institute of Bioinformatics, University of Zurich, Zurich, 8057 Switzerland
| | - Lieven Clement
- Department of Applied Mathematics, Computer Science and Statistics, Ghent University, Krijgslaan 281, S9, Ghent, 9000 Belgium
- Bioinformatics Institute Ghent, Ghent University, Ghent, 9000 Belgium
| |
Collapse
|