1
|
Chen Y, Chen L, Lun AL, Baldoni P, Smyth G. edgeR v4: powerful differential analysis of sequencing data with expanded functionality and improved support for small counts and larger datasets. Nucleic Acids Res 2025; 53:gkaf018. [PMID: 39844453 PMCID: PMC11754124 DOI: 10.1093/nar/gkaf018] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 11/22/2024] [Accepted: 01/08/2025] [Indexed: 01/24/2025] Open
Abstract
edgeR is an R/Bioconductor software package for differential analyses of sequencing data in the form of read counts for genes or genomic features. Over the past 15 years, edgeR has been a popular choice for statistical analysis of data from sequencing technologies such as RNA-seq or ChIP-seq. edgeR pioneered the use of the negative binomial distribution to model read count data with replicates and the use of generalized linear models to analyze complex experimental designs. edgeR implements empirical Bayes moderation methods to allow reliable inference when the number of replicates is small. This article announces edgeR version 4, which includes new developments across a range of application areas. Infrastructure improvements include support for fractional counts, implementation of model fitting in C and a new statistical treatment of the quasi-likelihood pipeline that improves accuracy for small counts. The revised package has new functionality for differential methylation analysis, differential transcript expression, differential transcript and exon usage, testing relative to a fold-change threshold and pathway analysis. This article reviews the statistical framework and computational implementation of edgeR, briefly summarizing all the existing features and functionalities but with special attention to new features and those that have not been described previously.
Collapse
Affiliation(s)
- Yunshun Chen
- Bioinformatics Division, WEHI, Parkville, VIC 3052, Australia
- ACRF Cancer Biology and Stem Cells Division, WEHI, Parkville, VIC 3052, Australia
| | - Lizhong Chen
- Bioinformatics Division, WEHI, Parkville, VIC 3052, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC 3010, Australia
| | - Aaron T L Lun
- Computational Sciences, Genentech Inc, 1 DNA Way, South San Francisco, CA 94080, United States
| | - Pedro L Baldoni
- Bioinformatics Division, WEHI, Parkville, VIC 3052, Australia
- Department of Medical Biology, The University of Melbourne, Parkville, VIC 3010, Australia
| | - Gordon K Smyth
- Bioinformatics Division, WEHI, Parkville, VIC 3052, Australia
- School of Mathematics and Statistics, The University of Melbourne, Parkville, VIC 3010, Australia
| |
Collapse
|
2
|
Hogrel G, Marino-Puertas L, Laurent S, Ibrahim Z, Covès J, Girard E, Gabel F, Fenel D, Daugeron MC, Clouet-d'Orval B, Basta T, Flament D, Franzetti B. Characterization of a small tRNA-binding protein that interacts with the archaeal proteasome complex. Mol Microbiol 2022; 118:16-29. [PMID: 35615908 PMCID: PMC9540759 DOI: 10.1111/mmi.14948] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 05/15/2022] [Accepted: 05/17/2022] [Indexed: 11/27/2022]
Abstract
The proteasome system allows the elimination of functional or structurally impaired proteins. This includes the degradation of nascent peptides. In Archaea, how the proteasome complex interacts with the translational machinery remains to be described. Here, we characterised a small orphan protein, Q9UZY3 (Uniprot ID) conserved in Thermococcales. The protein was identified in native pull-down experiments using the proteasome regulatory complex (PAN) as bait. X-ray crystallography and SAXS experiments revealed that the protein is monomeric and adopts a β-barrel core structure with an Oligonucleotide/oligosaccharide-Binding (OB) fold, typically found in translation elongation factors. Mobility shift experiment showed that Q9UZY3 displays tRNA binding properties. Pull-downs, co-immunoprecipitation and ITC studies revealed that Q9UZY3 interacts in vitro with PAN. Native pull-downs and proteomic analysis using different versions of Q9UZY3 showed that the protein interacts with the assembled PAN-20S proteasome machinery in Pyrococcus abyssi cellular extracts. The protein was therefore named Pbp11, for Proteasome Binding Protein of 11 kDa. Interestingly, the interaction network of Pbp11 also includes ribosomal proteins, tRNA processing enzymes and exosome subunits dependent on Pbp11's N-terminal domain that was found to be essential for tRNA binding. Together these data suggest that Pbp11 participates in an interface between the proteasome and the translational machinery.
Collapse
Affiliation(s)
- Gaëlle Hogrel
- Univ Grenoble Alpes, CNRS, CEA, IBS, Grenoble, France.,University of St Andrews, St Andrews, UK
| | | | - Sébastien Laurent
- Univ Brest, Ifremer, CNRS, Laboratoire de Microbiologie des Environnements Extrêmes, Plouzané, France
| | - Ziad Ibrahim
- Univ Grenoble Alpes, CNRS, CEA, IBS, Grenoble, France.,University of Leicester, Leicester, UK
| | - Jacques Covès
- Univ Grenoble Alpes, CNRS, CEA, IBS, Grenoble, France
| | - Eric Girard
- Univ Grenoble Alpes, CNRS, CEA, IBS, Grenoble, France
| | - Frank Gabel
- Univ Grenoble Alpes, CNRS, CEA, IBS, Grenoble, France
| | - Daphna Fenel
- Univ Grenoble Alpes, CNRS, CEA, IBS, Grenoble, France
| | - Marie-Claire Daugeron
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), Gif-sur-Yvette, France
| | - Béatrice Clouet-d'Orval
- Laboratoire de Microbiologie et de Génétique Moléculaires, UMR5100, Centre de Biologie Intégrative (CBI), Université de Toulouse, CNRS, Université Paul Sabatier, Toulouse, France
| | - Tamara Basta
- Université Paris-Saclay, CEA, CNRS, Institute for Integrative Biology of the Cell (I2BC), Gif-sur-Yvette, France
| | - Didier Flament
- Univ Brest, Ifremer, CNRS, Laboratoire de Microbiologie des Environnements Extrêmes, Plouzané, France
| | | |
Collapse
|
3
|
Dow A, Burger A, Marcantonio E, Prisic S. Multi-Omics Profiling Specifies Involvement of Alternative Ribosomal Proteins in Response to Zinc Limitation in Mycobacterium smegmatis. Front Microbiol 2022; 13:811774. [PMID: 35222334 PMCID: PMC8866557 DOI: 10.3389/fmicb.2022.811774] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2021] [Accepted: 01/04/2022] [Indexed: 12/13/2022] Open
Abstract
Zinc ion (Zn2+) is an essential micronutrient and a potent antioxidant. However, Zn2+ is often limited in the environment. Upon Zn2+ limitation, Mycolicibacterium (basonym: Mycobacterium) smegmatis (Msm) undergoes a morphogenesis, which relies on alternative ribosomal proteins (AltRPs); i.e., Zn2+-independent paralogues of Zn2+-dependent ribosomal proteins. However, the underlying physiological changes triggered by Zn2+ limitation and how AltRPs contribute to these changes were not known. In this study, we expand the knowledge of mechanisms utilized by Msm to endure Zn2+ limitation, by comparing the transcriptomes and proteomes of Zn2+-limited and Zn2+-replete Msm. We further compare, corroborate and contrast our results to those reported for the pathogenic mycobacterium, M. tuberculosis, which highlighted conservation of the upregulated oxidative stress response when Zn2+ is limited in both mycobacteria. By comparing the multi-omics analysis of a knockout mutant lacking AltRPs (ΔaltRP) to the Msm wild type strain, we specify the involvement of AltRPs in the response to Zn2+ limitation. Our results show that AltRP expression in Msm does not affect the conserved oxidative stress response during Zn2+ limitation observed in mycobacteria, but AltRPs do significantly impact expression patterns of numerous genes that may be involved in morphogenesis or other adaptive responses. We conclude that AltRPs are not only important as functional replacements for their Zn2+-dependent paralogues; they are also involved in the transcriptomic response to the Zn2+-limited environment.
Collapse
Affiliation(s)
- Allexa Dow
- School of Life Sciences, University of Hawai‘i at Mānoa, Honolulu, HI, United States
| | - Andrew Burger
- School of Ocean and Earth Science and Technology, University of Hawai‘i at Mānoa, Honolulu, HI, United States
| | - Endrei Marcantonio
- School of Life Sciences, University of Hawai‘i at Mānoa, Honolulu, HI, United States
| | - Sladjana Prisic
- School of Life Sciences, University of Hawai‘i at Mānoa, Honolulu, HI, United States
- *Correspondence: Sladjana Prisic,
| |
Collapse
|
4
|
Gardner ML, Freitas MA. Multiple Imputation Approaches Applied to the Missing Value Problem in Bottom-Up Proteomics. Int J Mol Sci 2021; 22:ijms22179650. [PMID: 34502557 PMCID: PMC8431783 DOI: 10.3390/ijms22179650] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Revised: 08/28/2021] [Accepted: 08/31/2021] [Indexed: 01/15/2023] Open
Abstract
Analysis of differential abundance in proteomics data sets requires careful application of missing value imputation. Missing abundance values widely vary when performing comparisons across different sample treatments. For example, one would expect a consistent rate of “missing at random” (MAR) across batches of samples and varying rates of “missing not at random” (MNAR) depending on the inherent difference in sample treatments within the study. The missing value imputation strategy must thus be selected that best accounts for both MAR and MNAR simultaneously. Several important issues must be considered when deciding the appropriate missing value imputation strategy: (1) when it is appropriate to impute data; (2) how to choose a method that reflects the combinatorial manner of MAR and MNAR that occurs in an experiment. This paper provides an evaluation of missing value imputation strategies used in proteomics and presents a case for the use of hybrid left-censored missing value imputation approaches that can handle the MNAR problem common to proteomics data.
Collapse
Affiliation(s)
- Miranda L. Gardner
- Ohio State Biochemistry Program, Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA;
- Cancer Biology and Genetics, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA
| | - Michael A. Freitas
- Ohio State Biochemistry Program, Chemistry and Biochemistry, The Ohio State University, Columbus, OH 43210, USA;
- Cancer Biology and Genetics, Wexner Medical Center, The Ohio State University, Columbus, OH 43210, USA
- Correspondence: or
| |
Collapse
|
5
|
Phung DK, Etienne C, Batista M, Langendijk-Genevaux P, Moalic Y, Laurent S, Liuu S, Morales V, Jebbar M, Fichant G, Bouvier M, Flament D, Clouet-d’Orval B. RNA processing machineries in Archaea: the 5'-3' exoribonuclease aRNase J of the β-CASP family is engaged specifically with the helicase ASH-Ski2 and the 3'-5' exoribonucleolytic RNA exosome machinery. Nucleic Acids Res 2020; 48:3832-3847. [PMID: 32030412 PMCID: PMC7144898 DOI: 10.1093/nar/gkaa052] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/24/2019] [Revised: 01/14/2020] [Accepted: 01/23/2020] [Indexed: 01/22/2023] Open
Abstract
A network of RNA helicases, endoribonucleases and exoribonucleases regulates the quantity and quality of cellular RNAs. To date, mechanistic studies focussed on bacterial and eukaryal systems due to the challenge of identifying the main drivers of RNA decay and processing in Archaea. Here, our data support that aRNase J, a 5'-3' exoribonuclease of the β-CASP family conserved in Euryarchaeota, engages specifically with a Ski2-like helicase and the RNA exosome to potentially exert control over RNA surveillance, at the vicinity of the ribosome. Proteomic landscapes and direct protein-protein interaction analyses, strengthened by comprehensive phylogenomic studies demonstrated that aRNase J interplay with ASH-Ski2 and a cap exosome subunit. Finally, Thermococcus barophilus whole-cell extract fractionation experiments provide evidences that an aRNase J/ASH-Ski2 complex might exist in vivo and hint at an association of aRNase J with the ribosome that is emphasised in absence of ASH-Ski2. Whilst aRNase J homologues are found among bacteria, the RNA exosome and the Ski2-like RNA helicase have eukaryotic homologues, underlining the mosaic aspect of archaeal RNA machines. Altogether, these results suggest a fundamental role of β-CASP RNase/helicase complex in archaeal RNA metabolism.
Collapse
Affiliation(s)
- Duy Khanh Phung
- Laboratoire de Microbiologie et de Génétique Moléculaires, UMR5100, Centre de Biologie Intégrative (CBI), Université de Toulouse, CNRS, Université Paul Sabatier, F-31062 Toulouse, France
| | - Clarisse Etienne
- Laboratoire de Microbiologie et de Génétique Moléculaires, UMR5100, Centre de Biologie Intégrative (CBI), Université de Toulouse, CNRS, Université Paul Sabatier, F-31062 Toulouse, France
| | - Manon Batista
- Laboratoire de Microbiologie et de Génétique Moléculaires, UMR5100, Centre de Biologie Intégrative (CBI), Université de Toulouse, CNRS, Université Paul Sabatier, F-31062 Toulouse, France
| | - Petra Langendijk-Genevaux
- Laboratoire de Microbiologie et de Génétique Moléculaires, UMR5100, Centre de Biologie Intégrative (CBI), Université de Toulouse, CNRS, Université Paul Sabatier, F-31062 Toulouse, France
| | - Yann Moalic
- Ifremer, Univ Brest, CNRS, Laboratoire de Microbiologie des Environnements Extrêmes, F-29280 Plouzané, France
| | - Sébastien Laurent
- Ifremer, Univ Brest, CNRS, Laboratoire de Microbiologie des Environnements Extrêmes, F-29280 Plouzané, France
| | - Sophie Liuu
- Micalis Institute, PAPPSO, INRA, AgroParisTech, Université Paris-Saclay, 78350, Jouy-en-Josas, France
| | - Violette Morales
- Laboratoire de Microbiologie et de Génétique Moléculaires, UMR5100, Centre de Biologie Intégrative (CBI), Université de Toulouse, CNRS, Université Paul Sabatier, F-31062 Toulouse, France
| | - Mohamed Jebbar
- Ifremer, Univ Brest, CNRS, Laboratoire de Microbiologie des Environnements Extrêmes, F-29280 Plouzané, France
| | - Gwennaele Fichant
- Laboratoire de Microbiologie et de Génétique Moléculaires, UMR5100, Centre de Biologie Intégrative (CBI), Université de Toulouse, CNRS, Université Paul Sabatier, F-31062 Toulouse, France
| | - Marie Bouvier
- Laboratoire de Microbiologie et de Génétique Moléculaires, UMR5100, Centre de Biologie Intégrative (CBI), Université de Toulouse, CNRS, Université Paul Sabatier, F-31062 Toulouse, France
| | - Didier Flament
- Ifremer, Univ Brest, CNRS, Laboratoire de Microbiologie des Environnements Extrêmes, F-29280 Plouzané, France
| | - Béatrice Clouet-d’Orval
- Laboratoire de Microbiologie et de Génétique Moléculaires, UMR5100, Centre de Biologie Intégrative (CBI), Université de Toulouse, CNRS, Université Paul Sabatier, F-31062 Toulouse, France
- To whom correspondence should be addressed. Tel: +33 561 335 875; Fax: +33 561 335 886;
| |
Collapse
|
6
|
Glazko G, Zybailov B, Emmert-Streib F, Baranova A, Rahmatallah Y. Proteome-transcriptome alignment of molecular portraits achieved by self-contained gene set analysis: Consensus colon cancer subtypes case study. PLoS One 2019; 14:e0221444. [PMID: 31437237 PMCID: PMC6705791 DOI: 10.1371/journal.pone.0221444] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/01/2019] [Accepted: 08/06/2019] [Indexed: 01/10/2023] Open
Abstract
Gene set analysis (GSA) has become the common methodology for analyzing transcriptomics data. However, self-contained GSA techniques are rarely, if ever, used for proteomics data analysis. Here we present a self-contained proteome level GSA of four consensus molecular subtypes (CMSs) previously established by transcriptome dissection of colon carcinoma specimens. Despite notable difference in structure of proteomics and transcriptomics data, many pathway-wide characteristic features of CMSs found at the mRNA level were reproduced at the protein level. In particular, CMS1 features show heavy involvement of immune system as well as the pathways related to mismatch repair, DNA replication and functioning of proteasome, while CMS4 tumors upregulate complement pathway and proteins participating in epithelial-to-mesenchymal transition (EMT). In addition, protein level GSA yielded a set of novel observations visible at the proteome, but not at the transcriptome level, including possible involvement of major histocompatibility complex II (MHC-II) antigens in the known immunogenicity of CMS1 and a connection between cholesterol trafficking and the regulation of Integrin-linked kinase (ILK) in CMS3. Overall, this study proves utility of self-contained GSA approaches as a critical tool for analyzing proteomics data in general and dissecting protein-level molecular portraits of human tumors in particular.
Collapse
Affiliation(s)
- Galina Glazko
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Boris Zybailov
- Department of Biochemistry and Molecular Biology, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| | - Frank Emmert-Streib
- Computational Medicine and Statistical Learning Laboratory, Tampere University of Technology, Korkeakoulunkatu, Tampere, Finland FI
| | - Ancha Baranova
- School of Systems Biology, George Mason University, Manassas VA, United States of America
- Research Center for Medical Genetics, Moscow, Russia
| | - Yasir Rahmatallah
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, United States of America
| |
Collapse
|
7
|
Choi M, Eren-Dogu ZF, Colangelo C, Cottrell J, Hoopmann MR, Kapp EA, Kim S, Lam H, Neubert TA, Palmblad M, Phinney BS, Weintraub ST, MacLean B, Vitek O. ABRF Proteome Informatics Research Group (iPRG) 2015 Study: Detection of Differentially Abundant Proteins in Label-Free Quantitative LC-MS/MS Experiments. J Proteome Res 2017; 16:945-957. [PMID: 27990823 DOI: 10.1021/acs.jproteome.6b00881] [Citation(s) in RCA: 34] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/31/2022]
Abstract
Detection of differentially abundant proteins in label-free quantitative shotgun liquid chromatography-tandem mass spectrometry (LC-MS/MS) experiments requires a series of computational steps that identify and quantify LC-MS features. It also requires statistical analyses that distinguish systematic changes in abundance between conditions from artifacts of biological and technical variation. The 2015 study of the Proteome Informatics Research Group (iPRG) of the Association of Biomolecular Resource Facilities (ABRF) aimed to evaluate the effects of the statistical analysis on the accuracy of the results. The study used LC-tandem mass spectra acquired from a controlled mixture, and made the data available to anonymous volunteer participants. The participants used methods of their choice to detect differentially abundant proteins, estimate the associated fold changes, and characterize the uncertainty of the results. The study found that multiple strategies (including the use of spectral counts versus peak intensities, and various software tools) could lead to accurate results, and that the performance was primarily determined by the analysts' expertise. This manuscript summarizes the outcome of the study, and provides representative examples of good computational and statistical practice. The data set generated as part of this study is publicly available.
Collapse
Affiliation(s)
- Meena Choi
- Northeastern University , Boston, Massachusetts 02115, United States
| | | | | | | | - Michael R Hoopmann
- Institute for Systems Biology , Seattle, Washington 98109, United States
| | - Eugene A Kapp
- Walter and Eliza Hall Institute of Medical Research , Melbourne 3052, Australia
| | - Sangtae Kim
- Pacific Northwest National Laboratory , Richland, Washington 99354, United States
| | - Henry Lam
- Department of Chemical and Biomolecular Engineering and Division of Biomedical Engineering, The Hong Kong University of Science and Technology , Clear Water Bay, Hong Kong
| | - Thomas A Neubert
- Skirball Institute and Department of Biochemistry and Molecular Pharmacology, New York University School of Medicine , New York, New York 10016, United States
| | - Magnus Palmblad
- Center for Proteomics and Metabolomics, Leiden University Medical Center , 2333 ZA Leiden, The Netherlands
| | - Brett S Phinney
- University of California at Davis , Davis, California 95616, United States
| | - Susan T Weintraub
- University of Texas Health Science Center at San Antonio , San Antonio, Texas 78229, United States
| | - Brendan MacLean
- University of Washington , Seattle, Washington 98105, United States
| | - Olga Vitek
- Northeastern University , Boston, Massachusetts 02115, United States
| |
Collapse
|