1
|
Jiang M, Zhang S, Yin H, Zhuo Z, Meng G. A comprehensive benchmarking of differential splicing tools for RNA-seq analysis at the event level. Brief Bioinform 2023; 24:7108868. [PMID: 37020334 DOI: 10.1093/bib/bbad121] [Citation(s) in RCA: 5] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/30/2022] [Revised: 02/10/2023] [Accepted: 03/10/2023] [Indexed: 04/07/2023] Open
Abstract
RNA alternative splicing, a post-transcriptional stage in eukaryotes, is crucial in cellular homeostasis and disease processes. Due to the rapid development of the next-generation sequencing (NGS) technology and the flood of NGS data, the detection of differential splicing from RNA-seq data has become mainstream. A range of bioinformatic tools has been developed. However, until now, an independent and comprehensive comparison of available algorithms/tools at the event level is still lacking. Here, 21 different tools are subjected to systematic evaluation, based on simulated RNA-seq data where exact differential splicing events are introduced. We observe immense discrepancies among these tools. SUPPA, DARTS, rMATS and LeafCutter outperforme other event-based tools. We also examine the abilities of the tools to identify novel splicing events, which shows that most event-based tools are unsuitable for discovering novel splice sites. To improve the overall performance, we present two methodological approaches i.e. low-expression transcript filtering and tool-pair combination. Finally, a new protocol of selecting tools to perform differential splicing analysis for different analytical tasks (e.g. precision and recall rate) is proposed. Under this protocol, we analyze the distinct splicing landscape in the DUX4/IGH subgroup of B-cell acute lymphoblastic leukemia and uncover the differential splicing of TCF12. All codes needed to reproduce the results are available at https://github.com/mhjiang97/Benchmarking_DS.
Collapse
Affiliation(s)
- Minghao Jiang
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai 200025, China
| | - Shiyan Zhang
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai 200025, China
| | - Hongxin Yin
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai 200025, China
| | - Zhiyi Zhuo
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai 200025, China
| | - Guoyu Meng
- Shanghai Institute of Hematology, State Key Laboratory of Medical Genomics, National Research Center for Translational Medicine, Rui-Jin Hospital, Shanghai Jiao Tong University School of Medicine and School of Life Sciences and Biotechnology, Shanghai Jiao Tong University, 197 Ruijin Er Road, Shanghai 200025, China
| |
Collapse
|
2
|
Beslic D, Tscheuschner G, Renard BY, Weller MG, Muth T. Comprehensive evaluation of peptide de novo sequencing tools for monoclonal antibody assembly. Brief Bioinform 2022; 24:6955273. [PMID: 36545804 PMCID: PMC9851299 DOI: 10.1093/bib/bbac542] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2022] [Revised: 10/25/2022] [Accepted: 11/10/2022] [Indexed: 12/24/2022] Open
Abstract
Monoclonal antibodies are biotechnologically produced proteins with various applications in research, therapeutics and diagnostics. Their ability to recognize and bind to specific molecule structures makes them essential research tools and therapeutic agents. Sequence information of antibodies is helpful for understanding antibody-antigen interactions and ensuring their affinity and specificity. De novo protein sequencing based on mass spectrometry is a valuable method to obtain the amino acid sequence of peptides and proteins without a priori knowledge. In this study, we evaluated six recently developed de novo peptide sequencing algorithms (Novor, pNovo 3, DeepNovo, SMSNet, PointNovo and Casanovo), which were not specifically designed for antibody data. We validated their ability to identify and assemble antibody sequences on three multi-enzymatic data sets. The deep learning-based tools Casanovo and PointNovo showed an increased peptide recall across different enzymes and data sets compared with spectrum-graph-based approaches. We evaluated different error types of de novo peptide sequencing tools and their performance for different numbers of missing cleavage sites, noisy spectra and peptides of various lengths. We achieved a sequence coverage of 97.69-99.53% on the light chains of three different antibody data sets using the de Bruijn assembler ALPS and the predictions from Casanovo. However, low sequence coverage and accuracy on the heavy chains demonstrate that complete de novo protein sequencing remains a challenging issue in proteomics that requires improved de novo error correction, alternative digestion strategies and hybrid approaches such as homology search to achieve high accuracy on long protein sequences.
Collapse
Affiliation(s)
- Denis Beslic
- Corresponding authors: D. Beslic, Robert Koch Institute, ZKI-PH 3, Nordufer 20, 13353 Berlin, Germany. E-mail: ; G. Tscheuschner, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; B.Y. Renard, Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Prof.-Dr.-Helmert-Straße 2-3, 14482 Potsdam, Germany. E-mail: ; M.G. Weller, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; T. Muth, Federal Institute for Materials Research and Testing (BAM), Unter den Eichen 87, 12205 Berlin, Germany. E-mail:
| | - Georg Tscheuschner
- Corresponding authors: D. Beslic, Robert Koch Institute, ZKI-PH 3, Nordufer 20, 13353 Berlin, Germany. E-mail: ; G. Tscheuschner, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; B.Y. Renard, Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Prof.-Dr.-Helmert-Straße 2-3, 14482 Potsdam, Germany. E-mail: ; M.G. Weller, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; T. Muth, Federal Institute for Materials Research and Testing (BAM), Unter den Eichen 87, 12205 Berlin, Germany. E-mail:
| | - Bernhard Y Renard
- Corresponding authors: D. Beslic, Robert Koch Institute, ZKI-PH 3, Nordufer 20, 13353 Berlin, Germany. E-mail: ; G. Tscheuschner, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; B.Y. Renard, Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Prof.-Dr.-Helmert-Straße 2-3, 14482 Potsdam, Germany. E-mail: ; M.G. Weller, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; T. Muth, Federal Institute for Materials Research and Testing (BAM), Unter den Eichen 87, 12205 Berlin, Germany. E-mail:
| | - Michael G Weller
- Corresponding authors: D. Beslic, Robert Koch Institute, ZKI-PH 3, Nordufer 20, 13353 Berlin, Germany. E-mail: ; G. Tscheuschner, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; B.Y. Renard, Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Prof.-Dr.-Helmert-Straße 2-3, 14482 Potsdam, Germany. E-mail: ; M.G. Weller, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; T. Muth, Federal Institute for Materials Research and Testing (BAM), Unter den Eichen 87, 12205 Berlin, Germany. E-mail:
| | - Thilo Muth
- Corresponding authors: D. Beslic, Robert Koch Institute, ZKI-PH 3, Nordufer 20, 13353 Berlin, Germany. E-mail: ; G. Tscheuschner, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; B.Y. Renard, Hasso Plattner Institute, Digital Engineering Faculty, University of Potsdam, Prof.-Dr.-Helmert-Straße 2-3, 14482 Potsdam, Germany. E-mail: ; M.G. Weller, Federal Institute for Materials Research and Testing (BAM), Richard-Willstätter-Straße 11, 12489 Berlin, Germany. E-mail: ; T. Muth, Federal Institute for Materials Research and Testing (BAM), Unter den Eichen 87, 12205 Berlin, Germany. E-mail:
| |
Collapse
|
3
|
van der Putten BCL, Huijsmans NAH, Mende DR, Schultsz C. Benchmarking the topological accuracy of bacterial phylogenomic workflows using in silico evolution. Microb Genom 2022; 8. [PMID: 35290758 PMCID: PMC9176278 DOI: 10.1099/mgen.0.000799] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Phylogenetic analyses are widely used in microbiological research, for example to trace the progression of bacterial outbreaks based on whole-genome sequencing data. In practice, multiple analysis steps such as de novo assembly, alignment and phylogenetic inference are combined to form phylogenetic workflows. Comprehensive benchmarking of the accuracy of complete phylogenetic workflows is lacking. To benchmark different phylogenetic workflows, we simulated bacterial evolution under a wide range of evolutionary models, varying the relative rates of substitution, insertion, deletion, gene duplication, gene loss and lateral gene transfer events. The generated datasets corresponded to a genetic diversity usually observed within bacterial species (≥95 % average nucleotide identity). We replicated each simulation three times to assess replicability. In total, we benchmarked 19 distinct phylogenetic workflows using 8 different simulated datasets. We found that recently developed k-mer alignment methods such as kSNP and ska achieve similar accuracy as reference mapping. The high accuracy of k-mer alignment methods can be explained by the large fractions of genomes these methods can align, relative to other approaches. We also found that the choice of de novo assembly algorithm influences the accuracy of phylogenetic reconstruction, with workflows employing SPAdes or skesa outperforming those employing Velvet. Finally, we found that the results of phylogenetic benchmarking are highly variable between replicates. We conclude that for phylogenomic reconstruction, k-mer alignment methods are relevant alternatives to reference mapping at the species level, especially in the absence of suitable reference genomes. We show de novo genome assembly accuracy to be an underappreciated parameter required for accurate phylogenomic reconstruction.
Collapse
Affiliation(s)
- Boas C L van der Putten
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.,Department of Global Health, Amsterdam Institute for Global Health and Development, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Niek A H Huijsmans
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Daniel R Mende
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| | - Constance Schultsz
- Department of Medical Microbiology, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands.,Department of Global Health, Amsterdam Institute for Global Health and Development, Amsterdam UMC, University of Amsterdam, Amsterdam, The Netherlands
| |
Collapse
|
4
|
Griss J, Stanek F, Hudecz O, Dürnberger G, Perez-Riverol Y, Vizcaíno JA, Mechtler K. Spectral Clustering Improves Label-Free Quantification of Low-Abundant Proteins. J Proteome Res 2019; 18:1477-1485. [PMID: 30859831 PMCID: PMC6456873 DOI: 10.1021/acs.jproteome.8b00377] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.4] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2018] [Indexed: 11/29/2022]
Abstract
Label-free quantification has become a common-practice in many mass spectrometry-based proteomics experiments. In recent years, we and others have shown that spectral clustering can considerably improve the analysis of (primarily large-scale) proteomics data sets. Here we show that spectral clustering can be used to infer additional peptide-spectrum matches and improve the quality of label-free quantitative proteomics data in data sets also containing only tens of MS runs. We analyzed four well-known public benchmark data sets that represent different experimental settings using spectral counting and peak intensity based label-free quantification. In both approaches, the additionally inferred peptide-spectrum matches through our spectra-cluster algorithm improved the detectability of low abundant proteins while increasing the accuracy of the derived quantitative data, without increasing the data sets' noise. Additionally, we developed a Proteome Discoverer node for our spectra-cluster algorithm which allows anyone to rebuild our proposed pipeline using the free version of Proteome Discoverer.
Collapse
Affiliation(s)
- Johannes Griss
- Department
of Dermatology, Medical University of Vienna, Währinger Gürtel 18-20, 1090 Vienna, Austria
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10
1SD Hinxton, Cambridge, United Kingdom
| | - Florian Stanek
- Research
Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Campus-Vienna-Biocenter 1, 1030 Vienna, Austria
- Institute
of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna Biocenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria
| | - Otto Hudecz
- Research
Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Campus-Vienna-Biocenter 1, 1030 Vienna, Austria
- Institute
of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna Biocenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria
| | - Gerhard Dürnberger
- Research
Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Campus-Vienna-Biocenter 1, 1030 Vienna, Austria
- Institute
of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna Biocenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria
- Gregor
Mendel Institute of Molecular Plant Biology (GMI), Vienna Biocenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria
| | - Yasset Perez-Riverol
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10
1SD Hinxton, Cambridge, United Kingdom
| | - Juan Antonio Vizcaíno
- European
Molecular Biology Laboratory, European Bioinformatics
Institute (EMBL-EBI), Wellcome Trust Genome Campus, CB10
1SD Hinxton, Cambridge, United Kingdom
| | - Karl Mechtler
- Research
Institute of Molecular Pathology (IMP), Vienna Biocenter (VBC), Campus-Vienna-Biocenter 1, 1030 Vienna, Austria
- Institute
of Molecular Biotechnology of the Austrian Academy of Sciences (IMBA), Vienna Biocenter (VBC), Dr. Bohr-Gasse 3, 1030 Vienna, Austria
| |
Collapse
|
5
|
Björgvinsson T, Kertz SJ, Bigda-Peyton JS, Rosmarin DH, Aderka IM, Neuhaus EC. Effectiveness of cognitive behavior therapy for severe mood disorders in an acute psychiatric naturalistic setting: a benchmarking study. Cogn Behav Ther 2014; 43:209-20. [PMID: 24679127 DOI: 10.1080/16506073.2014.901988] [Citation(s) in RCA: 35] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/25/2022]
Abstract
The current study examined the effectiveness of brief cognitive behavior therapy (CBT) for severe mood disorders in an acute naturalistic setting. The sample included 951 individuals with either major depressive disorder (n = 857) or bipolar disorder with depressed mood (n = 94). Participants completed a battery of self-report measures assessing depression, overall well-being, and a range of secondary outcomes both before and after treatment. We found significant reductions in depressive symptoms, worry, self-harm, emotional lability, and substance abuse, as well as significant improvements in well-being and interpersonal relationships, post-treatment. Comparable to outpatient studies, 30% of the sample evidenced recovery from depression. Comparison of findings to benchmark studies indicated that, although the current sample started treatment with severe depressive symptoms and were in treatment for average of only 10 days, the overall magnitude of symptom improvement was similar to that of randomized controlled trials. Limitations of the study include a lack of control group, a limitation of most naturalistic studies. These findings indicate that interventions developed in controlled research settings on the efficacy of CBT can be transported to naturalistic, "real world" settings, and that brief CBT delivered in a partial hospital program is effective for many patients with severe depressive symptoms.
Collapse
|