Woodcock-Girard MD, Bretz EC, Robertson HM, Ramanauskas K, Hampton-Marcell JT, Walker JF. Semblans: automated assembly and processing of RNA-seq data.
Bioinformatics 2024;
41:btaf003. [PMID:
39786867 PMCID:
PMC11748423 DOI:
10.1093/bioinformatics/btaf003]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/13/2024] [Revised: 11/10/2024] [Accepted: 01/06/2025] [Indexed: 01/12/2025] Open
Abstract
MOTIVATION
Recent advancements in parallel sequencing methods have precipitated a surge in publicly available short-read sequence data. This has encouraged the development of novel computational tools for the de novo assembly of transcriptomes from RNA-seq data. Despite the availability of these tools, performing an end-to-end transcriptome assembly remains a programmatically involved task necessitating familiarity with best practices. Aside from quality control steps, including error correction, adapter trimming, and chimera filtration needing to be correctly used, moving data between programs often requires manual reformatting or restructuring, which can further impede throughput. Here, we introduce Semblans, a tool for streamlining the assembly process that efficiently and consistently produces high-quality transcriptome assemblies.
RESULTS
Semblans abstracts the key quality control, reconstitution, and postprocessing steps of transcriptome assembly from raw short-read sequences to annotated coding sequences. Evaluating its performance against previously assembled transcriptomes on the basis of assembly quality, we find that Semblans produced higher quality assemblies for 98 of the 101 short-read runs tested.
AVAILABILITY AND IMPLEMENTATION
Semblans is written in C++ and runs on Unix-compliant operating systems. Source code, documentation, and compiled binaries are hosted under the GNU General Public License at https://github.com/gladshire/Semblans.
Collapse