1
|
Zampolli J, Orro A, Manconi A, Ami D, Natalello A, Di Gennaro P. Transcriptomic analysis of Rhodococcus opacus R7 grown on polyethylene by RNA-seq. Sci Rep 2021; 11:21311. [PMID: 34716360 PMCID: PMC8556283 DOI: 10.1038/s41598-021-00525-x] [Citation(s) in RCA: 35] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/06/2021] [Accepted: 10/12/2021] [Indexed: 12/20/2022] Open
Abstract
Plastic waste management has become a global issue. Polyethylene (PE) is the most abundant synthetic plastic worldwide, and one of the most resistant to biodegradation. Indeed, few bacteria can degrade polyethylene. In this paper, the transcriptomic analysis unveiled for the first time Rhodococcus opacus R7 complex genetic system based on diverse oxidoreductases for polyethylene biodegradation. The RNA-seq allowed uncovering genes putatively involved in the first step of oxidation. In-depth investigations through preliminary bioinformatic analyses and enzymatic assays on the supernatant of R7 grown in the presence of PE confirmed the activation of genes encoding laccase-like enzymes. Moreover, the transcriptomic data allowed identifying candidate genes for the further steps of short aliphatic chain oxidation including alkB gene encoding an alkane monooxygenase, cyp450 gene encoding cytochrome P450 hydroxylase, and genes encoding membrane transporters. The PE biodegradative system was also validated by FTIR analysis on R7 cells grown on polyethylene.
Collapse
Affiliation(s)
- Jessica Zampolli
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, 20126, Milan, Italy
| | - Alessandro Orro
- Institute of Biomedical Technologies, National Research Council, CNR, via Fratelli Cervi 19, Segrate, 20133, Milan, Italy
| | - Andrea Manconi
- Institute of Biomedical Technologies, National Research Council, CNR, via Fratelli Cervi 19, Segrate, 20133, Milan, Italy
| | - Diletta Ami
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, 20126, Milan, Italy
| | - Antonino Natalello
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, 20126, Milan, Italy
| | - Patrizia Di Gennaro
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Piazza della Scienza 2, 20126, Milan, Italy.
| |
Collapse
|
2
|
Dai H, Guan Y. Nubeam-dedup: a fast and RAM-efficient tool to de-duplicate sequencing reads without mapping. Bioinformatics 2020; 36:3254-3256. [PMID: 32091581 DOI: 10.1093/bioinformatics/btaa112] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/02/2019] [Revised: 02/06/2020] [Accepted: 02/14/2020] [Indexed: 12/15/2022] Open
Abstract
SUMMARY We present Nubeam-dedup, a fast and RAM-efficient tool to de-duplicate sequencing reads without reference genome. Nubeam-dedup represents nucleotides by matrices, transforms reads into products of matrices, and based on which assigns a unique number to a read. Thus, duplicate reads can be efficiently removed by using a collisionless hash function. Compared with other state-of-the-art reference-free tools, Nubeam-dedup uses 50-70% of CPU time and 10-15% of RAM. AVAILABILITY AND IMPLEMENTATION Source code in C++ and manual are available at https://github.com/daihang16/nubeamdedup and https://haplotype.org. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Hang Dai
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27705, USA
| | - Yongtao Guan
- Department of Biostatistics and Bioinformatics, Duke University School of Medicine, Durham, NC 27705, USA
| |
Collapse
|
3
|
Zampolli J, Di Canito A, Manconi A, Milanesi L, Di Gennaro P, Orro A. Transcriptomic Analysis of Rhodococcus opacus R7 Grown on o-Xylene by RNA-Seq. Front Microbiol 2020; 11:1808. [PMID: 32903390 PMCID: PMC7434839 DOI: 10.3389/fmicb.2020.01808] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 07/09/2020] [Indexed: 11/13/2022] Open
Abstract
Xylenes are considered one of the most common hazardous sources of environmental contamination. The biodegradation of these compounds has been often reported, rarer the ability to oxidize the ortho-isomer. Among few o-xylene-degrading bacteria, Rhodococcus opacus R7 is well known for its capability to degrade diverse aromatic hydrocarbons and toxic compounds, including o-xylene as only carbon and energy source. This work shows for the first time the RNA-seq approach to elucidate the genetic determinants involved in the o-xylene degradation pathway in R. opacus R7. Transcriptomic data showed 542 differentially expressed genes that are associated with the oxidation of aromatic hydrocarbons and stress response, osmotic regulation and central metabolism. Gene ontology (GO) enrichment and KEGG pathway analysis confirmed significant changes in aromatic compound catabolic processes, fatty acid metabolism, beta-oxidation, TCA cycle enzymes, and biosynthesis of metabolites when cells are cultured in the presence of o-xylene. Interestingly, the most up-regulated genes belong to the akb gene cluster encoding for the ethylbenzene (Akb) dioxygenase system. Moreover, the transcriptomic approach allowed identifying candidate enzymes involved in R7 o-xylene degradation for their likely participation in the formation of the metabolites that have been previously identified. Overall, this approach supports the identification of several oxidative systems likely involved in o-xylene metabolism confirming that R. opacus R7 possesses a redundancy of sequences that converge in o-xylene degradation through R7 peculiar degradation pathway. This work advances our understanding of o-xylene metabolism in bacteria belonging to Rhodococcus genus and provides a framework of useful enzymes (molecular tools) that can be fruitfully targeted for optimized o-xylene consumption.
Collapse
Affiliation(s)
- Jessica Zampolli
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Milan, Italy
| | - Alessandra Di Canito
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Milan, Italy
| | - Andrea Manconi
- Institute of Biomedical Technologies, National Research Council, CNR, Milan, Italy
| | - Luciano Milanesi
- Institute of Biomedical Technologies, National Research Council, CNR, Milan, Italy
| | - Patrizia Di Gennaro
- Department of Biotechnology and Biosciences, University of Milano-Bicocca, Milan, Italy
| | - Alessandro Orro
- Institute of Biomedical Technologies, National Research Council, CNR, Milan, Italy
| |
Collapse
|
4
|
Na JC, Lee I, Rhee JK, Shin SY. Fast single individual haplotyping method using GPGPU. Comput Biol Med 2019; 113:103421. [PMID: 31499396 DOI: 10.1016/j.compbiomed.2019.103421] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/04/2019] [Revised: 08/28/2019] [Accepted: 08/28/2019] [Indexed: 11/27/2022]
Abstract
BACKGROUND Most bioinformatic tools for next generation sequencing (NGS) data are computationally intensive, requiring a large amount of computational power for processing and analysis. Here the utility of graphic processing units (GPUs) for NGS data computation is assessed. METHOD In a previous study, we developed a probabilistic evolutionary algorithm with toggling for haplotyping (PEATH) method based on the estimation of distribution algorithm and toggling heuristic. Here, we parallelized the PEATH method (PEATH/G) using general-purpose computing on GPU (GPGPU). RESULTS The PEATH/G runs approximately 46.8 times and 25.4 times faster than PEATH on the NA12878 fosmid-sequencing dataset and the HuRef dataset, respectively, with an NVIDIA GeForce GTX 1660Ti. Moreover, the PEATH/G is approximately 13.3 times faster on the fosmid-sequencing dataset, even with an inexpensive conventional GPGPU (NVIDIA GeForce GTX 950). CONCLUSIONS PEATH/G can be a practical single individual haplotyping tool in terms of both its accuracy and speed. GPGPU can help reduce the running time of NGS analysis tools.
Collapse
Affiliation(s)
- Joong Chae Na
- Department of Computer Science and Engineering, Sejong University, Seoul, 05006, South Korea
| | - Inbok Lee
- Department of Software, Korea Aerospace University, Goyang, 10540, South Korea
| | - Je-Keun Rhee
- School of Systems Biomedical Science, Soongsil University, Seoul, 06978, South Korea.
| | - Soo-Yong Shin
- Department of Digital Health, SAIHST, Sungkyunkwan University, Seoul, 06351, South Korea; Big Data Research Center, Samsung Medical Center, Seoul, 06351, South Korea.
| |
Collapse
|
5
|
NGSReadsTreatment - A Cuckoo Filter-based Tool for Removing Duplicate Reads in NGS Data. Sci Rep 2019; 9:11681. [PMID: 31406180 PMCID: PMC6690869 DOI: 10.1038/s41598-019-48242-w] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/16/2018] [Accepted: 08/01/2019] [Indexed: 11/24/2022] Open
Abstract
The Next-Generation Sequencing (NGS) platforms provide a major approach to obtaining millions of short reads from samples. NGS has been used in a wide range of analyses, such as for determining genome sequences, analyzing evolutionary processes, identifying gene expression and resolving metagenomic analyses. Usually, the quality of NGS data impacts the final study conclusions. Moreover, quality assessment is generally considered the first step in data analyses to ensure the use of only reliable reads for further studies. In NGS platforms, the presence of duplicated reads (redundancy) that are usually introduced during library sequencing is a major issue. These might have a serious impact on research application, as redundancies in reads can lead to difficulties in subsequent analysis (e.g., de novo genome assembly). Herein, we present NGSReadsTreatment, a computational tool for the removal of duplicated reads in paired-end or single-end datasets. NGSReadsTreatment can handle reads from any platform with the same or different sequence lengths. Using the probabilistic structure Cuckoo Filter, the redundant reads are identified and removed by comparing the reads with themselves. Thus, no prerequisite is required beyond the set of reads. NGSReadsTreatment was compared with other redundancy removal tools in analyzing different sets of reads. The results demonstrated that NGSReadsTreatment was better than the other tools in both the amount of redundancies removed and the use of computational memory for all analyses performed. Available in https://sourceforge.net/projects/ngsreadstreatment/.
Collapse
|
6
|
Identification of factors associated with duplicate rate in ChIP-seq data. PLoS One 2019; 14:e0214723. [PMID: 30943272 PMCID: PMC6447195 DOI: 10.1371/journal.pone.0214723] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/09/2018] [Accepted: 03/19/2019] [Indexed: 12/20/2022] Open
Abstract
Chromatin immunoprecipitation and sequencing (ChIP-seq) has been widely used to map DNA-binding proteins, histone proteins and their modifications. ChIP-seq data contains redundant reads termed duplicates, referring to those mapping to the same genomic location and strand. There are two main sources of duplicates: polymerase chain reaction (PCR) duplicates and natural duplicates. Unlike natural duplicates that represent true signals from sequencing of independent DNA templates, PCR duplicates are artifacts originating from sequencing of identical copies amplified from the same DNA template. In analysis, duplicates are removed from peak calling and signal quantification. Nevertheless, a significant portion of the duplicates is believed to represent true signals. Obviously, removing all duplicates will underestimate the signal level in peaks and impact the identification of signal changes across samples. Therefore, an in-depth evaluation of the impact from duplicate removal is needed. Using eight public ChIP-seq datasets from three narrow-peak and two broad-peak marks, we tried to understand the distribution of duplicates in the genome, the extent by which duplicate removal impacts peak calling and signal estimation, and the factors associated with duplicate level in peaks. The three PCR-free histone H3 lysine 4 trimethylation (H3K4me3) ChIP-seq data had about 40% duplicates and 97% of them were within peaks. For the other datasets generated with PCR amplification of ChIP DNA, as expected, the narrow-peak marks have a much higher proportion of duplicates than the broad-peak marks. We found that duplicates are enriched in peaks and largely represent true signals, more conspicuous in those with high confidence. Furthermore, duplicate level in peaks is strongly correlated with the target enrichment level estimated using nonredundant reads, which provides the basis to properly allocate duplicates between noise and signal. Our analysis supports the feasibility of retaining the portion of signal duplicates into downstream analysis, thus alleviating the limitation of complete deduplication.
Collapse
|
7
|
Milanesi L, Guffanti A, Mauri G, Masseroli M. BITS 2015: the annual meeting of the Italian Society of Bioinformatics. BMC Bioinformatics 2016; 17:396. [PMID: 28185548 PMCID: PMC5123416 DOI: 10.1186/s12859-016-1187-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
This preface introduces the content of the BioMed Central journal Supplements related to the BITS 2015 meeting, held in Milan, Italy, from the 3th to the 5th of June, 2015.
Collapse
|