Fóthi Á, Liu H, Susztak K, Aranyi T. Improve-RRBS: a novel tool to correct the 3' trimming of reduced representation sequencing reads.
BIOINFORMATICS ADVANCES 2024;
4:vbae076. [PMID:
38846137 PMCID:
PMC11154647 DOI:
10.1093/bioadv/vbae076]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 12/14/2023] [Revised: 04/18/2024] [Accepted: 05/23/2024] [Indexed: 06/09/2024]
Abstract
Motivation
Reduced Representation Bisulfite Sequencing (RRBS) is a popular approach to determine DNA methylation of the CpG-rich regions of the genome. However, we observed that false positive differentially methylated sites (DMS) are also identified using the standard computational analysis.
Results
During RRBS library preparation the MspI digested DNA undergo end-repair by a cytosine at the 3' end of the fragments. After sequencing, Trim Galore cuts these end-repaired nucleotides. However, Trim Galore fails to detect end-repair when it overlaps with the 3' end of the sequencing reads. We found that these non-trimmed cytosines bias methylation calling, thus, can identify DMS erroneously. To circumvent this problem, we developed improve-RRBS, which efficiently identifies and hides these cytosines from methylation calling with a false positive rate of maximum 0.5%. To test improve-RRBS, we investigated four datasets from four laboratories and two different species. We found non-trimmed 3' cytosines in all datasets analyzed and as much as >50% of false positive DMS under certain conditions. By applying improve-RRBS, these DMS completely disappeared from all comparisons.
Availability and implementation
Improve-RRBS is a freely available python package https://pypi.org/project/iRRBS/ or https://github.com/fothia/improve-RRBS to be implemented in RRBS pipelines.
Collapse