Aiming off the target: recycling target capture sequencing reads for investigating repetitive DNA.
ANNALS OF BOTANY 2021;
128:835-848. [PMID:
34050647 PMCID:
PMC8577205 DOI:
10.1093/aob/mcab063]
[Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Accepted: 05/26/2021] [Indexed: 05/28/2023]
Abstract
BACKGROUND AND AIMS
With the advance of high-throughput sequencing, reduced-representation methods such as target capture sequencing (TCS) emerged as cost-efficient ways of gathering genomic information, particularly from coding regions. As the off-target reads from such sequencing are expected to be similar to genome skimming (GS), we assessed the quality of repeat characterization in plant genomes using these data.
METHODS
Repeat composition obtained from TCS datasets of five Rhynchospora (Cyperaceae) species were compared with GS data from the same taxa. In addition, a FISH probe was designed based on the most abundant satellite found in the TCS dataset of Rhynchospora cephalotes. Finally, repeat-based phylogenies of the five Rhynchospora species were constructed based on the GS and TCS datasets and the topologies were compared with a gene-alignment-based phylogenetic tree.
KEY RESULTS
All the major repetitive DNA families were identified in TCS, including repeats that showed abundances as low as 0.01 % in the GS data. Rank correlations between GS and TCS repeat abundances were moderately high (r = 0.58-0.85), increasing after filtering out the targeted loci from the raw TCS reads (r = 0.66-0.92). Repeat data obtained by TCS were also reliable in developing a cytogenetic probe of a new variant of the holocentromeric satellite Tyba. Repeat-based phylogenies from TCS data were congruent with those obtained from GS data and the gene-alignment tree.
CONCLUSIONS
Our results show that off-target TCS reads can be recycled to identify repeats for cyto- and phylogenomic investigations. Given the growing availability of TCS reads, driven by global phylogenomic projects, our strategy represents a way to recycle genomic data and contribute to a better characterization of plant biodiversity.
Collapse