de Campos GM, Clemente LG, Lima ARJ, Cella E, Fonseca V, Ximenez JPB, Nishiyama MY, de Carvalho E, Sampaio SC, Giovanetti M, Elias MC, Slavov SN. Anellovirus abundance as an indicator for viral metagenomic classifier utility in plasma samples.
Virol J 2025;
22:88. [PMID:
40148934 PMCID:
PMC11951539 DOI:
10.1186/s12985-025-02708-8]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/31/2025] [Accepted: 03/13/2025] [Indexed: 03/29/2025] Open
Abstract
BACKGROUND
Viral metagenomics has expanded significantly in recent years due to advancements in next-generation sequencing, establishing it as the leading method for identifying emerging viruses. A crucial step in metagenomics is taxonomic classification, where sequence data is assigned to specific taxa, thereby enabling the characterization of species composition within a sample. Various taxonomic classifiers have been developed in recent years, each employing distinct classification approaches that produce varying results and abundance profiles, even when analyzing the same sample.
METHODS
In this study, we propose using the identification of Torque Teno Viruses (TTVs), from the Anelloviridae family, as indicators to evaluate the performance of four short-read-based metagenomic classifiers: Kraken2, Kaiju, CLARK and DIAMOND, when evaluating human plasma samples.
RESULTS
Our results show that each classifier assigns TTV species at different abundance levels, potentially influencing the interpretation of diversity within samples. Specifically, nucleotide-based classifiers tend to detect a broader range of TTV species, indicating higher sensitivity, while amino acid-based classifiers like DIAMOND and CLARK display lower abundance indices. Interestingly, despite employing different algorithms and data types (protein-based vs. nucleotide-based), Kaiju and Kraken2 performed similarly.
CONCLUSION
Our study underscores the critical impact of classifier selection on diversity indices in metagenomic analyses. Kaiju effectively assigned a wide variety of TTV species, demonstrating it did not require a high volume of reads to capture diversity. Nucleotide-based classifiers like CLARK and Kraken2 showed superior sensitivity, which is valuable for detecting emerging or rare viruses. At the same time, protein-based approaches such as DIAMOND and Kaiju proved robust for identifying known species with low variability.
Collapse