1
|
Kaiser NL, Groschup MH, Sadeghi B. Identification of bioinformatic pipelines for virus monitoring using nanopore sequence data: A systematic assessment. J Virol Methods 2025; 336:115153. [PMID: 40194661 DOI: 10.1016/j.jviromet.2025.115153] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2025] [Revised: 03/21/2025] [Accepted: 03/31/2025] [Indexed: 04/09/2025]
Abstract
Nanopore sequencing has proven to be a promising technique in virus surveillance efforts, especially due to the portability of its sequencers. In order to process the long, error-prone reads generated, specialised bioinformatic programs are required. These can be run automatically within pipelines so as to effectively provide decision makers with all relevant information about the molecular characteristics of a virus. The purpose of this systematic assessment was to identify pipelines that are suitable for virus surveillance programs using nanopore sequencing. Promising candidates were then compared in terms of their functional scope. Of 239 initial papers, 22 pipelines were tested, of which six were included in the final assessment. The four pipelines that were exclusively available offline were each missing individual downstream analysis steps considered in our assessment. The other two executed all steps. One of these was only available online and subject to a charge, while the other was freely available both online and offline. While we were able to identify two pipelines that are broadly suitable for virus surveillance using nanopore sequencing, we discovered two major shortcomings in this domain. None of the pipelines integrated basecalling, the initial step of data processing. In addition, there was no pipeline that was easy to install and provided all relevant analysis results with a single program call. We therefore see a need for the development of a pipeline that incorporates both aspects.
Collapse
Affiliation(s)
- Nick Laurenz Kaiser
- Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Institute of Novel and Emerging Infectious Diseases, Greifswald, Insel Riems 17493, Germany.
| | - Martin H Groschup
- Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Institute of Novel and Emerging Infectious Diseases, Greifswald, Insel Riems 17493, Germany.
| | - Balal Sadeghi
- Friedrich-Loeffler-Institut, Federal Research Institute for Animal Health, Institute of Novel and Emerging Infectious Diseases, Greifswald, Insel Riems 17493, Germany.
| |
Collapse
|
2
|
Consul S, Robertson J, Vikalo H. XVir: A Transformer-Based Architecture for Identifying Viral Reads from Cancer Samples. J Comput Biol 2025. [PMID: 40392695 DOI: 10.1089/cmb.2025.0075] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/22/2025] Open
Abstract
It is estimated that approximately 15% of cancers worldwide can be linked to viral infections. The viruses that can cause or increase the risk of cancer include human papillomavirus, hepatitis B and C viruses, Epstein-Barr virus, and human immunodeficiency virus, to name a few. The computational analysis of the massive amounts of tumor DNA data, whose collection is enabled by the advancements in sequencing technologies, has allowed studies of the potential association between cancers and viral pathogens. However, the high diversity of oncoviral families makes reliable detection of viral DNA difficult, and the training of machine learning models that enable such analysis computationally challenging. We introduce XVir, a data pipeline that deploys a transformer-based deep learning architecture to reliably identify viral DNA present in human tumors. XVir is trained on a mix of sequencing reads coming from viral and human genomes, resulting in a model capable of robust detection of potentially mutated viral DNA across a range of experimental settings. Results on semi-experimental data demonstrate that XVir is able to achieve high classification accuracy, generally outperforming state-of-the-art competing methods. In particular, it retains high accuracy even when faced with diverse viral populations while being significantly faster to train than other large deep learning-based classifiers.
Collapse
Affiliation(s)
- Shorya Consul
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, Texas, USA
| | - John Robertson
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, Texas, USA
| | - Haris Vikalo
- Chandra Family Department of Electrical and Computer Engineering, The University of Texas at Austin, Austin, Texas, USA
| |
Collapse
|
3
|
Yang J, Xia T, Zhou X, Xu S, Guo K, Zhang Q, Guo J, Hou S. Complete genome sequence of Algoriphagus halophilus strain SOCE 003, a marine bacterium isolated from the surface seawater of Dapeng Bay. Microbiol Resour Announc 2025; 14:e0130624. [PMID: 39812640 PMCID: PMC11812419 DOI: 10.1128/mra.01306-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/10/2024] [Accepted: 12/18/2024] [Indexed: 01/16/2025] Open
Abstract
Algoriphagus is a heterotrophic bacterium commonly found in diverse marine environments. Here, we report the complete genome sequence of Algoriphagus halophilus strain SOCE 003, which is 5,154,101 bp long, encoding 5,524 annotated protein-coding genes, 39 tRNAs, and 8 rRNAs. This genome information will help us understand the ecology of Algoriphagus.
Collapse
Affiliation(s)
- Jiayi Yang
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Tian Xia
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Xunying Zhou
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Shuaishuai Xu
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Kangli Guo
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Qianqing Zhang
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
- Department of Ocean Science, Hong Kong University of Science and Technology, Hong Kong, China
| | - Jing Guo
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Shengwei Hou
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
- Shanghai Key Laboratory of Polar Life and Environment Sciences, Shanghai Jiao Tong University, Shanghai, China
- Key Laboratory of Polar Ecosystem and Climate Change, Ministry of Education, Shanghai Jiao Tong University, Shanghai, China
| |
Collapse
|
4
|
Ruperao P, Rangan P, Shah T, Sharma V, Rathore A, Mayes S, Pandey MK. Developing pangenomes for large and complex plant genomes and their representation formats. J Adv Res 2025:S2090-1232(25)00071-2. [PMID: 39894347 DOI: 10.1016/j.jare.2025.01.052] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/25/2024] [Revised: 01/27/2025] [Accepted: 01/27/2025] [Indexed: 02/04/2025] Open
Abstract
BACKGROUND The development of pangenomes has revolutionized genomic studies by capturing the complete genetic diversity within a species. Pangenome assembly integrates data from multiple individuals to construct a comprehensive genomic landscape, revealing both core and accessory genomic elements. This approach enables the identification of novel genes, structural variations, and gene presence-absence variations, providing insights into species evolution, adaptation, and trait variation. Representing pangenomes requires innovative visualization formats that effectively convey the complex genomic structures and variations. AIM This review delves into contemporary methodologies and recent advancements in constructing pangenomes, particularly in plant genomes. It examines the structure of pangenome representation, including format comparison, conversion, visualization techniques, and their implications for enhancing crop improvement strategies. KEY SCIENTIFIC CONCEPTS OF REVIEW Earlier comparative studies have illuminated novel gene sequences, copy number variations, and presence-absence variations across diverse crop species. The concept of a pan-genome, which captures multiple genetic variations from a broad spectrum of genotypes, offers a holistic perspective of a species' genetic makeup. However, constructing a pan-genome for plants with larger genomes poses challenges, including managing vast genome sequence data and comprehending the genetic variations within the germplasm. To address these challenges, researchers have explored cost-effective alternatives to encapsulate species diversity in a single assembly known as a pangenome. This involves reducing the volume of genome sequences while focusing on genetic variations. With the growing prominence of the pan-genome concept in plant genomics, several software tools have emerged to facilitate pangenome construction. This review sheds light on developing and utilizing software tools tailored for constructing pan-genomes in plants. It also discusses representation formats suitable for downstream analyses, offering valuable insights into the genetic landscape and evolutionary dynamics of plant species. In summary, this review underscores the significance of pan-genome construction and representation formats in resolving the genetic architecture of plants, particularly those with complex genomes. It provides a comprehensive overview of recent advancements, aiding in exploring and understanding plant genetic diversity.
Collapse
Affiliation(s)
- Pradeep Ruperao
- Center of Excellence in Genomics and Systems Biology (CEGSB) and Center for Pre-Breeding Research (CPBR), International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India.
| | - Parimalan Rangan
- ICAR-National Bureau of Plant Genetic Resources (NBPGR), New Delhi, India; Queensland Alliance for Agriculture and Food Innovation, The University of Queensland, St Lucia, Australia
| | - Trushar Shah
- International Institute of Tropical Agriculture (IITA), Nairobi, Kenya
| | - Vinay Sharma
- Center of Excellence in Genomics and Systems Biology (CEGSB) and Center for Pre-Breeding Research (CPBR), International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Abhishek Rathore
- International Maize and Wheat Improvement Center (CIMMYT), Nairobi, Kenya
| | - Sean Mayes
- Center of Excellence in Genomics and Systems Biology (CEGSB) and Center for Pre-Breeding Research (CPBR), International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India
| | - Manish K Pandey
- Center of Excellence in Genomics and Systems Biology (CEGSB) and Center for Pre-Breeding Research (CPBR), International Crops Research Institute for the Semi-Arid Tropics (ICRISAT), Hyderabad, India.
| |
Collapse
|
5
|
Hyman P. Are You My Host? An Overview of Methods Used to Link Bacteriophages with Hosts. Viruses 2025; 17:65. [PMID: 39861854 PMCID: PMC11769497 DOI: 10.3390/v17010065] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/14/2024] [Revised: 01/02/2025] [Accepted: 01/03/2025] [Indexed: 01/27/2025] Open
Abstract
Until recently, the only methods for finding out if a particular strain or species of bacteria could be a host for a particular bacteriophage was to see if the bacteriophage could infect that bacterium and kill it, releasing progeny phages. Establishing the host range of a bacteriophage thus meant infecting many different bacteria and seeing if the phage could kill each one. Detection of bacterial killing can be achieved on solid media (plaques, spots) or broth (culture clearing). More recently, additional methods to link phages and hosts have been developed. These include methods to show phage genome entry into host cells (e.g., PhageFISH); proximity of phage and host genomes (e.g., proximity ligation, polonies, viral tagging); and analysis of genomes and metagenomes (e.g., CRISPR spacer analysis, metagenomic co-occurrence). These methods have advantages and disadvantages. They also are not measuring the same interactions. Host range can be divided into multiple host ranges, each defined by how far the phage can progress in the infection cycle. For example, the ability to effect genome entry (penetrative host range) is different than the ability to produce progeny (productive host range). These different host ranges reflect bacterial defense mechanisms that block phage growth and development at various stages in the infection cycle. Here, I present a comparison of the various methods used to identify bacteriophage-host relationships with a focus on what type of host range is being measured or predicted.
Collapse
Affiliation(s)
- Paul Hyman
- Department of Biology and Toxicology, Ashland University, Ashland, OH 44805, USA
| |
Collapse
|
6
|
Liu X, Liu Y, Liu J, Zhang H, Shan C, Guo Y, Gong X, Cui M, Li X, Tang M. Correlation between the gut microbiome and neurodegenerative diseases: a review of metagenomics evidence. Neural Regen Res 2024; 19:833-845. [PMID: 37843219 PMCID: PMC10664138 DOI: 10.4103/1673-5374.382223] [Citation(s) in RCA: 16] [Impact Index Per Article: 16.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/23/2023] [Revised: 04/19/2023] [Accepted: 06/17/2023] [Indexed: 10/17/2023] Open
Abstract
A growing body of evidence suggests that the gut microbiota contributes to the development of neurodegenerative diseases via the microbiota-gut-brain axis. As a contributing factor, microbiota dysbiosis always occurs in pathological changes of neurodegenerative diseases, such as Alzheimer's disease, Parkinson's disease, and amyotrophic lateral sclerosis. High-throughput sequencing technology has helped to reveal that the bidirectional communication between the central nervous system and the enteric nervous system is facilitated by the microbiota's diverse microorganisms, and for both neuroimmune and neuroendocrine systems. Here, we summarize the bioinformatics analysis and wet-biology validation for the gut metagenomics in neurodegenerative diseases, with an emphasis on multi-omics studies and the gut virome. The pathogen-associated signaling biomarkers for identifying brain disorders and potential therapeutic targets are also elucidated. Finally, we discuss the role of diet, prebiotics, probiotics, postbiotics and exercise interventions in remodeling the microbiome and reducing the symptoms of neurodegenerative diseases.
Collapse
Affiliation(s)
- Xiaoyan Liu
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Yi Liu
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
- Institute of Animal Husbandry, Jiangsu Academy of Agricultural Sciences, Nanjing, Jiangsu Province, China
| | - Junlin Liu
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Hantao Zhang
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Chaofan Shan
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Yinglu Guo
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Xun Gong
- Department of Rheumatology & Immunology, Affiliated Hospital of Jiangsu University, Zhenjiang, Jiangsu Province, China
| | - Mengmeng Cui
- Department of Neurology, The Second Affiliated Hospital of Shandong First Medical University, Taian, Shandong Province, China
| | - Xiubin Li
- Department of Neurology, The Second Affiliated Hospital of Shandong First Medical University, Taian, Shandong Province, China
| | - Min Tang
- School of Life Sciences, Jiangsu University, Zhenjiang, Jiangsu Province, China
| |
Collapse
|
7
|
Veldsman WP, Yang C, Zhang Z, Huang Y, Chowdhury D, Zhang L. Structural and Functional Disparities within the Human Gut Virome in Terms of Genome Topology and Representative Genome Selection. Viruses 2024; 16:134. [PMID: 38257834 PMCID: PMC10820185 DOI: 10.3390/v16010134] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/18/2023] [Revised: 01/12/2024] [Accepted: 01/16/2024] [Indexed: 01/24/2024] Open
Abstract
Circularity confers protection to viral genomes where linearity falls short, thereby fulfilling the form follows function aphorism. However, a shift away from morphology-based classification toward the molecular and ecological classification of viruses is currently underway within the field of virology. Recent years have seen drastic changes in the International Committee on Taxonomy of Viruses' operational definitions of viruses, particularly for the tailed phages that inhabit the human gut. After the abolition of the order Caudovirales, these tailed phages are best defined as members of the class Caudoviricetes. To determine the epistemological value of genome topology in the context of the human gut virome, we designed a set of seven experiments to assay the impact of genome topology and representative viral selection on biological interpretation. Using Oxford Nanopore long reads for viral genome assembly coupled with Illumina short-read polishing, we showed that circular and linear virus genomes differ remarkably in terms of genome quality, GC skew, transfer RNA gene frequency, structural variant frequency, cross-reference functional annotation (COG, KEGG, Pfam, and TIGRfam), state-of-the-art marker-based classification, and phage-host interaction. Furthermore, the disparity profile changes during dereplication. In particular, our phage-host interaction results demonstrated that proportional abundances cannot be meaningfully compared without due regard for genome topology and dereplication threshold, which necessitates the need for standardized reporting. As a best practice guideline, we recommend that comparative studies of the human gut virome always report the ratio of circular to linear viral genomes along with the dereplication threshold so that structural and functional metrics can be placed into context when assessing biologically relevant metagenomic properties such as proportional abundance.
Collapse
Affiliation(s)
- Werner P. Veldsman
- Department of Computer Science, Hong Kong Baptist University, Kowloon, Hong Kong SAR, China; (W.P.V.); (C.Y.); (Z.Z.)
| | - Chao Yang
- Department of Computer Science, Hong Kong Baptist University, Kowloon, Hong Kong SAR, China; (W.P.V.); (C.Y.); (Z.Z.)
| | - Zhenmiao Zhang
- Department of Computer Science, Hong Kong Baptist University, Kowloon, Hong Kong SAR, China; (W.P.V.); (C.Y.); (Z.Z.)
| | | | - Debajyoti Chowdhury
- School of Chinese Medicine, Hong Kong Baptist University, Hong Kong SAR, China;
- Computational Medicine Laboratory, Hong Kong Baptist University, Hong Kong SAR, China
| | - Lu Zhang
- Department of Computer Science, Hong Kong Baptist University, Kowloon, Hong Kong SAR, China; (W.P.V.); (C.Y.); (Z.Z.)
- Institute for Research and Continuing Education, Hong Kong Baptist University, Shenzhen 518057, China
| |
Collapse
|
8
|
Grigson SR, Giles SK, Edwards RA, Papudeshi B. Knowing and Naming: Phage Annotation and Nomenclature for Phage Therapy. Clin Infect Dis 2023; 77:S352-S359. [PMID: 37932119 PMCID: PMC10627814 DOI: 10.1093/cid/ciad539] [Citation(s) in RCA: 11] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/08/2023] Open
Abstract
Bacteriophages, or phages, are viruses that infect bacteria shaping microbial communities and ecosystems. They have gained attention as potential agents against antibiotic resistance. In phage therapy, lytic phages are preferred for their bacteria killing ability, while temperate phages, which can transfer antibiotic resistance or toxin genes, are avoided. Selection relies on plaque morphology and genome sequencing. This review outlines annotating genomes, identifying critical genomic features, and assigning functional labels to protein-coding sequences. These annotations prevent the transfer of unwanted genes, such as antimicrobial resistance or toxin genes, during phage therapy. Additionally, it covers International Committee on Taxonomy of Viruses (ICTV)-an established phage nomenclature system for simplified classification and communication. Accurate phage genome annotation and nomenclature provide insights into phage-host interactions, replication strategies, and evolution, accelerating our understanding of the diversity and evolution of phages and facilitating the development of phage-based therapies.
Collapse
Affiliation(s)
- Susanna R Grigson
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, Australia
| | - Sarah K Giles
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, Australia
| | - Robert A Edwards
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, Australia
| | - Bhavya Papudeshi
- Flinders Accelerator for Microbiome Exploration, College of Science and Engineering, Flinders University, Adelaide, Australia
| |
Collapse
|
9
|
Inwood SN, Skelly J, Guhlin JG, Harrop TWR, Goldson SL, Dearden PK. Chromosome-level genome assemblies of two parasitoid biocontrol wasps reveal the parthenogenesis mechanism and an associated novel virus. BMC Genomics 2023; 24:440. [PMID: 37543591 PMCID: PMC10403939 DOI: 10.1186/s12864-023-09538-4] [Citation(s) in RCA: 4] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/22/2023] [Accepted: 07/27/2023] [Indexed: 08/07/2023] Open
Abstract
BACKGROUND Biocontrol is a key technology for the control of pest species. Microctonus parasitoid wasps (Hymenoptera: Braconidae) have been released in Aotearoa New Zealand as biocontrol agents, targeting three different pest weevil species. Despite their value as biocontrol agents, no genome assemblies are currently available for these Microctonus wasps, limiting investigations into key biological differences between the different species and strains. METHODS AND FINDINGS Here we present high-quality genomes for Microctonus hyperodae and Microctonus aethiopoides, assembled with short read sequencing and Hi-C scaffolding. These assemblies have total lengths of 106.7 Mb for M. hyperodae and 129.2 Mb for M. aethiopoides, with scaffold N50 values of 9 Mb and 23 Mb respectively. With these assemblies we investigated differences in reproductive mechanisms, and association with viruses between Microctonus wasps. Meiosis-specific genes are conserved in asexual Microctonus, with in-situ hybridisation validating expression of one of these genes in the ovaries of asexual Microctonus aethiopoides. This implies asexual reproduction in these Microctonus wasps involves meiosis, with the potential for sexual reproduction maintained. Investigation of viral gene content revealed candidate genes that may be involved in virus-like particle production in M. aethiopoides, as well as a novel virus infecting M. hyperodae, for which a complete genome was assembled. CONCLUSION AND SIGNIFICANCE These are the first published genomes for Microctonus wasps which have been deployed as biocontrol agents, in Aotearoa New Zealand. These assemblies will be valuable resources for continued investigation and monitoring of these biocontrol systems. Understanding the biology underpinning Microctonus biocontrol is crucial if we are to maintain its efficacy, or in the case of M. hyperodae to understand what may have influenced the significant decline of biocontrol efficacy. The potential for sexual reproduction in asexual Microctonus is significant given that empirical modelling suggests this asexual reproduction is likely to have contributed to biocontrol decline. Furthermore the identification of a novel virus in M. hyperodae highlights a previously unknown aspect of this biocontrol system, which may contribute to premature mortality of the host pest. These findings have potential to be exploited in future in attempt to increase the effectiveness of M. hyperodae biocontrol.
Collapse
Affiliation(s)
- Sarah N Inwood
- Bioprotection Aotearoa and Biochemistry Department, University of Otago, Dunedin, Aotearoa, New Zealand
| | - John Skelly
- Bioprotection Aotearoa and Biochemistry Department, University of Otago, Dunedin, Aotearoa, New Zealand
- Humble Bee Bio, Wellington, Aotearoa, New Zealand
| | - Joseph G Guhlin
- Genomics Aotearoa, University of Otago, Dunedin, Aotearoa, New Zealand
| | - Thomas W R Harrop
- Melbourne Bioinformatics, The University of Melbourne, Parkville, VIC, 3010, Australia
| | - Stephen L Goldson
- Biocontrol and Biosecurity Group, AgResearch Limited, Lincoln, Aotearoa, New Zealand
| | - Peter K Dearden
- Bioprotection Aotearoa and Biochemistry Department, University of Otago, Dunedin, Aotearoa, New Zealand.
- Genomics Aotearoa, University of Otago, Dunedin, Aotearoa, New Zealand.
| |
Collapse
|
10
|
Guo J, Xu S, Liu Y, Zhang C, Hou S. Complete Genome Sequence of Stutzerimonas stutzeri Strain SOCE 002, a Marine Bacterium Isolated from the Surface Seawater of Dapeng Bay. Microbiol Resour Announc 2023; 12:e0015023. [PMID: 37067410 DOI: 10.1128/mra.00150-23] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/18/2023] Open
Abstract
We report the complete genome sequence of Stutzerimonas stutzeri strain SOCE 002, obtained from Illumina and Oxford Nanopore sequencing. The genome is 4.68 Mb long, with a GC content of 63.5%, and contains 4,334 protein-coding genes, 60 tRNAs, and 12 rRNAs. We expect that this complete genome sequence will provide a reference for both genomic and metabolic analyses of S. stutzeri.
Collapse
Affiliation(s)
- Jing Guo
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
- Southern Marine Science and Engineering Guangdong Laboratory, Guangzhou, China
- Shenzhen Key Laboratory of Marine Archaea Geo-Omics, Southern University of Science and Technology, Shenzhen, China
| | - Shuaishuai Xu
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
- College of Life Science and Technology, Jinan University, Guangzhou, China
| | - Yanting Liu
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
| | - Chuanlun Zhang
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
- Southern Marine Science and Engineering Guangdong Laboratory, Guangzhou, China
- Shenzhen Key Laboratory of Marine Archaea Geo-Omics, Southern University of Science and Technology, Shenzhen, China
| | - Shengwei Hou
- Department of Ocean Science and Engineering, Southern University of Science and Technology, Shenzhen, China
- Southern Marine Science and Engineering Guangdong Laboratory, Guangzhou, China
- Shenzhen Key Laboratory of Marine Archaea Geo-Omics, Southern University of Science and Technology, Shenzhen, China
| |
Collapse
|
11
|
Yu R, Cai D, Sun Y. AccuVIR: an ACCUrate VIRal genome assembly tool for third-generation sequencing data. Bioinformatics 2023; 39:6969105. [PMID: 36610711 PMCID: PMC9825286 DOI: 10.1093/bioinformatics/btac827] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2022] [Revised: 11/24/2022] [Accepted: 12/24/2022] [Indexed: 12/27/2022] Open
Abstract
MOTIVATION RNA viruses tend to mutate constantly. While many of the variants are neutral, some can lead to higher transmissibility or virulence. Accurate assembly of complete viral genomes enables the identification of underlying variants, which are essential for studying virus evolution and elucidating the relationship between genotypes and virus properties. Recently, third-generation sequencing platforms such as Nanopore sequencers have been used for real-time virus sequencing for Ebola, Zika, coronavirus disease 2019, etc. However, their high per-base error rate prevents the accurate reconstruction of the viral genome. RESULTS In this work, we introduce a new tool, AccuVIR, for viral genome assembly and polishing using error-prone long reads. It can better distinguish sequencing errors from true variants based on the key observation that sequencing errors can disrupt the gene structures of viruses, which usually have a high density of coding regions. Our experimental results on both simulated and real third-generation sequencing data demonstrated its superior performance on generating more accurate viral genomes than generic assembly or polish tools. AVAILABILITY AND IMPLEMENTATION The source code and the documentation of AccuVIR are available at https://github.com/rainyrubyzhou/AccuVIR. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Runzhou Yu
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR 000000, China
| | - Dehan Cai
- Department of Electrical Engineering, City University of Hong Kong, Kowloon, Hong Kong SAR 000000, China
| | - Yanni Sun
- To whom correspondence should be addressed.
| |
Collapse
|
12
|
Buttler J, Drown DM. Accuracy and Completeness of Long Read Metagenomic Assemblies. Microorganisms 2022; 11:96. [PMID: 36677391 PMCID: PMC9861289 DOI: 10.3390/microorganisms11010096] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/28/2022] [Revised: 12/22/2022] [Accepted: 12/28/2022] [Indexed: 01/03/2023] Open
Abstract
Microbes influence the surrounding environment and contribute to human health. Metagenomics can be used as a tool to explore the interactions between microbes. Metagenomic assemblies built using long read nanopore data depend on the read level accuracy. The read level accuracy of nanopore sequencing has made dramatic improvements over the past several years. However, we do not know if the increased read level accuracy allows for faster assemblers to make as accurate metagenomic assemblies as slower assemblers. Here, we present the results of a benchmarking study comparing three commonly used long read assemblers, Flye, Raven, and Redbean. We used a prepared DNA standard of seven bacteria as our input community. We prepared a sequencing library using a VolTRAX V2 and sequenced using a MinION mk1b. We basecalled with Guppy v5.0.7 using the super-accuracy model. We found that increasing read depth benefited each of the assemblers, and nearly complete community member chromosomes were assembled with as little as 10× read depth. Polishing assemblies using Medaka had a predictable improvement in quality. We found Flye to be the most robust across taxa and was the most effective assembler for recovering plasmids. Based on Flye's consistency for chromosomes and increased effectiveness at assembling plasmids, we would recommend using Flye in future metagenomic studies.
Collapse
Affiliation(s)
- Jeremy Buttler
- Department of Biology and Wildlife, University of Alaska Fairbanks, Fairbanks, AK 99775, USA
| | - Devin M. Drown
- Department of Biology and Wildlife, University of Alaska Fairbanks, Fairbanks, AK 99775, USA
- Institute of Arctic Biology, University of Alaska Fairbanks, Fairbanks, AK 99775, USA
| |
Collapse
|