1
|
Ciuchcinski K, Stokke R, Steen IH, Dziewit L. Landscape of the metaplasmidome of deep-sea hydrothermal vents located at Arctic Mid-Ocean Ridges in the Norwegian-Greenland Sea: ecological insights from comparative analysis of plasmid identification tools. FEMS Microbiol Ecol 2024; 100:fiae124. [PMID: 39271469 PMCID: PMC11451466 DOI: 10.1093/femsec/fiae124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/26/2024] [Revised: 09/04/2024] [Accepted: 09/12/2024] [Indexed: 09/15/2024] Open
Abstract
Plasmids are one of the key drivers of microbial adaptation and evolution. However, their diversity and role in adaptation, especially in extreme environments, remains largely unexplored. In this study, we aimed to identify, characterize, and compare plasmid sequences originating from samples collected from deep-sea hydrothermal vents located in Arctic Mid-Ocean Ridges. To achieve this, we employed, and benchmarked three recently developed plasmid identification tools-PlasX, GeNomad, and PLASMe-on metagenomic data from this unique ecosystem. To date, this is the first direct comparison of these computational methods in the context of data from extreme environments. Upon recovery of plasmid contigs, we performed a multiapproach analysis, focusing on identifying taxonomic and functional biases within datasets originating from each tool. Next, we implemented a majority voting system to identify high-confidence plasmid contigs, enhancing the reliability of our findings. By analysing the consensus plasmid sequences, we gained insights into their diversity, ecological roles, and adaptive significance. Within the high-confidence sequences, we identified a high abundance of Pseudomonadota and Campylobacterota, as well as multiple toxin-antitoxin systems. Our findings ensure a deeper understanding of how plasmids contribute to shaping microbial communities living under extreme conditions of hydrothermal vents, potentially uncovering novel adaptive mechanisms.
Collapse
Affiliation(s)
- Karol Ciuchcinski
- Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw,00-927, Warsaw, Poland
| | - Runar Stokke
- Department of Biological Sciences, Center for Deep Sea Research, University of Bergen, N-5020, Bergen, Norway
| | - Ida Helene Steen
- Department of Biological Sciences, Center for Deep Sea Research, University of Bergen, N-5020, Bergen, Norway
| | - Lukasz Dziewit
- Department of Environmental Microbiology and Biotechnology, Institute of Microbiology, Faculty of Biology, University of Warsaw,00-927, Warsaw, Poland
| |
Collapse
|
2
|
Yang Y, Shao Y, Pei C, Liu Y, Zhang M, Zhu X, Li J, Feng L, Li G, Li K, Liang Y, Li Y. Pangenome analyses of Clostridium butyricum provide insights into its genetic characteristics and industrial application. Genomics 2024; 116:110855. [PMID: 38703968 DOI: 10.1016/j.ygeno.2024.110855] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2024] [Revised: 04/29/2024] [Accepted: 05/01/2024] [Indexed: 05/06/2024]
Abstract
Clostridium butyricum is a Gram-positive anaerobic bacterium known for its ability to produce butyate. In this study, we conducted whole-genome sequencing and assembly of 14C. butyricum industrial strains collected from various parts of China. We performed a pan-genome comparative analysis of the 14 assembled strains and 139 strains downloaded from NCBI. We found that the genes related to critical industrial production pathways were primarily present in the core and soft-core gene categories. The phylogenetic analysis revealed that strains from the same clade of the phylogenetic tree possessed similar antibiotic resistance and virulence factors, with most of these genes present in the shell and cloud gene categories. Finally, we predicted the genes producing bacteriocins and botulinum toxins as well as CRISPR systems responsible for host defense. In conclusion, our research provides a desirable pan-genome database for the industrial production, food application, and genetic research of C. butyricum.
Collapse
Affiliation(s)
- Yicheng Yang
- State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Yuan Shao
- State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Chenchen Pei
- State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Yangyang Liu
- State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Min Zhang
- State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Xi Zhu
- State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Jinshan Li
- State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Lifei Feng
- Henan Jinbaihe Biotechnology Co., Ltd., Tangyin, Anyang 455000, China
| | - Guanghua Li
- Henan Jinbaihe Biotechnology Co., Ltd., Tangyin, Anyang 455000, China
| | - Keke Li
- Henan Jinbaihe Biotechnology Co., Ltd., Tangyin, Anyang 455000, China
| | - Yunxiang Liang
- State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China
| | - Yingjun Li
- State Key Laboratory of Agricultural Microbiology, College of Life Science and Technology, Huazhong Agricultural University, Wuhan 430070, China.
| |
Collapse
|
3
|
Rádai Z, Váradi A, Takács P, Nagy NA, Schmitt N, Prépost E, Kardos G, Laczkó L. An overlooked phenomenon: complex interactions of potential error sources on the quality of bacterial de novo genome assemblies. BMC Genomics 2024; 25:45. [PMID: 38195441 PMCID: PMC10777565 DOI: 10.1186/s12864-023-09910-4] [Citation(s) in RCA: 1] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/21/2022] [Accepted: 12/15/2023] [Indexed: 01/11/2024] Open
Abstract
BACKGROUND Parameters adversely affecting the contiguity and accuracy of the assemblies from Illumina next-generation sequencing (NGS) are well described. However, past studies generally focused on their additive effects, overlooking their potential interactions possibly exacerbating one another's effects in a multiplicative manner. To investigate whether or not they act interactively on de novo genome assembly quality, we simulated sequencing data for 13 bacterial reference genomes, with varying levels of error rate, sequencing depth, PCR and optical duplicate ratios. RESULTS We assessed the quality of assemblies from the simulated sequencing data with a number of contiguity and accuracy metrics, which we used to quantify both additive and multiplicative effects of the four parameters. We found that the tested parameters are engaged in complex interactions, exerting multiplicative, rather than additive, effects on assembly quality. Also, the ratio of non-repeated regions and GC% of the original genomes can shape how the four parameters affect assembly quality. CONCLUSIONS We provide a framework for consideration in future studies using de novo genome assembly of bacterial genomes, e.g. in choosing the optimal sequencing depth, balancing between its positive effect on contiguity and negative effect on accuracy due to its interaction with error rate. Furthermore, the properties of the genomes to be sequenced also should be taken into account, as they might influence the effects of error sources themselves.
Collapse
Affiliation(s)
- Zoltán Rádai
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary.
- Department of Dermatology, University Hospital Düsseldorf, Heinrich-Heine-University, Düsseldorf, Germany.
| | - Alex Váradi
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- Department of Laboratory Medicine, Medical School, University of Pécs, Pécs, Hungary
| | - Péter Takács
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- Department of Health Informatics, Institute of Health Sciences, Faculty of Health, University of Debrecen, Debrecen, Hungary
| | - Nikoletta Andrea Nagy
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- Department of Evolutionary Zoology, ELKH-DE Behavioural Ecology Research Group, University of Debrecen, Debrecen, Hungary
- Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary
| | - Nicholas Schmitt
- Department of Dermatology, University Hospital Düsseldorf, Heinrich-Heine-University, Düsseldorf, Germany
| | - Eszter Prépost
- Department of Health Industry, University of Debrecen, Debrecen, Hungary
| | - Gábor Kardos
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- Department of Gerontology, Faculty of Health Sciences, University of Debrecen, Debrecen, Hungary
| | - Levente Laczkó
- Institute of Metagenomics, University of Debrecen, Debrecen, Hungary
- ELKH-DE Conservation Biology Research Group, Debrecen, Hungary
| |
Collapse
|
4
|
Kuśmirek W. Estimated Nucleotide Reconstruction Quality Symbols of Basecalling Tools for Oxford Nanopore Sequencing. SENSORS (BASEL, SWITZERLAND) 2023; 23:6787. [PMID: 37571570 PMCID: PMC10422362 DOI: 10.3390/s23156787] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/29/2023] [Revised: 07/21/2023] [Accepted: 07/27/2023] [Indexed: 08/13/2023]
Abstract
Currently, one of the fastest-growing DNA sequencing technologies is nanopore sequencing. One of the key stages involved in processing sequencer data is the basecalling process, where the input sequence of currents measured on the nanopores of the sequencer reproduces the DNA sequences, called DNA reads. Many of the applications dedicated to basecalling, together with the DNA sequence, provide the estimated quality of the reconstruction of a given nucleotide (quality symbols are contained on every fourth line of the FASTQ file; each nucleotide in the FASTQ file corresponds to exactly one estimated nucleotide reconstruction quality symbol). Herein, we compare the estimated nucleotide reconstruction quality symbols (signs from every fourth line of the FASTQ file) reported by other basecallers. The conducted experiments consisted of basecalling the same raw datasets from the nanopore device by other basecallers and comparing the provided quality symbols, denoting the estimated quality of the nucleotide reconstruction. The results show that the estimated quality reported by different basecallers may vary, depending on the tool used, particularly in terms of range and distribution. Moreover, we mapped basecalled DNA reads to reference genomes and calculated matched and mismatched rates for groups of nucleotides with the same quality symbol. Finally, the presented paper shows that the estimated nucleotide reconstruction quality reported in the basecalling process is not used in any investigated tool for processing nanopore DNA reads.
Collapse
Affiliation(s)
- Wiktor Kuśmirek
- Institute of Computer Science, Warsaw University of Technology, 00-661 Warsaw, Poland
| |
Collapse
|
5
|
Huang F, Xiao L, Gao M, Vallely EJ, Dybvig K, Atkinson TP, Waites KB, Chong Z. B-assembler: a circular bacterial genome assembler. BMC Genomics 2022; 23:361. [PMID: 35546658 PMCID: PMC9092672 DOI: 10.1186/s12864-022-08577-7] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/21/2022] [Accepted: 04/21/2022] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Accurate bacteria genome de novo assembly is fundamental to understand the evolution and pathogenesis of new bacteria species. The advent and popularity of Third-Generation Sequencing (TGS) enables assembly of bacteria genomes at an unprecedented speed. However, most current TGS assemblers were specifically designed for human or other species that do not have a circular genome. Besides, the repetitive DNA fragments in many bacterial genomes plus the high error rate of long sequencing data make it still very challenging to accurately assemble their genomes even with a relatively small genome size. Therefore, there is an urgent need for the development of an optimized method to address these issues. RESULTS We developed B-assembler, which is capable of assembling bacterial genomes when there are only long reads or a combination of short and long reads. B-assembler takes advantage of the structural resolving power of long reads and the accuracy of short reads if applicable. It first selects and corrects the ultra-long reads to get an initial contig. Then, it collects the reads overlapping with the ends of the initial contig. This two-round assembling procedure along with optimized error correction enables a high-confidence and circularized genome assembly. Benchmarked on both synthetic and real sequencing data of several species of bacterium, the results show that both long-read-only and hybrid-read modes can accurately assemble circular bacterial genomes free of structural errors and have fewer small errors compared to other assemblers. CONCLUSIONS B-assembler provides a better solution to bacterial genome assembly, which will facilitate downstream bacterial genome analysis.
Collapse
Affiliation(s)
- Fengyuan Huang
- Informatics Institute, Heersink School of Medicine, the University of Alabama at Birmingham, AL, 35294, Birmingham, USA.,Department of Genetics, Heersink School of Medicine, the University of Alabama at Birmingham, AL, 35294, Birmingham, USA
| | - Li Xiao
- Department of Medicine, Heersink School of Medicine, the University of Alabama at Birmingham, AB, 35294, Birmingham, USA
| | - Min Gao
- Informatics Institute, Heersink School of Medicine, the University of Alabama at Birmingham, AL, 35294, Birmingham, USA.,Department of Medicine, Heersink School of Medicine, the University of Alabama at Birmingham, AB, 35294, Birmingham, USA
| | - Ethan J Vallely
- Informatics Institute, Heersink School of Medicine, the University of Alabama at Birmingham, AL, 35294, Birmingham, USA
| | - Kevin Dybvig
- Department of Genetics, Heersink School of Medicine, the University of Alabama at Birmingham, AL, 35294, Birmingham, USA.,Department of Pediatrics, Heersink School of Medicine, the University of Alabama at Birmingham, AL, 35233, Birmingham, USA
| | - T Prescott Atkinson
- Department of Pediatrics, Heersink School of Medicine, the University of Alabama at Birmingham, AL, 35233, Birmingham, USA
| | - Ken B Waites
- Department of Pathology, Heersink School of Medicine, the University of Alabama at Birmingham, AL, 35233, Birmingham, USA
| | - Zechen Chong
- Informatics Institute, Heersink School of Medicine, the University of Alabama at Birmingham, AL, 35294, Birmingham, USA. .,Department of Genetics, Heersink School of Medicine, the University of Alabama at Birmingham, AL, 35294, Birmingham, USA.
| |
Collapse
|
6
|
Palma F, Mangone I, Janowicz A, Moura A, Chiaverini A, Torresi M, Garofolo G, Criscuolo A, Brisse S, Di Pasquale A, Cammà C, Radomski N. In vitro and in silico parameters for precise cgMLST typing of Listeria monocytogenes. BMC Genomics 2022; 23:235. [PMID: 35346021 PMCID: PMC8961897 DOI: 10.1186/s12864-022-08437-4] [Citation(s) in RCA: 7] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2021] [Accepted: 02/28/2022] [Indexed: 02/02/2023] Open
Abstract
Background Whole genome sequencing analyzed by core genome multi-locus sequence typing (cgMLST) is widely used in surveillance of the pathogenic bacteria Listeria monocytogenes. Given the heterogeneity of available bioinformatics tools to define cgMLST alleles, our aim was to identify parameters influencing the precision of cgMLST profiles. Methods We used three L. monocytogenes reference genomes from different phylogenetic lineages and assessed the impact of in vitro (i.e. tested genomes, successive platings, replicates of DNA extraction and sequencing) and in silico parameters (i.e. targeted depth of coverage, depth of coverage, breadth of coverage, assembly metrics, cgMLST workflows, cgMLST completeness) on cgMLST precision made of 1748 core loci. Six cgMLST workflows were tested, comprising assembly-based (BIGSdb, INNUENDO, GENPAT, SeqSphere and BioNumerics) and assembly-free (i.e. kmer-based MentaLiST) allele callers. Principal component analyses and generalized linear models were used to identify the most impactful parameters on cgMLST precision. Results The isolate’s genetic background, cgMLST workflows, cgMLST completeness, as well as depth and breadth of coverage were the parameters that impacted most on cgMLST precision (i.e. identical alleles against reference circular genomes). All workflows performed well at ≥40X of depth of coverage, with high loci detection (> 99.54% for all, except for BioNumerics with 97.78%) and showed consistent cluster definitions using the reference cut-off of ≤7 allele differences. Conclusions This highlights that bioinformatics workflows dedicated to cgMLST allele calling are largely robust when paired-end reads are of high quality and when the sequencing depth is ≥40X. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-022-08437-4.
Collapse
|
7
|
Genome assembly and annotation. Bioinformatics 2022. [DOI: 10.1016/b978-0-323-89775-4.00013-4] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/15/2022] Open
|
8
|
DNA Viral Diversity, Abundance, and Functional Potential Vary across Grassland Soils with a Range of Historical Moisture Regimes. mBio 2021; 12:e0259521. [PMID: 34724822 PMCID: PMC8567247 DOI: 10.1128/mbio.02595-21] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/03/2022] Open
Abstract
Soil viruses are abundant, but the influence of the environment and climate on soil viruses remains poorly understood. Here, we addressed this gap by comparing the diversity, abundance, lifestyle, and metabolic potential of DNA viruses in three grassland soils with historical differences in average annual precipitation, low in eastern Washington (WA), high in Iowa (IA), and intermediate in Kansas (KS). Bioinformatics analyses were applied to identify a total of 2,631 viral contigs, including 14 complete viral genomes from three deep metagenomes (1 terabase [Tb] each) that were sequenced from bulk soil DNA. An additional three replicate metagenomes (∼0.5 Tb each) were obtained from each location for statistical comparisons. Identified viruses were primarily bacteriophages targeting dominant bacterial taxa. Both viral and host diversity were higher in soil with lower precipitation. Viral abundance was also significantly higher in the arid WA location than in IA and KS. More lysogenic markers and fewer clustered regularly interspaced short palindromic repeats (CRISPR) spacer hits were found in WA, reflecting more lysogeny in historically drier soil. More putative auxiliary metabolic genes (AMGs) were also detected in WA than in the historically wetter locations. The AMGs occurring in 18 pathways could potentially contribute to carbon metabolism and energy acquisition in their hosts. Structural equation modeling (SEM) suggested that historical precipitation influenced viral life cycle and selection of AMGs. The observed and predicted relationships between soil viruses and various biotic and abiotic variables have value for predicting viral responses to environmental change.
Collapse
|
9
|
Jin JJ, Yu WB, Yang JB, Song Y, dePamphilis CW, Yi TS, Li DZ. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol 2020; 21:241. [PMID: 32912315 PMCID: PMC7488116 DOI: 10.1186/s13059-020-02154-5] [Citation(s) in RCA: 1784] [Impact Index Per Article: 356.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/16/2020] [Accepted: 08/24/2020] [Indexed: 12/13/2022] Open
Abstract
GetOrganelle is a state-of-the-art toolkit to accurately assemble organelle genomes from whole genome sequencing data. It recruits organelle-associated reads using a modified "baiting and iterative mapping" approach, conducts de novo assembly, filters and disentangles the assembly graph, and produces all possible configurations of circular organelle genomes. For 50 published plant datasets, we are able to reassemble the circular plastomes from 47 datasets using GetOrganelle. GetOrganelle assemblies are more accurate than published and/or NOVOPlasty-reassembled plastomes as assessed by mapping. We also assemble complete mitochondrial genomes using GetOrganelle. GetOrganelle is freely released under a GPL-3 license ( https://github.com/Kinggerm/GetOrganelle ).
Collapse
Affiliation(s)
- Jian-Jun Jin
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Wen-Bin Yu
- Center for Integrative Conservation, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, Yunnan, 666303, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Mengla, Yunnan, 666303, China
- Southeast Asia Biodiversity Research Institute, Chinese Academy of Sciences, Yezin, Nay Pyi Taw, 05282, Myanmar
| | - Jun-Bo Yang
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Yu Song
- Center for Integrative Conservation, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, Yunnan, 666303, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Mengla, Yunnan, 666303, China
- Southeast Asia Biodiversity Research Institute, Chinese Academy of Sciences, Yezin, Nay Pyi Taw, 05282, Myanmar
| | - Claude W dePamphilis
- Department of Biology, The Pennsylvania State University, University Park, PA, 16801, USA
| | - Ting-Shuang Yi
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China.
| | - De-Zhu Li
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China.
| |
Collapse
|
10
|
Jin JJ, Yu WB, Yang JB, Song Y, dePamphilis CW, Yi TS, Li DZ. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol 2020. [PMID: 32912315 DOI: 10.1101/256479] [Citation(s) in RCA: 122] [Impact Index Per Article: 24.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/17/2023] Open
Abstract
GetOrganelle is a state-of-the-art toolkit to accurately assemble organelle genomes from whole genome sequencing data. It recruits organelle-associated reads using a modified "baiting and iterative mapping" approach, conducts de novo assembly, filters and disentangles the assembly graph, and produces all possible configurations of circular organelle genomes. For 50 published plant datasets, we are able to reassemble the circular plastomes from 47 datasets using GetOrganelle. GetOrganelle assemblies are more accurate than published and/or NOVOPlasty-reassembled plastomes as assessed by mapping. We also assemble complete mitochondrial genomes using GetOrganelle. GetOrganelle is freely released under a GPL-3 license ( https://github.com/Kinggerm/GetOrganelle ).
Collapse
Affiliation(s)
- Jian-Jun Jin
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Wen-Bin Yu
- Center for Integrative Conservation, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, Yunnan, 666303, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Mengla, Yunnan, 666303, China
- Southeast Asia Biodiversity Research Institute, Chinese Academy of Sciences, Yezin, Nay Pyi Taw, 05282, Myanmar
| | - Jun-Bo Yang
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China
| | - Yu Song
- Center for Integrative Conservation, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Mengla, Yunnan, 666303, China
- Center of Conservation Biology, Core Botanical Gardens, Chinese Academy of Sciences, Mengla, Yunnan, 666303, China
- Southeast Asia Biodiversity Research Institute, Chinese Academy of Sciences, Yezin, Nay Pyi Taw, 05282, Myanmar
| | - Claude W dePamphilis
- Department of Biology, The Pennsylvania State University, University Park, PA, 16801, USA
| | - Ting-Shuang Yi
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China.
| | - De-Zhu Li
- Germplasm Bank of Wild Species, Kunming Institute of Botany, Chinese Academy of Sciences, Kunming, Yunnan, 650201, China.
| |
Collapse
|
11
|
Comparative genomics and pangenome-oriented studies reveal high homogeneity of the agronomically relevant enterobacterial plant pathogen Dickeya solani. BMC Genomics 2020; 21:449. [PMID: 32600255 PMCID: PMC7325237 DOI: 10.1186/s12864-020-06863-w] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/27/2020] [Accepted: 06/22/2020] [Indexed: 11/11/2022] Open
Abstract
Background Dickeya solani is an important plant pathogenic bacterium causing severe losses in European potato production. This species draws a lot of attention due to its remarkable virulence, great devastating potential and easier spread in contrast to other Dickeya spp. In view of a high need for extensive studies on economically important soft rot Pectobacteriaceae, we performed a comparative genomics analysis on D. solani strains to search for genetic foundations that would explain the differences in the observed virulence levels within the D. solani population. Results High quality assemblies of 8 de novo sequenced D. solani genomes have been obtained. Whole-sequence comparison, ANIb, ANIm, Tetra and pangenome-oriented analyses performed on these genomes and the sequences of 14 additional strains revealed an exceptionally high level of homogeneity among the studied genetic material of D. solani strains. With the use of 22 genomes, the pangenome of D. solani, comprising 84.7% core, 7.2% accessory and 8.1% unique genes, has been almost completely determined, suggesting the presence of a nearly closed pangenome structure. Attribution of the genes included in the D. solani pangenome fractions to functional COG categories showed that higher percentages of accessory and unique pangenome parts in contrast to the core section are encountered in phage/mobile elements- and transcription- associated groups with the genome of RNS 05.1.2A strain having the most significant impact. Also, the first D. solani large-scale genome-wide phylogeny computed on concatenated core gene alignments is herein reported. Conclusions The almost closed status of D. solani pangenome achieved in this work points to the fact that the unique gene pool of this species should no longer expand. Such a feature is characteristic of taxa whose representatives either occupy isolated ecological niches or lack efficient mechanisms for gene exchange and recombination, which seems rational concerning a strictly pathogenic species with clonal population structure. Finally, no obvious correlations between the geographical origin of D. solani strains and their phylogeny were found, which might reflect the specificity of the international seed potato market.
Collapse
|
12
|
From Nucleotides to Satellite Imagery: Approaches to Identify and Manage the Invasive Pathogen Xylella fastidiosa and Its Insect Vectors in Europe. SUSTAINABILITY 2020. [DOI: 10.3390/su12114508] [Citation(s) in RCA: 46] [Impact Index Per Article: 9.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/13/2022]
Abstract
Biological invasions represent some of the most severe threats to local communities and ecosystems. Among invasive species, the vector-borne pathogen Xylella fastidiosa is responsible for a wide variety of plant diseases and has profound environmental, social and economic impacts. Once restricted to the Americas, it has recently invaded Europe, where multiple dramatic outbreaks have highlighted critical challenges for its management. Here, we review the most recent advances on the identification, distribution and management of X. fastidiosa and its insect vectors in Europe through genetic and spatial ecology methodologies. We underline the most important theoretical and technological gaps that remain to be bridged. Challenges and future research directions are discussed in the light of improving our understanding of this invasive species, its vectors and host–pathogen interactions. We highlight the need of including different, complimentary outlooks in integrated frameworks to substantially improve our knowledge on invasive processes and optimize resources allocation. We provide an overview of genetic, spatial ecology and integrated approaches that will aid successful and sustainable management of one of the most dangerous threats to European agriculture and ecosystems.
Collapse
|
13
|
Linking De Novo Assembly Results with Long DNA Reads Using the dnaasm-link Application. BIOMED RESEARCH INTERNATIONAL 2019; 2019:7847064. [PMID: 31111066 PMCID: PMC6487145 DOI: 10.1155/2019/7847064] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 02/01/2019] [Revised: 03/25/2019] [Accepted: 03/27/2019] [Indexed: 12/14/2022]
Abstract
Currently, third-generation sequencing techniques, which make it possible to obtain much longer DNA reads compared to the next-generation sequencing technologies, are becoming more and more popular. There are many possibilities for combining data from next-generation and third-generation sequencing. Herein, we present a new application called dnaasm-link for linking contigs, the result of de novo assembly of second-generation sequencing data, with long DNA reads. Our tool includes an integrated module to fill gaps with a suitable fragment of an appropriate long DNA read, which improves the consistency of the resulting DNA sequences. This feature is very important, in particular for complex DNA regions. Our implementation is found to outperform other state-of-the-art tools in terms of speed and memory requirements, which may enable its usage for organisms with a large genome, something which is not possible in existing applications. The presented application has many advantages: (i) it significantly optimizes memory and reduces computation time; (ii) it fills gaps with an appropriate fragment of a specified long DNA read; (iii) it reduces the number of spanned and unspanned gaps in existing genome drafts. The application is freely available to all users under GNU Library or Lesser General Public License version 3.0 (LGPLv3). The demo application, Docker image, and source code can be downloaded from project homepage.
Collapse
|
14
|
Rowe WPM. When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data. Genome Biol 2019; 20:199. [PMID: 31519212 PMCID: PMC6744645 DOI: 10.1186/s13059-019-1809-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 09/02/2019] [Indexed: 01/21/2023] Open
Abstract
Considerable advances in genomics over the past decade have resulted in vast amounts of data being generated and deposited in global archives. The growth of these archives exceeds our ability to process their content, leading to significant analysis bottlenecks. Sketching algorithms produce small, approximate summaries of data and have shown great utility in tackling this flood of genomic data, while using minimal compute resources. This article reviews the current state of the field, focusing on how the algorithms work and how genomicists can utilize them effectively. References to interactive workbooks for explaining concepts and demonstrating workflows are included at https://github.com/will-rowe/genome-sketching .
Collapse
Affiliation(s)
- Will P M Rowe
- Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, Birmingham, B15 2TT, UK.
- Scientific Computing Department, The Hartree Centre, STFC Daresbury Laboratory, Warrington, WA4 4AD, UK.
| |
Collapse
|