1
|
Lerminiaux N, Fakharuddin K, Mulvey MR, Mataseje L. Do we still need Illumina sequencing data? Evaluating Oxford Nanopore Technologies R10.4.1 flow cells and the Rapid v14 library prep kit for Gram negative bacteria whole genome assemblies. Can J Microbiol 2024; 70:178-189. [PMID: 38354391 DOI: 10.1139/cjm-2023-0175] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/16/2024]
Abstract
The best whole genome assemblies are currently built from a combination of highly accurate short-read sequencing data and long-read sequencing data that can bridge repetitive and problematic regions. Oxford Nanopore Technologies (ONT) produce long-read sequencing platforms and they are continually improving their technology to obtain higher quality read data that is approaching the quality obtained from short-read platforms such as Illumina. As these innovations continue, we evaluated how much ONT read coverage produced by the Rapid Barcoding Kit v14 (SQK-RBK114) is necessary to generate high-quality hybrid and long-read-only genome assemblies for a panel of carbapenemase-producing Enterobacterales bacterial isolates. We found that 30× long-read coverage is sufficient if Illumina data are available, and that more (at least 100× long-read coverage is recommended for long-read-only assemblies. Illumina polishing is still improving single nucleotide variants (SNVs) and INDELs in long-read-only assemblies. We also examined if antimicrobial resistance genes could be accurately identified in long-read-only data, and found that Flye assemblies regardless of ONT coverage detected >96% of resistance genes at 100% identity and length. Overall, the Rapid Barcoding Kit v14 and long-read-only assemblies can be an optimal sequencing strategy (i.e., plasmid characterization and AMR detection) but finer-scale analyses (i.e., SNV) still benefit from short-read data.
Collapse
Affiliation(s)
- Nicole Lerminiaux
- National Microbiology Lab, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Ken Fakharuddin
- National Microbiology Lab, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Michael R Mulvey
- National Microbiology Lab, Public Health Agency of Canada, Winnipeg, MB, Canada
| | - Laura Mataseje
- National Microbiology Lab, Public Health Agency of Canada, Winnipeg, MB, Canada
| |
Collapse
|
2
|
Makhetha LLN, Mokolopi BG, Oguttu JW, Mbajiorgu CA, Makete G, Mamphogoro TP. Genome assembly of antimicrobial-resistant Escherichia coli HMVC1 isolated from healthy Mogosane village cattle, South Africa. Microbiol Resour Announc 2024; 13:e0127323. [PMID: 38385670 DOI: 10.1128/mra.01273-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2023] [Accepted: 01/31/2024] [Indexed: 02/23/2024] Open
Abstract
Here, we present the genome assembly of E. coli strain HMVC1 isolated from rectal fecal samples of healthy cattle in South Africa. The genome size of HMVC1 consisted of 5,043,843 bp, with G + C content of 50.5%. The strain harbors marA, mdtM, acrF, acrD, and other antimicrobial resistance genes.
Collapse
Affiliation(s)
| | | | - James Wabwire Oguttu
- Department of Agriculture and Animal Health, University of South Africa, Science Campus, Florida, South Africa
| | | | - Goitsemang Makete
- Gastro-Intestinal Microbiology and Biotechnology Unit, Agricultural Research Council-Animal Production, Irene, Pretoria, South Africa
| | - Tshifhiwa Paris Mamphogoro
- Gastro-Intestinal Microbiology and Biotechnology Unit, Agricultural Research Council-Animal Production, Irene, Pretoria, South Africa
| |
Collapse
|
3
|
Dicks J, Fazal MA, Oliver K, Grayson NE, Turnbull JD, Bane E, Burnett E, Deheer-Graham A, Holroyd N, Kaushal D, Keane J, Langridge G, Lomax J, McGregor H, Picton S, Quail M, Singh D, Tracey A, Korlach J, Russell JE, Alexander S, Parkhill J. NCTC3000: a century of bacterial strain collecting leads to a rich genomic data resource. Microb Genom 2023; 9. [PMID: 37194944 DOI: 10.1099/mgen.0.000976] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 05/18/2023] Open
Abstract
The National Collection of Type Cultures (NCTC) was founded on 1 January 1920 in order to fulfil a recognized need for a centralized repository for bacterial and fungal strains within the UK. It is among the longest-established collections of its kind anywhere in the world and today holds approximately 6000 type and reference bacterial strains - many of medical, scientific and veterinary importance - available to academic, health, food and veterinary institutions worldwide. Recently, a collaboration between NCTC, Pacific Biosciences and the Wellcome Sanger Institute established the NCTC3000 project to long-read sequence and assemble the genomes of up to 3000 NCTC strains. Here, at the beginning of the collection's second century, we introduce the resulting NCTC3000 sequence read datasets, genome assemblies and annotations as a unique, historically and scientifically relevant resource for the benefit of the international bacterial research community.
Collapse
Affiliation(s)
- Jo Dicks
- Culture Collections, UK Health Security Agency, 61 Colindale Avenue, London, NW9 5EQ, UK
| | - Mohammed-Abbas Fazal
- Culture Collections, UK Health Security Agency, 61 Colindale Avenue, London, NW9 5EQ, UK
| | - Karen Oliver
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Nicholas E Grayson
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
- Present address: Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, UK, OX3 9DU, UK
| | - Jake D Turnbull
- Culture Collections, UK Health Security Agency, 61 Colindale Avenue, London, NW9 5EQ, UK
| | - Evangeline Bane
- Culture Collections, UK Health Security Agency, 61 Colindale Avenue, London, NW9 5EQ, UK
| | - Edward Burnett
- Culture Collections, UK Health Security Agency, 61 Colindale Avenue, London, NW9 5EQ, UK
| | - Ana Deheer-Graham
- Culture Collections, UK Health Security Agency, 61 Colindale Avenue, London, NW9 5EQ, UK
| | - Nancy Holroyd
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Dorota Kaushal
- Culture Collections, UK Health Security Agency, 61 Colindale Avenue, London, NW9 5EQ, UK
| | - Jacqueline Keane
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Gemma Langridge
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
- Present address: Quadram Institute Bioscience, Norwich Research Park, Norwich, NR4 7UQ, UK
| | - Jane Lomax
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Hannah McGregor
- Culture Collections, UK Health Security Agency, 61 Colindale Avenue, London, NW9 5EQ, UK
| | - Steve Picton
- Pacific Biosciences, 1305 O'Brien Drive, Menlo Park, CA, USA
| | - Michael Quail
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Deepak Singh
- Pacific Biosciences, 1305 O'Brien Drive, Menlo Park, CA, USA
| | - Alan Tracey
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
| | - Jonas Korlach
- Pacific Biosciences, 1305 O'Brien Drive, Menlo Park, CA, USA
| | - Julie E Russell
- Culture Collections, UK Health Security Agency, 61 Colindale Avenue, London, NW9 5EQ, UK
| | - Sarah Alexander
- Culture Collections, UK Health Security Agency, 61 Colindale Avenue, London, NW9 5EQ, UK
| | - Julian Parkhill
- Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK
- Present address: Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, CB3 0ES, UK
| |
Collapse
|
4
|
Johnson LK, Sahasrabudhe R, Gill JA, Roach JL, Froenicke L, Brown CT, Whitehead A. Draft genome assemblies using sequencing reads from Oxford Nanopore Technology and Illumina platforms for four species of North American Fundulus killifish. Gigascience 2021; 9:5859380. [PMID: 32556169 PMCID: PMC7301629 DOI: 10.1093/gigascience/giaa067] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [What about the content of this article? (0)] [Affiliation(s)] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/04/2019] [Revised: 04/16/2020] [Accepted: 05/27/2020] [Indexed: 01/04/2023] Open
Abstract
BACKGROUND Whole-genome sequencing data from wild-caught individuals of closely related North American killifish species (Fundulus xenicus, Fundulus catenatus, Fundulus nottii, and Fundulus olivaceus) were obtained using long-read Oxford Nanopore Technology (ONT) PromethION and short-read Illumina platforms. FINDINGS Draft de novo reference genome assemblies were generated using a combination of long and short sequencing reads. For each species, the PromethION platform was used to generate 30-45× sequence coverage, and the Illumina platform was used to generate 50-160× sequence coverage. Illumina-only assemblies were fragmented with high numbers of contigs, while ONT-only assemblies were error prone with low BUSCO scores. The highest N50 values, ranging from 0.4 to 2.7 Mb, were from assemblies generated using a combination of short- and long-read data. BUSCO scores were consistently >90% complete using the Eukaryota database. CONCLUSIONS High-quality genomes can be obtained from a combination of using short-read Illumina data to polish assemblies generated with long-read ONT data. Draft assemblies and raw sequencing data are available for public use. We encourage use and reuse of these data for assembly benchmarking and other analyses.
Collapse
Affiliation(s)
- Lisa K Johnson
- Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
- Department of Population Health & Reproduction, School of Veterinary Medicine, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Ruta Sahasrabudhe
- DNA Technologies Core, Genome Center, University of California, 1 Shields Avenue, Davis, CA 95616
| | - James Anthony Gill
- Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Jennifer L Roach
- Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Lutz Froenicke
- DNA Technologies Core, Genome Center, University of California, 1 Shields Avenue, Davis, CA 95616
| | - C Titus Brown
- Department of Population Health & Reproduction, School of Veterinary Medicine, University of California. 1 Shields Avenue, Davis, CA 95616, Davis, CA, USA
| | - Andrew Whitehead
- Correspondence address. Andrew Whitehead, Department of Environmental Toxicology, University of California. 1 Shields Avenue, Davis, CA 95616, USA, Davis, CA, USA. E-mail:
| |
Collapse
|
5
|
Totikov A, Tomarovsky A, Prokopov D, Yakupova A, Bulyonkova T, Derezanin L, Rasskazov D, Wolfsberger WW, Koepfli KP, Oleksyk TK, Kliver S. Chromosome-Level Genome Assemblies Expand Capabilities of Genomics for Conservation Biology. Genes (Basel) 2021; 12:1336. [PMID: 34573318 DOI: 10.3390/genes12091336] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [What about the content of this article? (0)] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2021] [Revised: 08/20/2021] [Accepted: 08/25/2021] [Indexed: 11/26/2022] Open
Abstract
Genome assemblies are in the process of becoming an increasingly important tool for understanding genetic diversity in threatened species. Unfortunately, due to limited budgets typical for the area of conservation biology, genome assemblies of threatened species, when available, tend to be highly fragmented, represented by tens of thousands of scaffolds not assigned to chromosomal locations. The recent advent of high-throughput chromosome conformation capture (Hi-C) enables more contiguous assemblies containing scaffolds spanning the length of entire chromosomes for little additional cost. These inexpensive contiguous assemblies can be generated using Hi-C scaffolding of existing short-read draft assemblies, where N50 of the draft contigs is larger than 0.1% of the estimated genome size and can greatly improve analyses and facilitate visualization of genome-wide features including distribution of genetic diversity in markers along chromosomes or chromosome-length scaffolds. We compared distribution of genetic diversity along chromosomes of eight mammalian species, including six listed as threatened by IUCN, where both draft genome assemblies and newer chromosome-level assemblies were available. The chromosome-level assemblies showed marked improvement in localization and visualization of genetic diversity, especially where the distribution of low heterozygosity across the genomes of threatened species was not uniform.
Collapse
|