1
|
Stritt C, Reitsma M, Marin AMG, Goig G, Dötsch A, Borrell S, Beisel C, Comas I, Brites D, Gagneux S. Gene conversion and duplication contribute to genetic variation in an outbreak of Mycobacterium tuberculosis. Microb Genom 2025; 11:001396. [PMID: 40310468 PMCID: PMC12046097 DOI: 10.1099/mgen.0.001396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2024] [Accepted: 03/17/2025] [Indexed: 05/02/2025] Open
Abstract
Repeats are the most diverse and dynamic but also the least well-understood component of microbial genomes. For all we know, repeat-associated mutations such as duplications, deletions, inversions and gene conversion might be as common as point mutations, but because of short-read myopia and methodological bias, they have received much less attention. Long-read DNA sequencing opens the perspective of resolving repeats and systematically investigating the mutations they induce. For this study, we assembled the genomes of 16 closely related strains of the bacterial pathogen Mycobacterium tuberculosis from Pacific Biosciences HiFi reads, with the aim of characterizing the full spectrum of DNA polymorphisms. We found that complete and accurate genomes can be assembled from HiFi reads, with read size being the main limitation in the presence of duplications. By combining a reference-free pangenome graph with extensive repeat annotation, we identified 110 variants, 58 of which could be assigned to repeat-associated mutational mechanisms such as strand slippage and homologous recombination. Whilst recombination events were less frequent than point mutations, they affected large regions and introduced multiple variants at once, as shown by three gene conversion events and a duplication of 7.3 kb that involved ppe18 and ppe57, two genes possibly involved in immune subversion. The vast majority of variants were present in single isolates, such that phylogenetic resolution was only marginally increased when estimating a tree from complete genomes. Our study shows that the contribution of repeat-associated mechanisms of mutation can be similar to that of point mutations at the microevolutionary scale of an outbreak. A large reservoir of unstudied genetic variation in this 'monomorphic' bacterial pathogen awaits investigation.
Collapse
Affiliation(s)
- Christoph Stritt
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Michelle Reitsma
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | | | - Galo Goig
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Anna Dötsch
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Sonia Borrell
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Christian Beisel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Iñaki Comas
- Biomedicine Institute of Valencia, Spanish Research Council (IBV-CSIC), Valencia, Spain
- Spanish Network for Research on Epidemiology and Public Health (CIBERESP), Carlos III Health Institute, Madrid, Spain
| | - Daniela Brites
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Sebastien Gagneux
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| |
Collapse
|
2
|
Espinoza ME, Swing AM, Elghraoui A, Modlin SJ, Valafar F. Interred mechanisms of resistance and host immune evasion revealed through network-connectivity analysis of M. tuberculosis complex graph pangenome. mSystems 2025; 10:e0049924. [PMID: 40261029 PMCID: PMC12013269 DOI: 10.1128/msystems.00499-24] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/05/2024] [Accepted: 12/16/2024] [Indexed: 04/24/2025] Open
Abstract
Mycobacterium tuberculosis complex successfully adapts to environmental pressures through mechanisms of rapid adaptation which remain poorly understood despite knowledge gained through decades of research. In this study, we used 110 reference-quality, complete de novo assembled, long-read sequenced clinical genomes to study patterns of structural adaptation through a graph-based pangenome analysis, elucidating rarely studied mechanisms that enable enhanced clinical phenotypes offering a novel perspective to the species' adaptation. Across isolates, we identified a pangenome of 4,325 genes (3,767 core and 558 accessory), revealing 290 novel genes, and a substantially more complete account of difficult-to-sequence esx/pe/pgrs/ppe genes. Seventy-four percent of core genes were deemed non-essential in vitro, 38% of which support the pathogen's survival in vivo, suggesting a need to broaden current perspectives on essentiality. Through information-theoretic analysis, we reveal the ppe genes that contribute most to the species' diversity-several with known consequences for antigenic variation and immune evasion. Construction of a graph pangenome revealed topological variations that implicate genes known to modulate host immunity (Rv0071-73, Rv2817c, cas2), defense against phages/viruses (cas2, csm6, and Rv2817c-2821c), and others associated with host tissue colonization. Here, the prominent trehalose transport pathway stands out for its involvement in caseous granuloma catabolism and the development of post-primary disease. We show paralogous duplications of genes implicated in bedaquiline (mmpL5 in all L1 isolates) and ethambutol (embC-A) resistance, with a paralogous duplication of its regulator (embR) in 96 isolates. We provide hypotheses for novel mechanisms of immune evasion and antibiotic resistance through gene dosing that can escape detection by molecular diagnostics.IMPORTANCEM. tuberculosis complex (MTBC) has killed over a billion people in the past 200 years alone and continues to kill nearly 1.5 million annually. The pathogen has a versatile ability to diversify under immune and drug pressure and survive, even becoming antibiotic persistent or resistant in the face of harsh chemotherapy. For proper diagnosis and design of an appropriate treatment regimen, a full understanding of this diversification and its clinical consequences is desperately needed. A mechanism of diversification that is rarely studied systematically is MTBC's ability to structurally change its genome. In this article, we have de novo assembled 110 clinical genomes (the largest de novo assembled set to date) and performed a pangenomic analysis. Our pangenome provides structural variation-based hypotheses for novel mechanisms of immune evasion and antibiotic resistance through gene dosing that can compromise molecular diagnostics and lead to further emergence of antibiotic resistance.
Collapse
Affiliation(s)
- Monica E. Espinoza
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, California, USA
| | - Ashley M. Swing
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, California, USA
- San Diego State University/University of California, San Diego | Joint Doctoral Program in Public Health (Global Health), San Diego, California, USA
| | - Afif Elghraoui
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, California, USA
- Department of Electrical and Computer Engineering, San Diego State University, San Diego, California, USA
- Department of Electrical and Computer Engineering, University of California San Diego, San Diego, California, USA
| | - Samuel J. Modlin
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, California, USA
| | - Faramarz Valafar
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, California, USA
| |
Collapse
|
3
|
Conkle-Gutierrez D, Ramirez-Busby SM, Gorman BM, Elghraoui A, Hoffner S, Elmaraachli W, Valafar F. Novel and reported compensatory mutations in rpoABC genes found in drug resistant tuberculosis outbreaks. Front Microbiol 2024; 14:1265390. [PMID: 38260909 PMCID: PMC10800992 DOI: 10.3389/fmicb.2023.1265390] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2023] [Accepted: 12/19/2023] [Indexed: 01/24/2024] Open
Abstract
Background Rifampicin (RIF) is a key first-line drug used to treat tuberculosis, a primarily pulmonary disease caused by Mycobacterium tuberculosis. RIF resistance is caused by mutations in rpoB, at the cost of slower growth and reduced transcription efficiency. Antibiotic resistance to RIF is prevalent despite this fitness cost. Compensatory mutations in rpoABC genes have been shown to alleviate the fitness cost of rpoB:S450L, explaining how RIF resistant strains harbor this mutation can spread so rapidly. Unfortunately, the full set of RIF compensatory mutations is still unknown, particularly those compensating for rarer RIF resistance mutations. Objectives We performed an association study on a globally representative set of 4,309 whole genome sequenced clinical M. tuberculosis isolates to identify novel putative compensatory mutations, determine the prevalence of known and previously reported putative compensatory mutations, and determine which RIF resistance markers associate with these compensatory mutations. Results and conclusions Of the 1,079 RIF resistant isolates, 638 carried previously reported putative and high-probability compensatory mutations. Our strict criteria identified 46 additional mutations in rpoABC for which no strong prior evidence of their compensatory role exists. Of these, 35 have previously been reported. As such, our independent corroboration adds to the mounting evidence that these 35 also carry a compensatory role. The remaining 11 are novel putative compensatory markers, reported here for the first time. Six of these 11 novel putative compensatory mutations had two or more mutation events. Most compensatory mutations appear to be specifically compensating for the fitness loss due to rpoB:S450L. However, an outbreak of 22 closely related isolates each carried three rpoB mutations, the rare RIFR markers D435G and L452P and the putative compensatory mutation I1106T. This suggests compensation may require specific combinations of rpoABC mutations. Here, we report only mutations that met our very strict criteria. It is highly likely that many additional rpoABC mutations compensate for rare resistance-causing mutations and therefore did not carry the statistical power to be reported here. These findings aid in the identification of RIF resistant M. tuberculosis strains with restored fitness, which pose a greater risk of causing resistant outbreaks.
Collapse
Affiliation(s)
- Derek Conkle-Gutierrez
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, CA, United States
| | - Sarah M. Ramirez-Busby
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, CA, United States
| | - Bria M. Gorman
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, CA, United States
| | - Afif Elghraoui
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, CA, United States
| | - Sven Hoffner
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, CA, United States
- Department of Global Public Health, Karolinska Institute, Stockholm, Sweden
| | - Wael Elmaraachli
- Division of Pulmonary, Critical Care, and Sleep Medicine, University of California, San Diego, San Diego, CA, United States
| | - Faramarz Valafar
- Laboratory for Pathogenesis of Clinical Drug Resistance and Persistence, San Diego State University, San Diego, CA, United States
| |
Collapse
|
4
|
Gómez-González PJ, Grabowska AD, Tientcheu LD, Tsolaki AG, Hibberd ML, Campino S, Phelan JE, Clark TG. Functional genetic variation in pe/ ppe genes contributes to diversity in Mycobacterium tuberculosis lineages and potential interactions with the human host. Front Microbiol 2023; 14:1244319. [PMID: 37876785 PMCID: PMC10591178 DOI: 10.3389/fmicb.2023.1244319] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/22/2023] [Accepted: 09/21/2023] [Indexed: 10/26/2023] Open
Abstract
Introduction Around 10% of the coding potential of Mycobacterium tuberculosisis constituted by two poorly understood gene families, the pe and ppe loci, thought to be involved in host-pathogen interactions. Their repetitive nature and high GC content have hindered sequence analysis, leading to exclusion from whole-genome studies. Understanding the genetic diversity of pe/ppe families is essential to facilitate their potential translation into tools for tuberculosis prevention and treatment. Methods To investigate the genetic diversity of the 169 pe/ppe genes, we performed a sequence analysis across 73 long-read assemblies representing seven different lineages of M. tuberculosis and M. bovis BCG. Individual pe/ppe gene alignments were extracted and diversity and conservation across the different lineages studied. Results The pe/ppe genes were classified into three groups based on the level of protein sequence conservation relative to H37Rv, finding that >50% were conserved, with indels in pe_pgrs and ppe_mptr sub-families being major drivers of structural variation. Gene rearrangements, such as duplications and gene fusions, were observed between pe and pe_pgrs genes. Inter-lineage diversity revealed lineage-specific SNPs and indels. Discussion The high level of pe/ppe genes conservation, together with the lineage-specific findings, suggest their phylogenetic informativeness. However, structural variants and gene rearrangements differing from the reference were also identified, with potential implications for pathogenicity. Overall, improving our knowledge of these complex gene families may have insights into pathogenicity and inform the development of much-needed tools for tuberculosis control.
Collapse
Affiliation(s)
| | - Anna D. Grabowska
- Department of Biophysics, Physiology and Pathophysiology, Medical University of Warsaw, Warsaw, Poland
| | - Leopold D. Tientcheu
- MRC Unit, The Gambia at the London School of Hygiene and Tropical Medicine, Vaccines and Immunity Theme, Fajara, The Gambia
| | - Anthony G. Tsolaki
- Department of Life Sciences, Brunel University London, Uxbridge, United Kingdom
| | - Martin L. Hibberd
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Susana Campino
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Jody E. Phelan
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
| | - Taane G. Clark
- Faculty of Infectious and Tropical Diseases, London School of Hygiene and Tropical Medicine, London, United Kingdom
- Faculty of Epidemiology and Population Health, London School of Hygiene and Tropical Medicine, London, United Kingdom
| |
Collapse
|
5
|
Green AG, Vargas R, Marin MG, Freschi L, Xie J, Farhat MR. Analysis of Genome-Wide Mutational Dependence in Naturally Evolving Mycobacterium tuberculosis Populations. Mol Biol Evol 2023; 40:msad131. [PMID: 37352142 PMCID: PMC10292908 DOI: 10.1093/molbev/msad131] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2022] [Revised: 05/12/2023] [Accepted: 05/23/2023] [Indexed: 06/25/2023] Open
Abstract
Pathogenic microorganisms are in a perpetual struggle for survival in changing host environments, where host pressures necessitate changes in pathogen virulence, antibiotic resistance, or transmissibility. The genetic basis of phenotypic adaptation by pathogens is difficult to study in vivo. In this work, we develop a phylogenetic method to detect genetic dependencies that promote pathogen adaptation using 31,428 in vivo sampled Mycobacterium tuberculosis genomes, a globally prevalent bacterial pathogen with increasing levels of antibiotic resistance. We find that dependencies between mutations are enriched in antigenic and antibiotic resistance functions and discover 23 mutations that potentiate the development of antibiotic resistance. Between 11% and 92% of resistant strains harbor a dependent mutation acquired after a resistance-conferring variant. We demonstrate the pervasiveness of genetic dependency in adaptation of naturally evolving populations and the utility of the proposed computational approach.
Collapse
Affiliation(s)
- Anna G Green
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Roger Vargas
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Center for Computational Biomedicine, Harvard Medical School, Boston, MA, USA
| | - Maximillian G Marin
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Luca Freschi
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
| | - Jiaqi Xie
- Department of Genetics, Johns Hopkins School of Medicine, Baltimore, MD, USA
| | - Maha R Farhat
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, USA
- Division of Pulmonary and Critical Care Medicine, Massachusetts General Hospital, Boston, MA, USA
| |
Collapse
|
6
|
Reynolds NK, Stajich JE, Benny GL, Barry K, Mondo S, LaButti K, Lipzen A, Daum C, Grigoriev IV, Ho HM, Crous PW, Spatafora JW, Smith ME. Mycoparasites, Gut Dwellers, and Saprotrophs: Phylogenomic Reconstructions and Comparative Analyses of Kickxellomycotina Fungi. Genome Biol Evol 2023; 15:evac185. [PMID: 36617272 PMCID: PMC9866270 DOI: 10.1093/gbe/evac185] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2022] [Revised: 12/15/2022] [Accepted: 12/20/2022] [Indexed: 01/09/2023] Open
Abstract
Improved sequencing technologies have profoundly altered global views of fungal diversity and evolution. High-throughput sequencing methods are critical for studying fungi due to the cryptic, symbiotic nature of many species, particularly those that are difficult to culture. However, the low coverage genome sequencing (LCGS) approach to phylogenomic inference has not been widely applied to fungi. Here we analyzed 171 Kickxellomycotina fungi using LCGS methods to obtain hundreds of marker genes for robust phylogenomic reconstruction. Additionally, we mined our LCGS data for a set of nine rDNA and protein coding genes to enable analyses across species for which no LCGS data were obtained. The main goals of this study were to: 1) evaluate the quality and utility of LCGS data for both phylogenetic reconstruction and functional annotation, 2) test relationships among clades of Kickxellomycotina, and 3) perform comparative functional analyses between clades to gain insight into putative trophic modes. In opposition to previous studies, our nine-gene analyses support two clades of arthropod gut dwelling species and suggest a possible single evolutionary event leading to this symbiotic lifestyle. Furthermore, we resolve the mycoparasitic Dimargaritales as the earliest diverging clade in the subphylum and find four major clades of Coemansia species. Finally, functional analyses illustrate clear variation in predicted carbohydrate active enzymes and secondary metabolites (SM) based on ecology, that is biotroph versus saprotroph. Saprotrophic Kickxellales broadly lack many known pectinase families compared with saprotrophic Mucoromycota and are depauperate for SM but have similar numbers of predicted chitinases as mycoparasitic.
Collapse
Affiliation(s)
| | - Jason E Stajich
- Department of Microbiology & Plant Pathology and Institute for Integrative Genome Biology, University of California–Riverside
| | | | - Kerrie Barry
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory
| | - Stephen Mondo
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory
| | - Kurt LaButti
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory
| | - Anna Lipzen
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory
| | - Chris Daum
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory
| | - Igor V Grigoriev
- U.S. Department of Energy Joint Genome Institute, Lawrence Berkeley National Laboratory
- Department of Plant and Microbial Biology, University of California Berkeley
| | - Hsiao-Man Ho
- Department of Science Education, University of Education, 134, Section 2, Heping E. Road, National Taipei, Taipei 106, Taiwan
| | - Pedro W Crous
- Department of Evolutionary Phytopathology, Westerdijk Fungal Biodiversity Institute, Uppsalalaan 8, 3584 CT, Utrecht, The Netherlands
| | | | | |
Collapse
|
7
|
de Oliveira Martins L, Bloomfield S, Stoakes E, Grant AJ, Page AJ, Mather AE. Tatajuba: exploring the distribution of homopolymer tracts. NAR Genom Bioinform 2022; 4:lqac003. [PMID: 35118377 PMCID: PMC8808543 DOI: 10.1093/nargab/lqac003] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2021] [Revised: 11/18/2021] [Accepted: 01/05/2022] [Indexed: 11/14/2022] Open
Abstract
Length variation of homopolymeric tracts, which induces phase variation, is known to regulate gene expression leading to phenotypic variation in a wide range of bacterial species. There is no specialized bioinformatics software which can, at scale, exhaustively explore and describe these features from sequencing data. Identifying these is non-trivial as sequencing and bioinformatics methods are prone to introducing artefacts when presented with homopolymeric tracts due to the decreased base diversity. We present tatajuba, which can automatically identify potential homopolymeric tracts and help predict their putative phenotypic impact, allowing for rapid investigation. We use it to detect all tracts in two separate datasets, one of Campylobacter jejuni and one of three Bordetella species, and to highlight those tracts that are polymorphic across samples. With this we confirm homopolymer tract variation with phenotypic impact found in previous studies and additionally find many more with potential variability. The software is written in C and is available under the open source licence GNU GPLv3.
Collapse
Affiliation(s)
| | - Samuel Bloomfield
- Quadram Institute Bioscience, Norwich Research Park, Norwich NR4 7UQ, UK
| | - Emily Stoakes
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| | - Andrew J Grant
- Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge CB3 0ES, UK
| | - Andrew J Page
- Quadram Institute Bioscience, Norwich Research Park, Norwich NR4 7UQ, UK
| | - Alison E Mather
- Quadram Institute Bioscience, Norwich Research Park, Norwich NR4 7UQ, UK
| |
Collapse
|
8
|
Lorente-Leal V, Farrell D, Romero B, Álvarez J, de Juan L, Gordon SV. Performance and Agreement Between WGS Variant Calling Pipelines Used for Bovine Tuberculosis Control: Toward International Standardization. Front Vet Sci 2022; 8:780018. [PMID: 34970617 PMCID: PMC8712436 DOI: 10.3389/fvets.2021.780018] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/20/2021] [Accepted: 11/25/2021] [Indexed: 11/29/2022] Open
Abstract
Whole genome sequencing (WGS) and allied variant calling pipelines are a valuable tool for the control and eradication of infectious diseases, since they allow the assessment of the genetic relatedness of strains of animal pathogens. In the context of the control of tuberculosis (TB) in livestock, mainly caused by Mycobacterium bovis, these tools offer a high-resolution alternative to traditional molecular methods in the study of herd breakdown events. However, despite the increased use and efforts in the standardization of WGS methods in human tuberculosis around the world, the application of these WGS-enabled approaches to control TB in livestock is still in early development. Our study pursued an initial evaluation of the performance and agreement of four publicly available pipelines for the analysis of M. bovis WGS data (vSNP, SNiPgenie, BovTB, and MTBseq) on a set of simulated Illumina reads generated from a real-world setting with high TB prevalence in cattle and wildlife in the Republic of Ireland. The overall performance of the evaluated pipelines was high, with recall and precision rates above 99% once repeat-rich and problematic regions were removed from the analyses. In addition, when the same filters were applied, distances between inferred phylogenetic trees were similar and pairwise comparison revealed that most of the differences were due to the positioning of polytomies. Hence, under the studied conditions, all pipelines offer similar performance for variant calling to underpin real-world studies of M. bovis transmission dynamics.
Collapse
Affiliation(s)
- Víctor Lorente-Leal
- VISAVET Health Surveillance Center, Universidad Complutense de Madrid, Madrid, Spain.,Animal Health Department, Faculty of Veterinary Medicine, Universidad Complutense de Madrid, Madrid, Spain
| | - Damien Farrell
- UCD School of Veterinary Medicine, University College Dublin, Dublin, Ireland
| | - Beatriz Romero
- VISAVET Health Surveillance Center, Universidad Complutense de Madrid, Madrid, Spain.,Animal Health Department, Faculty of Veterinary Medicine, Universidad Complutense de Madrid, Madrid, Spain
| | - Julio Álvarez
- VISAVET Health Surveillance Center, Universidad Complutense de Madrid, Madrid, Spain.,Animal Health Department, Faculty of Veterinary Medicine, Universidad Complutense de Madrid, Madrid, Spain
| | - Lucía de Juan
- VISAVET Health Surveillance Center, Universidad Complutense de Madrid, Madrid, Spain.,Animal Health Department, Faculty of Veterinary Medicine, Universidad Complutense de Madrid, Madrid, Spain
| | - Stephen V Gordon
- UCD School of Veterinary Medicine, University College Dublin, Dublin, Ireland
| |
Collapse
|
9
|
Evaluating coverage bias in next-generation sequencing of Escherichia coli. PLoS One 2021; 16:e0253440. [PMID: 34166413 PMCID: PMC8224930 DOI: 10.1371/journal.pone.0253440] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/08/2020] [Accepted: 06/05/2021] [Indexed: 01/18/2023] Open
Abstract
Whole-genome sequencing is essential to many facets of infectious disease research. However, technical limitations such as bias in coverage and tagmentation, and difficulties characterising genomic regions with extreme GC content have created significant obstacles in its use. Illumina has claimed that the recently released DNA Prep library preparation kit, formerly known as Nextera Flex, overcomes some of these limitations. This study aimed to assess bias in coverage, tagmentation, GC content, average fragment size distribution, and de novo assembly quality using both the Nextera XT and DNA Prep kits from Illumina. When performing whole-genome sequencing on Escherichia coli and where coverage bias is the main concern, the DNA Prep kit may provide higher quality results; though de novo assembly quality, tagmentation bias and GC content related bias are unlikely to improve. Based on these results, laboratories with existing workflows based on Nextera XT would see minor benefits in transitioning to the DNA Prep kit if they were primarily studying organisms with neutral GC content.
Collapse
|