Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Farrer RA, Henk DA, MacLean D, Studholme DJ, Fisher MC. Using false discovery rates to benchmark SNP-callers in next-generation sequencing projects. Sci Rep 2013;3:1512. [PMID: 23518929 PMCID: PMC3604800 DOI: 10.1038/srep01512] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Accepted: 02/25/2013] [Indexed: 12/16/2022] Open

For:	Farrer RA, Henk DA, MacLean D, Studholme DJ, Fisher MC. Using false discovery rates to benchmark SNP-callers in next-generation sequencing projects. Sci Rep 2013;3:1512. [PMID: 23518929 PMCID: PMC3604800 DOI: 10.1038/srep01512] [Citation(s) in RCA: 34] [Impact Index Per Article: 3.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/09/2012] [Accepted: 02/25/2013] [Indexed: 12/16/2022] Open

Number

Cited by Other Article(s)

Burda K, Konczal M. Validation of machine learning approach for direct mutation rate estimation. Mol Ecol Resour 2023;23:1757-1771. [PMID: 37486035 DOI: 10.1111/1755-0998.13841] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/25/2023] [Revised: 06/16/2023] [Accepted: 07/05/2023] [Indexed: 07/25/2023]

Bayer PE. Skim-Based Genotyping by Sequencing Using a Double Haploid Population to Call SNPs, Infer Gene Conversions, and Improve Genome Assemblies. Methods Mol Biol 2022;2443:405-413. [PMID: 35037217 DOI: 10.1007/978-1-0716-2067-0_20] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/14/2023]

Farrer RA. HaplotypeTools: a toolkit for accurately identifying recombination and recombinant genotypes. BMC Bioinformatics 2021;22:560. [PMID: 34809571 PMCID: PMC8607637 DOI: 10.1186/s12859-021-04473-1] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/20/2021] [Accepted: 11/10/2021] [Indexed: 11/17/2022] Open

Paula DP. Next-Generation Sequencing and Its Impacts on Entomological Research in Ecology and Evolution. NEOTROPICAL ENTOMOLOGY 2021;50:679-696. [PMID: 34374956 DOI: 10.1007/s13744-021-00895-x] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 09/17/2020] [Accepted: 07/06/2021] [Indexed: 06/13/2023]

Qi H, Li L, Zhang G. Construction of a chromosome-level genome and variation map for the Pacific oyster Crassostrea gigas. Mol Ecol Resour 2021;21:1670-1685. [PMID: 33655634 DOI: 10.1111/1755-0998.13368] [Citation(s) in RCA: 28] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2020] [Revised: 02/17/2021] [Accepted: 02/23/2021] [Indexed: 12/11/2022]

Valiente-Mullor C, Beamud B, Ansari I, Francés-Cuesta C, García-González N, Mejía L, Ruiz-Hueso P, González-Candelas F. One is not enough: On the effects of reference genome for the mapping and subsequent analyses of short-reads. PLoS Comput Biol 2021;17:e1008678. [PMID: 33503026 PMCID: PMC7870062 DOI: 10.1371/journal.pcbi.1008678] [Citation(s) in RCA: 31] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2020] [Revised: 02/08/2021] [Accepted: 01/05/2021] [Indexed: 12/17/2022] Open

Abstract

Mapping of high-throughput sequencing (HTS) reads to a single arbitrary reference genome is a frequently used approach in microbial genomics. However, the choice of a reference may represent a source of errors that may affect subsequent analyses such as the detection of single nucleotide polymorphisms (SNPs) and phylogenetic inference. In this work, we evaluated the effect of reference choice on short-read sequence data from five clinically and epidemiologically relevant bacteria (Klebsiella pneumoniae, Legionella pneumophila, Neisseria gonorrhoeae, Pseudomonas aeruginosa and Serratia marcescens). Publicly available whole-genome assemblies encompassing the genomic diversity of these species were selected as reference sequences, and read alignment statistics, SNP calling, recombination rates, dN/dS ratios, and phylogenetic trees were evaluated depending on the mapping reference. The choice of different reference genomes proved to have an impact on almost all the parameters considered in the five species. In addition, these biases had potential epidemiological implications such as including/excluding isolates of particular clades and the estimation of genetic distances. These findings suggest that the single reference approach might introduce systematic errors during mapping that affect subsequent analyses, particularly for data sets with isolates from genetically diverse backgrounds. In any case, exploring the effects of different references on the final conclusions is highly recommended.

Mapping consists in the alignment of reads (i.e., DNA fragments) obtained through high-throughput genome sequencing to a previously assembled reference sequence. It is a common practice in genomic studies to use a single reference for mapping, usually the ‘reference genome’ of a species—a high-quality assembly. However, the selection of an optimal reference is hindered by intrinsic intra-species genetic variability, particularly in bacteria. It is known that genetic differences between the reference genome and the read sequences may produce incorrect alignments during mapping. Eventually, these errors could lead to misidentification of variants and biased reconstruction of phylogenetic trees (which reflect ancestry between different bacterial lineages). To our knowledge, this is the first work to systematically examine the effect of different references for mapping on the inference of tree topology as well as the impact on recombination and natural selection inferences. Furthermore, the novelty of this work relies on a procedure that guarantees that we are evaluating only the effect of the reference. This effect has proved to be pervasive in the five bacterial species that we have studied and, in some cases, alterations in phylogenetic trees could lead to incorrect epidemiological inferences. Hence, the use of different reference genomes may be prescriptive to assess the potential biases of mapping.

Collapse

Bush SJ. Read trimming has minimal effect on bacterial SNP-calling accuracy. Microb Genom 2020;6. [PMID: 33332257 PMCID: PMC8116680 DOI: 10.1099/mgen.0.000434] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/01/2023] Open

Abstract

Read alignment is the central step of many analytic pipelines that perform variant calling. To reduce error, it is common practice to pre-process raw sequencing reads to remove low-quality bases and residual adapter contamination, a procedure collectively known as ‘trimming’. Trimming is widely assumed to increase the accuracy of variant calling, although there are relatively few systematic evaluations of its effects and no clear consensus on its efficacy. As sequencing datasets increase both in number and size, it is worthwhile reappraising computational operations of ambiguous benefit, particularly when the scope of many analyses now routinely incorporates thousands of samples, increasing the time and cost required. Using a curated set of 17 Gram-negative bacterial genomes, this study initially evaluated the impact of four read-trimming utilities (Atropos, fastp, Trim Galore and Trimmomatic), each used with a range of stringencies, on the accuracy and completeness of three bacterial SNP-calling pipelines. It was found that read trimming made only small, and statistically insignificant, increases in SNP-calling accuracy even when using the highest-performing pre-processor in this study, fastp. To extend these findings, >6500 publicly archived sequencing datasets from Escherichia coli, Mycobacterium tuberculosis and Staphylococcus aureus were re-analysed using a common analytic pipeline. Of the approximately 125 million SNPs and 1.25 million indels called across all samples, the same bases were called in 98.8 and 91.9 % of cases, respectively, irrespective of whether raw reads or trimmed reads were used. Nevertheless, the proportion of mixed calls (i.e. calls where <100 % of the reads support the variant allele; considered a proxy of false positives) was significantly reduced after trimming, which suggests that while trimming rarely alters the set of variant bases, it can affect the proportion of reads supporting each call. It was concluded that read quality- and adapter-trimming add relatively little value to a SNP-calling pipeline and may only be necessary if small differences in the absolute number of SNP calls, or the false call rate, are critical. Broadly similar conclusions can be drawn about the utility of trimming to an indel-calling pipeline. Read trimming remains routinely performed prior to variant calling likely out of concern that doing otherwise would typically have negative consequences. While historically this may have been the case, the data in this study suggests that read trimming is not always a practical necessity.

Collapse

Bush SJ, Foster D, Eyre DW, Clark EL, De Maio N, Shaw LP, Stoesser N, Peto TEA, Crook DW, Walker AS. Genomic diversity affects the accuracy of bacterial single-nucleotide polymorphism-calling pipelines. Gigascience 2020;9:giaa007. [PMID: 32025702 PMCID: PMC7002876 DOI: 10.1093/gigascience/giaa007] [Citation(s) in RCA: 61] [Impact Index Per Article: 15.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/28/2019] [Revised: 12/02/2019] [Accepted: 01/15/2020] [Indexed: 02/06/2023] Open

Abstract

BACKGROUND

Accurately identifying single-nucleotide polymorphisms (SNPs) from bacterial sequencing data is an essential requirement for using genomics to track transmission and predict important phenotypes such as antimicrobial resistance. However, most previous performance evaluations of SNP calling have been restricted to eukaryotic (human) data. Additionally, bacterial SNP calling requires choosing an appropriate reference genome to align reads to, which, together with the bioinformatic pipeline, affects the accuracy and completeness of a set of SNP calls obtained. This study evaluates the performance of 209 SNP-calling pipelines using a combination of simulated data from 254 strains of 10 clinically common bacteria and real data from environmentally sourced and genomically diverse isolates within the genera Citrobacter, Enterobacter, Escherichia, and Klebsiella.

RESULTS

We evaluated the performance of 209 SNP-calling pipelines, aligning reads to genomes of the same or a divergent strain. Irrespective of pipeline, a principal determinant of reliable SNP calling was reference genome selection. Across multiple taxa, there was a strong inverse relationship between pipeline sensitivity and precision, and the Mash distance (a proxy for average nucleotide divergence) between reads and reference genome. The effect was especially pronounced for diverse, recombinogenic bacteria such as Escherichia coli but less dominant for clonal species such as Mycobacterium tuberculosis.

CONCLUSIONS

The accuracy of SNP calling for a given species is compromised by increasing intra-species diversity. When reads were aligned to the same genome from which they were sequenced, among the highest-performing pipelines was Novoalign/GATK. By contrast, when reads were aligned to particularly divergent genomes, the highest-performing pipelines often used the aligners NextGenMap or SMALT, and/or the variant callers LoFreq, mpileup, or Strelka.

Collapse

Affiliation(s)

Stephen J Bush Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK National Institute for Health Research Health Research Protection Unit in Healthcare Associated Infections and Antimicrobial Resistance at University of Oxford in partnership with Public Health England, Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK
Dona Foster Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK National Institute for Health Research Oxford Biomedical Research Centre, Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK
David W Eyre Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK
Emily L Clark The Roslin Institute and Royal (Dick) School of Veterinary Studies, University of Edinburgh, Easter Bush Campus, Midlothian, EH25 9RG, UK
Nicola De Maio European Molecular Biology Laboratory, European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SH, UK
Liam P Shaw Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK
Nicole Stoesser Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK
Tim E A Peto Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK National Institute for Health Research Health Research Protection Unit in Healthcare Associated Infections and Antimicrobial Resistance at University of Oxford in partnership with Public Health England, Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK National Institute for Health Research Oxford Biomedical Research Centre, Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK
Derrick W Crook Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK National Institute for Health Research Health Research Protection Unit in Healthcare Associated Infections and Antimicrobial Resistance at University of Oxford in partnership with Public Health England, Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK National Institute for Health Research Oxford Biomedical Research Centre, Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK
A Sarah Walker Nuffield Department of Medicine, University of Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK National Institute for Health Research Health Research Protection Unit in Healthcare Associated Infections and Antimicrobial Resistance at University of Oxford in partnership with Public Health England, Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK National Institute for Health Research Oxford Biomedical Research Centre, Oxford, John Radcliffe Hospital, Headington, Oxford, OX3 9DU, UK

Collapse

Chang LY, Toghiani S, Hay EH, Aggrey SE, Rekaya R. A Weighted Genomic Relationship Matrix Based on Fixation Index (F_ST) Prioritized SNPs for Genomic Selection. Genes (Basel) 2019;10:genes10110922. [PMID: 31726712 PMCID: PMC6895924 DOI: 10.3390/genes10110922] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 11/06/2019] [Accepted: 11/08/2019] [Indexed: 12/30/2022] Open

Hui W, Yang Y, Wu G, Wang Y, Zaky Zayed M, Chen X. Differential gene expression analyses related to fruit yield of Jatropha curcas L. using RNA-seq. BIOTECHNOL BIOTEC EQ 2018. [DOI: 10.1080/13102818.2018.1507757] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/28/2022] Open

Affiliation(s)

Wenkai Hui State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, College of Forestry and Landscape Architecture, South China Agricultural University, Guangzhou, P.R. China National Engineering Laboratory for Forest Tree Breeding, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, P.R. China
Yuantong Yang State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, College of Forestry and Landscape Architecture, South China Agricultural University, Guangzhou, P.R. China
Guojiang Wu Key Laboratory of Plant Resources Conservation and Sustainable Utilization, South China Botanical Garden, Chinese Academy of Sciences, Guangzhou, P.R. China
Yi Wang State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, College of Forestry and Landscape Architecture, South China Agricultural University, Guangzhou, P.R. China
Mohamed Zaky Zayed State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, College of Forestry and Landscape Architecture, South China Agricultural University, Guangzhou, P.R. China Forestry and Wood Technology Department, Faculty of Agriculture (EL-Shatby), Alexandria University, Alexandria, Egypt
Xiaoyang Chen State Key Laboratory for Conservation and Utilization of Subtropical Agro-bioresources, Guangdong Key Laboratory for Innovative Development and Utilization of Forest Plant Germplasm, College of Forestry and Landscape Architecture, South China Agricultural University, Guangzhou, P.R. China National Engineering Laboratory for Forest Tree Breeding, College of Biological Sciences and Technology, Beijing Forestry University, Beijing, P.R. China

Collapse

Tiley GP, Kimball RT, Braun EL, Burleigh JG. Comparison of the Chinese bamboo partridge and red Junglefowl genome sequences highlights the importance of demography in genome evolution. BMC Genomics 2018;19:336. [PMID: 29739321 PMCID: PMC5941490 DOI: 10.1186/s12864-018-4711-0] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2017] [Accepted: 04/23/2018] [Indexed: 12/31/2022] Open

Abstract

BACKGROUND

Recent large-scale whole genome sequencing efforts in birds have elucidated broad patterns of avian phylogeny and genome evolution. However, despite the great interest in economically important phasianids like Gallus gallus (Red Junglefowl, the progenitor of the chicken), we know little about the genomes of closely related species. Gallus gallus is highly sexually dichromatic and polygynous, but its sister genus, Bambusicola, is smaller, sexually monomorphic, and monogamous with biparental care. We sequenced the genome of Bambusicola thoracicus (Chinese Bamboo Partridge) using a single insert library to test hypotheses about genome evolution in galliforms. Selection acting at the phenotypic level could result in more evidence of positive selection in the Gallus genome than in Bambusicola. However, the historical range size of Bambusicola was likely smaller than Gallus, and demographic effects could lead to higher rates of nonsynonymous substitution in Bambusicola than in Gallus.

RESULTS

We generated a genome assembly suitable for evolutionary analyses. We examined the impact of selection on coding regions by examining shifts in the average nonsynonymous to synonymous rate ratio (dN/dS) and the proportion of sites subject to episodic positive selection. We observed elevated dN/dS in Bambusicola relative to Gallus, which is consistent with our hypothesis that demographic effects may be important drivers of genome evolution in Bambusicola. We also demonstrated that alignment error can greatly inflate estimates of the number of genes that experienced episodic positive selection and heterogeneity in dN/dS. However, overall patterns of molecular evolution were robust to alignment uncertainty. Bambusicola thoracicus has higher estimates of heterozygosity than Gallus gallus, possibly due to migration events over the past 100,000 years.

CONCLUSIONS

Our results emphasized the importance of demographic processes in generating the patterns of variation between Bambusicola and Gallus. We also demonstrated that genome assemblies generated using a single library can provide valuable insights into avian evolutionary history and found that it is important to account for alignment uncertainty in evolutionary inferences from draft genomes.

Collapse

Bradic M, Warring SD, Tooley GE, Scheid P, Secor WE, Land KM, Huang PJ, Chen TW, Lee CC, Tang P, Sullivan SA, Carlton JM. Genetic Indicators of Drug Resistance in the Highly Repetitive Genome of Trichomonas vaginalis. Genome Biol Evol 2018. [PMID: 28633446 PMCID: PMC5522705 DOI: 10.1093/gbe/evx110] [Citation(s) in RCA: 38] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022] Open

Abstract

Trichomonas vaginalis, the most common nonviral sexually transmitted parasite, causes ∼283 million trichomoniasis infections annually and is associated with pregnancy complications and increased risk of HIV-1 acquisition. The antimicrobial drug metronidazole is used for treatment, but in a fraction of clinical cases, the parasites can become resistant to this drug. We undertook sequencing of multiple clinical isolates and lab derived lines to identify genetic markers and mechanisms of metronidazole resistance. Reduced representation genome sequencing of ∼100 T. vaginalis clinical isolates identified 3,923 SNP markers and presence of a bipartite population structure. Linkage disequilibrium was found to decay rapidly, suggesting genome-wide recombination and the feasibility of genetic association studies in the parasite. We identified 72 SNPs associated with metronidazole resistance, and a comparison of SNPs within several lab-derived resistant lines revealed an overlap with the clinically resistant isolates. We identified SNPs in genes for which no function has yet been assigned, as well as in functionally-characterized genes relevant to drug resistance (e.g., pyruvate:ferredoxin oxidoreductase). Transcription profiles of resistant strains showed common changes in genes involved in drug activation (e.g., flavin reductase), accumulation (e.g., multidrug resistance pump), and detoxification (e.g., nitroreductase). Finally, we identified convergent genetic changes in lab-derived resistant lines of Tritrichomonas foetus, a distantly related species that causes venereal disease in cattle. Shared genetic changes within and between T. vaginalis and Tr. foetus parasites suggest conservation of the pathways through which adaptation has occurred. These findings extend our knowledge of drug resistance in the parasite, providing a panel of markers that can be used as a diagnostic tool.

Collapse

Farrer RA, Fisher MC. Describing Genomic and Epigenomic Traits Underpinning Emerging Fungal Pathogens. ADVANCES IN GENETICS 2017;100:73-140. [PMID: 29153405 DOI: 10.1016/bs.adgen.2017.09.009] [Citation(s) in RCA: 16] [Impact Index Per Article: 2.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 12/20/2022]

Wu SH, Schwartz RS, Winter DJ, Conrad DF, Cartwright RA. Estimating error models for whole genome sequencing using mixtures of Dirichlet-multinomial distributions. Bioinformatics 2017;33:2322-2329. [PMID: 28334373 PMCID: PMC5860108 DOI: 10.1093/bioinformatics/btx133] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/19/2016] [Revised: 01/22/2017] [Accepted: 03/07/2017] [Indexed: 12/30/2022] Open

Farrer RA, Martel A, Verbrugghe E, Abouelleil A, Ducatelle R, Longcore JE, James TY, Pasmans F, Fisher MC, Cuomo CA. Genomic innovations linked to infection strategies across emerging pathogenic chytrid fungi. Nat Commun 2017;8:14742. [PMID: 28322291 PMCID: PMC5364385 DOI: 10.1038/ncomms14742] [Citation(s) in RCA: 68] [Impact Index Per Article: 9.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/13/2016] [Accepted: 01/26/2017] [Indexed: 11/09/2022] Open

Farrer RA, Voelz K, Henk DA, Johnston SA, Fisher MC, May RC, Cuomo CA. Microevolutionary traits and comparative population genomics of the emerging pathogenic fungus Cryptococcus gattii. Philos Trans R Soc Lond B Biol Sci 2016;371:20160021. [PMID: 28080992 PMCID: PMC5095545 DOI: 10.1098/rstb.2016.0021] [Citation(s) in RCA: 24] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/04/2016] [Indexed: 01/15/2023] Open

Muñoz JF, Farrer RA, Desjardins CA, Gallo JE, Sykes S, Sakthikumar S, Misas E, Whiston EA, Bagagli E, Soares CMA, Teixeira MDM, Taylor JW, Clay OK, McEwen JG, Cuomo CA. Genome Diversity, Recombination, and Virulence across the Major Lineages of Paracoccidioides. mSphere 2016;1:e00213-16. [PMID: 27704050 PMCID: PMC5040785 DOI: 10.1128/msphere.00213-16] [Citation(s) in RCA: 74] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/28/2016] [Accepted: 09/06/2016] [Indexed: 12/29/2022] Open

Abstract

The Paracoccidioides genus includes two species of thermally dimorphic fungi that cause paracoccidioidomycosis, a neglected health-threatening human systemic mycosis endemic to Latin America. To examine the genome evolution and the diversity of Paracoccidioides spp., we conducted whole-genome sequencing of 31 isolates representing the phylogenetic, geographic, and ecological breadth of the genus. These samples included clinical, environmental and laboratory reference strains of the S1, PS2, PS3, and PS4 lineages of P. brasiliensis and also isolates of Paracoccidioides lutzii species. We completed the first annotated genome assemblies for the PS3 and PS4 lineages and found that gene order was highly conserved across the major lineages, with only a few chromosomal rearrangements. Comparing whole-genome assemblies of the major lineages with single-nucleotide polymorphisms (SNPs) predicted from the remaining 26 isolates, we identified a deep split of the S1 lineage into two clades we named S1a and S1b. We found evidence for greater genetic exchange between the S1b lineage and all other lineages; this may reflect the broad geographic range of S1b, which is often sympatric with the remaining, largely geographically isolated lineages. In addition, we found evidence of positive selection for the GP43 and PGA1 antigen genes and genes coding for other secreted proteins and proteases and lineage-specific loss-of-function mutations in cell wall and protease genes; these together may contribute to virulence and host immune response variation among natural isolates of Paracoccidioides spp. These insights into the recent evolutionary events highlight important differences between the lineages that could impact the distribution, pathogenicity, and ecology of Paracoccidioides. IMPORTANCE Characterization of genetic differences between lineages of the dimorphic human-pathogenic fungus Paracoccidioides can identify changes linked to important phenotypes and guide the development of new diagnostics and treatments. In this article, we compared genomes of 31 diverse isolates representing the major lineages of Paracoccidioides spp. and completed the first annotated genome sequences for the PS3 and PS4 lineages. We analyzed the population structure and characterized the genetic diversity among the lineages of Paracoccidioides, including a deep split of S1 into two lineages (S1a and S1b), and differentiated S1b, associated with most clinical cases, as the more highly recombining and diverse lineage. In addition, we found patterns of positive selection in surface proteins and secreted enzymes among the lineages, suggesting diversifying mechanisms of pathogenicity and adaptation across this species complex. These genetic differences suggest associations with the geographic range, pathogenicity, and ecological niches of Paracoccidioides lineages.

Collapse

Affiliation(s)

José F. Muñoz Cellular and Molecular Biology Unit, Corporación para Investigaciones Biológicas, Medellín, Colombia Institute of Biology, Universidad de Antioquia, Medellín, Colombia Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
Rhys A. Farrer Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
Christopher A. Desjardins Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
Juan E. Gallo Cellular and Molecular Biology Unit, Corporación para Investigaciones Biológicas, Medellín, Colombia Doctoral Program in Biomedical Sciences, Universidad del Rosario, Bogotá, Colombia
Sean Sykes Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
Sharadha Sakthikumar Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA
Elizabeth Misas Cellular and Molecular Biology Unit, Corporación para Investigaciones Biológicas, Medellín, Colombia Institute of Biology, Universidad de Antioquia, Medellín, Colombia
Emily A. Whiston Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, California, USA
Eduardo Bagagli Instituto de Biociências, Universidade Estadual Paulista, Botucatu, São Paulo, Brazil
Celia M. A. Soares Laboratório de Biología Molecular, Instituto de Ciências Biológicas, ICBII, Goiânia, Brazil
Marcus de M. Teixeira Instituto de Ciências Biológicas, Universidade de Brasília, Brasília, Distrito Federal, Brazil Division of Pathogen Genomics, Translational Genomics Research Institute North, Flagstaff, Arizona, USA
John W. Taylor Department of Plant and Microbial Biology, University of California, Berkeley, Berkeley, California, USA
Oliver K. Clay Cellular and Molecular Biology Unit, Corporación para Investigaciones Biológicas, Medellín, Colombia School of Medicine and Health Sciences, Universidad del Rosario, Bogotá, Colombia
Juan G. McEwen Cellular and Molecular Biology Unit, Corporación para Investigaciones Biológicas, Medellín, Colombia School of Medicine, Universidad de Antioquia, Medellín, Colombia
Christina A. Cuomo Broad Institute of MIT and Harvard, Cambridge, Massachusetts, USA

Collapse

Shifman AR, Johnson RM, Wilhelm BT. Cascade: an RNA-seq visualization tool for cancer genomics. BMC Genomics 2016;17:75. [PMID: 26810393 PMCID: PMC4727405 DOI: 10.1186/s12864-016-2389-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2015] [Accepted: 01/11/2016] [Indexed: 12/20/2022] Open

Bayer PE. Skim-Based Genotyping by Sequencing Using a Double Haploid Population to Call SNPs, Infer Gene Conversions, and Improve Genome Assemblies. Methods Mol Biol 2016;1374:285-292. [PMID: 26519413 DOI: 10.1007/978-1-4939-3167-5_16] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/05/2023]

Pightling AW, Petronella N, Pagotto F. Choice of reference-guided sequence assembler and SNP caller for analysis of Listeria monocytogenes short-read sequence data greatly influences rates of error. BMC Res Notes 2015;8:748. [PMID: 26643440 PMCID: PMC4672502 DOI: 10.1186/s13104-015-1689-4] [Citation(s) in RCA: 25] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/06/2015] [Accepted: 11/11/2015] [Indexed: 02/07/2023] Open

Abstract

Background

The influences that different programs and conditions have on error rates of single-nucleotide polymorphism (SNP) analyses are poorly understood. Using Illumina short-read sequence data generated from Listeria monocytogenes strain HPB5622, we assessed the performance of four SNP callers (BCFtools, FreeBayes, UnifiedGenotyper, VarScan) under a variety of conditions, including: (1) a range of sequencing coverages; (2) use of four popular reference-guided assemblers (Burrows-Wheeler Aligner, Novoalign, MOSAIK, SMALT); (3) with and without read quality trimming and filtering; and (4) use of different reference sequences.

Results

At 8-fold coverage the proportions of true positive calls ranged from 0.22 to 25.00 % when reads were aligned to a nearly identical reference (0.000096 % distant). Calls made when reads were aligned to a non-identical reference (0.85 % distant) were from 92.54 to 98.88 % accurate. At 79-fold coverage accuracies ranged from 3.95 to 20.00 % with the nearly identical reference and 93.80–98.75 % with the non-identical reference. Read preprocessing significantly changed the numbers of false positive calls made, from a 65.24 % decrease to a 54.55 % increase.

Conclusions

The combinations of reference-guided sequence assemblers and SNP callers greatly influenced not only the numbers of true and false positive sites but also the proportions of true positive calls relative to the total numbers of calls made. Furthermore, the efficacy of different assembler and caller combinations changed dramatically with the different conditions tested. Researchers should consider whether identifying the greatest numbers of true positive sites, reducing the numbers of false positive calls, or achieving the highest accuracies are desired.

Electronic supplementary material

The online version of this article (doi:10.1186/s13104-015-1689-4) contains supplementary material, which is available to authorized users.

Collapse

Ribeiro A, Golicz A, Hackett CA, Milne I, Stephen G, Marshall D, Flavell AJ, Bayer M. An investigation of causes of false positive single nucleotide polymorphisms using simulated reads from a small eukaryote genome. BMC Bioinformatics 2015;16:382. [PMID: 26558718 PMCID: PMC4642669 DOI: 10.1186/s12859-015-0801-z] [Citation(s) in RCA: 29] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2015] [Accepted: 10/29/2015] [Indexed: 12/30/2022] Open

Abstract

Background

Single Nucleotide Polymorphisms (SNPs) are widely used molecular markers, and their use has increased massively since the inception of Next Generation Sequencing (NGS) technologies, which allow detection of large numbers of SNPs at low cost. However, both NGS data and their analysis are error-prone, which can lead to the generation of false positive (FP) SNPs. We explored the relationship between FP SNPs and seven factors involved in mapping-based variant calling — quality of the reference sequence, read length, choice of mapper and variant caller, mapping stringency and filtering of SNPs by read mapping quality and read depth. This resulted in 576 possible factor level combinations. We used error- and variant-free simulated reads to ensure that every SNP found was indeed a false positive.

Results

The variation in the number of FP SNPs generated ranged from 0 to 36,621 for the 120 million base pairs (Mbp) genome. All of the experimental factors tested had statistically significant effects on the number of FP SNPs generated and there was a considerable amount of interaction between the different factors. Using a fragmented reference sequence led to a dramatic increase in the number of FP SNPs generated, as did relaxed read mapping and a lack of SNP filtering. The choice of reference assembler, mapper and variant caller also significantly affected the outcome. The effect of read length was more complex and suggests a possible interaction between mapping specificity and the potential for contributing more false positives as read length increases.

Conclusions

The choice of tools and parameters involved in variant calling can have a dramatic effect on the number of FP SNPs produced, with particularly poor combinations of software and/or parameter settings yielding tens of thousands in this experiment. Between-factor interactions make simple recommendations difficult for a SNP discovery pipeline but the quality of the reference sequence is clearly of paramount importance. Our findings are also a stark reminder that it can be unwise to use the relaxed mismatch settings provided as defaults by some read mappers when reads are being mapped to a relatively unfinished reference sequence from e.g. a non-model organism in its early stages of genomic exploration.

Electronic supplementary material

The online version of this article (doi:10.1186/s12859-015-0801-z) contains supplementary material, which is available to authorized users.

Collapse

Beal MA, Gagné R, Williams A, Marchetti F, Yauk CL. Characterizing Benzo[a]pyrene-induced lacZ mutation spectrum in transgenic mice using next-generation sequencing. BMC Genomics 2015;16:812. [PMID: 26481219 PMCID: PMC4617527 DOI: 10.1186/s12864-015-2004-4] [Citation(s) in RCA: 27] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/22/2015] [Accepted: 10/03/2015] [Indexed: 11/25/2022] Open

Arthur JW, Cheung FSG, Reichardt JKV. Single nucleotide differences (SNDs) continue to contaminate the dbSNP database with consequences for human genomics and health. Hum Mutat 2015;36:196-9. [PMID: 25421747 DOI: 10.1002/humu.22735] [Citation(s) in RCA: 8] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/25/2014] [Accepted: 11/17/2014] [Indexed: 01/31/2023]

Genome Evolution and Innovation across the Four Major Lineages of Cryptococcus gattii. mBio 2015;6:e00868-15. [PMID: 26330512 PMCID: PMC4556806 DOI: 10.1128/mbio.00868-15] [Citation(s) in RCA: 70] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/21/2022] Open

Abstract

Cryptococcus gattii is a fungal pathogen of humans, causing pulmonary infections in otherwise healthy hosts. To characterize genomic variation among the four major lineages of C. gattii (VGI, -II, -III, and -IV), we generated, annotated, and compared 16 de novo genome assemblies, including the first for the rarely isolated lineages VGIII and VGIV. By identifying syntenic regions across assemblies, we found 15 structural rearrangements, which were almost exclusive to the VGI-III-IV lineages. Using synteny to inform orthology prediction, we identified a core set of 87% of C. gattii genes present as single copies in all four lineages. Remarkably, 737 genes are variably inherited across lineages and are overrepresented for response to oxidative stress, mitochondrial import, and metal binding and transport. Specifically, VGI has an expanded set of iron-binding genes thought to be important to the virulence of Cryptococcus, while VGII has expansions in the stress-related heat shock proteins relative to the other lineages. We also characterized genes uniquely absent in each lineage, including a copper transporter absent from VGIV, which influences Cryptococcus survival during pulmonary infection and the onset of meningoencephalitis. Through inclusion of population-level data for an additional 37 isolates, we identified a new transcontinental clonal group that we name VGIIx, mitochondrial recombination between VGII and VGIII, and positive selection of multidrug transporters and the iron-sulfur protein aconitase along multiple branches of the phylogenetic tree. Our results suggest that gene expansion or contraction and positive selection have introduced substantial variation with links to mechanisms of pathogenicity across this species complex.

The genetic differences between phenotypically different pathogens provide clues to the underlying mechanisms of those traits and can lead to new drug targets and improved treatments for those diseases. In this paper, we compare 16 genomes belonging to four highly differentiated lineages of Cryptococcus gattii, which cause pulmonary infections in otherwise healthy humans and other animals. Half of these lineages have not had their genomes previously assembled and annotated. We identified 15 ancestral rearrangements in the genome and over 700 genes that are unique to one or more lineages, many of which are associated with virulence. In addition, we found evidence for recent transcontinental spread, mitochondrial genetic exchange, and positive selection in multidrug transporters. Our results suggest that gene expansion/contraction and positive selection are diversifying the mechanisms of pathogenicity across this species complex.

Collapse

Dell'Acqua M, Zuccolo A, Tuna M, Gianfranceschi L, Pè ME. Targeting environmental adaptation in the monocot model Brachypodium distachyon: a multi-faceted approach. BMC Genomics 2014;15:801. [PMID: 25236859 PMCID: PMC4177692 DOI: 10.1186/1471-2164-15-801] [Citation(s) in RCA: 26] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2014] [Accepted: 09/04/2014] [Indexed: 12/22/2022] Open

Abstract

BACKGROUND

The local environment plays a major role in the spatial distribution of plant populations. Natural plant populations have an extremely poor displacing capacity, so their continued survival in a given environment depends on how well they adapt to local pedoclimatic conditions. Genomic tools can be used to identify adaptive traits at a DNA level and to further our understanding of evolutionary processes. Here we report the use of genotyping-by-sequencing on local groups of the sequenced monocot model species Brachypodium distachyon. Exploiting population genetics, landscape genomics and genome wide association studies, we evaluate B. distachyon role as a natural probe for identifying genomic loci involved in environmental adaptation.

RESULTS

Brachypodium distachyon individuals were sampled in nine locations with different ecologies and characterized with 16,697 SNPs. Variations in sequencing depth showed consistent patterns at 8,072 genomic bins, which were significantly enriched in transposable elements. We investigated the structuration and diversity of this collection, and exploited climatic data to identify loci with adaptive significance through i) two different approaches for genome wide association analyses considering climatic variation, ii) an outlier loci approach, and iii) a canonical correlation analysis on differentially sequenced bins. A linkage disequilibrium-corrected Bonferroni method was applied to filter associations. The two association methods jointly identified a set of 15 genes significantly related to environmental adaptation. The outlier loci approach revealed that 5.7% of the loci analysed were under selection. The canonical correlation analysis showed that the distribution of some differentially sequenced regions was associated to environmental variation.

CONCLUSIONS

We show that the multi-faceted approach used here targeted different components of B. distachyon adaptive variation, and may lead to the discovery of genes related to environmental adaptation in natural populations. Its application to a model species with a fully sequenced genome is a modular strategy that enables the stratification of biological material and thus improves our knowledge of the functional loci determining adaptation in near-crop species. When coupled with population genetics and measures of genomic structuration, methods coming from genome wide association studies may lead to the exploitation of model species as natural probes to identify loci related to environmental adaptation.

Collapse

Pightling AW, Petronella N, Pagotto F. Choice of reference sequence and assembler for alignment of Listeria monocytogenes short-read sequence data greatly influences rates of error in SNP analyses. PLoS One 2014;9:e104579. [PMID: 25144537 PMCID: PMC4140716 DOI: 10.1371/journal.pone.0104579] [Citation(s) in RCA: 57] [Impact Index Per Article: 5.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/15/2014] [Accepted: 07/14/2014] [Indexed: 01/06/2023] Open

Abstract

The wide availability of whole-genome sequencing (WGS) and an abundance of open-source software have made detection of single-nucleotide polymorphisms (SNPs) in bacterial genomes an increasingly accessible and effective tool for comparative analyses. Thus, ensuring that real nucleotide differences between genomes (i.e., true SNPs) are detected at high rates and that the influences of errors (such as false positive SNPs, ambiguously called sites, and gaps) are mitigated is of utmost importance. The choices researchers make regarding the generation and analysis of WGS data can greatly influence the accuracy of short-read sequence alignments and, therefore, the efficacy of such experiments. We studied the effects of some of these choices, including: i) depth of sequencing coverage, ii) choice of reference-guided short-read sequence assembler, iii) choice of reference genome, and iv) whether to perform read-quality filtering and trimming, on our ability to detect true SNPs and on the frequencies of errors. We performed benchmarking experiments, during which we assembled simulated and real Listeria monocytogenes strain 08-5578 short-read sequence datasets of varying quality with four commonly used assemblers (BWA, MOSAIK, Novoalign, and SMALT), using reference genomes of varying genetic distances, and with or without read pre-processing (i.e., quality filtering and trimming). We found that assemblies of at least 50-fold coverage provided the most accurate results. In addition, MOSAIK yielded the fewest errors when reads were aligned to a nearly identical reference genome, while using SMALT to align reads against a reference sequence that is ∼0.82% distant from 08-5578 at the nucleotide level resulted in the detection of the greatest numbers of true SNPs and the fewest errors. Finally, we show that whether read pre-processing improves SNP detection depends upon the choice of reference sequence and assembler. In total, this study demonstrates that researchers should test a variety of conditions to achieve optimal results.

Collapse

Voelz K, Ma H, Phadke S, Byrnes EJ, Zhu P, Mueller O, Farrer RA, Henk DA, Lewit Y, Hsueh YP, Fisher MC, Idnurm A, Heitman J, May RC. Transmission of Hypervirulence traits via sexual reproduction within and between lineages of the human fungal pathogen cryptococcus gattii. PLoS Genet 2013;9:e1003771. [PMID: 24039607 PMCID: PMC3764205 DOI: 10.1371/journal.pgen.1003771] [Citation(s) in RCA: 41] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/23/2013] [Accepted: 07/22/2013] [Indexed: 01/11/2023] Open

Abstract

Since 1999 a lineage of the pathogen Cryptococcus gattii has been infecting humans and other animals in Canada and the Pacific Northwest of the USA. It is now the largest outbreak of a life-threatening fungal infection in a healthy population in recorded history. The high virulence of outbreak strains is closely linked to the ability of the pathogen to undergo rapid mitochondrial tubularisation and proliferation following engulfment by host phagocytes. Most outbreaks spread by geographic expansion across suitable niches, but it is known that genetic re-assortment and hybridisation can also lead to rapid range and host expansion. In the context of C. gattii, however, the likelihood of virulence traits associated with the outbreak lineages spreading to other lineages via genetic exchange is currently unknown. Here we address this question by conducting outgroup crosses between distantly related C. gattii lineages (VGII and VGIII) and ingroup crosses between isolates from the same molecular type (VGII). Systematic phenotypic characterisation shows that virulence traits are transmitted to outgroups infrequently, but readily inherited during ingroup crosses. In addition, we observed higher levels of biparental (as opposed to uniparental) mitochondrial inheritance during VGII ingroup sexual mating in this species and provide evidence for mitochondrial recombination following mating. Taken together, our data suggest that hypervirulence can spread among the C. gattii lineages VGII and VGIII, potentially creating novel hypervirulent genotypes, and that current models of uniparental mitochondrial inheritance in the Cryptococcus genus may not be universal.

How infections spread within the human population is an important question in forecasting potential epidemics. One way to investigate potential mechanisms is to test experimentally whether combinations of genes that confer high virulence are able to spread to less-virulent lineages. Here, we address this question in a fungal pathogen that is causing an outbreak of meningitis in healthy humans in Canada and the Pacific Northwest. We demonstrate that virulence traits are easily transmitted between closely related pathogenic strains, but are more difficult to transmit to more distant lineages. In addition, we show that a paradigm of organelle inheritance, namely that mitochondria are inherited uniparentally from the a mating type, is altered in the R265α outbreak strain such that it transmits its mitochondrial genome to 25–30% of its progeny. This biparental inheritance likely contributes to increased mitochondrial recombination. Taken together, our data suggest that virulence traits may be relatively mobile within this species and that current models of mitochondrial inheritance may require revising.

Collapse

Affiliation(s)

Kerstin Voelz Institute of Microbiology and Infection & School of Biosciences, University of Birmingham, Birmingham, United Kingdom The National Institute of Health Research Surgical Reconstruction and Microbiology Research Centre, Queen Elizabeth Hospital Birmingham, Birmingham, United Kingdom
Hansong Ma Institute of Microbiology and Infection & School of Biosciences, University of Birmingham, Birmingham, United Kingdom
Sujal Phadke Department of Molecular Genetics and Microbiology, Duke University, Durham, North Carolina, United States of America
Edmond J. Byrnes Department of Molecular Genetics and Microbiology, Duke University, Durham, North Carolina, United States of America
Pinkuan Zhu School of Biological Sciences, University of Missouri, Kansas City, Missouri, United States of America
Olaf Mueller Department of Molecular Genetics and Microbiology, Duke University, Durham, North Carolina, United States of America
Rhys A. Farrer Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, United Kingdom
Daniel A. Henk Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, United Kingdom
Yonathan Lewit Department of Molecular Genetics and Microbiology, Duke University, Durham, North Carolina, United States of America
Yen-Ping Hsueh Department of Molecular Genetics and Microbiology, Duke University, Durham, North Carolina, United States of America
Matthew C. Fisher Department of Infectious Disease Epidemiology, School of Public Health, Imperial College London, London, United Kingdom
Alexander Idnurm School of Biological Sciences, University of Missouri, Kansas City, Missouri, United States of America
Joseph Heitman Department of Molecular Genetics and Microbiology, Duke University, Durham, North Carolina, United States of America * E-mail: (JH); (RCM)
Robin C. May Institute of Microbiology and Infection & School of Biosciences, University of Birmingham, Birmingham, United Kingdom The National Institute of Health Research Surgical Reconstruction and Microbiology Research Centre, Queen Elizabeth Hospital Birmingham, Birmingham, United Kingdom * E-mail: (JH); (RCM)

Collapse

Chromosomal copy number variation, selection and uneven rates of recombination reveal cryptic genome diversity linked to pathogenicity. PLoS Genet 2013;9:e1003703. [PMID: 23966879 PMCID: PMC3744429 DOI: 10.1371/journal.pgen.1003703] [Citation(s) in RCA: 97] [Impact Index Per Article: 8.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/20/2012] [Accepted: 06/21/2013] [Indexed: 11/19/2022] Open

Abstract

Pathogenic fungi constitute a growing threat to both plant and animal species on a global scale. Despite a clonal mode of reproduction dominating the population genetic structure of many fungi, putatively asexual species are known to adapt rapidly when confronted by efforts to control their growth and transmission. However, the mechanisms by which adaptive diversity is generated across a clonal background are often poorly understood. We sequenced a global panel of the emergent amphibian pathogen, Batrachochytrium dendrobatidis (Bd), to high depth and characterized rapidly changing features of its genome that we believe hold the key to the worldwide success of this organism. Our analyses show three processes that contribute to the generation of de novo diversity. Firstly, we show that the majority of wild isolates manifest chromosomal copy number variation that changes over short timescales. Secondly, we show that cryptic recombination occurs within all lineages of Bd, leading to large regions of the genome being in linkage equilibrium, and is preferentially associated with classes of genes of known importance for virulence in other pathosystems. Finally, we show that these classes of genes are under directional selection, and that this has predominantly targeted the Global Panzootic Lineage (BdGPL). Our analyses show that Bd manifests an unusually dynamic genome that may have been shaped by its association with the amphibian host. The rates of variation that we document likely explain the high levels of phenotypic variability that have been reported for Bd, and suggests that the dynamic genome of this pathogen has contributed to its success across multiple biomes and host-species.

Pathogenic fungi constitute a growing threat to both plant and animal species on a global scale. However, many features of the fungal genome that enable them to successfully adapt to infect diverse hosts and ecological niches remain cryptic, especially for newly evolved emerging lineages. In this paper, we report three novel features of genome diversity linked to pathogenicity in the emerging amphibian pathogen, Batrachochytrium dendrobatidis (Bd). Firstly, we identified widespread chromosome copy number variation (CCNV) across our lineages, with individual isolates harboring between 2 to 5 copies of each chromosome and rapid rates of CCNV occurring in culture. In addition, by using in vitro divergence of replicate lines of Bd, we showed that changes in ploidy can occur within as few as 40 generations. Secondly, we identified uneven rates of recombination across the genomes and lineages, revealing hot spots in known classes of virulence factors. Finally we identified significant evidence of diversifying selection across the secretome of Bd, and showed that selection also targets putative virulence factors. These findings add to our knowledge of genome-dynamicity and modes of evolution manifested by eukaryote microbial pathogens, and may explain the varied phenotypic responses observed in Bd.

Collapse

Population-Sequencing as a Biomarker for Sample Characterization. J Biomark 2013;2013:861823. [PMID: 26317024 PMCID: PMC4437355 DOI: 10.1155/2013/861823] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.1] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2013] [Accepted: 10/10/2013] [Indexed: 11/27/2022] Open