26
|
Brown A, Mead ME, Steenwyk JL, Goldman GH, Rokas A. Extensive non-coding sequence divergence between the major human pathogen Aspergillus fumigatus and its relatives. FRONTIERS IN FUNGAL BIOLOGY 2022; 3:802494. [PMID: 36866034 PMCID: PMC9977105 DOI: 10.3389/ffunb.2022.802494] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/26/2021] [Accepted: 06/09/2022] [Indexed: 11/13/2022]
Abstract
Invasive aspergillosis is a deadly fungal disease; more than 400,000 patients are infected worldwide each year and the mortality rate can be as high as 50-95%. Of the ~450 species in the genus Aspergillus only a few are known to be clinically relevant, with the major pathogen Aspergillus fumigatus being responsible for ~50% of all invasive mold infections. Genomic comparisons between A. fumigatus and other Aspergillus species have historically focused on protein-coding regions. However, most A. fumigatus genes, including those that modulate its virulence, are also present in other pathogenic and non-pathogenic closely related species. Our hypothesis is that differential gene regulation - mediated through the non-coding regions upstream of genes' first codon - contributes to A. fumigatus pathogenicity. To begin testing this, we compared non-coding regions upstream of the first codon of single-copy orthologous genes from the two A. fumigatus reference strains Af293 and A1163 and eight closely related Aspergillus section Fumigati species. We found that these non-coding regions showed extensive sequence variation and lack of homology across species. By examining the evolutionary rates of both protein-coding and non-coding regions in a subset of orthologous genes with highly conserved non-coding regions across the phylogeny, we identified 418 genes, including 25 genes known to modulate A. fumigatus virulence, whose non-coding regions exhibit a different rate of evolution in A. fumigatus. Examination of sequence alignments of these non-coding regions revealed numerous instances of insertions, deletions, and other types of mutations of at least a few nucleotides in A. fumigatus compared to its close relatives. These results show that closely related Aspergillus species that vary greatly in their pathogenicity exhibit extensive non-coding sequence variation and identify numerous changes in non-coding regions of A. fumigatus genes known to contribute to virulence.
Collapse
|
27
|
Steenwyk JL, Buida Iii TJ, Gonçalves C, Goltz DC, Morales G, Mead ME, LaBella AL, Chavez CM, Schmitz JE, Hadjifrangiskou M, Li Y, Rokas A. BioKIT: a versatile toolkit for processing and analyzing diverse types of sequence data. Genetics 2022; 221:6583183. [PMID: 35536198 PMCID: PMC9252278 DOI: 10.1093/genetics/iyac079] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2022] [Accepted: 05/03/2022] [Indexed: 11/14/2022] Open
Abstract
Bioinformatic analysis-such as genome assembly quality assessment, alignment summary statistics, relative synonymous codon usage, file format conversion, and processing and analysis-is integrated into diverse disciplines in the biological sciences. Several command-line pieces of software have been developed to conduct some of these individual analyses, but unified toolkits that conduct all these analyses are lacking. To address this gap, we introduce BioKIT, a versatile command line toolkit that has, upon publication, 42 functions, several of which were community-sourced, that conduct routine and novel processing and analysis of genome assemblies, multiple sequence alignments, coding sequences, sequencing data, and more. To demonstrate the utility of BioKIT, we conducted a comprehensive examination of relative synonymous codon usage across 171 fungal genomes that use alternative genetic codes, showed that the novel metric of gene-wise relative synonymous codon usage can accurately estimate gene-wise codon optimization, evaluated the quality and characteristics of 901 eukaryotic genome assemblies, and calculated alignment summary statistics for 10 phylogenomic data matrices. BioKIT will be helpful in facilitating and streamlining sequence analysis workflows. BioKIT is freely available under the MIT license from GitHub (https://github.com/JLSteenwyk/BioKIT), PyPi (https://pypi.org/project/jlsteenwyk-biokit/), and the Anaconda Cloud (https://anaconda.org/jlsteenwyk/jlsteenwyk-biokit). Documentation, user tutorials, and instructions for requesting new features are available online (https://jlsteenwyk.com/BioKIT).
Collapse
|
28
|
Steenwyk JL, Phillips MA, Yang F, Date SS, Graham TR, Berman J, Hittinger CT, Rokas A. An orthologous gene coevolution network provides insight into eukaryotic cellular and genomic structure and function. SCIENCE ADVANCES 2022; 8:eabn0105. [PMID: 35507651 PMCID: PMC9067921 DOI: 10.1126/sciadv.abn0105] [Citation(s) in RCA: 8] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 10/27/2021] [Accepted: 03/16/2022] [Indexed: 06/14/2023]
Abstract
The evolutionary rates of functionally related genes often covary. We present a gene coevolution network inferred from examining nearly 3 million orthologous gene pairs from 332 budding yeast species spanning ~400 million years of evolution. Network modules provide insight into cellular and genomic structure and function. Examination of the phenotypic impact of network perturbation using deletion mutant data from the baker's yeast Saccharomyces cerevisiae, which were obtained from previously published studies, suggests that fitness in diverse environments is affected by orthologous gene neighborhood and connectivity. Mapping the network onto the chromosomes of S. cerevisiae and Candida albicans revealed that coevolving orthologous genes are not physically clustered in either species; rather, they are often located on different chromosomes or far apart on the same chromosome. The coevolution network captures the hierarchy of cellular structure and function, provides a roadmap for genotype-to-phenotype discovery, and portrays the genome as a linked ensemble of genes.
Collapse
|
29
|
Bradley NP, Wahl KL, Steenwyk JL, Rokas A, Eichman BF. Resistance-Guided Mining of Bacterial Genotoxins Defines a Family of DNA Glycosylases. mBio 2022; 13:e0329721. [PMID: 35311535 PMCID: PMC9040887 DOI: 10.1128/mbio.03297-21] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2021] [Accepted: 02/22/2022] [Indexed: 11/20/2022] Open
Abstract
Unique DNA repair enzymes that provide self-resistance against therapeutically important, genotoxic natural products have been discovered in bacterial biosynthetic gene clusters (BGCs). Among these, the DNA glycosylase AlkZ is essential for azinomycin B production and belongs to the HTH_42 superfamily of uncharacterized proteins. Despite their widespread existence in antibiotic producers and pathogens, the roles of these proteins in production of other natural products are unknown. Here, we determine the evolutionary relationship and genomic distribution of all HTH_42 proteins from Streptomyces and use a resistance-based genome mining approach to identify homologs associated with known and uncharacterized BGCs. We find that AlkZ-like (AZL) proteins constitute one distinct HTH_42 subfamily and are highly enriched in BGCs and variable in sequence, suggesting each has evolved to protect against a specific secondary metabolite. As a validation of the approach, we show that the AZL protein, HedH4, associated with biosynthesis of the alkylating agent hedamycin, excises hedamycin-DNA adducts with exquisite specificity and provides resistance to the natural product in cells. We also identify a second, phylogenetically and functionally distinct subfamily whose proteins are never associated with BGCs, are highly conserved with respect to sequence and genomic neighborhood, and repair DNA lesions not associated with a particular natural product. This work delineates two related families of DNA repair enzymes-one specific for complex alkyl-DNA lesions and involved in self-resistance to antimicrobials and the other likely involved in protection against an array of genotoxins-and provides a framework for targeted discovery of new genotoxic compounds with therapeutic potential. IMPORTANCE Bacteria are rich sources of secondary metabolites that include DNA-damaging genotoxins with antitumor/antibiotic properties. Although Streptomyces produce a diverse number of therapeutic genotoxins, efforts toward targeted discovery of biosynthetic gene clusters (BGCs) producing DNA-damaging agents is lacking. Moreover, work on toxin-resistance genes has lagged behind our understanding of those involved in natural product synthesis. Here, we identified over 70 uncharacterized BGCs producing potentially novel genotoxins through resistance-based genome mining using the azinomycin B-resistance DNA glycosylase AlkZ. We validate our analysis by characterizing the enzymatic activity and cellular resistance of one AlkZ ortholog in the BGC of hedamycin, a potent DNA alkylating agent. Moreover, we uncover a second, phylogenetically distinct family of proteins related to Escherichia coli YcaQ, a DNA glycosylase capable of unhooking interstrand DNA cross-links, which differs from the AlkZ-like family in sequence, genomic location, proximity to BGCs, and substrate specificity. This work defines two families of DNA glycosylase for specialized repair of complex genotoxic natural products and generalized repair of a broad range of alkyl-DNA adducts and provides a framework for targeted discovery of new compounds with therapeutic potential.
Collapse
|
30
|
de Castro PA, Colabardini AC, Moraes M, Horta MAC, Knowles SL, Raja HA, Oberlies NH, Koyama Y, Ogawa M, Gomi K, Steenwyk JL, Rokas A, Gonçales RA, Duarte-Oliveira C, Carvalho A, Ries LNA, Goldman GH. Regulation of gliotoxin biosynthesis and protection in Aspergillus species. PLoS Genet 2022; 18:e1009965. [PMID: 35041649 PMCID: PMC8797188 DOI: 10.1371/journal.pgen.1009965] [Citation(s) in RCA: 13] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/30/2021] [Revised: 01/28/2022] [Accepted: 01/04/2022] [Indexed: 02/07/2023] Open
Abstract
Aspergillus fumigatus causes a range of human and animal diseases collectively known as aspergillosis. A. fumigatus possesses and expresses a range of genetic determinants of virulence, which facilitate colonisation and disease progression, including the secretion of mycotoxins. Gliotoxin (GT) is the best studied A. fumigatus mycotoxin with a wide range of known toxic effects that impair human immune cell function. GT is also highly toxic to A. fumigatus and this fungus has evolved self-protection mechanisms that include (i) the GT efflux pump GliA, (ii) the GT neutralising enzyme GliT, and (iii) the negative regulation of GT biosynthesis by the bis-thiomethyltransferase GtmA. The transcription factor (TF) RglT is the main regulator of GliT and this GT protection mechanism also occurs in the non-GT producing fungus A. nidulans. However, the A. nidulans genome does not encode GtmA and GliA. This work aimed at analysing the transcriptional response to exogenous GT in A. fumigatus and A. nidulans, two distantly related Aspergillus species, and to identify additional components required for GT protection. RNA-sequencing shows a highly different transcriptional response to exogenous GT with the RglT-dependent regulon also significantly differing between A. fumigatus and A. nidulans. However, we were able to observe homologs whose expression pattern was similar in both species (43 RglT-independent and 11 RglT-dependent). Based on this approach, we identified a novel RglT-dependent methyltranferase, MtrA, involved in GT protection. Taking into consideration the occurrence of RglT-independent modulated genes, we screened an A. fumigatus deletion library of 484 transcription factors (TFs) for sensitivity to GT and identified 15 TFs important for GT self-protection. Of these, the TF KojR, which is essential for kojic acid biosynthesis in Aspergillus oryzae, was also essential for virulence and GT biosynthesis in A. fumigatus, and for GT protection in A. fumigatus, A. nidulans, and A. oryzae. KojR regulates rglT, gliT, gliJ expression and sulfur metabolism in Aspergillus species. Together, this study identified conserved components required for GT protection in Aspergillus species. A. fumigatus secretes mycotoxins that are essential for its virulence and pathogenicity. Gliotoxin (GT) is a sulfur-containing mycotoxin, which is known to impair several aspects of the human immune response. GT is also toxic to different fungal species, which have evolved several GT protection strategies. To further decipher these responses, we used transcriptional profiling aiming to compare the response to GT in the GT producer A. fumigatus and the GT non-producer A. nidulans. This analysis allowed us to identify additional genes with a potential role in GT protection. We also identified 15 transcription factors (TFs) encoded in the A. fumigatus genome that are important for conferring resistance to exogenous gliotoxin. One of these TFs, KojR, which is essential for A. oryzae kojic acid production, is also important for virulence in A. fumigatus and GT protection in A. fumigatus, A. nidulans and A. oryzae. KojR regulates the expression of genes important for gliotoxin biosynthesis and protection, and sulfur metabolism. Together, this work identified conserved components required for gliotoxin protection in Aspergillus species.
Collapse
|
31
|
Phillips MA, Steenwyk JL, Shen XX, Rokas A. Examination of Gene Loss in the DNA Mismatch Repair Pathway and Its Mutational Consequences in a Fungal Phylum. Genome Biol Evol 2021; 13:evab219. [PMID: 34554246 PMCID: PMC8597960 DOI: 10.1093/gbe/evab219] [Citation(s) in RCA: 8] [Impact Index Per Article: 2.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 09/17/2021] [Indexed: 12/12/2022] Open
Abstract
The DNA mismatch repair (MMR) pathway corrects mismatched bases produced during DNA replication and is highly conserved across the tree of life, reflecting its fundamental importance for genome integrity. Loss of function in one or a few MMR genes can lead to increased mutation rates and microsatellite instability, as seen in some human cancers. Although loss of MMR genes has been documented in the context of human disease and in hypermutant strains of pathogens, examples of entire species and species lineages that have experienced substantial MMR gene loss are lacking. We examined the genomes of 1,107 species in the fungal phylum Ascomycota for the presence of 52 genes known to be involved in the MMR pathway of fungi. We found that the median ascomycete genome contained 49/52 MMR genes. In contrast, four closely related species of obligate plant parasites from the powdery mildew genera Erysiphe and Blumeria, have lost between five and 21 MMR genes, including MLH3, EXO1, and DPB11. The lost genes span MMR functions, include genes that are conserved in all other ascomycetes, and loss of function of any of these genes alone has been previously linked to increased mutation rate. Consistent with the hypothesis that loss of these genes impairs MMR pathway function, we found that powdery mildew genomes with higher levels of MMR gene loss exhibit increased numbers of mononucleotide runs, longer microsatellites, accelerated sequence evolution, elevated mutational bias in the A|T direction, and decreased GC content. These results identify a striking example of macroevolutionary loss of multiple MMR pathway genes in a eukaryotic lineage, even though the mutational outcomes of these losses appear to resemble those associated with detrimental MMR dysfunction in other organisms.
Collapse
|
32
|
LaBella AL, Opulente DA, Steenwyk JL, Hittinger CT, Rokas A. Correction: Variation and selection on codon usage bias across an entire subphylum. PLoS Genet 2021; 17:e1009824. [PMID: 34570754 PMCID: PMC8476021 DOI: 10.1371/journal.pgen.1009824] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/30/2022] Open
|
33
|
dos Santos RAC, Mead ME, Steenwyk JL, Rivero-Menéndez O, Alastruey-Izquierdo A, Goldman GH, Rokas A. Examining Signatures of Natural Selection in Antifungal Resistance Genes Across Aspergillus Fungi. FRONTIERS IN FUNGAL BIOLOGY 2021; 2:723051. [PMID: 37744093 PMCID: PMC10512362 DOI: 10.3389/ffunb.2021.723051] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 06/09/2021] [Accepted: 08/16/2021] [Indexed: 09/26/2023]
Abstract
Certain Aspergillus fungi cause aspergillosis, a set of diseases that typically affect immunocompromised individuals. Most cases of aspergillosis are caused by Aspergillus fumigatus, which infects millions of people annually. Some closely related so-called cryptic species, such as Aspergillus lentulus, can also cause aspergillosis, albeit at lower frequencies, and they are also clinically relevant. Few antifungal drugs are currently available for treating aspergillosis and there is increasing worldwide concern about the presence of antifungal drug resistance in Aspergillus species. Furthermore, isolates from both A. fumigatus and other Aspergillus pathogens exhibit substantial heterogeneity in their antifungal drug resistance profiles. To gain insights into the evolution of antifungal drug resistance genes in Aspergillus, we investigated signatures of positive selection in 41 genes known to be involved in drug resistance across 42 susceptible and resistant isolates from 12 Aspergillus section Fumigati species. Using codon-based site models of sequence evolution, we identified ten genes that contain 43 sites with signatures of ancient positive selection across our set of species. None of the sites that have experienced positive selection overlap with sites previously reported to be involved in drug resistance. These results identify sites that likely experienced ancient positive selection in Aspergillus genes involved in resistance to antifungal drugs and suggest that historical selective pressures on these genes likely differ from any current selective pressures imposed by antifungal drugs.
Collapse
|
34
|
Steenwyk JL, Rokas A. orthofisher: a broadly applicable tool for automated gene identification and retrieval. G3-GENES GENOMES GENETICS 2021; 11:6321954. [PMID: 34544141 PMCID: PMC8496211 DOI: 10.1093/g3journal/jkab250] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 06/07/2021] [Accepted: 07/06/2021] [Indexed: 11/15/2022]
Abstract
Identification and retrieval of genes of interest from genomic data are an essential step for many bioinformatic applications. We present orthofisher, a command-line tool for automated identification and retrieval of genes with high sequence similarity to a query profile Hidden Markov Model sequence alignment across a set of proteomes. Performance assessment of orthofisher revealed high accuracy and precision during single-copy orthologous gene identification. orthofisher may be useful for assessing gene annotation quality, identifying single-copy orthologous genes for phylogenomic analyses, estimating gene copy number, and other evolutionary analyses that rely on identification and retrieval of homologous genes from genomic data. orthofisher comes complete with comprehensive documentation (https://jlsteenwyk.com/orthofisher/), is freely available under the MIT license, and is available for download from GitHub (https://github.com/JLSteenwyk/orthofisher), PyPi (https://pypi.org/project/orthofisher/), and the Anaconda Cloud (https://anaconda.org/jlsteenwyk/orthofisher).
Collapse
|
35
|
Steenwyk JL, Mead ME, de Castro PA, Valero C, Damasio A, dos Santos RAC, Labella AL, Li Y, Knowles SL, Raja HA, Oberlies NH, Zhou X, Cornely OA, Fuchs F, Koehler P, Goldman GH, Rokas A. Genomic and Phenotypic Analysis of COVID-19-Associated Pulmonary Aspergillosis Isolates of Aspergillus fumigatus. Microbiol Spectr 2021; 9:e0001021. [PMID: 34106569 PMCID: PMC8552514 DOI: 10.1128/spectrum.00010-21] [Citation(s) in RCA: 27] [Impact Index Per Article: 9.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/08/2021] [Accepted: 04/08/2021] [Indexed: 02/06/2023] Open
Abstract
The ongoing global pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for coronavirus disease 2019 (COVID-19), first described in Wuhan, China. A subset of COVID-19 patients has been reported to have acquired secondary infections by microbial pathogens, such as opportunistic fungal pathogens from the genus Aspergillus. To gain insight into COVID-19-associated pulmonary aspergillosis (CAPA), we analyzed the genomes and characterized the phenotypic profiles of four CAPA isolates of Aspergillus fumigatus obtained from patients treated in the area of North Rhine-Westphalia, Germany. By examining the mutational spectrum of single nucleotide polymorphisms, insertion-deletion polymorphisms, and copy number variants among 206 genes known to modulate A. fumigatus virulence, we found that CAPA isolate genomes do not exhibit significant differences from the genome of the Af293 reference strain. By examining a number of factors, including virulence in an invertebrate moth model, growth in the presence of osmotic, cell wall, and oxidative stressors, secondary metabolite biosynthesis, and the MIC of antifungal drugs, we found that CAPA isolates were generally, but not always, similar to A. fumigatus reference strains Af293 and CEA17. Notably, CAPA isolate D had more putative loss-of-function mutations in genes known to increase virulence when deleted. Moreover, CAPA isolate D was significantly more virulent than the other three CAPA isolates and the A. fumigatus reference strains Af293 and CEA17, but similarly virulent to two other clinical strains of A. fumigatus. These findings expand our understanding of the genomic and phenotypic characteristics of isolates that cause CAPA. IMPORTANCE The global pandemic caused by severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), the etiological agent of coronavirus disease 2019 (COVID-19), has already killed millions of people. COVID-19 patient outcome can be further complicated by secondary infections, such as COVID-19-associated pulmonary aspergillosis (CAPA). CAPA is caused by Aspergillus fungal pathogens, but there is little information about the genomic and phenotypic characteristics of CAPA isolates. We conducted genome sequencing and extensive phenotyping of four CAPA isolates of Aspergillus fumigatus from Germany. We found that CAPA isolates were often, but not always, similar to other reference strains of A. fumigatus across 206 genetic determinants of infection-relevant phenotypes, including virulence. For example, CAPA isolate D was more virulent than other CAPA isolates and reference strains in an invertebrate model of fungal disease, but similarly virulent to two other clinical strains. These results expand our understanding of COVID-19-associated pulmonary aspergillosis.
Collapse
|
36
|
Mead ME, Borowsky AT, Joehnk B, Steenwyk JL, Shen XX, Sil A, Rokas A. Recurrent Loss of abaA, a Master Regulator of Asexual Development in Filamentous Fungi, Correlates with Changes in Genomic and Morphological Traits. Genome Biol Evol 2021; 12:1119-1130. [PMID: 32442273 PMCID: PMC7531577 DOI: 10.1093/gbe/evaa107] [Citation(s) in RCA: 13] [Impact Index Per Article: 4.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/19/2020] [Indexed: 12/11/2022] Open
Abstract
Gene regulatory networks (GRNs) drive developmental and cellular differentiation, and variation in their architectures gives rise to morphological diversity. Pioneering studies in Aspergillus fungi, coupled with subsequent work in other filamentous fungi, have shown that the GRN governed by the BrlA, AbaA, and WetA proteins controls the development of the asexual fruiting body or conidiophore. A specific aspect of conidiophore development is the production of phialides, conidiophore structures that are under the developmental control of AbaA and function to repetitively generate spores. Fungal genome sequencing has revealed that some filamentous fungi lack abaA, and also produce asexual structures that lack phialides, raising the hypothesis that abaA loss is functionally linked to diversity in asexual fruiting body morphology. To examine this hypothesis, we carried out an extensive search for the abaA gene across 241 genomes of species from the fungal subphylum Pezizomycotina. We found that abaA was independently lost in four lineages of Eurotiomycetes, including from all sequenced species within the order Onygenales, and that all four lineages that have lost abaA also lack the ability to form phialides. Genetic restoration of abaA from Aspergillus nidulans into Histoplasma capsulatum, a pathogenic species from the order Onygenales that lacks an endogenous copy of abaA, did not alter Histoplasma conidiation morphology but resulted in a marked increase in spore viability. We also discovered that species lacking abaA contain fewer AbaA binding motifs in the regulatory regions of orthologs of some AbaA target genes, suggesting that the asexual fruiting body GRN of organisms that have lost abaA has likely been rewired. Our results provide an illustration of how repeated losses of a key regulatory transcription factor have contributed to the diversity of an iconic fungal morphological trait.
Collapse
|
37
|
Mead ME, Steenwyk JL, Silva LP, de Castro PA, Saeed N, Hillmann F, Goldman GH, Rokas A. An evolutionary genomic approach reveals both conserved and species-specific genetic elements related to human disease in closely related Aspergillus fungi. Genetics 2021; 218:6263860. [PMID: 33944921 DOI: 10.1093/genetics/iyab066] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/11/2021] [Accepted: 04/20/2021] [Indexed: 11/12/2022] Open
Abstract
Aspergillosis is an important opportunistic human disease caused by filamentous fungi in the genus Aspergillus. Roughly 70% of infections are caused by Aspergillus fumigatus, with the rest stemming from approximately a dozen other Aspergillus species. Several of these pathogens are closely related to A. fumigatus and belong in the same taxonomic section, section Fumigati. Pathogenic species are frequently most closely related to nonpathogenic ones, suggesting Aspergillus pathogenicity evolved multiple times independently. To understand the repeated evolution of Aspergillus pathogenicity, we performed comparative genomic analyses on 18 strains from 13 species, including 8 species in section Fumigati, which aimed to identify genes, both ones previously connected to virulence as well as ones never before implicated, whose evolution differs between pathogens and nonpathogens. We found that most genes were present in all species, including approximately half of those previously connected to virulence, but a few genes were section- or species-specific. Evolutionary rate analyses identified over 1700 genes whose evolutionary rate differed between pathogens and nonpathogens and dozens of genes whose rates differed between specific pathogens and the rest of the taxa. Functional testing of deletion mutants of 17 transcription factor-encoding genes whose evolution differed between pathogens and nonpathogens identified eight genes that affect either fungal survival in a model of phagocytic killing, host survival in an animal model of fungal disease, or both. These results suggest that the evolution of pathogenicity in Aspergillus involved both conserved and species-specific genetic elements, illustrating how an evolutionary genomic approach informs the study of fungal disease.
Collapse
|
38
|
Li Y, Steenwyk JL, Chang Y, Wang Y, James TY, Stajich JE, Spatafora JW, Groenewald M, Dunn CW, Hittinger CT, Shen XX, Rokas A. A genome-scale phylogeny of the kingdom Fungi. Curr Biol 2021; 31:1653-1665.e5. [PMID: 33607033 PMCID: PMC8347878 DOI: 10.1016/j.cub.2021.01.074] [Citation(s) in RCA: 112] [Impact Index Per Article: 37.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/11/2020] [Revised: 12/10/2020] [Accepted: 01/21/2021] [Indexed: 12/22/2022]
Abstract
Phylogenomic studies using genome-scale amounts of data have greatly improved understanding of the tree of life. Despite the diversity, ecological significance, and biomedical and industrial importance of fungi, evolutionary relationships among several major lineages remain poorly resolved, especially those near the base of the fungal phylogeny. To examine poorly resolved relationships and assess progress toward a genome-scale phylogeny of the fungal kingdom, we compiled a phylogenomic data matrix of 290 genes from the genomes of 1,644 species that includes representatives from most major fungal lineages. We also compiled 11 data matrices by subsampling genes or taxa from the full data matrix based on filtering criteria previously shown to improve phylogenomic inference. Analyses of these 12 data matrices using concatenation- and coalescent-based approaches yielded a robust phylogeny of the fungal kingdom, in which ∼85% of internal branches were congruent across data matrices and approaches used. We found support for several historically poorly resolved relationships as well as evidence for polytomies likely stemming from episodes of ancient diversification. By examining the relative evolutionary divergence of taxonomic groups of equivalent rank, we found that fungal taxonomy is broadly aligned with both genome sequence divergence and divergence time but also identified lineages where current taxonomic circumscription does not reflect their levels of evolutionary divergence. Our results provide a robust phylogenomic framework to explore the tempo and mode of fungal evolution and offer directions for future fungal phylogenetic and taxonomic studies.
Collapse
|
39
|
LaBella AL, Opulente DA, Steenwyk JL, Hittinger CT, Rokas A. Signatures of optimal codon usage in metabolic genes inform budding yeast ecology. PLoS Biol 2021; 19:e3001185. [PMID: 33872297 PMCID: PMC8084343 DOI: 10.1371/journal.pbio.3001185] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/22/2020] [Revised: 04/29/2021] [Accepted: 03/15/2021] [Indexed: 02/06/2023] Open
Abstract
Reverse ecology is the inference of ecological information from patterns of genomic variation. One rich, heretofore underutilized, source of ecologically relevant genomic information is codon optimality or adaptation. Bias toward codons that match the tRNA pool is robustly associated with high gene expression in diverse organisms, suggesting that codon optimization could be used in a reverse ecology framework to identify highly expressed, ecologically relevant genes. To test this hypothesis, we examined the relationship between optimal codon usage in the classic galactose metabolism (GAL) pathway and known ecological niches for 329 species of budding yeasts, a diverse subphylum of fungi. We find that optimal codon usage in the GAL pathway is positively correlated with quantitative growth on galactose, suggesting that GAL codon optimization reflects increased capacity to grow on galactose. Optimal codon usage in the GAL pathway is also positively correlated with human-associated ecological niches in yeasts of the CUG-Ser1 clade and with dairy-associated ecological niches in the family Saccharomycetaceae. For example, optimal codon usage of GAL genes is greater than 85% of all genes in the genome of the major human pathogen Candida albicans (CUG-Ser1 clade) and greater than 75% of genes in the genome of the dairy yeast Kluyveromyces lactis (family Saccharomycetaceae). We further find a correlation between optimization in the GALactose pathway genes and several genes associated with nutrient sensing and metabolism. This work suggests that codon optimization harbors information about the metabolic ecology of microbial eukaryotes. This information may be particularly useful for studying fungal dark matter-species that have yet to be cultured in the lab or have only been identified by genomic material.
Collapse
|
40
|
Shen XX, Steenwyk JL, Rokas A. Dissecting incongruence between concatenation- and quartet-based approaches in phylogenomic data. Syst Biol 2021; 70:997-1014. [PMID: 33616672 DOI: 10.1093/sysbio/syab011] [Citation(s) in RCA: 18] [Impact Index Per Article: 6.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/08/2020] [Revised: 02/10/2021] [Accepted: 02/17/2021] [Indexed: 12/12/2022] Open
Abstract
Topological conflict or incongruence is widespread in phylogenomic data. Concatenation- and coalescent-based approaches often result in incongruent topologies, but the causes of this conflict can be difficult to characterize. We examined incongruence stemming from conflict between likelihood-based signal (quantified by the difference in gene-wise log likelihood score or ΔGLS) and quartet-based topological signal (quantified by the difference in gene-wise quartet score or ΔGQS) for every gene in three phylogenomic studies in animals, fungi, and plants, which were chosen because their concatenation-based IQ-TREE (T1) and quartet-based ASTRAL (T2) phylogenies are known to produce eight conflicting internal branches (bipartitions). By comparing the types of phylogenetic signal for all genes in these three data matrices, we found that 30% - 36% of genes in each data matrix are inconsistent, that is, each of these genes has higher log likelihood score for T1 versus T2 (i.e., ΔGLS >0) whereas its T1 topology has lower quartet score than its T2 topology (i.e., ΔGQS <0) or vice versa. Comparison of inconsistent and consistent genes using a variety of metrics (e.g., evolutionary rate, gene tree topology, distribution of branch lengths, hidden paralogy, and gene tree discordance) showed that inconsistent genes are more likely to recover neither T1 nor T2 and have higher levels of gene tree discordance than consistent genes. Simulation analyses demonstrate that removal of inconsistent genes from datasets with low levels of incomplete lineage sorting (ILS) and low and medium levels of gene tree estimation error (GTEE) reduced incongruence and increased accuracy. In contrast, removal of inconsistent genes from datasets with medium and high ILS levels and high GTEE levels eliminated or extensively reduced incongruence, but the resulting congruent species phylogenies were not always topologically identical to the true species trees.
Collapse
|
41
|
Steenwyk JL. Cover Image: A portrait of budding yeasts: A symbol of the arts, sciences and a whole greater than the sum of its parts. Yeast 2021. [DOI: 10.1002/yea.3551] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/09/2022] Open
|
42
|
Steenwyk JL, Buida TJ, Labella AL, Li Y, Shen XX, Rokas A. PhyKIT: a broadly applicable UNIX shell toolkit for processing and analyzing phylogenomic data. Bioinformatics 2021; 37:2325-2331. [PMID: 33560364 PMCID: PMC8388027 DOI: 10.1093/bioinformatics/btab096] [Citation(s) in RCA: 57] [Impact Index Per Article: 19.0] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/06/2020] [Revised: 01/13/2021] [Accepted: 02/05/2021] [Indexed: 11/12/2022] Open
Abstract
MOTIVATION Diverse disciplines in biology process and analyze multiple sequence alignments (MSAs) and phylogenetic trees to evaluate their information content, infer evolutionary events and processes, and predict gene function. However, automated processing of MSAs and trees remains a challenge due to the lack of a unified toolkit. To fill this gap, we introduce PhyKIT, a toolkit for the UNIX shell environment with 30 functions that process MSAs and trees, including but not limited to estimation of mutation rate, evaluation of sequence composition biases, calculation of the degree of violation of a molecular clock, and collapsing bipartitions (internal branches) with low support. RESULTS To demonstrate the utility of PhyKIT, we detail three use cases: (1) summarizing information content in MSAs and phylogenetic trees for diagnosing potential biases in sequence or tree data; (2) evaluating gene-gene covariation of evolutionary rates to identify functional relationships, including novel ones, among genes; and (3) identify lack of resolution events or polytomies in phylogenetic trees, which are suggestive of rapid radiation events or lack of data. We anticipate PhyKIT will be useful for processing, examining, and deriving biological meaning from increasingly large phylogenomic datasets. AVAILABILITY PhyKIT is freely available on GitHub (https://github.com/JLSteenwyk/PhyKIT), PyPi (https://pypi.org/project/phykit/), and the Anaconda Cloud (https://anaconda.org/JLSteenwyk/phykit) under the MIT license with extensive documentation and user tutorials (https://jlsteenwyk.com/PhyKIT). SUPPLEMENTARY INFORMATION Supplementary data are available on figshare (doi: 10.6084/m9.figshare.13118600) and are available at Bioinformatics online.
Collapse
|
43
|
Li Y, David KT, Shen XX, Steenwyk JL, Halanych KM, Rokas A. Feature frequency profile-based phylogenies are inaccurate. Proc Natl Acad Sci U S A 2020; 117:31580-31581. [PMID: 33234569 PMCID: PMC7749326 DOI: 10.1073/pnas.2013143117] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
|
44
|
Steenwyk JL, Buida TJ, Li Y, Shen XX, Rokas A. ClipKIT: A multiple sequence alignment trimming software for accurate phylogenomic inference. PLoS Biol 2020; 18:e3001007. [PMID: 33264284 PMCID: PMC7735675 DOI: 10.1371/journal.pbio.3001007] [Citation(s) in RCA: 157] [Impact Index Per Article: 39.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/24/2020] [Revised: 12/14/2020] [Accepted: 11/10/2020] [Indexed: 12/22/2022] Open
Abstract
Highly divergent sites in multiple sequence alignments (MSAs), which can stem from erroneous inference of homology and saturation of substitutions, are thought to negatively impact phylogenetic inference. Thus, several different trimming strategies have been developed for identifying and removing these sites prior to phylogenetic inference. However, a recent study reported that doing so can worsen inference, underscoring the need for alternative alignment trimming strategies. Here, we introduce ClipKIT, an alignment trimming software that, rather than identifying and removing putatively phylogenetically uninformative sites, instead aims to identify and retain parsimony-informative sites, which are known to be phylogenetically informative. To test the efficacy of ClipKIT, we examined the accuracy and support of phylogenies inferred from 14 different alignment trimming strategies, including those implemented in ClipKIT, across nearly 140,000 alignments from a broad sampling of evolutionary histories. Phylogenies inferred from ClipKIT-trimmed alignments are accurate, robust, and time saving. Furthermore, ClipKIT consistently outperformed other trimming methods across diverse datasets, suggesting that strategies based on identifying and retaining parsimony-informative sites provide a robust framework for alignment trimming.
Collapse
|
45
|
Steenwyk JL, Mead ME, de Castro PA, Valero C, Damasio A, dos Santos RAC, Labella AL, Li Y, Knowles SL, Raja HA, Oberlies NH, Zhou X, Cornely OA, Fuchs F, Koehler P, Goldman GH, Rokas A. Genomic and phenotypic analysis of COVID-19-associated pulmonary aspergillosis isolates of Aspergillus fumigatus. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2020:2020.11.06.371971. [PMID: 33173866 PMCID: PMC7654854 DOI: 10.1101/2020.11.06.371971] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 01/04/2023]
Abstract
The ongoing global pandemic caused by the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the coronavirus disease 2019 (COVID-19) first described from Wuhan, China. A subset of COVID-19 patients has been reported to have acquired secondary infections by microbial pathogens, such as fungal opportunistic pathogens from the genus Aspergillus . To gain insight into COVID-19 associated pulmonary aspergillosis (CAPA), we analyzed the genomes and characterized the phenotypic profiles of four CAPA isolates of Aspergillus fumigatus obtained from patients treated in the area of North Rhine-Westphalia, Germany. By examining the mutational spectrum of single nucleotide polymorphisms, insertion-deletion polymorphisms, and copy number variants among 206 genes known to modulate A. fumigatus virulence, we found that CAPA isolate genomes do not exhibit major differences from the genome of the Af293 reference strain. By examining virulence in an invertebrate moth model, growth in the presence of osmotic, cell wall, and oxidative stressors, and the minimum inhibitory concentration of antifungal drugs, we found that CAPA isolates were generally, but not always, similar to A. fumigatus reference strains Af293 and CEA17. Notably, CAPA isolate D had more putative loss of function mutations in genes known to increase virulence when deleted (e.g., in the FLEA gene, which encodes a lectin recognized by macrophages). Moreover, CAPA isolate D was significantly more virulent than the other three CAPA isolates and the A. fumigatus reference strains tested. These findings expand our understanding of the genomic and phenotypic characteristics of isolates that cause CAPA.
Collapse
|
46
|
Shen XX, Steenwyk JL, LaBella AL, Opulente DA, Zhou X, Kominek J, Li Y, Groenewald M, Hittinger CT, Rokas A. Genome-scale phylogeny and contrasting modes of genome evolution in the fungal phylum Ascomycota. SCIENCE ADVANCES 2020; 6:eabd0079. [PMID: 33148650 PMCID: PMC7673691 DOI: 10.1126/sciadv.abd0079] [Citation(s) in RCA: 59] [Impact Index Per Article: 14.8] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 05/26/2020] [Accepted: 09/21/2020] [Indexed: 05/14/2023]
Abstract
Ascomycota, the largest and most well-studied phylum of fungi, contains three subphyla: Saccharomycotina (budding yeasts), Pezizomycotina (filamentous fungi), and Taphrinomycotina (fission yeasts). Despite its importance, we lack a comprehensive genome-scale phylogeny or understanding of the similarities and differences in the mode of genome evolution within this phylum. By examining 1107 genomes from Saccharomycotina (332), Pezizomycotina (761), and Taphrinomycotina (14) species, we inferred a robust genome-wide phylogeny that resolves several contentious relationships and estimated that the Ascomycota last common ancestor likely originated in the Ediacaran period. Comparisons of genomic properties revealed that Saccharomycotina and Pezizomycotina differ greatly in their genome properties and enabled inference of the direction of evolutionary change. The Saccharomycotina typically have smaller genomes, lower guanine-cytosine contents, lower numbers of genes, and higher rates of molecular sequence evolution compared with Pezizomycotina. These results provide a robust evolutionary framework for understanding the diversity and ecological lifestyles of the largest fungal phylum.
Collapse
|
47
|
Steenwyk JL, Mead ME, Knowles SL, Raja HA, Roberts CD, Bader O, Houbraken J, Goldman GH, Oberlies NH, Rokas A. Variation Among Biosynthetic Gene Clusters, Secondary Metabolite Profiles, and Cards of Virulence Across Aspergillus Species. Genetics 2020; 216:481-497. [PMID: 32817009 PMCID: PMC7536862 DOI: 10.1534/genetics.120.303549] [Citation(s) in RCA: 41] [Impact Index Per Article: 10.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2020] [Accepted: 08/01/2020] [Indexed: 02/07/2023] Open
Abstract
Aspergillus fumigatus is a major human pathogen. In contrast, Aspergillus fischeri and the recently described Aspergillus oerlinghausenensis, the two species most closely related to A. fumigatus, are not known to be pathogenic. Some of the genetic determinants of virulence (or "cards of virulence") that A. fumigatus possesses are secondary metabolites that impair the host immune system, protect from host immune cell attacks, or acquire key nutrients. To examine whether secondary metabolism-associated cards of virulence vary between these species, we conducted extensive genomic and secondary metabolite profiling analyses of multiple A. fumigatus, one A. oerlinghausenensis, and multiple A. fischeri strains. We identified two cards of virulence (gliotoxin and fumitremorgin) shared by all three species and three cards of virulence (trypacidin, pseurotin, and fumagillin) that are variable. For example, we found that all species and strains examined biosynthesized gliotoxin, which is known to contribute to virulence, consistent with the conservation of the gliotoxin biosynthetic gene cluster (BGC) across genomes. For other secondary metabolites, such as fumitremorgin, a modulator of host biology, we found that all species produced the metabolite but that there was strain heterogeneity in its production within species. Finally, species differed in their biosynthesis of fumagillin and pseurotin, both contributors to host tissue damage during invasive aspergillosis. A. fumigatus biosynthesized fumagillin and pseurotin, while A. oerlinghausenensis biosynthesized fumagillin and A. fischeri biosynthesized neither. These biochemical differences were reflected in sequence divergence of the intertwined fumagillin/pseurotin BGCs across genomes. These results delineate the similarities and differences in secondary metabolism-associated cards of virulence between a major fungal pathogen and its nonpathogenic closest relatives, shedding light onto the genetic and phenotypic changes associated with the evolution of fungal pathogenicity.
Collapse
|
48
|
Steenwyk JL. A portrait of budding yeasts: A symbol of the arts, sciences and a whole greater than the sum of its parts. Yeast 2020; 38:54-56. [PMID: 32869892 DOI: 10.1002/yea.3518] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/29/2020] [Accepted: 08/20/2020] [Indexed: 01/10/2023] Open
|
49
|
Steenwyk JL, Lind AL, Ries LNA, Dos Reis TF, Silva LP, Almeida F, Bastos RW, Fraga da Silva TFDC, Bonato VLD, Pessoni AM, Rodrigues F, Raja HA, Knowles SL, Oberlies NH, Lagrou K, Goldman GH, Rokas A. Pathogenic Allodiploid Hybrids of Aspergillus Fungi. Curr Biol 2020; 30:2495-2507.e7. [PMID: 32502407 PMCID: PMC7343619 DOI: 10.1016/j.cub.2020.04.071] [Citation(s) in RCA: 25] [Impact Index Per Article: 6.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/16/2019] [Revised: 02/25/2020] [Accepted: 04/24/2020] [Indexed: 12/12/2022]
Abstract
Interspecific hybridization substantially alters genotypes and phenotypes and can give rise to new lineages. Hybrid isolates that differ from their parental species in infection-relevant traits have been observed in several human-pathogenic yeasts and plant-pathogenic filamentous fungi but have yet to be found in human-pathogenic filamentous fungi. We discovered 6 clinical isolates from patients with aspergillosis originally identified as Aspergillus nidulans (section Nidulantes) that are actually allodiploid hybrids formed by the fusion of Aspergillus spinulosporus with an unknown close relative of Aspergillus quadrilineatus, both in section Nidulantes. Evolutionary genomic analyses revealed that these isolates belong to Aspergillus latus, an allodiploid hybrid species. Characterization of diverse infection-relevant traits further showed that A. latus hybrid isolates are genomically and phenotypically heterogeneous but also differ from A. nidulans, A. spinulosporus, and A. quadrilineatus. These results suggest that allodiploid hybridization contributes to the genomic and phenotypic diversity of filamentous fungal pathogens of humans.
Collapse
|
50
|
Rokas A, Mead ME, Steenwyk JL, Raja HA, Oberlies NH. Biosynthetic gene clusters and the evolution of fungal chemodiversity. Nat Prod Rep 2020; 37:868-878. [PMID: 31898704 PMCID: PMC7332410 DOI: 10.1039/c9np00045c] [Citation(s) in RCA: 69] [Impact Index Per Article: 17.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/23/2022]
Abstract
Covering: up to 2019Fungi produce a remarkable diversity of secondary metabolites: small, bioactive molecules not required for growth but which are essential to their ecological interactions with other organisms. Genes that participate in the same secondary metabolic pathway typically reside next to each other in fungal genomes and form biosynthetic gene clusters (BGCs). By synthesizing state-of-the-art knowledge on the evolution of BGCs in fungi, we propose that fungal chemodiversity stems from three molecular evolutionary processes involving BGCs: functional divergence, horizontal transfer, and de novo assembly. We provide examples of how these processes have contributed to the generation of fungal chemodiversity, discuss their relative importance, and outline major, outstanding questions in the field.
Collapse
|