Reference Citation Analysis: Find an Article, Find a Category, Find a Journal, Find a Scholar

For: Turner I, Garimella KV, Iqbal Z, McVean G. Integrating long-range connectivity information into de Bruijn graphs. Bioinformatics 2018;34:2556-2565. [PMID: 29554215 PMCID: PMC6061703 DOI: 10.1093/bioinformatics/bty157] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Revised: 11/25/2017] [Accepted: 03/14/2018] [Indexed: 12/27/2022] Open

For:	Turner I, Garimella KV, Iqbal Z, McVean G. Integrating long-range connectivity information into de Bruijn graphs. Bioinformatics 2018;34:2556-2565. [PMID: 29554215 PMCID: PMC6061703 DOI: 10.1093/bioinformatics/bty157] [Citation(s) in RCA: 36] [Impact Index Per Article: 5.1] [Reference Citation Analysis] [What about the content of this article? (0)] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2017] [Revised: 11/25/2017] [Accepted: 03/14/2018] [Indexed: 12/27/2022] Open

Number

Cited by Other Article(s)

Břinda K, Lima L, Pignotti S, Quinones-Olvera N, Salikhov K, Chikhi R, Kucherov G, Iqbal Z, Baym M. Efficient and robust search of microbial genomes via phylogenetic compression. Nat Methods 2025;22:692-697. [PMID: 40205174 DOI: 10.1038/s41592-025-02625-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/08/2023] [Accepted: 02/12/2025] [Indexed: 04/11/2025]

Matthews CA, Watson-Haigh NS, Burton RA, Sheppard AE. A gentle introduction to pangenomics. Brief Bioinform 2024;25:bbae588. [PMID: 39552065 PMCID: PMC11570541 DOI: 10.1093/bib/bbae588] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/25/2024] [Revised: 09/12/2024] [Accepted: 11/01/2024] [Indexed: 11/19/2024] Open

Roberts M, Josephs EB. Previously unmeasured genetic diversity explains part of Lewontin's paradox in a k -mer-based meta-analysis of 112 plant species. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2024.05.17.594778. [PMID: 38798362 PMCID: PMC11118579 DOI: 10.1101/2024.05.17.594778] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/29/2024]

Garg V, Bohra A, Mascher M, Spannagl M, Xu X, Bevan MW, Bennetzen JL, Varshney RK. Unlocking plant genetics with telomere-to-telomere genome assemblies. Nat Genet 2024;56:1788-1799. [PMID: 39048791 DOI: 10.1038/s41588-024-01830-7] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/18/2024] [Accepted: 06/12/2024] [Indexed: 07/27/2024]

Mustafa H, Karasikov M, Mansouri Ghiasi N, Rätsch G, Kahles A. Label-guided seed-chain-extend alignment on annotated De Bruijn graphs. Bioinformatics 2024;40:i337-i346. [PMID: 38940164 PMCID: PMC11211850 DOI: 10.1093/bioinformatics/btae226] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 06/29/2024] Open

Abstract

MOTIVATION

Exponential growth in sequencing databases has motivated scalable De Bruijn graph-based (DBG) indexing for searching these data, using annotations to label nodes with sample IDs. Low-depth sequencing samples correspond to fragmented subgraphs, complicating finding the long contiguous walks required for alignment queries. Aligners that target single-labelled subgraphs reduce alignment lengths due to fragmentation, leading to low recall for long reads. While some (e.g. label-free) aligners partially overcome fragmentation by combining information from multiple samples, biologically irrelevant combinations in such approaches can inflate the search space or reduce accuracy.

RESULTS

We introduce a new scoring model, 'multi-label alignment' (MLA), for annotated DBGs. MLA leverages two new operations: To promote biologically relevant sample combinations, 'Label Change' incorporates more informative global sample similarity into local scores. To improve connectivity, 'Node Length Change' dynamically adjusts the DBG node length during traversal. Our fast, approximate, yet accurate MLA implementation has two key steps: a single-label seed-chain-extend aligner (SCA) and a multi-label chainer (MLC). SCA uses a traditional scoring model adapting recent chaining improvements to assembly graphs and provides a curated pool of alignments. MLC extracts seed anchors from SCAs alignments, produces multi-label chains using MLA scoring, then finally forms multi-label alignments. We show via substantial improvements in taxonomic classification accuracy that MLA produces biologically relevant alignments, decreasing average weighted UniFrac errors by 63.1%-66.8% and covering 45.5%-47.4% (median) more long-read query characters than state-of-the-art aligners. MLAs runtimes are competitive with label-combining alignment and substantially faster than single-label alignment.

AVAILABILITY AND IMPLEMENTATION

The data, scripts, and instructions for generating our results are available at https://github.com/ratschlab/mla.

Collapse

Břinda K, Lima L, Pignotti S, Quinones-Olvera N, Salikhov K, Chikhi R, Kucherov G, Iqbal Z, Baym M. Efficient and Robust Search of Microbial Genomes via Phylogenetic Compression. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.04.15.536996. [PMID: 37131636 PMCID: PMC10153118 DOI: 10.1101/2023.04.15.536996] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 05/04/2023]

Yu C, Zhao Y, Zhao C, Jin J, Mao K, Wang G. MiniDBG: A Novel and Minimal De Bruijn Graph for Read Mapping. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2024;21:129-142. [PMID: 38060353 DOI: 10.1109/tcbb.2023.3340251] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 02/07/2024]

Schamp CN, Dhowlaghar N, Hudson LK, Bryan DW, Zhong Q, Fozo EM, Gaballa A, Wiedmann M, Denes TG. Selection of mutant Listeria phages under food-relevant conditions can enhance application potential. Appl Environ Microbiol 2023;89:e0100723. [PMID: 37800961 PMCID: PMC10617581 DOI: 10.1128/aem.01007-23] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/15/2023] [Accepted: 08/04/2023] [Indexed: 10/07/2023] Open

Abstract

Bacteriophages are viruses that infect and kill bacteria. Currently, phage products are available for the control of the pathogen Listeria monocytogenes in food products in the United States. In this study, we explore whether experimental evolution can be used to generate phages with improved abilities to function under specific food-relevant conditions. Ultra-pasteurized oat and whole milk were chosen as test matrices as they represent different food groups, yet have similar physical traits and macronutrient composition. We showed that (i) wild-type phage LP-125 infection kinetics are different in the two matrices and (ii) LP-125 has a significantly higher burst size in oat milk. From this, we attempted to evolve LP-125 to have improved infection kinetics in whole milk. Ancestral LP-125 was passaged through 10 rounds of amplification in milk conditions. Plaque-purified DNA samples from milk-selected phages were isolated and sequenced, and mutations present in the isolated phages were identified. We found two nonsynonymous substitutions in LP125_108 and LP125_112 genes, which encode putative baseplate-associated glycerophosphoryl diester phosphodiesterase and baseplate protein, respectively. Protein structural modeling showed that the substituted amino acids in the mutant phages are predicted to localize to surface-exposed helices on the corresponding structures, which might affect the surface charge of proteins and their interaction with the bacterial cell. The phage containing the LP125_112 mutation adsorbed significantly faster than the ancestral phage in both oat and whole milk. Follow-up experiments suggest that fat content may be a key factor for the expression of the phenotype of this mutation. IMPORTANCE Bacteriophages are one of the tools available to control the foodborne pathogen, Listeria monocytogenes. Phage products must work under a broad range of food conditions to be an effective control for L. monocytogenes. Here, we show that the experimental evolution of phages can be used to generate new phages with phenotypes useful under specific conditions. We used this approach to select for a mutant phage that more efficiently binds to L. monocytogenes that is grown in whole milk and oat milk. We show that the fat content of these milks is necessary for the expression of this phenotype. Our findings show that experimental evolution can be used to select for improved phages with better performance under specific conditions. This approach has the potential to support the development of condition-specific phage-based biocontrols in the food industry.

Collapse

Shi ZJ, Nayfach S, Pollard KS. Maast: genotyping thousands of microbial strains efficiently. Genome Biol 2023;24:186. [PMID: 37563669 PMCID: PMC10416524 DOI: 10.1186/s13059-023-03030-8] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/07/2022] [Accepted: 07/31/2023] [Indexed: 08/12/2023] Open

Shipilina D, Pal A, Stankowski S, Chan YF, Barton NH. On the origin and structure of haplotype blocks. Mol Ecol 2023;32:1441-1457. [PMID: 36433653 PMCID: PMC10946714 DOI: 10.1111/mec.16793] [Citation(s) in RCA: 5] [Impact Index Per Article: 2.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/13/2022] [Revised: 11/16/2022] [Accepted: 11/18/2022] [Indexed: 11/27/2022]

Akhter S, Westrin KJ, Zivi N, Nordal V, Kretzschmar WW, Delhomme N, Street NR, Nilsson O, Emanuelsson O, Sundström JF. Cone-setting in spruce is regulated by conserved elements of the age-dependent flowering pathway. THE NEW PHYTOLOGIST 2022;236:1951-1963. [PMID: 36076311 PMCID: PMC9825996 DOI: 10.1111/nph.18449] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 05/28/2022] [Accepted: 08/23/2022] [Indexed: 06/15/2023]

Hunt M, Letcher B, Malone KM, Nguyen G, Hall MB, Colquhoun RM, Lima L, Schatz MC, Ramakrishnan S, Iqbal Z. Minos: variant adjudication and joint genotyping of cohorts of bacterial genomes. Genome Biol 2022;23:147. [PMID: 35791022 PMCID: PMC9254434 DOI: 10.1186/s13059-022-02714-x] [Citation(s) in RCA: 10] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/06/2021] [Accepted: 06/20/2022] [Indexed: 12/30/2022] Open

Yu C, Mao K, Zhao Y, Chang C, Wang G. StLiter: A Novel Algorithm to Iteratively Build the Compacted de Bruijn Graph From Many Complete Genomes. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2022;19:2471-2483. [PMID: 33630738 DOI: 10.1109/tcbb.2021.3062068] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/12/2023]

Ebler J, Ebert P, Clarke WE, Rausch T, Audano PA, Houwaart T, Mao Y, Korbel JO, Eichler EE, Zody MC, Dilthey AT, Marschall T. Pangenome-based genome inference allows efficient and accurate genotyping across a wide spectrum of variant classes. Nat Genet 2022;54:518-525. [PMID: 35410384 PMCID: PMC9005351 DOI: 10.1038/s41588-022-01043-w] [Citation(s) in RCA: 121] [Impact Index Per Article: 40.3] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2021] [Accepted: 03/03/2022] [Indexed: 12/30/2022]

Rivera D, Moreno-Switt AI, Denes TG, Hudson LK, Peters TL, Samir R, Aziz RK, Noben JP, Wagemans J, Dueñas F. Novel Salmonella Phage, vB_Sen_STGO-35-1, Characterization and Evaluation in Chicken Meat. Microorganisms 2022;10:606. [PMID: 35336181 PMCID: PMC8954984 DOI: 10.3390/microorganisms10030606] [Citation(s) in RCA: 15] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/27/2022] [Revised: 03/03/2022] [Accepted: 03/10/2022] [Indexed: 02/05/2023] Open

Balaji A, Sapoval N, Seto C, Leo Elworth R, Fu Y, Nute MG, Savidge T, Segarra S, Treangen TJ. KOMB: K-core based de novo characterization of copy number variation in microbiomes. Comput Struct Biotechnol J 2022;20:3208-3222. [PMID: 35832621 PMCID: PMC9249589 DOI: 10.1016/j.csbj.2022.06.019] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/31/2022] [Revised: 06/08/2022] [Accepted: 06/09/2022] [Indexed: 11/29/2022] Open

CStone: A de novo transcriptome assembler for short-read data that identifies non-chimeric contigs based on underlying graph structure. PLoS Comput Biol 2021;17:e1009631. [PMID: 34813594 PMCID: PMC8651127 DOI: 10.1371/journal.pcbi.1009631] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/23/2021] [Revised: 12/07/2021] [Accepted: 11/11/2021] [Indexed: 11/19/2022] Open

Abstract

With the exponential growth of sequence information stored over the last decade, including that of de novo assembled contigs from RNA-Seq experiments, quantification of chimeric sequences has become essential when assembling read data. In transcriptomics, de novo assembled chimeras can closely resemble underlying transcripts, but patterns such as those seen between co-evolving sites, or mapped read counts, become obscured. We have created a de Bruijn based de novo assembler for RNA-Seq data that utilizes a classification system to describe the complexity of underlying graphs from which contigs are created. Each contig is labelled with one of three levels, indicating whether or not ambiguous paths exist. A by-product of this is information on the range of complexity of the underlying gene families present. As a demonstration of CStones ability to assemble high-quality contigs, and to label them in this manner, both simulated and real data were used. For simulated data, ten million read pairs were generated from cDNA libraries representing four species, Drosophila melanogaster, Panthera pardus, Rattus norvegicus and Serinus canaria. These were assembled using CStone, Trinity and rnaSPAdes; the latter two being high-quality, well established, de novo assembers. For real data, two RNA-Seq datasets, each consisting of ≈30 million read pairs, representing two adult D. melanogaster whole-body samples were used. The contigs that CStone produced were comparable in quality to those of Trinity and rnaSPAdes in terms of length, sequence identity of aligned regions and the range of cDNA transcripts represented, whilst providing additional information on chimerism. Here we describe the details of CStones assembly and classification process, and propose that similar classification systems can be incorporated into other de novo assembly tools. Within a related side study, we explore the effects that chimera’s within reference sets have on the identification of differentially expression genes. CStone is available at: https://sourceforge.net/projects/cstone/.

Within transcriptome reference sets, non-chimeric sequences are representations of transcribed genes, while artificially generated chimeric ones are mosaics of two or more pieces of DNA incorrectly pieced together. One area where such sets are utilized is in the quantification of gene expression patterns; where RNA-Seq reads are mapped to the sequences within, and subsequent count values reflect expression levels. Artificial chimeras can have a negative impact on count values by erroneously increasing variation in relation to the reads being mapped. Reference sets can be created from de novo assembled contigs, but chimeras can be introduced during the assembly process via the required traversal of graphs, representing gene families, constructed from the RNA-Seq data. Graph complexity determines how likely chimeras will arise. We have created CStone, a de novo assembler that utilizes a classification system to describe such complexity. Contigs created by CStone are labelled in a manner that indicates whether or not they are non-chimeric. This encourages contig dependent results to be presented with increased objectivity by maintaining the context of ambiguity associated with the assembly process. CStone has been tested extensively. Additionally, we have quantified the relationship between chimeras within reference sets and the identification of differentially expressed genes.

Collapse

Krannich T, White WTJ, Niehus S, Holley G, Halldórsson BV, Kehr B. Population-scale detection of non-reference sequence variants using colored de Bruijn graphs. Bioinformatics 2021;38:604-611. [PMID: 34726732 PMCID: PMC8756200 DOI: 10.1093/bioinformatics/btab749] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2021] [Revised: 09/27/2021] [Accepted: 10/28/2021] [Indexed: 02/03/2023] Open

Danciu D, Karasikov M, Mustafa H, Kahles A, Rätsch G. Topology-based sparsification of graph annotations. Bioinformatics 2021;37:i169-i176. [PMID: 34252940 PMCID: PMC8346655 DOI: 10.1093/bioinformatics/btab330] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 05/03/2021] [Indexed: 01/03/2023] Open

Abstract

Motivation

Since the amount of published biological sequencing data is growing exponentially, efficient methods for storing and indexing this data are more needed than ever to truly benefit from this invaluable resource for biomedical research. Labeled de Bruijn graphs are a frequently-used approach for representing large sets of sequencing data. While significant progress has been made to succinctly represent the graph itself, efficient methods for storing labels on such graphs are still rapidly evolving.

Results

In this article, we present RowDiff, a new technique for compacting graph labels by leveraging expected similarities in annotations of vertices adjacent in the graph. RowDiff can be constructed in linear time relative to the number of vertices and labels in the graph, and in space proportional to the graph size. In addition, construction can be efficiently parallelized and distributed, making the technique applicable to graphs with trillions of nodes. RowDiff can be viewed as an intermediary sparsification step of the original annotation matrix and can thus naturally be combined with existing generic schemes for compressed binary matrices. Experiments on 10 000 RNA-seq datasets show that RowDiff combined with multi-BRWT results in a 30% reduction in annotation footprint over Mantis-MST, the previously known most compact annotation representation. Experiments on the sparser Fungi subset of the RefSeq collection show that applying RowDiff sparsification reduces the size of individual annotation columns stored as compressed bit vectors by an average factor of 42. When combining RowDiff with a multi-BRWT representation, the resulting annotation is 26 times smaller than Mantis-MST.

Availability and implementation

RowDiff is implemented in C++ within the MetaGraph framework. The source code and the data used in the experiments are publicly available at https://github.com/ratschlab/row_diff.

Collapse

Alanko J, Alipanahi B, Settle J, Boucher C, Gagie T. Buffering updates enables efficient dynamic de Bruijn graphs. Comput Struct Biotechnol J 2021;19:4067-4078. [PMID: 34377371 PMCID: PMC8326735 DOI: 10.1016/j.csbj.2021.06.047] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/15/2021] [Revised: 06/29/2021] [Accepted: 06/29/2021] [Indexed: 12/24/2022] Open

Rahman A, Chikhi R, Medvedev P. Disk compression of k-mer sets. Algorithms Mol Biol 2021;16:10. [PMID: 34154632 PMCID: PMC8218509 DOI: 10.1186/s13015-021-00192-7] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/05/2021] [Accepted: 06/08/2021] [Indexed: 12/23/2022] Open

High-Resolution Genomic Comparisons within Salmonella enterica Serotypes Derived from Beef Feedlot Cattle: Parsing the Roles of Cattle Source, Pen, Animal, Sample Type, and Production Period. Appl Environ Microbiol 2021;87:e0048521. [PMID: 33863705 DOI: 10.1128/aem.00485-21] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Abstract

Salmonella enterica is a major foodborne pathogen, and contaminated beef products have been identified as one of the primary sources of Salmonella-related outbreaks. Pathogenicity and antibiotic resistance of Salmonella are highly serotype and subpopulation specific, which makes it essential to understand high-resolution Salmonella population dynamics in cattle. Time of year, source of cattle, pen, and sample type (i.e., feces, hide, or lymph nodes) have previously been identified as important factors influencing the serotype distribution of Salmonella (e.g., Anatum, Lubbock, Cerro, Montevideo, Kentucky, Newport, and Norwich) that were isolated from a longitudinal sampling design in a research feedlot. In this study, we performed high-resolution genomic comparisons of Salmonella isolates within each serotype using both single-nucleotide polymorphism-based maximum-likelihood phylogeny and hierarchical clustering of core-genome multilocus sequence typing. The importance of the aforementioned features in clonal Salmonella expansion was further explored using a supervised machine learning algorithm. In addition, we identified and compared the resistance genes, plasmids, and pathogenicity island profiles of the isolates within each subpopulation. Our findings indicate that clonal expansion of Salmonella strains in cattle was mainly influenced by the randomization of block and pen, as well as the origin/source of the cattle, i.e., regardless of sampling time and sample type (i.e., feces, lymph node, or hide). Further research is needed concerning the role of the feedlot pen environment prior to cattle placement to better understand carryover contributions of existing strains of Salmonella and their bacteriophages. IMPORTANCE Salmonella serotypes isolated from outbreaks in humans can also be found in beef cattle and feedlots. Virulence factors and antibiotic resistance are among the primary defense mechanisms of Salmonella, and are often associated with clonal expansion. This makes understanding the subpopulation dynamics of Salmonella in cattle critical for effective mitigation. There remains a gap in the literature concerning subpopulation dynamics within Salmonella serotypes in feedlot cattle from the beginning of feeding up until slaughter. Here, we explore Salmonella population dynamics within each serotype using core-genome phylogeny and hierarchical classifications. We used machine learning to quantitatively parse the relative importance of both hierarchical and longitudinal clustering among cattle host samples. Our results reveal that Salmonella populations in cattle are highly clonal over a 6-month study period and that clonal dissemination of Salmonella in cattle is mainly influenced spatially by experimental block and pen, as well by the geographical origin of the cattle.

Collapse

Alipanahi B, Muggli MD, Jundi M, Noyes NR, Boucher C. Metagenome SNP calling via read-colored de Bruijn graphs. Bioinformatics 2021;36:5275-5281. [PMID: 32049324 DOI: 10.1093/bioinformatics/btaa081] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 01/08/2020] [Accepted: 02/03/2020] [Indexed: 11/13/2022] Open

Abstract

MOTIVATION

Metagenomics refers to the study of complex samples containing of genetic contents of multiple individual organisms and, thus, has been used to elucidate the microbiome and resistome of a complex sample. The microbiome refers to all microbial organisms in a sample, and the resistome refers to all of the antimicrobial resistance (AMR) genes in pathogenic and non-pathogenic bacteria. Single-nucleotide polymorphisms (SNPs) can be effectively used to 'fingerprint' specific organisms and genes within the microbiome and resistome and trace their movement across various samples. However, to effectively use these SNPs for this traceability, a scalable and accurate metagenomics SNP caller is needed. Moreover, such an SNP caller should not be reliant on reference genomes since 95% of microbial species is unculturable, making the determination of a reference genome extremely challenging. In this article, we address this need.

RESULTS

We present LueVari, a reference-free SNP caller based on the read-colored de Bruijn graph, an extension of the traditional de Bruijn graph that allows repeated regions longer than the k-mer length and shorter than the read length to be identified unambiguously. LueVari is able to identify SNPs in both AMR genes and chromosomal DNA from shotgun metagenomics data with reliable sensitivity (between 91% and 99%) and precision (between 71% and 99%) as the performance of competing methods varies widely. Furthermore, we show that LueVari constructs sequences containing the variation, which span up to 97.8% of genes in datasets, which can be helpful in detecting distinct AMR genes in large metagenomic datasets.

AVAILABILITY AND IMPLEMENTATION

Code and datasets are publicly available at https://github.com/baharpan/cosmo/tree/LueVari.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

Collapse

Horesh G, Blackwell GA, Tonkin-Hill G, Corander J, Heinz E, Thomson NR. A comprehensive and high-quality collection of Escherichia coli genomes and their genes. Microb Genom 2021;7:000499. [PMID: 33417534 PMCID: PMC8208696 DOI: 10.1099/mgen.0.000499] [Citation(s) in RCA: 27] [Impact Index Per Article: 6.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/21/2020] [Accepted: 12/07/2020] [Indexed: 01/25/2023] Open

Taiwo AO, Harper LA, Derbyshire MC. Impacts of fludioxonil resistance on global gene expression in the necrotrophic fungal plant pathogen Sclerotinia sclerotiorum. BMC Genomics 2021;22:91. [PMID: 33516198 PMCID: PMC7847169 DOI: 10.1186/s12864-021-07402-x] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/27/2020] [Accepted: 01/21/2021] [Indexed: 01/23/2023] Open

Abstract

Background

The fungicide fludioxonil over-stimulates the fungal response to osmotic stress, leading to over-accumulation of glycerol and hyphal swelling and bursting. Fludioxonil-resistant fungal strains that are null-mutants for osmotic stress response genes are easily generated through continual sub-culturing on sub-lethal fungicide doses. Using this approach combined with RNA sequencing, we aimed to characterise the effects of mutations in osmotic stress response genes on the transcriptional profile of the important agricultural pathogen Sclerotinia sclerotiorum under standard laboratory conditions. Our objective was to understand the impact of disruption of the osmotic stress response on the global transcriptional regulatory network in an important agricultural pathogen.

Results

We generated two fludioxonil-resistant S. sclerotiorum strains, which exhibited growth defects and hypersensitivity to osmotic stressors. Both had missense mutations in the homologue of the Neurospora crassa osmosensing two component histidine kinase gene OS1, and one had a disruptive in-frame deletion in a non-associated gene. RNA sequencing showed that both strains together differentially expressed 269 genes relative to the parent during growth in liquid broth. Of these, 185 (69%) were differentially expressed in both strains in the same direction, indicating similar effects of the different point mutations in OS1 on the transcriptome. Among these genes were numerous transmembrane transporters and secondary metabolite biosynthetic genes.

Conclusions

Our study is an initial investigation into the kinds of processes regulated through the osmotic stress pathway in S. sclerotiorum. It highlights a possible link between secondary metabolism and osmotic stress signalling, which could be followed up in future studies.

Supplementary Information

The online version contains supplementary material available at 10.1186/s12864-021-07402-x.

Collapse

Holley G, Beyter D, Ingimundardottir H, Møller PL, Kristmundsdottir S, Eggertsson HP, Halldorsson BV. Ratatosk: hybrid error correction of long reads enables accurate variant calling and assembly. Genome Biol 2021;22:28. [PMID: 33419473 PMCID: PMC7792008 DOI: 10.1186/s13059-020-02244-4] [Citation(s) in RCA: 37] [Impact Index Per Article: 9.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2020] [Accepted: 12/15/2020] [Indexed: 12/20/2022] Open

Mutant and Recombinant Phages Selected from In Vitro Coevolution Conditions Overcome Phage-Resistant Listeria monocytogenes. Appl Environ Microbiol 2020;86:AEM.02138-20. [PMID: 32887717 DOI: 10.1128/aem.02138-20] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2020] [Accepted: 08/31/2020] [Indexed: 12/17/2022] Open

Abstract

Bacteriophages (phages) are currently available for use by the food industry to control the foodborne pathogen Listeria monocytogenes Although phage biocontrols are effective under specific conditions, their use can select for phage-resistant bacteria that repopulate phage-treated environments. Here, we performed short-term coevolution experiments to investigate the impact of single phages and a two-phage cocktail on the regrowth of phage-resistant L. monocytogenes and the adaptation of the phages to overcome this resistance. We used whole-genome sequencing to identify mutations in the target host that confer phage resistance and in the phages that alter host range. We found that infections with Listeria phages LP-048, LP-125, or a combination of both select for different populations of phage-resistant L. monocytogenes bacteria with different regrowth times. Phages isolated from the end of the coevolution experiments were found to have gained the ability to infect phage-resistant mutants of L. monocytogenes and L. monocytogenes strains previously found to be broadly resistant to phage infection. Phages isolated from coinfected cultures were identified as recombinants of LP-048 and LP-125. Interestingly, recombination events occurred twice independently in a locus encoding two proteins putatively involved in DNA binding. We show that short-term coevolution of phages and their hosts can be utilized to obtain mutant and recombinant phages with adapted host ranges. These laboratory-evolved phages may be useful for limiting the emergence of phage resistance and for targeting strains that show general resistance to wild-type (WT) phages.IMPORTANCE Listeria monocytogenes is a life-threatening bacterial foodborne pathogen that can persist in food processing facilities for years. Phages can be used to control L. monocytogenes in food production, but phage-resistant bacterial subpopulations can regrow in phage-treated environments. Coevolution experiments were conducted on a Listeria phage-host system to provide insight into the genetic variation that emerges in both the phage and bacterial host under reciprocal selective pressure. As expected, mutations were identified in both phage and host, but additionally, recombination events were shown to have repeatedly occurred between closely related phages that coinfected L. monocytogenes This study demonstrates that in vitro evolution of phages can be utilized to expand the host range and improve the long-term efficacy of phage-based control of L. monocytogenes This approach may also be applied to other phage-host systems for applications in biocontrol, detection, and phage therapy.

Collapse

The ecological and genomic basis of explosive adaptive radiation. Nature 2020;586:75-79. [DOI: 10.1038/s41586-020-2652-7] [Citation(s) in RCA: 87] [Impact Index Per Article: 17.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/14/2020] [Accepted: 05/22/2020] [Indexed: 12/22/2022]

Garimella KV, Iqbal Z, Krause MA, Campino S, Kekre M, Drury E, Kwiatkowski D, Sá JM, Wellems TE, McVean G. Detection of simple and complex de novo mutations with multiple reference sequences. Genome Res 2020;30:1154-1169. [PMID: 32817236 PMCID: PMC7462078 DOI: 10.1101/gr.255505.119] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/02/2019] [Accepted: 07/17/2020] [Indexed: 12/25/2022]

Affiliation(s)

Kiran V Garimella Data Sciences Platform, Broad Institute of MIT and Harvard, Cambridge, Massachusetts 02142, USA.,Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, Oxfordshire, OX3 7BN, United Kingdom.,Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, Oxfordshire, OX3 7LF, United Kingdom
Zamin Iqbal Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, Oxfordshire, OX3 7BN, United Kingdom.,European Bioinformatics Institute (EMBL-EBI), Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SD, United Kingdom
Michael A Krause Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, Oxfordshire, OX3 7BN, United Kingdom.,The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom.,Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
Susana Campino The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom
Mihir Kekre The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom
Eleanor Drury The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom
Dominic Kwiatkowski Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, Oxfordshire, OX3 7LF, United Kingdom.,The Wellcome Trust Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, United Kingdom
Juliana M Sá Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
Thomas E Wellems Laboratory of Malaria and Vector Research, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Bethesda, Maryland 20892, USA
Gil McVean Wellcome Trust Centre for Human Genetics, University of Oxford, Oxford, Oxfordshire, OX3 7BN, United Kingdom.,Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, Oxford, Oxfordshire, OX3 7LF, United Kingdom

Collapse

Petit RA, Read TD. Bactopia: a Flexible Pipeline for Complete Analysis of Bacterial Genomes. mSystems 2020;5:e00190-20. [PMID: 32753501 PMCID: PMC7406220 DOI: 10.1128/msystems.00190-20] [Citation(s) in RCA: 107] [Impact Index Per Article: 21.4] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/23/2020] [Accepted: 07/15/2020] [Indexed: 12/19/2022] Open

Abstract

Sequencing of bacterial genomes using Illumina technology has become such a standard procedure that often data are generated faster than can be conveniently analyzed. We created a new series of pipelines called Bactopia, built using Nextflow workflow software, to provide efficient comparative genomic analyses for bacterial species or genera. Bactopia consists of a data set setup step (Bactopia Data Sets [BaDs]), which creates a series of customizable data sets for the species of interest, the Bactopia Analysis Pipeline (BaAP), which performs quality control, genome assembly, and several other functions based on the available data sets and outputs the processed data to a structured directory format, and a series of Bactopia Tools (BaTs) that perform specific postprocessing on some or all of the processed data. BaTs include pan-genome analysis, computing average nucleotide identity between samples, extracting and profiling the 16S genes, and taxonomic classification using highly conserved genes. It is expected that the number of BaTs will increase to fill specific applications in the future. As a demonstration, we performed an analysis of 1,664 public Lactobacillus genomes, focusing on Lactobacillus crispatus, a species that is a common part of the human vaginal microbiome. Bactopia is an open source system that can scale from projects as small as one bacterial genome to ones including thousands of genomes and that allows for great flexibility in choosing comparison data sets and options for downstream analysis. Bactopia code can be accessed at https://www.github.com/bactopia/bactopiaIMPORTANCE It is now relatively easy to obtain a high-quality draft genome sequence of a bacterium, but bioinformatic analysis requires organization and optimization of multiple open source software tools. We present Bactopia, a pipeline for bacterial genome analysis, as an option for processing bacterial genome data. Bactopia also automates downloading of data from multiple public sources and species-specific customization. Because the pipeline is written in the Nextflow language, analyses can be scaled from individual genomes on a local computer to thousands of genomes using cloud resources. As a usage example, we processed 1,664 Lactobacillus genomes from public sources and used comparative analysis workflows (Bactopia Tools) to identify and analyze members of the L. crispatus species.

Collapse

A unified catalog of 204,938 reference genomes from the human gut microbiome. Nat Biotechnol 2020;39:105-114. [PMID: 32690973 PMCID: PMC7801254 DOI: 10.1038/s41587-020-0603-3] [Citation(s) in RCA: 688] [Impact Index Per Article: 137.6] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/18/2019] [Accepted: 05/31/2020] [Indexed: 01/08/2023]

Listeria monocytogenes is prevalent in retail produce environments but Salmonella enterica is rare. Food Control 2020. [DOI: 10.1016/j.foodcont.2020.107173] [Citation(s) in RCA: 16] [Impact Index Per Article: 3.2] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/16/2022]

Eizenga JM, Novak AM, Sibbesen JA, Heumos S, Ghaffaari A, Hickey G, Chang X, Seaman JD, Rounthwaite R, Ebler J, Rautiainen M, Garg S, Paten B, Marschall T, Sirén J, Garrison E. Pangenome Graphs. Annu Rev Genomics Hum Genet 2020;21:139-162. [PMID: 32453966 DOI: 10.1146/annurev-genom-120219-080406] [Citation(s) in RCA: 136] [Impact Index Per Article: 27.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]

Affiliation(s)

Jordan M Eizenga Genomics Institute, University of California, Santa Cruz, California 95064, USA;
Adam M Novak Genomics Institute, University of California, Santa Cruz, California 95064, USA;
Jonas A Sibbesen Genomics Institute, University of California, Santa Cruz, California 95064, USA;
Simon Heumos Quantitative Biology Center, University of Tübingen, 72076 Tübingen, Germany
Ali Ghaffaari Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany.,Max Planck Institute for Informatics, 66123 Saarbrücken, Germany.,Saarbrücken Graduate School for Computer Science, Saarland University, 66123 Saarbrücken, Germany
Glenn Hickey Genomics Institute, University of California, Santa Cruz, California 95064, USA;
Xian Chang Genomics Institute, University of California, Santa Cruz, California 95064, USA;
Josiah D Seaman Royal Botanic Gardens, Kew, Richmond TW9 3AB, United Kingdom.,School of Biological and Chemical Sciences, Queen Mary University of London, London E1 4NS, United Kingdom
Robin Rounthwaite Genomics Institute, University of California, Santa Cruz, California 95064, USA;
Jana Ebler Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany.,Max Planck Institute for Informatics, 66123 Saarbrücken, Germany.,Saarbrücken Graduate School for Computer Science, Saarland University, 66123 Saarbrücken, Germany
Mikko Rautiainen Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany.,Max Planck Institute for Informatics, 66123 Saarbrücken, Germany.,Saarbrücken Graduate School for Computer Science, Saarland University, 66123 Saarbrücken, Germany
Shilpa Garg Departments of Genetics and Biomedical Informatics, Harvard Medical School, Boston, Massachusetts 02215, USA.,Department of Data Sciences, Dana-Farber Cancer Institute, Boston, Massachusetts 02215, USA
Benedict Paten Genomics Institute, University of California, Santa Cruz, California 95064, USA;
Tobias Marschall Center for Bioinformatics, Saarland University, 66123 Saarbrücken, Germany.,Max Planck Institute for Informatics, 66123 Saarbrücken, Germany
Jouni Sirén Genomics Institute, University of California, Santa Cruz, California 95064, USA;
Erik Garrison Genomics Institute, University of California, Santa Cruz, California 95064, USA;

Collapse

Almodaresi F, Pandey P, Ferdman M, Johnson R, Patro R. An Efficient, Scalable, and Exact Representation of High-Dimensional Color Information Enabled Using de Bruijn Graph Search. J Comput Biol 2020;27:485-499. [PMID: 32176522 PMCID: PMC7185321 DOI: 10.1089/cmb.2019.0322] [Citation(s) in RCA: 13] [Impact Index Per Article: 2.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/13/2022] Open

Homburgvirus LP-018 Has a Unique Ability to Infect Phage-Resistant Listeria monocytogenes. Viruses 2019;11:v11121166. [PMID: 31861087 PMCID: PMC6950383 DOI: 10.3390/v11121166] [Citation(s) in RCA: 12] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/22/2019] [Revised: 12/11/2019] [Accepted: 12/15/2019] [Indexed: 12/17/2022] Open

Complete Genome Sequences of Two Listeria Phages of the Genus Pecentumvirus. Microbiol Resour Announc 2019;8:8/46/e01229-19. [PMID: 31727716 PMCID: PMC6856282 DOI: 10.1128/mra.01229-19] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open

Rivera D, Hudson LK, Denes TG, Hamilton-West C, Pezoa D, Moreno-Switt AI. Two Phages of the Genera Felixounavirus Subjected to 12 Hour Challenge on Salmonella Infantis Showed Distinct Genotypic and Phenotypic Changes. Viruses 2019;11:E586. [PMID: 31252667 PMCID: PMC6669636 DOI: 10.3390/v11070586] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/15/2019] [Revised: 06/23/2019] [Accepted: 06/25/2019] [Indexed: 12/15/2022] Open

Abstract

Salmonella Infantis is considered in recent years an emerging Salmonella serovar, as it has been associated with several outbreaks and multidrug resistance phenotypes. Phages appear as a possible alternative strategy to control Salmonella Infantis (SI). The aims of this work were to characterize two phages of the Felixounavirus genus, isolated using the same strain of SI, and to expose them to interact in challenge assays to identify genetic and phenotypic changes generated from these interactions. These two phages have a shared nucleotide identity of 97% and are differentiated by their host range: one phage has a wide host range (lysing 14 serovars), and the other has a narrow host range (lysing 6 serovars). During the 12 h challenge we compared: (1) optical density of SI, (2) proportion of SI survivors from phage-infected cultures, and (3) phage titer. Isolates obtained through the assays were evaluated by efficiency of plating (EOP) and by host-range characterization. Genomic modifications were characterized by evaluation of single nucleotide polymorphisms (SNPs). The optical density (600 nm) of phage-infected SI decreased, as compared to the uninfected control, by an average of 0.7 for SI infected with the wide-host-range (WHR) phage and by 0.3 for SI infected with the narrow-host-range (NHR) phage. WHR phage reached higher phage titer (7 × 1011 PFU/mL), and a lower proportion of SI survivor was obtained from the challenge assay. In SI that interacted with phages, we identified SNPs in two genes (rfaK and rfaB), which are both involved in lipopolysaccharide (LPS) polymerization. Therefore, mutations that could impact potential phage receptors on the host surface were selected by lytic phage exposure. This work demonstrates that the interaction of Salmonella phages (WHR and NHR) with SI for 12 h in vitro leads to emergence of new phenotypic and genotypic traits in both phage and host. This information is crucial for the rational design of phage-based control strategies.

Collapse

Cross-resistance to phage infection in Listeria monocytogenes serotype 1/2a mutants. Food Microbiol 2019;84:103239. [PMID: 31421769 DOI: 10.1016/j.fm.2019.06.003] [Citation(s) in RCA: 24] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/01/2019] [Revised: 05/31/2019] [Accepted: 06/03/2019] [Indexed: 01/22/2023]

Ultrafast search of all deposited bacterial and viral genomic data. Nat Biotechnol 2019;37:152-159. [PMID: 30718882 PMCID: PMC6420049 DOI: 10.1038/s41587-018-0010-1] [Citation(s) in RCA: 71] [Impact Index Per Article: 11.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2017] [Accepted: 12/20/2018] [Indexed: 02/07/2023]

Akhter S, Kretzschmar WW, Nordal V, Delhomme N, Street NR, Nilsson O, Emanuelsson O, Sundström JF. Integrative Analysis of Three RNA Sequencing Methods Identifies Mutually Exclusive Exons of MADS-Box Isoforms During Early Bud Development in Picea abies. FRONTIERS IN PLANT SCIENCE 2018;9:1625. [PMID: 30483285 PMCID: PMC6243048 DOI: 10.3389/fpls.2018.01625] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Subscribe] [Scholar Register] [Received: 06/15/2018] [Accepted: 10/18/2018] [Indexed: 05/06/2023]

Abstract

Recent efforts to sequence the genomes and transcriptomes of several gymnosperm species have revealed an increased complexity in certain gene families in gymnosperms as compared to angiosperms. One example of this is the gymnosperm sister clade to angiosperm TM3-like MADS-box genes, which at least in the conifer lineage has expanded in number of genes. We have previously identified a member of this sub-clade, the conifer gene DEFICIENS AGAMOUS LIKE 19 (DAL19), as being specifically upregulated in cone-setting shoots. Here, we show through Sanger sequencing of mRNA-derived cDNA and mapping to assembled conifer genomic sequences that DAL19 produces six mature mRNA splice variants in Picea abies. These splice variants use alternate first and last exons, while their four central exons constitute a core region present in all six transcripts. Thus, they are likely to be transcript isoforms. Quantitative Real-Time PCR revealed that two mutually exclusive first DAL19 exons are differentially expressed across meristems that will form either male or female cones, or vegetative shoots. Furthermore, mRNA in situ hybridization revealed that two mutually exclusive last DAL19 exons were expressed in a cell-specific pattern within bud meristems. Based on these findings in DAL19, we developed a sensitive approach to transcript isoform assembly from short-read sequencing of mRNA. We applied this method to 42 putative MADS-box core regions in P. abies, from which we assembled 1084 putative transcripts. We manually curated these transcripts to arrive at 933 assembled transcript isoforms of 38 putative MADS-box genes. 152 of these isoforms, which we assign to 28 putative MADS-box genes, were differentially expressed across eight female, male, and vegetative buds. We further provide evidence of the expression of 16 out of the 38 putative MADS-box genes by mapping PacBio Iso-Seq circular consensus reads derived from pooled sample sequencing to assembled transcripts. In summary, our analyses reveal the use of mutually exclusive exons of MADS-box gene isoforms during early bud development in P. abies, and we find that the large number of identified MADS-box transcripts in P. abies results not only from expansion of the gene family through gene duplication events but also from the generation of numerous splice variants.

Collapse