51
|
Nagy NA, Rácz R, Rimington O, Póliska S, Orozco-terWengel P, Bruford MW, Barta Z. Draft genome of a biparental beetle species, Lethrus apterus. BMC Genomics 2021; 22:301. [PMID: 33902445 PMCID: PMC8074431 DOI: 10.1186/s12864-021-07627-w] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/28/2020] [Accepted: 04/13/2021] [Indexed: 11/23/2022] Open
Abstract
BACKGROUND The lack of an understanding about the genomic architecture underpinning parental behaviour in subsocial insects displaying simple parental behaviours prevents the development of a full understanding about the evolutionary origin of sociality. Lethrus apterus is one of the few insect species that has biparental care. Division of labour can be observed between parents during the reproductive period in order to provide food and protection for their offspring. RESULTS Here, we report the draft genome of L. apterus, the first genome in the family Geotrupidae. The final assembly consisted of 286.93 Mbp in 66,933 scaffolds. Completeness analysis found the assembly contained 93.5% of the Endopterygota core BUSCO gene set. Ab initio gene prediction resulted in 25,385 coding genes, whereas homology-based analyses predicted 22,551 protein coding genes. After merging, 20,734 were found during functional annotation. Compared to other publicly available beetle genomes, 23,528 genes among the predicted genes were assigned to orthogroups of which 1664 were in species-specific groups. Additionally, reproduction related genes were found among the predicted genes based on which a reduction in the number of odorant- and pheromone-binding proteins was detected. CONCLUSIONS These genes can be used in further comparative and functional genomic researches which can advance our understanding of the genetic basis and hence the evolution of parental behaviour.
Collapse
Affiliation(s)
- Nikoletta A Nagy
- MTA-DE Behavioural Ecology Research Group, Department of Evolutionary Zoology, University of Debrecen, Egyetem tér 1, Debrecen, H-4032, Hungary.
- Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary.
| | - Rita Rácz
- MTA-DE Behavioural Ecology Research Group, Department of Evolutionary Zoology, University of Debrecen, Egyetem tér 1, Debrecen, H-4032, Hungary
- Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary
| | | | - Szilárd Póliska
- Genomic Medicine and Bioinformatic Core Facility, Department of Biochemistry and Molecular Biology, Faculty of Medicine, University of Debrecen, Debrecen, Hungary
| | | | | | - Zoltán Barta
- MTA-DE Behavioural Ecology Research Group, Department of Evolutionary Zoology, University of Debrecen, Egyetem tér 1, Debrecen, H-4032, Hungary
- Department of Evolutionary Zoology and Human Biology, University of Debrecen, Debrecen, Hungary
| |
Collapse
|
52
|
Kämpfer P, Glaeser SP, McInroy JA, Clermont D, Criscuolo A, Busse HJ. Pseudomonas carbonaria sp. nov., isolated from charcoal. Int J Syst Evol Microbiol 2021; 71. [PMID: 33835910 DOI: 10.1099/ijsem.0.004750] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/22/2023] Open
Abstract
A beige-pigmented, oxidase-positive bacterial isolate, Wesi-4T, isolated from charcoal in 2012, was examined in detail by applying a polyphasic taxonomic approach. Cells of the isolates were rod shaped and Gram-stain negative. Examination of the 16S rRNA gene sequence of the isolate revealed highest sequence similarities to the type strains of Pseudomonas matsuisoli and Pseudomonas nosocomialis (both 97.3 %). Phylogenetic analyses on the basis of the 16S rRNA gene sequences indicated a separate position of Wesi-4T, which was confirmed by multilocus sequence analyses (MLSA) based on the three loci gyrB, rpoB and rpoD and a core genome-based phylogenetic tree. Genome sequence based comparison of Wesi-4T and the type strains of P. matsuisoli and P. nosocomialis yielded average nucleotide identity values <95 % and in silico DNA-DNA hybridization values <70 %, respectively. The polyamine pattern contains the major amines putrescine, cadaverine and spermidine. The quinone system contains predominantly ubiquinone Q-9 and in the polar lipid profile diphosphatidylglycerol, phosphatidylglycerol and phosphatidylethanolamine are the major lipids. The fatty acid contains predominantly C16 : 0, summed feature 3 (C16 : 1ω7c and/or C16 : 1ω6c) and summed feature 8 (C18 : 1ω7c and/or C18 : 1 ω6c). In addition, physiological and biochemical tests revealed a clear phenotypic difference from P. matsuisoli. These cumulative data indicate that the isolate represents a novel species of the genus Pseudomonas for which the name Pseudomonas carbonaria sp. nov. is proposed with Wesi-4T (=DSM 110367T=CIP 111764T=CCM 9017T) as the type strain.
Collapse
Affiliation(s)
- Peter Kämpfer
- Institut für Angewandte Mikrobiologie, Justus-Liebig-Universität Giessen, D-35392 Giessen, Germany
| | - S P Glaeser
- Institut für Angewandte Mikrobiologie, Justus-Liebig-Universität Giessen, D-35392 Giessen, Germany
| | - John A McInroy
- Department of Entomology and Plant Pathology, Auburn University, Alabama, USA
| | | | - Alexis Criscuolo
- Hub de Bioinformatique et Biostatistique - Département Biologie Computationnelle, Institut Pasteur, Paris, France
| | - Hans-Jürgen Busse
- Institut für Mikrobiologie, Veterinärmedizinische Universität, A-1210 Wien, Austria
| |
Collapse
|
53
|
Heo Y, Manikandan G, Ramachandran A, Chen D. Comprehensive Evaluation of Error-Correction Methodologies for Genome Sequencing Data. Bioinformatics 2021. [DOI: 10.36255/exonpublications.bioinformatics.2021.ch6] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
|
54
|
Kämpfer P, Busse HJ, Glaeser SP, Clermont D, Criscuolo A, Mietke H. Jeotgalicoccus meleagridis sp. nov. isolated from bioaerosol from emissions of a turkey fattening plant and reclassification of Jeotgalicoccus halophilus Liu et al. 2011 as a later heterotypic synonym of Jeotgalicoccus aerolatus Martin et al. 2011. Int J Syst Evol Microbiol 2021; 71. [PMID: 33724175 DOI: 10.1099/ijsem.0.004745] [Citation(s) in RCA: 11] [Impact Index Per Article: 2.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
A Gram-stain-positive, non-motile, non-spore-forming, coccus (strain Do 184T) was isolated from exhaust air of a turkey fattening plant on mannitol salt agar. The strain shared high 16S rRNA gene sequence similarity to the type strains of Jeotgalicoccus aerolatus (98.0%) followed by Jeotgalicoccus marinus (97.2%) and Jeotgalicoccus huakuii (97.1%). All other 16S rRNA gene sequence similarities to species of the genus Jeotgalicoccus were below 97%. The average nucleotide identities (ANI) between the Do 184T genome assembly and the ones of type strains of species of the genus Jeotgalicoccus were far below the 95% species delineation cutoff value, ranging from 79.47% (J. marinus DSM 19772T) to 75.30% (J. pinnipedialis CIP 107946T). The quinone system of Do 184T, the polar lipid profile, the polyamine pattern and the fatty acid profile were in congruence with those reported for other species of the genus Jeotgalicoccus and thus supported the affiliation of Do 184T to this genus. Do 184T represents a novel species, for which the name Jeotgalicoccus meleagridis sp. nov. is proposed, with the type strain Do 184T (=LMG 31100T=CCM 8918T=CIP 111649T). In addition, data on genome sequences of Jeotgalicoccus halophilus C1-52T =CGMCC 1.8911T=NBRC 105788T and Jeotgalicoccus aerolatus MPA-33T=CCM 7679T=CCUG 57953T=DSM 22420T=CIP 111750T indicate that both isolates represent the same species. Pairwise ANI between the genomes of these two strains lead to similarities of 98.98-99.05 %. These results indicate that these strains represent members of the same species. Due to priority of publication it is proposed that Jeotgalicoccus halophilus is reclassified as Jeotgalicoccus aerolatus.
Collapse
Affiliation(s)
- Peter Kämpfer
- Institut für Angewandte Mikrobiologie, Justus-Liebig-Universität Giessen, D-35392 Giessen, Germany
| | - Hans-Jürgen Busse
- Institut für Mikrobiologie, Veterinärmedizinische Universität, A-1210 Wien, Austria
| | - Stefanie P Glaeser
- Institut für Angewandte Mikrobiologie, Justus-Liebig-Universität Giessen, D-35392 Giessen, Germany
| | | | - Alexis Criscuolo
- Hub de Bioinformatique et Biostatistique ‒ Département Biologie Computationnelle, Institut Pasteur, USR 3756 CNRS, Paris, France
| | - Henriette Mietke
- Staatliche Betriebsgesellschaft für Umwelt und Landwirtschaft, D-01683 Nossen, Germany
| |
Collapse
|
55
|
Fischer C, Koblmüller S, Börger C, Michelitsch G, Trajanoski S, Schlötterer C, Guelly C, Thallinger GG, Sturmbauer C. Genome sequences of Tropheus moorii and Petrochromis trewavasae, two eco-morphologically divergent cichlid fishes endemic to Lake Tanganyika. Sci Rep 2021; 11:4309. [PMID: 33619328 PMCID: PMC7900123 DOI: 10.1038/s41598-021-81030-z] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2020] [Accepted: 12/28/2020] [Indexed: 01/01/2023] Open
Abstract
With more than 1000 species, East African cichlid fishes represent the fastest and most species-rich vertebrate radiation known, providing an ideal model to tackle molecular mechanisms underlying recurrent adaptive diversification. We add high-quality genome reconstructions for two phylogenetic key species of a lineage that diverged about ~ 3-9 million years ago (mya), representing the earliest split of the so-called modern haplochromines that seeded additional radiations such as those in Lake Malawi and Victoria. Along with the annotated genomes we analysed discriminating genomic features of the study species, each representing an extreme trophic morphology, one being an algae browser and the other an algae grazer. The genomes of Tropheus moorii (TM) and Petrochromis trewavasae (PT) comprise 911 and 918 Mbp with 40,300 and 39,600 predicted genes, respectively. Our DNA sequence data are based on 5 and 6 individuals of TM and PT, and the transcriptomic sequences of one individual per species and sex, respectively. Concerning variation, on average we observed 1 variant per 220 bp (interspecific), and 1 variant per 2540 bp (PT vs PT)/1561 bp (TM vs TM) (intraspecific). GO enrichment analysis of gene regions affected by variants revealed several candidates which may influence phenotype modifications related to facial and jaw morphology, such as genes belonging to the Hedgehog pathway (SHH, SMO, WNT9A) and the BMP and GLI families.
Collapse
Affiliation(s)
- C Fischer
- Institute of Biology, University of Graz, Graz, Austria
- Institute of Biomedical Informatics, Graz University of Technology, Graz, Austria
| | - S Koblmüller
- Institute of Biology, University of Graz, Graz, Austria
| | - C Börger
- Institute of Biology, University of Graz, Graz, Austria
| | - G Michelitsch
- Center for Medical Research, Medical University of Graz, Graz, Austria
| | - S Trajanoski
- Center for Medical Research, Medical University of Graz, Graz, Austria
| | - C Schlötterer
- Institut für Populationsgenetik, Vetmeduni Vienna, Vienna, Austria
| | - C Guelly
- Center for Medical Research, Medical University of Graz, Graz, Austria
| | - G G Thallinger
- Institute of Biomedical Informatics, Graz University of Technology, Graz, Austria.
- BioTechMed-Graz, Graz, Austria.
| | - C Sturmbauer
- Institute of Biology, University of Graz, Graz, Austria.
- BioTechMed-Graz, Graz, Austria.
| |
Collapse
|
56
|
Tahir M, Sardaraz M, Mehmood Z, Khan MS. ESREEM: Efficient Short Reads Error Estimation Computational Model for Next-generation Genome Sequencing. Curr Bioinform 2021. [DOI: 10.2174/1574893615999200614171832] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/22/2022]
Abstract
Aims:
To assess the error profile in NGS data, generated from high throughput
sequencing machines.
Background:
Short-read sequencing data from Next Generation Sequencing (NGS) are currently
being generated by a number of research projects. Depicting the errors produced by NGS
platforms and expressing accurate genetic variation from reads are two inter-dependent phases. It
has high significance in various analyses, such as genome sequence assembly, SNPs calling,
evolutionary studies, and haplotype inference. The systematic and random errors show incidence
profile for each of the sequencing platforms i.e. Illumina sequencing, Pacific Biosciences, 454
pyrosequencing, Complete Genomics DNA nanoball sequencing, Ion Torrent sequencing, and
Oxford Nanopore sequencing. Advances in NGS deliver galactic data with the addition of errors.
Some ratio of these errors may emulate genuine true biological signals i.e., mutation, and may
subsequently negate the results. Various independent applications have been proposed to correct
the sequencing errors. Systematic analysis of these algorithms shows that state-of-the-art models
are missing.
Objective:
In this paper, an effcient error estimation computational model called ESREEM is
proposed to assess the error rates in NGS data.
Methods:
The proposed model prospects the analysis that there exists a true linear regression
association between the number of reads containing errors and the number of reads sequenced. The
model is based on a probabilistic error model integrated with the Hidden Markov Model (HMM).
Result:
The proposed model is evaluated on several benchmark datasets and the results obtained are
compared with state-of-the-art algorithms.
Conclusions:
Experimental results analyses show that the proposed model efficiently estimates errors
and runs in less time as compared to others.
Collapse
Affiliation(s)
- Muhammad Tahir
- Department of Computer Science, COMSATS University Islamabad, Attock Campus, Attock,Pakistan
| | - Muhammad Sardaraz
- Department of Computer Science, COMSATS University Islamabad, Attock Campus, Attock,Pakistan
| | - Zahid Mehmood
- Department of Software Engineering, University of Engineering and Technology, Taxila,Pakistan
| | - Muhammad Saud Khan
- Department of Computer Science, COMSATS University Islamabad, Attock Campus, Attock,Pakistan
| |
Collapse
|
57
|
Shi W, Sun Q, Fan G, Hideaki S, Moriya O, Itoh T, Zhou Y, Cai M, Kim SG, Lee JS, Sedlacek I, Arahal DR, Lucena T, Kawasaki H, Evtushenko L, Weir BS, Alexander S, Dénes D, Tanasupawat S, Eurwilaichitr L, Ingsriswang S, Gomez-Gil B, Hazbón MH, Riojas MA, Suwannachart C, Yao S, Vandamme P, Peng F, Chen Z, Liu D, Sun X, Zhang X, Zhou Y, Meng Z, Wu L, Ma J. gcType: a high-quality type strain genome database for microbial phylogenetic and functional research. Nucleic Acids Res 2021; 49:D694-D705. [PMID: 33119759 PMCID: PMC7778895 DOI: 10.1093/nar/gkaa957] [Citation(s) in RCA: 65] [Impact Index Per Article: 16.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2020] [Revised: 10/06/2020] [Accepted: 10/28/2020] [Indexed: 02/07/2023] Open
Abstract
Taxonomic and functional research of microorganisms has increasingly relied upon genome-based data and methods. As the depository of the Global Catalogue of Microorganisms (GCM) 10K prokaryotic type strain sequencing project, Global Catalogue of Type Strain (gcType) has published 1049 type strain genomes sequenced by the GCM 10K project which are preserved in global culture collections with a valid published status. Additionally, the information provided through gcType includes >12 000 publicly available type strain genome sequences from GenBank incorporated using quality control criteria and standard data annotation pipelines to form a high-quality reference database. This database integrates type strain sequences with their phenotypic information to facilitate phenotypic and genotypic analyses. Multiple formats of cross-genome searches and interactive interfaces have allowed extensive exploration of the database's resources. In this study, we describe web-based data analysis pipelines for genomic analyses and genome-based taxonomy, which could serve as a one-stop platform for the identification of prokaryotic species. The number of type strain genomes that are published will continue to increase as the GCM 10K project increases its collaboration with culture collections worldwide. Data of this project is shared with the International Nucleotide Sequence Database Collaboration. Access to gcType is free at http://gctype.wdcm.org/.
Collapse
Affiliation(s)
- Wenyu Shi
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.,World Data Center for Microorganisms, Beijing 100101, China
| | - Qinglan Sun
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.,World Data Center for Microorganisms, Beijing 100101, China.,China-Thailand Joint Laboratory on Microbial Biotechnology, Beijing 100190, China
| | - Guomei Fan
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.,World Data Center for Microorganisms, Beijing 100101, China
| | | | - Ohkuma Moriya
- Japan Collection of Microorganisms (JCM)/ Microbe Divion, RIKEN BioResource Center, Koyadai 3-1-1, Tsukuba, Ibaraki 305-0074, Japan
| | - Takashi Itoh
- Japan Collection of Microorganisms (JCM)/ Microbe Divion, RIKEN BioResource Center, Koyadai 3-1-1, Tsukuba, Ibaraki 305-0074, Japan
| | - Yuguang Zhou
- China General Microbiological Culture Collection Center (CGMCC), Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Man Cai
- China General Microbiological Culture Collection Center (CGMCC), Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Song-Gun Kim
- Korean Collection for Type Cultures (KCTC), Korea Research Institute of Bioscience and Biotechnology (KRIBB), 181 Ipsin-gil, Jeongeup-si, Jeollabuk-do, 56212, Republic of Korea
| | - Jung-Sook Lee
- Korean Collection for Type Cultures (KCTC), Korea Research Institute of Bioscience and Biotechnology (KRIBB), 181 Ipsin-gil, Jeongeup-si, Jeollabuk-do, 56212, Republic of Korea
| | - Ivo Sedlacek
- Czech Collection of Microorganisms, Masaryk University, Kamenice 5, building A25, 625 00 Brno, Czech Republic
| | - David R Arahal
- Colección Española de Cultivos Tipo (CECT), and Departamento de Microbiología y Ecología, University of Valencia, 46100 Burjassot (Valencia), Spain
| | - Teresa Lucena
- Colección Española de Cultivos Tipo (CECT), and Departamento de Microbiología y Ecología, University of Valencia, 46100 Burjassot (Valencia), Spain
| | - Hiroko Kawasaki
- NITE Biological Resource Center (NBRC), National Institute of Technology and Evaluation, 2-5-8 Kazusakamatari, Kisarazu, Chiba 292-0818, Japan
| | - Lyudmila Evtushenko
- All-Russian Collection of Microorganisms (VKM), G.K. Skryabin Institute of Biochemistry and Physiology of Microorganisms RAS, Pushchino, Moscow region 142290, Russia
| | - Bevan S Weir
- Mycology & Bacteriology Systematics, Manaaki Whenua - Landcare Research, Auckland, New Zealand
| | - Sarah Alexander
- National Collection of Type Cultures (NCTC), Public Health England (PHE), UK
| | - Dlauchy Dénes
- National Collection of Agricultural and Industrial Microorganisms, Faculty of Food Science, Szent István University, H-1118, Budapest, Somlói út 14-16, Hungary
| | - Somboon Tanasupawat
- Faculty of Pharmaceutical Sciences, Chulalongkorn University (PCU), Bangkok 10330, Thailand
| | - Lily Eurwilaichitr
- China-Thailand Joint Laboratory on Microbial Biotechnology, Beijing 100190, China.,Thailand Bioresource Research Center (TBRC), National Center for Genetic Engineering and Biotechnology (BIOTEC), National Science and Technology Development Agency (NSTDA), Thailand
| | - Supawadee Ingsriswang
- China-Thailand Joint Laboratory on Microbial Biotechnology, Beijing 100190, China.,Thailand Bioresource Research Center (TBRC), National Center for Genetic Engineering and Biotechnology (BIOTEC), National Science and Technology Development Agency (NSTDA), Thailand
| | - Bruno Gomez-Gil
- CIAD, A.C., Collection of Aquatic Important Microorganisms (CAIM). AP 711 Mazatlán, Sinaloa, Mexico
| | - Manzour H Hazbón
- American Type Culture Collection(ATCC), 10801 University Boulevard, Manassas, VA 20110, USA
| | - Marco A Riojas
- American Type Culture Collection(ATCC), 10801 University Boulevard, Manassas, VA 20110, USA
| | - Chatrudee Suwannachart
- Biodiversity Research Centre, Thailand Institute of Scientific and Technological Research (TISTR), 35 M 3 Technopolis Khlong 5 Khlong Luang Pathum Thani 12120, Thailand
| | - Su Yao
- China Center of Industrial Culture Collection (CICC), Beijing, China
| | - Peter Vandamme
- BCCM/LMG Bacteria Collection, Laboratory of Microbiology, Faculty of Sciences, Ghent University, K. L. Ledeganckstraat 35, 9000 Ghent, Belgium
| | - Fang Peng
- China Center for Type Culture Collection (CCTCC), College of Life Sciences, Wuhan University, Wuhan 430072, China
| | - Zenghui Chen
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.,World Data Center for Microorganisms, Beijing 100101, China
| | - Dongmei Liu
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.,World Data Center for Microorganisms, Beijing 100101, China
| | - Xiuqiang Sun
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.,World Data Center for Microorganisms, Beijing 100101, China
| | - Xinjiao Zhang
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.,World Data Center for Microorganisms, Beijing 100101, China
| | - Yuanchun Zhou
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
| | - Zhen Meng
- Computer Network Information Center, Chinese Academy of Sciences, Beijing 100190, China
| | - Linhuan Wu
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.,World Data Center for Microorganisms, Beijing 100101, China.,State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Juncai Ma
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.,World Data Center for Microorganisms, Beijing 100101, China.,China-Thailand Joint Laboratory on Microbial Biotechnology, Beijing 100190, China.,State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
58
|
Davis EM, Sun Y, Liu Y, Kolekar P, Shao Y, Szlachta K, Mulder HL, Ren D, Rice SV, Wang Z, Nakitandwe J, Gout AM, Shaner B, Hall S, Robison LL, Pounds S, Klco JM, Easton J, Ma X. SequencErr: measuring and suppressing sequencer errors in next-generation sequencing data. Genome Biol 2021; 22:37. [PMID: 33487172 PMCID: PMC7829059 DOI: 10.1186/s13059-020-02254-2] [Citation(s) in RCA: 26] [Impact Index Per Article: 6.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/27/2020] [Accepted: 12/18/2020] [Indexed: 12/20/2022] Open
Abstract
Background There is currently no method to precisely measure the errors that occur in the sequencing instrument/sequencer, which is critical for next-generation sequencing applications aimed at discovering the genetic makeup of heterogeneous cellular populations. Results We propose a novel computational method, SequencErr, to address this challenge by measuring the base correspondence between overlapping regions in forward and reverse reads. An analysis of 3777 public datasets from 75 research institutions in 18 countries revealed the sequencer error rate to be ~ 10 per million (pm) and 1.4% of sequencers and 2.7% of flow cells have error rates > 100 pm. At the flow cell level, error rates are elevated in the bottom surfaces and > 90% of HiSeq and NovaSeq flow cells have at least one outlier error-prone tile. By sequencing a common DNA library on different sequencers, we demonstrate that sequencers with high error rates have reduced overall sequencing accuracy, and removal of outlier error-prone tiles improves sequencing accuracy. We demonstrate that SequencErr can reveal novel insights relative to the popular quality control method FastQC and achieve a 10-fold lower error rate than popular error correction methods including Lighter and Musket. Conclusions Our study reveals novel insights into the nature of DNA sequencing errors incurred on DNA sequencers. Our method can be used to assess, calibrate, and monitor sequencer accuracy, and to computationally suppress sequencer errors in existing datasets.
Collapse
Affiliation(s)
- Eric M Davis
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Yu Sun
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA.,Department of Computer Science, University of Memphis, Memphis, TN, USA
| | - Yanling Liu
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Pandurang Kolekar
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Ying Shao
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Karol Szlachta
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Heather L Mulder
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | | | - Stephen V Rice
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Zhaoming Wang
- Department of Epidemiology & Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Joy Nakitandwe
- Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Alexander M Gout
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Bridget Shaner
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Salina Hall
- Discovery Life Sciences, Huntsville, AL, USA
| | - Leslie L Robison
- Department of Epidemiology & Cancer Control, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Stanley Pounds
- Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Jeffery M Klco
- Department of Pathology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - John Easton
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA
| | - Xiaotu Ma
- Department of Computational Biology, St. Jude Children's Research Hospital, Memphis, TN, USA.
| |
Collapse
|
59
|
Swat S, Laskowski A, Badura J, Frohmberg W, Wojciechowski P, Swiercz A, Kasprzak M, Blazewicz J. Genome-scale de novo assembly using ALGA. Bioinformatics 2021; 37:1644-1651. [PMID: 33471088 PMCID: PMC8289375 DOI: 10.1093/bioinformatics/btab005] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/10/2020] [Revised: 11/30/2020] [Accepted: 01/06/2021] [Indexed: 12/03/2022] Open
Abstract
Motivation There are very few methods for de novo genome assembly based on the overlap graph approach. It is considered as giving more exact results than the so-called de Bruijn graph approach but in much greater time and of much higher memory usage. It is not uncommon that assembly methods involving the overlap graph model are not able to successfully compute greater datasets, mainly due to memory limitation of a computer. This was the reason for developing in last decades mainly de Bruijn-based assembly methods, fast and fairly accurate. However, the latter methods can fail for longer or more repetitive genomes, as they decompose reads to shorter fragments and lose a part of information. An efficient assembler for processing big datasets and using the overlap graph model is still looked out. Results We propose a new genome-scale de novo assembler based on the overlap graph approach, designed for short-read sequencing data. The method, ALGA, incorporates several new ideas resulting in more exact contigs produced in short time. Among these ideas, we have creation of a sparse but quite informative graph, reduction of the graph including a procedure referring to the problem of minimum spanning tree of a local subgraph, and graph traversal connected with simultaneous analysis of contigs stored so far. What is rare in genome assembly, the algorithm is almost parameter-free, with only one optional parameter to be set by a user. ALGA was compared with nine state-of-the-art assemblers in tests on genome-scale sequencing data obtained from real experiments on six organisms, differing in size, coverage, GC content and repetition rate. ALGA produced best results in the sense of overall quality of genome reconstruction, understood as a good balance between genome coverage, accuracy and length of resulting sequences. The algorithm is one of tools involved in processing data in currently realized national project Genomic Map of Poland. Availability and implementation ALGA is available at http://alga.put.poznan.pl. Supplementary information Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Sylwester Swat
- Poland, Poznan University of Technology, Institute of Computing Science, Piotrowo 2, 60-965 Poznan
| | - Artur Laskowski
- Poland, Poznan University of Technology, Institute of Computing Science, Piotrowo 2, 60-965 Poznan
| | - Jan Badura
- Poland, Poznan University of Technology, Institute of Computing Science, Piotrowo 2, 60-965 Poznan
| | - Wojciech Frohmberg
- Poland, Poznan University of Technology, Institute of Computing Science, Piotrowo 2, 60-965 Poznan
| | - Pawel Wojciechowski
- Poland, Poznan University of Technology, Institute of Computing Science, Piotrowo 2, 60-965 Poznan.,Poland, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan
| | - Aleksandra Swiercz
- Poland, Poznan University of Technology, Institute of Computing Science, Piotrowo 2, 60-965 Poznan.,Poland, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan
| | - Marta Kasprzak
- Poland, Poznan University of Technology, Institute of Computing Science, Piotrowo 2, 60-965 Poznan
| | - Jacek Blazewicz
- Poland, Poznan University of Technology, Institute of Computing Science, Piotrowo 2, 60-965 Poznan.,Poland, Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan
| |
Collapse
|
60
|
Kämpfer P, Irgang R, Glaeser SP, Busse HJ, Criscuolo A, Clermont D, Avendaño-Herrera R. Flavobacterium salmonis sp. nov. isolated from Atlantic salmon (Salmo salar) and formal proposal to reclassify Flavobacterium spartansii as a later heterotypic synonym of Flavobacterium tructae. Int J Syst Evol Microbiol 2020; 70:6147-6154. [DOI: 10.1099/ijsem.0.004510] [Citation(s) in RCA: 10] [Impact Index Per Article: 2.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/13/2022] Open
Abstract
A Gram-staining-negative non endospore-forming strain, T13(2019)T was isolated from water samples from Atlantic salmon (Salmo salar) fry culture in Chile and studied in detail for its taxonomic position. The isolate shared highest 16S rRNA gene sequence similarities with the type strains of
Flavobacterium chungangense
(98.44 %) followed by
Flavobacterium tructae
and
Flavobacterium spartansii
(both 98.22 %). Menaquinone MK-6 was the predominant respiratory quinone in T13(2019)T. Major polar lipids were phosphatidylethanolamine, an ornithine lipid and the unidentified polar lipids L1, L3 and L4 lacking a functional group. The major polyamine was sym-homospermidine. The fatty acid profile contained major amounts of iso-C15 : 0, iso-C15 : 0 3-OH, iso-C17 : 0 3-OH, C15 : 0, summed feature 3 (C16 : 1
ω7c and/or iso-C15 : 0 2-OH) and various hydroxylated fatty acids in smaller amounts, among them iso-C16 : 0 3-OH, and C15 : 0 3-OH, which supported the grouping of the isolate into the genus
Flavobacterium
. Physiological/biochemical characterisation and ANI calculations with the type strains of the most closely related species allowed a clear phenotypic and genotypic differentiation. In addition it became obvious, that the type strains of
F. tructae
and
F. spartansii
showed 100 % 16S rRNA gene sequence similarities and ANI values of 97.21%/ 97.59 % and DDH values of 80.40 % [77.5 and 83%]. These data indicate that
F. tructae
and
F. spartansii
belong to the same species and it is proposed that
F. spartansii
is a later heterotypic synonym of
F. tructae
. For strain T13(2019)T (=CIP 111411T=LMG 30298T=CCM 8798T) a new species with the name Flavobacterium salmonis sp. nov. is proposed.
Collapse
Affiliation(s)
- Peter Kämpfer
- Institut für Angewandte Mikrobiologie, Universität Giessen, Giessen, Germany
| | - Rute Irgang
- Centro FONDAP, Interdisciplinary Center for Aquaculture Research (INCAR), Viña del Mar, Chile
| | - Stefanie P. Glaeser
- Institut für Angewandte Mikrobiologie, Universität Giessen, Giessen, Germany
| | - Hans-Jürgen Busse
- Institut für Mikrobiologie, Veterinärmedizinische Universität, A-1210 Wien, Austria
| | - Alexis Criscuolo
- Hub de Bioinformatique et Biostatistique Département Biologie Computationnelle, Institut Pasteur, USR 3756 CNRS – Paris, France
| | | | - Ruben Avendaño-Herrera
- Universidad Andrés Bello, Laboratorio de Patología de Organismos Acuáticos y Biotecnología Acuícola, Facultad de Ciencias de la Vida, Viña del Mar, Chile
- Centro FONDAP, Interdisciplinary Center for Aquaculture Research (INCAR), Viña del Mar, Chile
- Universidad Andrés Bello, Centro de Investigación Marina Quintay (CIMARQ), Quintay, Chile
| |
Collapse
|
61
|
Mühle E, Abry C, Leclerc P, Goly GM, Criscuolo A, Busse HJ, Kämpfer P, Bernardet JF, Clermont D, Chesneau O. Flavobacterium bizetiae sp. nov., isolated from diseased freshwater fish in Canada at the end of the 1970s. Int J Syst Evol Microbiol 2020; 71. [PMID: 33253083 DOI: 10.1099/ijsem.0.004576] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/18/2022] Open
Abstract
Genome sequence analysis of two strains collected in Canada at the end of the 1970s and deposited in 1998 at the Collection de l'Institut Pasteur has led to the taxonomic description of a novel fish-associated species in the genus Flavobacterium. Both strains, CIP 105534T and CIP 105535, were yellow-pigmented, Gram-stain-negative, non-spore-forming rod-shaped bacteria that exhibited gliding motility. They grew aerobically in a temperature range from 5 to 30 °C with optimal growth at 25 °C on trypticase soy or Reasoner's 2A agar but they did not grow on marine agar. Their major fatty acid profiles were similar, consisting of iso-C15 : 0, C16 : 1 ω7c and/or iso-C15 : 0 2-OH (shown as summed feature 3), C16 : 0 3-OH, iso-C17 : 0 3-OH and C16 : 0. The major polyamine was sym-homospermidine. Phosphatidylethanolamine and, most notably, ornithine-containing lipid OL2 and unidentified aminophospholipid APL1 were major polar lipids. A yellow pigment spot was visible after chromatographic analysis. The predominant respiratory quinone was MK-6. The G+C content of the two genomes was 34 mol% and their size was around 5.8 Mb. Comparison of the 16S rRNA gene sequences with those of the closely related type strains showed high levels of relatedness with Flavobacterium collinsii and Flavobacterium pectinovorum. All average nucleotide identity (ANI) and digital DNA-DNA hybridization values estimated against publicly available Flavobacterium genome assemblies were lower than 90 and 30 %, respectively. Phylogenetic, phenotypic and chemotaxonomic data indicated that the two strains represent a novel species of the genus Flavobacterium, for which the name Flavobacterium bizetiae sp. nov. is proposed. The type strain is CIP 105534T (=LMG 1342T). The unique ability of F. bizetiae to use melibiose as a sole source of carbon could provide a simple phenotypic test to discriminate F. bizetiae from its closest relatives.
Collapse
Affiliation(s)
- Estelle Mühle
- Collection de l'Institut Pasteur (CIP), Département de Microbiologie, Institut Pasteur, 28 rue du docteur Roux, 75015 Paris, France
| | - Chloé Abry
- Collection de l'Institut Pasteur (CIP), Département de Microbiologie, Institut Pasteur, 28 rue du docteur Roux, 75015 Paris, France
| | - Priscilla Leclerc
- Collection de l'Institut Pasteur (CIP), Département de Microbiologie, Institut Pasteur, 28 rue du docteur Roux, 75015 Paris, France
| | - Gogoa-Marthe Goly
- Collection de l'Institut Pasteur (CIP), Département de Microbiologie, Institut Pasteur, 28 rue du docteur Roux, 75015 Paris, France
| | - Alexis Criscuolo
- Hub de Bioinformatique et Biostatistique, Département de Biologie Computationnelle, Institut Pasteur, USR 3756 CNRS, 28 rue du docteur Roux, 75015 Paris, France
| | - Hans-Jürgen Busse
- Institut für Mikrobiologie, Veterinärmedizinische Universität Wien, Veterinärplatz 1, 1210 Wien, Austria
| | - Peter Kämpfer
- Institut für Angewandte Mikrobiologie, Justus-Liebig Universität Giessen, Heinrich-Buff-Ring 26 (IFZ), 35392 Giessen, Germany
| | - Jean-François Bernardet
- Unité de Virologie et Immunologie Moléculaires, Institut National de Recherche en Agriculture, Alimentation et Environnement (INRAE), 4 avenue Jean Jaurès, 78350 Jouy-en-Josas, France
| | - Dominique Clermont
- Collection de l'Institut Pasteur (CIP), Département de Microbiologie, Institut Pasteur, 28 rue du docteur Roux, 75015 Paris, France
| | - Olivier Chesneau
- Collection de l'Institut Pasteur (CIP), Département de Microbiologie, Institut Pasteur, 28 rue du docteur Roux, 75015 Paris, France
| |
Collapse
|
62
|
Hennart M, Panunzi LG, Rodrigues C, Gaday Q, Baines SL, Barros-Pinkelnig M, Carmi-Leroy A, Dazas M, Wehenkel AM, Didelot X, Toubiana J, Badell E, Brisse S. Population genomics and antimicrobial resistance in Corynebacterium diphtheriae. Genome Med 2020; 12:107. [PMID: 33246485 PMCID: PMC7694903 DOI: 10.1186/s13073-020-00805-7] [Citation(s) in RCA: 54] [Impact Index Per Article: 10.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2020] [Accepted: 11/11/2020] [Indexed: 12/21/2022] Open
Abstract
Background Corynebacterium diphtheriae, the agent of diphtheria, is a genetically diverse bacterial species. Although antimicrobial resistance has emerged against several drugs including first-line penicillin, the genomic determinants and population dynamics of resistance are largely unknown for this neglected human pathogen. Methods Here, we analyzed the associations of antimicrobial susceptibility phenotypes, diphtheria toxin production, and genomic features in C. diphtheriae. We used 247 strains collected over several decades in multiple world regions, including the 163 clinical isolates collected prospectively from 2008 to 2017 in France mainland and overseas territories. Results Phylogenetic analysis revealed multiple deep-branching sublineages, grouped into a Mitis lineage strongly associated with diphtheria toxin production and a largely toxin gene-negative Gravis lineage with few toxin-producing isolates including the 1990s ex-Soviet Union outbreak strain. The distribution of susceptibility phenotypes allowed proposing ecological cutoffs for most of the 19 agents tested, thereby defining acquired antimicrobial resistance. Penicillin resistance was found in 17.2% of prospective isolates. Seventeen (10.4%) prospective isolates were multidrug-resistant (≥ 3 antimicrobial categories), including four isolates resistant to penicillin and macrolides. Homologous recombination was frequent (r/m = 5), and horizontal gene transfer contributed to the emergence of antimicrobial resistance in multiple sublineages. Genome-wide association mapping uncovered genetic factors of resistance, including an accessory penicillin-binding protein (PBP2m) located in diverse genomic contexts. Gene pbp2m is widespread in other Corynebacterium species, and its expression in C. glutamicum demonstrated its effect against several beta-lactams. A novel 73-kb C. diphtheriae multiresistance plasmid was discovered. Conclusions This work uncovers the dynamics of antimicrobial resistance in C. diphtheriae in the context of phylogenetic structure, biovar, and diphtheria toxin production and provides a blueprint to analyze re-emerging diphtheria. Supplementary information Supplementary information accompanies this paper at 10.1186/s13073-020-00805-7.
Collapse
Affiliation(s)
- Melanie Hennart
- Institut Pasteur, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,Collège doctoral, Sorbonne Université, F-75005, Paris, France
| | - Leonardo G Panunzi
- Institut Pasteur, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,Institut Français de Bioinformatique, CNRS UMS 3601, Evry, France
| | - Carla Rodrigues
- Institut Pasteur, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France
| | - Quentin Gaday
- Unité de Microbiologie Structurale, Institut Pasteur, CNRS UMR 3528, Université de Paris, F-75015, Paris, France
| | - Sarah L Baines
- Doherty Applied Microbial Genomics, Department of Microbiology & Immunology, The University of Melbourne at The Peter Doherty Institute for Infection & Immunity, Melbourne, Victoria, Australia
| | | | - Annick Carmi-Leroy
- Institut Pasteur, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,Institut Pasteur, National Reference Center for Corynebacteria of the Diphtheriae Complex, Paris, France
| | - Melody Dazas
- Institut Pasteur, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,Institut Pasteur, National Reference Center for Corynebacteria of the Diphtheriae Complex, Paris, France
| | - Anne Marie Wehenkel
- Unité de Microbiologie Structurale, Institut Pasteur, CNRS UMR 3528, Université de Paris, F-75015, Paris, France
| | - Xavier Didelot
- School of Life Sciences and Department of Statistics, University of Warwick, Coventry, UK
| | - Julie Toubiana
- Institut Pasteur, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,Institut Pasteur, National Reference Center for Corynebacteria of the Diphtheriae Complex, Paris, France.,Department of General Pediatrics and Pediatric Infectious Diseases, Hôpital Necker-Enfants Malades, APHP, Université de Paris, Paris, France
| | - Edgar Badell
- Institut Pasteur, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France.,Institut Pasteur, National Reference Center for Corynebacteria of the Diphtheriae Complex, Paris, France
| | - Sylvain Brisse
- Institut Pasteur, Biodiversity and Epidemiology of Bacterial Pathogens, Paris, France. .,Institut Pasteur, National Reference Center for Corynebacteria of the Diphtheriae Complex, Paris, France.
| |
Collapse
|
63
|
Genetic Basis Underlying the Hyperhemolytic Phenotype of Streptococcus agalactiae Strain CNCTC10/84. J Bacteriol 2020; 202:JB.00504-20. [PMID: 32958630 DOI: 10.1128/jb.00504-20] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/11/2020] [Accepted: 09/11/2020] [Indexed: 01/30/2023] Open
Abstract
Streptococcus agalactiae (group B streptococcus [GBS]) is a major cause of infections in newborns, pregnant women, and immunocompromised patients. GBS strain CNCTC10/84 is a clinical isolate that has high virulence in animal models of infection and has been used extensively to study GBS pathogenesis. Two unusual features of this strain are hyperhemolytic activity and hypo-CAMP factor activity. These two phenotypes are typical of GBS strains that are functionally deficient in the CovR-CovS two-component regulatory system. A previous whole-genome sequencing study found that strain CNCTC10/84 has intact covR and covS regulatory genes. We investigated CovR-CovS regulation in CNCTC10/84 and discovered that a single-nucleotide insertion in a homopolymeric tract in the covR promoter region underlies the strong hemolytic activity and weak CAMP activity of this strain. Using isogenic mutant strains, we demonstrate that this single-nucleotide insertion confers significantly decreased expression of covR and covS and altered expression of CovR-CovS-regulated genes, including that of genes encoding β-hemolysin and CAMP factor. This single-nucleotide insertion also confers significantly increased GBS survival in human whole blood ex vivo IMPORTANCE Group B streptococcus (GBS) is the leading cause of neonatal sepsis, pneumonia, and meningitis. GBS strain CNCTC10/84 is a highly virulent blood isolate that has been used extensively to study GBS pathogenesis for over 20 years. Strain CNCTC10/84 has an unusually strong hemolytic activity, but the genetic basis is unknown. In this study, we discovered that a single-nucleotide insertion in an intergenic homopolymeric tract is responsible for the elevated hemolytic activity of CNCTC10/84.
Collapse
|
64
|
Hays M, Young JM, Levan PF, Malik HS. A natural variant of the essential host gene MMS21 restricts the parasitic 2-micron plasmid in Saccharomyces cerevisiae. eLife 2020; 9:62337. [PMID: 33063663 PMCID: PMC7652418 DOI: 10.7554/elife.62337] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/23/2020] [Accepted: 10/15/2020] [Indexed: 12/30/2022] Open
Abstract
Antagonistic coevolution with selfish genetic elements (SGEs) can drive evolution of host resistance. Here, we investigated host suppression of 2-micron (2μ) plasmids, multicopy nuclear parasites that have co-evolved with budding yeasts. We developed SCAMPR (Single-Cell Assay for Measuring Plasmid Retention) to measure copy number heterogeneity and 2μ plasmid loss in live cells. We identified three S. cerevisiae strains that lack endogenous 2μ plasmids and reproducibly inhibit mitotic plasmid stability. Focusing on the Y9 ragi strain, we determined that plasmid restriction is heritable and dominant. Using bulk segregant analysis, we identified a high-confidence Quantitative Trait Locus (QTL) with a single variant of MMS21 associated with increased 2μ instability. MMS21 encodes a SUMO E3 ligase and an essential component of the Smc5/6 complex, involved in sister chromatid cohesion, chromosome segregation, and DNA repair. Our analyses leverage natural variation to uncover a novel means by which budding yeasts can overcome highly successful genetic parasites.
Collapse
Affiliation(s)
- Michelle Hays
- Molecular and Cellular Biology program, University of Washington, Seattle, United States.,Division of Basic Sciences & Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Janet M Young
- Division of Basic Sciences & Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Paula F Levan
- Division of Basic Sciences & Fred Hutchinson Cancer Research Center, Seattle, United States
| | - Harmit S Malik
- Division of Basic Sciences & Fred Hutchinson Cancer Research Center, Seattle, United States.,Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, United States
| |
Collapse
|
65
|
Kämpfer P, Glaeser SP, McInroy JA, Xu J, Busse HJ, Clermont D, Criscuolo A. Flavobacterium panici sp. nov. isolated from the rhizosphere of the switchgrass Panicum virgatum. Int J Syst Evol Microbiol 2020; 70:5824-5831. [PMID: 33034547 DOI: 10.1099/ijsem.0.004482] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/28/2022] Open
Abstract
A Gram-staining-negative non endospore-forming strain, PXU-55T, was isolated from the rhizosphere of the switchgrass Panicum virgatum and studied in detail to determine its taxonomic position. The results of 16S rRNA gene sequence analysis indicated that the isolate represented a member of the genus Flavobacterium. The isolate shared highest 16S rRNA gene sequence similarities with the type strains of Flavobacterium chungangense (98.78 %) and Flavobacterium chilense (98.64 %). The average nucleotide identity (ANI) and in silico DNA-DNA hybridization (isDDH) values between the PXU-55T genome assembly and the ones of the most closely related type strains of species of the genus Flavobacterium were 87.3 and 31.9% (Flavobacterium defluvii), and 86.1 and 29.9% (Flavobacterium johnsoniae). Menaquinone MK-6 was the major respiratory quinone. As major polar lipids, phosphatidylethanolamine, an ornithine lipid and the unidentified polar lipids L2, L3 and L4 lacking a functional group were found. Moderate to minor amounts of another ornithine lipid, the unidentified lipid L1 and a glycolipid were present, as well. The major polyamine is sym-homospermidine. The fatty acid profiles contained major amounts of iso-C15:0, iso-C15:0 3-OH, iso-C17:0 3-OH, C15:0, summed feature 3 (C16:1ω7c and/or iso-C15:0 2-OH) and various hydroxylated fatty acids in smaller amounts, among them iso C16:0 3-OH, C16:0 3-OH and C15:0 3-OH, which supported the classification of the isolate as a member of the genus Flavobacterium. Physiological and biochemical characterisation and ANI calculations with the type strains of the most closely related species allowed a clear phenotypic and genotypic differentiation of the strain. For this reason, we propose that strain PXU-55T (=CIP 111646T=CCM 8914T) represents a novel species with the name Flavobacterium panici sp. nov.
Collapse
Affiliation(s)
- Peter Kämpfer
- Institut für Angewandte Mikrobiologie, Justus-Liebig-Universität Giessen, D-35392 Giessen, Germany
| | - S P Glaeser
- Institut für Angewandte Mikrobiologie, Justus-Liebig-Universität Giessen, D-35392 Giessen, Germany
| | - John A McInroy
- Department of Entomology and Plant Pathology, Auburn University, Alabama, USA
| | - Jia Xu
- Department of Entomology and Plant Pathology, Auburn University, Alabama, USA
| | - Hans-Jürgen Busse
- Institut für Mikrobiologie, Veterinärmedizinische Universität, A-1210 Wien, Austria
| | | | - Alexis Criscuolo
- Hub de Bioinformatique et Biostatistique - Département Biologie Computationnelle, Institut Pasteur, USR 3756 CNRS, Paris, France
| |
Collapse
|
66
|
Phan NT, Orjuela J, Danchin EGJ, Klopp C, Perfus‐Barbeoch L, Kozlowski DK, Koutsovoulos GD, Lopez‐Roques C, Bouchez O, Zahm M, Besnard G, Bellafiore S. Genome structure and content of the rice root-knot nematode ( Meloidogyne graminicola). Ecol Evol 2020; 10:11006-11021. [PMID: 33144944 PMCID: PMC7593179 DOI: 10.1002/ece3.6680] [Citation(s) in RCA: 25] [Impact Index Per Article: 5.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/22/2020] [Revised: 07/04/2020] [Accepted: 07/17/2020] [Indexed: 12/15/2022] Open
Abstract
Discovered in the 1960s, Meloidogyne graminicola is a root-knot nematode species considered as a major threat to rice production. Yet, its origin, genomic structure, and intraspecific diversity are poorly understood. So far, such studies have been limited by the unavailability of a sufficiently complete and well-assembled genome. In this study, using a combination of Oxford Nanopore Technologies and Illumina sequencing data, we generated a highly contiguous reference genome (283 scaffolds with an N50 length of 294 kb, totaling 41.5 Mb). The completeness scores of our assembly are among the highest currently published for Meloidogyne genomes. We predicted 10,284 protein-coding genes spanning 75.5% of the genome. Among them, 67 are identified as possibly originating from horizontal gene transfers (mostly from bacteria), which supposedly contribute to nematode infection, nutrient processing, and plant defense manipulation. Besides, we detected 575 canonical transposable elements (TEs) belonging to seven orders and spanning 2.61% of the genome. These TEs might promote genomic plasticity putatively related to the evolution of M. graminicola parasitism. This high-quality genome assembly constitutes a major improvement regarding previously available versions and represents a valuable molecular resource for future phylogenomic studies of Meloidogyne species. In particular, this will foster comparative genomic studies to trace back the evolutionary history of M. graminicola and its closest relatives.
Collapse
Affiliation(s)
- Ngan Thi Phan
- IRD‐CIRAD‐University of MontpellierUMR Interactions Plantes Microorganismes Environnement (IPME)MontpellierFrance
| | - Julie Orjuela
- IRD‐CIRAD‐University of MontpellierUMR Interactions Plantes Microorganismes Environnement (IPME)MontpellierFrance
| | | | - Christophe Klopp
- Plateforme BioInfo GenotoulUR875INRAECastanet‐Tolosan cedexFrance
| | | | - Djampa K. Kozlowski
- Institut Sophia AgrobiotechINRAECNRSUniversité Côte d’AzurSophia AntipolisFrance
| | | | | | | | - Margot Zahm
- Plateforme BioInfo GenotoulUR875INRAECastanet‐Tolosan cedexFrance
| | | | - Stéphane Bellafiore
- IRD‐CIRAD‐University of MontpellierUMR Interactions Plantes Microorganismes Environnement (IPME)MontpellierFrance
| |
Collapse
|
67
|
Francoeur CB, Khadempour L, Moreira-Soto RD, Gotting K, Book AJ, Pinto-Tomás AA, Keefover-Ring K, Currie CR. Bacteria Contribute to Plant Secondary Compound Degradation in a Generalist Herbivore System. mBio 2020; 11:e02146-20. [PMID: 32934088 PMCID: PMC7492740 DOI: 10.1128/mbio.02146-20] [Citation(s) in RCA: 15] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/03/2020] [Accepted: 08/11/2020] [Indexed: 02/07/2023] Open
Abstract
Herbivores must overcome a variety of plant defenses, including coping with plant secondary compounds (PSCs). To help detoxify these defensive chemicals, several insect herbivores are known to harbor gut microbiota with the metabolic capacity to degrade PSCs. Leaf-cutter ants are generalist herbivores, obtaining sustenance from specialized fungus gardens that act as external digestive systems and which degrade the diverse collection of plants foraged by the ants. There is in vitro evidence that certain PSCs harm Leucoagaricus gongylophorus, the fungal cultivar of leaf-cutter ants, suggesting a role for the Proteobacteria-dominant bacterial community present within fungus gardens. In this study, we investigated the ability of symbiotic bacteria present within fungus gardens of leaf-cutter ants to degrade PSCs. We cultured fungus garden bacteria, sequenced the genomes of 42 isolates, and identified genes involved in PSC degradation, including genes encoding cytochrome P450 enzymes and genes in geraniol, cumate, cinnamate, and α-pinene/limonene degradation pathways. Using metatranscriptomic analysis, we showed that some of these degradation genes are expressed in situ Most of the bacterial isolates grew unhindered in the presence of PSCs and, using gas chromatography-mass spectrometry (GC-MS), we determined that isolates from the genera Bacillus, Burkholderia, Enterobacter, Klebsiella, and Pseudomonas degrade α-pinene, β-caryophyllene, or linalool. Using a headspace sampler, we show that subcolonies of fungus gardens reduced α-pinene and linalool over a 36-h period, while L. gongylophorus strains alone reduced only linalool. Overall, our results reveal that the bacterial communities in fungus gardens play a pivotal role in alleviating the effect of PSCs on the leaf-cutter ant system.IMPORTANCE Leaf-cutter ants are dominant neotropical herbivores capable of deriving energy from a wide range of plant substrates. The success of leaf-cutter ants is largely due to their external gut, composed of key microbial symbionts, specifically, the fungal mutualist L. gongylophorus and a consistent bacterial community. Both symbionts are known to have critical roles in extracting energy from plant material, yet comparatively little is known about their roles in the detoxification of plant secondary compounds. In this study, we assessed if the bacterial communities associated with leaf-cutter ant fungus gardens can degrade harmful plant chemicals. We identify plant secondary compound detoxification in leaf-cutter ant gardens as a process that depends on the degradative potential of both the bacterial community and L. gongylophorus Our findings suggest that the fungus garden and its associated microbial community influence the generalist foraging abilities of the ants, underscoring the importance of microbial symbionts in plant substrate suitability for herbivores.
Collapse
Affiliation(s)
- Charlotte B Francoeur
- Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin, USA
- Department of Energy Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Lily Khadempour
- Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin, USA
- Department of Energy Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Rolando D Moreira-Soto
- Sección de Entomología Medica, Departamento de Parasitología, Facultad de Microbiología, Universidad de Costa Rica, San José, Costa Rica
| | - Kirsten Gotting
- Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin, USA
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Adam J Book
- Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin, USA
- Department of Energy Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Adrián A Pinto-Tomás
- Centro de Investigación en Estructuras Microscópicas, Universidad de Costa Rica, San José, Costa Rica
- Departamento de Bioquímica, Facultad de Medicina, Universidad de Costa Rica, San José, Costa Rica
- Centro de Investigación en Biología Celular y Molecular, Universidad de Costa Rica, San José, Costa Rica
| | - Ken Keefover-Ring
- Departments of Botany and Geography, University of Wisconsin-Madison, Madison, Wisconsin, USA
| | - Cameron R Currie
- Department of Bacteriology, University of Wisconsin-Madison, Madison, Wisconsin, USA
- Department of Energy Great Lakes Bioenergy Research Center, University of Wisconsin-Madison, Madison, Wisconsin, USA
| |
Collapse
|
68
|
Comparative genomics and gene-trait matching analysis of Bifidobacterium breve from Chinese children. FOOD BIOSCI 2020. [DOI: 10.1016/j.fbio.2020.100631] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/18/2022]
|
69
|
Ochoa A, Broe M, Moriarty Lemmon E, Lemmon AR, Rokyta DR, Gibbs HL. Drift, selection and adaptive variation in small populations of a threatened rattlesnake. Mol Ecol 2020; 29:2612-2625. [PMID: 32557885 DOI: 10.1111/mec.15517] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/26/2019] [Revised: 05/09/2020] [Accepted: 05/21/2020] [Indexed: 01/22/2023]
Abstract
An important goal of conservation genetics is to determine if the viability of small populations is reduced by a loss of adaptive variation due to genetic drift. Here, we assessed the impact of drift and selection on direct measures of adaptive variation (toxin loci encoding venom proteins) in the eastern massasauga rattlesnake (Sistrurus catenatus), a threatened reptile that exists in small isolated populations. We estimated levels of individual polymorphism in 46 toxin loci and 1,467 control loci across 12 populations of this species, and compared the results with patterns of selection on the same loci following speciation of S. catenatus and its closest relative, the western massasauga (S. tergeminus). Multiple lines of evidence suggest that both drift and selection have had observable impacts on standing adaptive variation. In support of drift effects, we found little evidence for selection on toxin variation within populations and a significant positive relationship between current levels of adaptive variation and long- and short-term estimates of effective population size. However, we also observed levels of directional selection on toxin loci among populations that are broadly similar to patterns predicted from interspecific selection analyses that pre-date the effects of recent drift, and that functional variation in these loci persists despite small short-term effective sizes. This suggests that much of the adaptive variation present in populations may represent an example of "drift debt," a nonequilibrium state where present-day levels of variation overestimate the amount of functional genetic diversity present in future populations.
Collapse
Affiliation(s)
- Alexander Ochoa
- Ohio Biodiversity Conservation Partnership and Department of Evolution, Ecology, and Organismal Biology, Ohio State University, Columbus, OH, USA
| | - Michael Broe
- Ohio Biodiversity Conservation Partnership and Department of Evolution, Ecology, and Organismal Biology, Ohio State University, Columbus, OH, USA
| | | | - Alan R Lemmon
- Department of Scientific Computing, Florida State University, Tallahassee, FL, USA
| | - Darin R Rokyta
- Department of Biological Science, Florida State University, Tallahassee, FL, USA
| | - H Lisle Gibbs
- Ohio Biodiversity Conservation Partnership and Department of Evolution, Ecology, and Organismal Biology, Ohio State University, Columbus, OH, USA
| |
Collapse
|
70
|
Human Infections Caused by Clonally Related African Clade (Clade III) Strains of Candida auris in the Greater Houston Region. J Clin Microbiol 2020; 58:JCM.02063-19. [PMID: 32295894 DOI: 10.1128/jcm.02063-19] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/13/2019] [Accepted: 04/09/2020] [Indexed: 11/20/2022] Open
Abstract
Candida auris is a pathogen of considerable public health importance. It was first reported in 2009. Five clades, determined by genomic analysis and named by the distinct regions where they were initially identified, have been defined. We previously completed a draft genome sequence of an African clade (clade III) strain cultured from the urine of a patient hospitalized in the greater Houston metropolitan region (strain LOM). Although initially uncommon, reports of the African clade in the United States have grown to include a recent cluster in California. Here, we describe a second human C. auris infection in the Houston area. Whole-genome sequence analysis demonstrated the Houston patient isolates to be clonally related to one another but distantly related to other African clade organisms recovered in the United States or elsewhere. Infections in these patients were present on admission to the hospital and occurred several months apart. Taken together, the data demonstrate the emergence and persistence of a clonal C. auris population and highlights the importance of routine high-resolution genomic surveillance of emerging human pathogens in the clinical laboratory.
Collapse
|
71
|
Shi W, Qi H, Sun Q, Fan G, Liu S, Wang J, Zhu B, Liu H, Zhao F, Wang X, Hu X, Li W, Liu J, Tian Y, Wu L, Ma J. gcMeta: a Global Catalogue of Metagenomics platform to support the archiving, standardization and analysis of microbiome data. Nucleic Acids Res 2020; 47:D637-D648. [PMID: 30365027 PMCID: PMC6324004 DOI: 10.1093/nar/gky1008] [Citation(s) in RCA: 40] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/14/2018] [Accepted: 10/13/2018] [Indexed: 11/26/2022] Open
Abstract
Meta-omics approaches have been increasingly used to study the structure and function of the microbial communities. A variety of large-scale collaborative projects are being conducted to encompass samples from diverse environments and habitats. This change has resulted in enormous demands for long-term data maintenance and capacity for data analysis. The Global Catalogue of Metagenomics (gcMeta) is a part of the ‘Chinese Academy of Sciences Initiative of Microbiome (CAS-CMI)’, which focuses on studying the human and environmental microbiome, establishing depositories of samples, strains and data, as well as promoting international collaboration. To accommodate and rationally organize massive datasets derived from several thousands of human and environmental microbiome samples, gcMeta features a database management system for archiving and publishing data in a standardized way. Another main feature is the integration of more than ninety web-based data analysis tools and workflows through a Docker platform which enables data analysis by using various operating systems. This platform has been rapidly expanding, and now hosts data from the CAS-CMI and a number of other ongoing research projects. In conclusion, this platform presents a powerful and user-friendly service to support worldwide collaborative efforts in the field of meta-omics research. This platform is freely accessible at https://gcmeta.wdcm.org/.
Collapse
Affiliation(s)
- Wenyu Shi
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Heyuan Qi
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Qinglan Sun
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Guomei Fan
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Shuangjiang Liu
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.,State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Jun Wang
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Science, Beijing 100101, China
| | - Baoli Zhu
- CAS Key Laboratory of Pathogenic Microbiology and Immunology, Institute of Microbiology, Chinese Academy of Science, Beijing 100101, China.,University of Chinese Academy of Sciences, Beijing 100049, China.,Collaborative Innovation Centre for Diagnosis and Treatment of Infectious Diseases First Attainted Hospital, College of Medicine, Zhejiang University, Hangzhou 310058, China.,Beijing Key Laboratory of Antimicrobial Resistance and Pathogen Genomics, Beijing 100101, China
| | - Hongwei Liu
- State Key Laboratory of Mycology, Institute of Microbiology, Chinese Academy of Science, Beijing 100101, China
| | - Fangqing Zhao
- Computational Genomics Lab, Beijing Institutes of Life Science, Chinese Academy of Sciences, Beijing 100101, China
| | - Xiaochen Wang
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Xiaoxuan Hu
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Wei Li
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Jia Liu
- Internet of Things Information Technology and Application Laboratory, Computer Network Information Center, Chinese Academy of Sciences. Beijing 100101, China
| | - Ye Tian
- Internet of Things Information Technology and Application Laboratory, Computer Network Information Center, Chinese Academy of Sciences. Beijing 100101, China
| | - Linhuan Wu
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.,State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| | - Juncai Ma
- Microbial Resource and Big Data Center, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China.,State Key Laboratory of Microbial Resources, Institute of Microbiology, Chinese Academy of Sciences, Beijing 100101, China
| |
Collapse
|
72
|
Mitchell K, Brito JJ, Mandric I, Wu Q, Knyazev S, Chang S, Martin LS, Karlsberg A, Gerasimov E, Littman R, Hill BL, Wu NC, Yang HT, Hsieh K, Chen L, Littman E, Shabani T, Enik G, Yao D, Sun R, Schroeder J, Eskin E, Zelikovsky A, Skums P, Pop M, Mangul S. Benchmarking of computational error-correction methods for next-generation sequencing data. Genome Biol 2020; 21:71. [PMID: 32183840 PMCID: PMC7079412 DOI: 10.1186/s13059-020-01988-3] [Citation(s) in RCA: 23] [Impact Index Per Article: 4.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/18/2019] [Accepted: 03/06/2020] [Indexed: 12/16/2022] Open
Abstract
BACKGROUND Recent advancements in next-generation sequencing have rapidly improved our ability to study genomic material at an unprecedented scale. Despite substantial improvements in sequencing technologies, errors present in the data still risk confounding downstream analysis and limiting the applicability of sequencing technologies in clinical tools. Computational error correction promises to eliminate sequencing errors, but the relative accuracy of error correction algorithms remains unknown. RESULTS In this paper, we evaluate the ability of error correction algorithms to fix errors across different types of datasets that contain various levels of heterogeneity. We highlight the advantages and limitations of computational error correction techniques across different domains of biology, including immunogenomics and virology. To demonstrate the efficacy of our technique, we apply the UMI-based high-fidelity sequencing protocol to eliminate sequencing errors from both simulated data and the raw reads. We then perform a realistic evaluation of error-correction methods. CONCLUSIONS In terms of accuracy, we find that method performance varies substantially across different types of datasets with no single method performing best on all types of examined data. Finally, we also identify the techniques that offer a good balance between precision and sensitivity.
Collapse
Affiliation(s)
- Keith Mitchell
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Jaqueline J Brito
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, CA, 90089, USA
| | - Igor Mandric
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
- Department of Computer Science, Georgia State University, 1 Park Place, Atlanta, GA, 30303, USA
| | - Qiaozhen Wu
- Department of Mathematics, University of California Los Angeles, 520 Portola Plaza, Los Angeles, CA, 90095, USA
| | - Sergey Knyazev
- Department of Computer Science, Georgia State University, 1 Park Place, Atlanta, GA, 30303, USA
| | - Sei Chang
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Lana S Martin
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, CA, 90089, USA
| | - Aaron Karlsberg
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, CA, 90089, USA
| | - Ekaterina Gerasimov
- Department of Computer Science, Georgia State University, 1 Park Place, Atlanta, GA, 30303, USA
| | - Russell Littman
- UCLA Bioinformatics, 621 Charles E Young Dr S, Los Angeles, CA, 90024, USA
| | - Brian L Hill
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Nicholas C Wu
- Department of Integrative Structural and Computational Biology, The Scripps Research Institute, La Jolla, CA, 92037, USA
| | - Harry Taegyun Yang
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Kevin Hsieh
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Linus Chen
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Eli Littman
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Taylor Shabani
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - German Enik
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Douglas Yao
- Department of Molecular, Cell, and Developmental Biology, University of California Los Angeles, 650 Charles E. Young Drive South, Los Angeles, CA, 90095, USA
| | - Ren Sun
- Department of Molecular and Medical Pharmacology, University of California Los Angeles, 650 Charles E. Young Drive South, Los Angeles, CA, 90095, USA
| | - Jan Schroeder
- Epigenetics & Reprogramming Laboratory, Monash University, 15 Innovation Walk, Melbourne, VIC, 3800, Australia
| | - Eleazar Eskin
- Department of Computer Science, University of California Los Angeles, 404 Westwood Plaza, Los Angeles, CA, 90095, USA
| | - Alex Zelikovsky
- Department of Computer Science, Georgia State University, 1 Park Place, Atlanta, GA, 30303, USA
- The Laboratory of Bioinformatics, I.M, Sechenov First Moscow State Medical University, Moscow, Russia, 119991
| | - Pavel Skums
- Department of Computer Science, Georgia State University, 1 Park Place, Atlanta, GA, 30303, USA
| | - Mihai Pop
- Department of Computer Science and Center for Bioinformatics and Computational Biology, University of Maryland, College Park, MD, 20742, USA
| | - Serghei Mangul
- Department of Clinical Pharmacy, School of Pharmacy, University of Southern California, 1985 Zonal Avenue, Los Angeles, CA, 90089, USA.
| |
Collapse
|
73
|
Jiménez-Ruiz J, Ramírez-Tejero JA, Fernández-Pozo N, Leyva-Pérez MDLO, Yan H, Rosa RDL, Belaj A, Montes E, Rodríguez-Ariza MO, Navarro F, Barroso JB, Beuzón CR, Valpuesta V, Bombarely A, Luque F. Transposon activation is a major driver in the genome evolution of cultivated olive trees (Olea europaea L.). THE PLANT GENOME 2020; 13:e20010. [PMID: 33016633 DOI: 10.1002/tpg2.20010] [Citation(s) in RCA: 47] [Impact Index Per Article: 9.4] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 07/26/2019] [Accepted: 01/15/2020] [Indexed: 05/25/2023]
Abstract
The primary domestication of olive (Olea europaea L.) in the Levant dates back to the Neolithic period, around 6,000-5,500 BC, as some archeological remains attest. Cultivated olive trees are reproduced clonally, with sexual crosses being the sporadic events that drive the development of new varieties. In order to determine the genomic changes which have occurred in a modern olive cultivar, the genome of the Picual cultivar, one of the most popular olive varieties, was sequenced. Additional 40 cultivated and 10 wild accessions were re-sequenced to elucidate the evolution of the olive genome during the domestication process. It was found that the genome of the 'Picual' cultivar contains 79,667 gene models, of which 78,079 were protein-coding genes and 1,588 were tRNA. Population analyses support two independent events in olive domestication, including an early possible genetic bottleneck. Despite genetic bottlenecks, cultivated accessions showed a high genetic diversity driven by the activation of transposable elements (TE). A high TE gene expression was observed in presently cultivated olives, which suggests a current activity of TEs in domesticated olives. Several TEs families were expanded in the last 5,000 or 6,000 years and produced insertions near genes that may have been involved in selected traits during domestication as reproduction, photosynthesis, seed development, and oil production. Therefore, a great genetic variability has been found in cultivated olive as a result of a significant activation of TEs during the domestication process.
Collapse
Affiliation(s)
- Jaime Jiménez-Ruiz
- Center for Advanced Studies in Olive Grove and Olive Oils, Department of Experimental Biology, University. Jaén, Jaén, 23071, Spain
| | - Jorge A Ramírez-Tejero
- Center for Advanced Studies in Olive Grove and Olive Oils, Department of Experimental Biology, University. Jaén, Jaén, 23071, Spain
| | - Noé Fernández-Pozo
- Plant Cell Biology, Faculty of Biology, University of Marburg, Marburg, Germany
| | - María de la O Leyva-Pérez
- Center for Advanced Studies in Olive Grove and Olive Oils, Department of Experimental Biology, University. Jaén, Jaén, 23071, Spain
| | - Haidong Yan
- School of Plants and Environmental Sciences, Virginia Tech, Blacksburg, VA, 24061, USA
| | - Raúl de la Rosa
- Centro de Investigación y Formación Agraria de Alameda del Obispo, Instituto de Investigación y Formación Agraria y Pesquera (IFAPA), Córdoba, Spain
| | - Angjelina Belaj
- Centro de Investigación y Formación Agraria de Alameda del Obispo, Instituto de Investigación y Formación Agraria y Pesquera (IFAPA), Córdoba, Spain
| | - Eva Montes
- Instituto Universitario de Investigación en Arqueología Ibérica, University. Jaén, Jaén, 23071, Spain
| | - Mª Oliva Rodríguez-Ariza
- Instituto Universitario de Investigación en Arqueología Ibérica, University. Jaén, Jaén, 23071, Spain
| | - Francisco Navarro
- Center for Advanced Studies in Olive Grove and Olive Oils, Department of Experimental Biology, University. Jaén, Jaén, 23071, Spain
| | - Juan Bautista Barroso
- Center for Advanced Studies in Olive Grove and Olive Oils, Department of Experimental Biology, University. Jaén, Jaén, 23071, Spain
| | - Carmen R Beuzón
- Departamento de Biología Celular, Genética y Fisiología, Facultad de Ciencias, Instituto de Hortofruticultura Subtropical y Mediterránea, Universidad de Málaga - Consejo Superior de Investigaciones Científicas, Málaga, Spain
| | - Victoriano Valpuesta
- Departamento de Biología Molecular y Bioquímica, Facultad de Ciencias, Instituto de Hortofruticultura Subtropical y Mediterránea, Universidad de Málaga - Consejo Superior de Investigaciones Científicas, Málaga, Spain
| | - Aureliano Bombarely
- School of Plants and Environmental Sciences, Virginia Tech, Blacksburg, VA, 24061, USA
- present address, Department of Bioscience, Universita degli Studi di Milano, Milan, 20133, Italy
| | - Francisco Luque
- Center for Advanced Studies in Olive Grove and Olive Oils, Department of Experimental Biology, University. Jaén, Jaén, 23071, Spain
| |
Collapse
|
74
|
Panyukov VV, Kiselev SS, Ozoline ON. Unique k-mers as Strain-Specific Barcodes for Phylogenetic Analysis and Natural Microbiome Profiling. Int J Mol Sci 2020; 21:944. [PMID: 32023871 PMCID: PMC7037511 DOI: 10.3390/ijms21030944] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/29/2019] [Revised: 01/21/2020] [Accepted: 01/28/2020] [Indexed: 02/07/2023] Open
Abstract
The need for a comparative analysis of natural metagenomes stimulated the development of new methods for their taxonomic profiling. Alignment-free approaches based on the search for marker k-mers turned out to be capable of identifying not only species, but also strains of microorganisms with known genomes. Here, we evaluated the ability of genus-specific k-mers to distinguish eight phylogroups of Escherichia coli (A, B1, C, E, D, F, G, B2) and assessed the presence of their unique 22-mers in clinical samples from microbiomes of four healthy people and four patients with Crohn's disease. We found that a phylogenetic tree inferred from the pairwise distance matrix for unique 18-mers and 22-mers of 124 genomes was fully consistent with the topology of the tree, obtained with concatenated aligned sequences of orthologous genes. Therefore, we propose strain-specific "barcodes" for rapid phylotyping. Using unique 22-mers for taxonomic analysis, we detected microbes of all groups in human microbiomes; however, their presence in the five samples was significantly different. Pointing to the intraspecies heterogeneity of E. coli in the natural microflora, this also indicates the feasibility of further studies of the role of this heterogeneity in maintaining population homeostasis.
Collapse
Affiliation(s)
- Valery V. Panyukov
- Institute of Mathematical Problems of Biology RAS—the Branch of Keldysh Institute of Applied Mathematics of Russian Academy of Sciences, 142290 Pushchino, Russia;
- Structural and Functional Genomics Group, Federal Research Center “Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences”, 142290 Pushchino, Russia;
| | - Sergey S. Kiselev
- Structural and Functional Genomics Group, Federal Research Center “Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences”, 142290 Pushchino, Russia;
- Institute of Cell Biophysics of the Russian Academy of Sciences, 142290 Pushchino, Russia
| | - Olga N. Ozoline
- Structural and Functional Genomics Group, Federal Research Center “Pushchino Scientific Center for Biological Research of the Russian Academy of Sciences”, 142290 Pushchino, Russia;
- Institute of Cell Biophysics of the Russian Academy of Sciences, 142290 Pushchino, Russia
| |
Collapse
|
75
|
Das AK, Goswami S, Lee K, Park SJ. A hybrid and scalable error correction algorithm for indel and substitution errors of long reads. BMC Genomics 2019; 20:948. [PMID: 31856721 PMCID: PMC6923905 DOI: 10.1186/s12864-019-6286-9] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 01/23/2023] Open
Abstract
BACKGROUND Long-read sequencing has shown the promises to overcome the short length limitations of second-generation sequencing by providing more complete assembly. However, the computation of the long sequencing reads is challenged by their higher error rates (e.g., 13% vs. 1%) and higher cost ($0.3 vs. $0.03 per Mbp) compared to the short reads. METHODS In this paper, we present a new hybrid error correction tool, called ParLECH (Parallel Long-read Error Correction using Hybrid methodology). The error correction algorithm of ParLECH is distributed in nature and efficiently utilizes the k-mer coverage information of high throughput Illumina short-read sequences to rectify the PacBio long-read sequences.ParLECH first constructs a de Bruijn graph from the short reads, and then replaces the indel error regions of the long reads with their corresponding widest path (or maximum min-coverage path) in the short read-based de Bruijn graph. ParLECH then utilizes the k-mer coverage information of the short reads to divide each long read into a sequence of low and high coverage regions, followed by a majority voting to rectify each substituted error base. RESULTS ParLECH outperforms latest state-of-the-art hybrid error correction methods on real PacBio datasets. Our experimental evaluation results demonstrate that ParLECH can correct large-scale real-world datasets in an accurate and scalable manner. ParLECH can correct the indel errors of human genome PacBio long reads (312 GB) with Illumina short reads (452 GB) in less than 29 h using 128 compute nodes. ParLECH can align more than 92% bases of an E. coli PacBio dataset with the reference genome, proving its accuracy. CONCLUSION ParLECH can scale to over terabytes of sequencing data using hundreds of computing nodes. The proposed hybrid error correction methodology is novel and rectifies both indel and substitution errors present in the original long reads or newly introduced by the short reads.
Collapse
Affiliation(s)
- Arghya Kusum Das
- Department of Computer Science and Software Engineering, University of Wisconsin at Platteville, Platteville, WI USA
| | - Sayan Goswami
- School of Electrical Engineering and Computer Science, Center for Computation and Technology, Louisiana State University, Baton Rouge, Baton Rouge, LA USA
| | - Kisung Lee
- School of Electrical Engineering and Computer Science, Center for Computation and Technology, Louisiana State University, Baton Rouge, Baton Rouge, LA USA
| | - Seung-Jong Park
- School of Electrical Engineering and Computer Science, Center for Computation and Technology, Louisiana State University, Baton Rouge, Baton Rouge, LA USA
| |
Collapse
|
76
|
Lamelza P, Young JM, Noble LM, Caro L, Isakharov A, Palanisamy M, Rockman MV, Malik HS, Ailion M. Hybridization promotes asexual reproduction in Caenorhabditis nematodes. PLoS Genet 2019; 15:e1008520. [PMID: 31841515 PMCID: PMC6946170 DOI: 10.1371/journal.pgen.1008520] [Citation(s) in RCA: 6] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/27/2019] [Revised: 01/07/2020] [Accepted: 11/15/2019] [Indexed: 02/04/2023] Open
Abstract
Although most unicellular organisms reproduce asexually, most multicellular eukaryotes are obligately sexual. This implies that there are strong barriers that prevent the origin or maintenance of asexuality arising from an obligately sexual ancestor. By studying rare asexual animal species we can gain a better understanding of the circumstances that facilitate their evolution from a sexual ancestor. Of the known asexual animal species, many originated by hybridization between two ancestral sexual species. The balance hypothesis predicts that genetic incompatibilities between the divergent genomes in hybrids can modify meiosis and facilitate asexual reproduction, but there are few instances where this has been shown. Here we report that hybridizing two sexual Caenorhabditis nematode species (C. nouraguensis females and C. becei males) alters the normal inheritance of the maternal and paternal genomes during the formation of hybrid zygotes. Most offspring of this interspecies cross die during embryogenesis, exhibiting inheritance of a diploid C. nouraguensis maternal genome and incomplete inheritance of C. becei paternal DNA. However, a small fraction of offspring develop into viable adults that can be either fertile or sterile. Fertile offspring are produced asexually by sperm-dependent parthenogenesis (also called gynogenesis or pseudogamy); these progeny inherit a diploid maternal genome but fail to inherit a paternal genome. Sterile offspring are hybrids that inherit both a diploid maternal genome and a haploid paternal genome. Whole-genome sequencing of individual viable worms shows that diploid maternal inheritance in both fertile and sterile offspring results from an altered meiosis in C. nouraguensis oocytes and the inheritance of two randomly selected homologous chromatids. We hypothesize that hybrid incompatibility between C. nouraguensis and C. becei modifies maternal and paternal genome inheritance and indirectly induces gynogenetic reproduction. This system can be used to dissect the molecular mechanisms by which hybrid incompatibilities can facilitate the emergence of asexual reproduction.
Collapse
Affiliation(s)
- Piero Lamelza
- Molecular and Cellular Biology Graduate Program, University of Washington, Seattle, Washington, United States of America
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| | - Janet M. Young
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Luke M. Noble
- Department of Biology and Center for Genomics & Systems Biology, New York University, New York, New York, United States of America
| | - Lews Caro
- Molecular and Cellular Biology Graduate Program, University of Washington, Seattle, Washington, United States of America
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| | - Arielle Isakharov
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| | - Meenakshi Palanisamy
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| | - Matthew V. Rockman
- Department of Biology and Center for Genomics & Systems Biology, New York University, New York, New York, United States of America
| | - Harmit S. Malik
- Molecular and Cellular Biology Graduate Program, University of Washington, Seattle, Washington, United States of America
- Division of Basic Sciences, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
- Howard Hughes Medical Institute, Fred Hutchinson Cancer Research Center, Seattle, Washington, United States of America
| | - Michael Ailion
- Molecular and Cellular Biology Graduate Program, University of Washington, Seattle, Washington, United States of America
- Department of Biochemistry, University of Washington, Seattle, Washington, United States of America
| |
Collapse
|
77
|
Ge J, Meng J, Guo N, Wei Y, Balaji P, Feng S. Counting Kmers for Biological Sequences at Large Scale. Interdiscip Sci 2019; 12:99-108. [PMID: 31734873 DOI: 10.1007/s12539-019-00348-5] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Revised: 08/19/2019] [Accepted: 10/25/2019] [Indexed: 11/25/2022]
Abstract
Counting the abundance of all the distinct kmers in biological sequence data is a fundamental step in bioinformatics. These applications include de novo genome assembly, error correction, etc. With the development of sequencing technology, the sequence data in a single project can reach Petabyte-scale or Terabyte-scale nucleotides. Counting demand for the abundance of these sequencing data is beyond the memory and computing capacity of single computing node, and how to process it efficiently is a challenge on a high-performance computing cluster. As such, we propose SWAPCounter, a highly scalable distributed approach for kmer counting. This approach is embedded with an MPI streaming I/O module for loading huge data set at high speed, and a counting bloom filter module for both memory and communication efficiency. By overlapping all the counting steps, SWAPCounter achieves high scalability with high parallel efficiency. The experimental results indicate that SWAPCounter has competitive performance with two other tools on shared memory environment, KMC2, and MSPKmerCounter. Moreover, SWAPCounter also shows the highest scalability under strong scaling experiments. In our experiment on Cetus supercomputer, SWAPCounter scales to 32,768 cores with 79% parallel efficiency (using 2048 cores as baseline) when processing 4 TB sequence data of 1000 Genomes. The source code of SWAPCounter is publicly available at https://github.com/mengjintao/SWAPCounter.
Collapse
Affiliation(s)
- Jianqiu Ge
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Beijing, 518055, China
| | - Jintao Meng
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Beijing, 518055, China
| | - Ning Guo
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Beijing, 518055, China
| | - Yanjie Wei
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Beijing, 518055, China.
| | - Pavan Balaji
- Mathematics and Computer Science Division, Argonne National Laboratory, Lemont, IL, 60439-4844, USA
| | - Shengzhong Feng
- Shenzhen Institutes of Advanced Technology, Chinese Academy of Sciences, Beijing, 518055, China
| |
Collapse
|
78
|
De Novo Assembly and Annotation from Parental and F 1 Puma Genomes of the Florida Panther Genetic Restoration Program. G3-GENES GENOMES GENETICS 2019; 9:3531-3536. [PMID: 31519748 PMCID: PMC6829145 DOI: 10.1534/g3.119.400629] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Indexed: 12/30/2022]
Abstract
In the mid-1990s, the population size of Florida panthers became so small that many individuals manifested traits associated with inbreeding depression (e.g., heart defects, cryptorchidism, high pathogen-parasite load). To mitigate these effects, pumas from Texas were introduced into South Florida to augment genetic variation in Florida panthers. In this study, we report a de novo puma genome assembly and annotation after resequencing 10 individual genomes from partial Florida-Texas-F1 trios. The final genome assembly consisted of ∼2.6 Gb and 20,561 functionally annotated protein-coding genes. Foremost, expanded gene families were associated with neuronal and embryological development, whereas contracted gene families were associated with olfactory receptors. Despite the latter, we characterized 17 positively selected genes related to the refinement of multiple sensory perceptions, most notably to visual capabilities. Furthermore, genes under positive selection were enriched for the targeting of proteins to the endoplasmic reticulum, degradation of mRNAs, and transcription of viral genomes. Nearly half (48.5%) of ∼6.2 million SNPs analyzed in the total sample set contained putative unique Texas alleles. Most of these alleles were likely inherited to subsequent F1 Florida panthers, as these individuals manifested a threefold increase in observed heterozygosity with respect to their immediate, canonical Florida panther predecessors. Demographic simulations were consistent with a recent colonization event in North America by a small number of founders from South America during the last glacial period. In conclusion, we provide an extensive set of genomic resources for pumas and elucidate the genomic effects of genetic rescue on this iconic conservation success story.
Collapse
|
79
|
Fischer-Hwang I, Ochoa I, Weissman T, Hernaez M. Denoising of Aligned Genomic Data. Sci Rep 2019; 9:15067. [PMID: 31636330 PMCID: PMC6803637 DOI: 10.1038/s41598-019-51418-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/07/2019] [Accepted: 09/25/2019] [Indexed: 12/30/2022] Open
Abstract
Noise in genomic sequencing data is known to have effects on various stages of genomic data analysis pipelines. Variant identification is an important step of many of these pipelines, and is increasingly being used in clinical settings to aid medical practices. We propose a denoising method, dubbed SAMDUDE, which operates on aligned genomic data in order to improve variant calling performance. Denoising human data with SAMDUDE resulted in improved variant identification in both individual chromosome as well as whole genome sequencing (WGS) data sets. In the WGS data set, denoising led to identification of almost 2,000 additional true variants, and elimination of over 1,500 erroneously identified variants. In contrast, we found that denoising with other state-of-the-art denoisers significantly worsens variant calling performance. SAMDUDE is written in Python and is freely available at https://github.com/ihwang/SAMDUDE .
Collapse
Affiliation(s)
- Irena Fischer-Hwang
- Stanford University, Department of Electrical Engineering, Stanford, 94305, USA.
| | - Idoia Ochoa
- University of Illinois Urbana-Champaign, Department of Electrical and Computer Engineering, Urbana, 61801, USA
| | - Tsachy Weissman
- Stanford University, Department of Electrical Engineering, Stanford, 94305, USA
| | - Mikel Hernaez
- University of Illinois Urbana-Champaign, Carl R. Woese Institute for Genomic Biology, Urbana, 61801, USA.
| |
Collapse
|
80
|
Savin C, Criscuolo A, Guglielmini J, Le Guern AS, Carniel E, Pizarro-Cerdá J, Brisse S. Genus-wide Yersinia core-genome multilocus sequence typing for species identification and strain characterization. Microb Genom 2019; 5:e000301. [PMID: 31580794 PMCID: PMC6861861 DOI: 10.1099/mgen.0.000301] [Citation(s) in RCA: 31] [Impact Index Per Article: 5.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2019] [Accepted: 09/16/2019] [Indexed: 11/18/2022] Open
Abstract
The genus Yersinia comprises species that differ widely in their pathogenic potential and public-health significance. Yersinia pestis is responsible for plague, while Yersinia enterocolitica is a prominent enteropathogen. Strains within some species, including Y. enterocolitica, also vary in their pathogenic properties. Phenotypic identification of Yersinia species is time-consuming, labour-intensive and may lead to incorrect identifications. Here, we developed a method to automatically identify and subtype all Yersinia isolates from their genomic sequence. A phylogenetic analysis of Yersinia isolates based on a core subset of 500 shared genes clearly demarcated all existing Yersinia species and uncovered novel, yet undefined Yersinia taxa. An automated taxonomic assignment procedure was developed using species-specific thresholds based on core-genome multilocus sequence typing (cgMLST). The performance of this method was assessed on 1843 isolates prospectively collected by the French National Surveillance System and analysed in parallel using phenotypic reference methods, leading to nearly complete (1814; 98.4 %) agreement at species and infra-specific (biotype and serotype) levels. For 29 isolates, incorrect phenotypic assignments resulted from atypical biochemical characteristics or lack of phenotypic resolution. To provide an identification tool, a database of cgMLST profiles and reference taxonomic information has been made publicly accessible (https://bigsdb.pasteur.fr/yersinia). Genomic sequencing-based identification and subtyping of any Yersinia is a powerful and reliable novel approach to define the pathogenic potential of isolates of this medically important genus.
Collapse
Affiliation(s)
- Cyril Savin
- Yersinia Research Unit, Institut Pasteur, Paris, France
- National Reference Laboratory for Plague and Other Yersinioses, Institut Pasteur, Paris, France
- WHO Collaborating Centre for Yersinia, Institut Pasteur, Paris, France
| | - Alexis Criscuolo
- Hub de Bioinformatique et Biostatistique – Département Biologie Computationnelle, Institut Pasteur, USR 3756 CNRS, Paris, France
| | - Julien Guglielmini
- Hub de Bioinformatique et Biostatistique – Département Biologie Computationnelle, Institut Pasteur, USR 3756 CNRS, Paris, France
| | - Anne-Sophie Le Guern
- Yersinia Research Unit, Institut Pasteur, Paris, France
- National Reference Laboratory for Plague and Other Yersinioses, Institut Pasteur, Paris, France
- WHO Collaborating Centre for Yersinia, Institut Pasteur, Paris, France
| | - Elisabeth Carniel
- Yersinia Research Unit, Institut Pasteur, Paris, France
- National Reference Laboratory for Plague and Other Yersinioses, Institut Pasteur, Paris, France
- WHO Collaborating Centre for Yersinia, Institut Pasteur, Paris, France
| | - Javier Pizarro-Cerdá
- Yersinia Research Unit, Institut Pasteur, Paris, France
- National Reference Laboratory for Plague and Other Yersinioses, Institut Pasteur, Paris, France
- WHO Collaborating Centre for Yersinia, Institut Pasteur, Paris, France
| | - Sylvain Brisse
- Biodiversity and Epidemiology of Bacterial Pathogens, Institut Pasteur, Paris, France
| |
Collapse
|
81
|
Chevrette MG, Carlos-Shanley C, Louie KB, Bowen BP, Northen TR, Currie CR. Taxonomic and Metabolic Incongruence in the Ancient Genus Streptomyces. Front Microbiol 2019; 10:2170. [PMID: 31616394 PMCID: PMC6763951 DOI: 10.3389/fmicb.2019.02170] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/08/2019] [Accepted: 09/04/2019] [Indexed: 12/15/2022] Open
Abstract
The advent of culture independent approaches has greatly facilitated insights into the vast diversity of bacteria and the ecological importance they hold in nature and human health. Recently, metagenomic surveys and other culture-independent methods have begun to describe the distribution and diversity of microbial metabolism across environmental conditions, often using 16S rRNA gene as a marker to group bacteria into taxonomic units. However, the extent to which similarity at the conserved ribosomal 16S gene correlates with different measures of phylogeny, metabolic diversity, and ecologically relevant gene content remains contentious. Here, we examine the relationship between 16S identity, core genome divergence, and metabolic gene content across the ancient and ecologically important genus Streptomyces. We assessed and quantified the high variability of average nucleotide identity (ANI) and ortholog presence/absence within Streptomyces, even in strains identical by 16S. Furthermore, we identified key differences in shared ecologically important characters, such as antibiotic resistance, carbohydrate metabolism, biosynthetic gene clusters (BGCs), and other metabolic hallmarks, within 16S identities commonly treated as the same operational taxonomic units (OTUs). Differences between common phylogenetic measures and metabolite-gene annotations confirmed this incongruence. Our results highlight the metabolic diversity and variability within OTUs and add to the growing body of work suggesting 16S-based studies of Streptomyces fail to resolve important ecological and metabolic characteristics.
Collapse
Affiliation(s)
- Marc G Chevrette
- Department of Plant Pathology, Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, WI, United States
| | | | - Katherine B Louie
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Joint Genome Institute, Berkeley, CA, United States
| | - Benjamin P Bowen
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Joint Genome Institute, Berkeley, CA, United States
| | - Trent R Northen
- Environmental Genomics and Systems Biology, Lawrence Berkeley National Laboratory, Joint Genome Institute, Berkeley, CA, United States
| | - Cameron R Currie
- Department of Bacteriology, University of Wisconsin-Madison, Madison, WI, United States
| |
Collapse
|
82
|
Rowe WPM. When the levee breaks: a practical guide to sketching algorithms for processing the flood of genomic data. Genome Biol 2019; 20:199. [PMID: 31519212 PMCID: PMC6744645 DOI: 10.1186/s13059-019-1809-x] [Citation(s) in RCA: 18] [Impact Index Per Article: 3.0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2019] [Accepted: 09/02/2019] [Indexed: 01/21/2023] Open
Abstract
Considerable advances in genomics over the past decade have resulted in vast amounts of data being generated and deposited in global archives. The growth of these archives exceeds our ability to process their content, leading to significant analysis bottlenecks. Sketching algorithms produce small, approximate summaries of data and have shown great utility in tackling this flood of genomic data, while using minimal compute resources. This article reviews the current state of the field, focusing on how the algorithms work and how genomicists can utilize them effectively. References to interactive workbooks for explaining concepts and demonstrating workflows are included at https://github.com/will-rowe/genome-sketching .
Collapse
Affiliation(s)
- Will P M Rowe
- Institute of Microbiology and Infection, School of Biosciences, University of Birmingham, Birmingham, B15 2TT, UK.
- Scientific Computing Department, The Hartree Centre, STFC Daresbury Laboratory, Warrington, WA4 4AD, UK.
| |
Collapse
|
83
|
Draft Genome Sequence of a Chryseobacterium indologenes Strain Isolated from a Blood Culture of a Hospitalized Child in Antananarivo, Madagascar. Microbiol Resour Announc 2019; 8:8/35/e00752-19. [PMID: 31467100 PMCID: PMC6715870 DOI: 10.1128/mra.00752-19] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/20/2022] Open
Abstract
We report here the draft genome sequence of a
Chryseobacterium indologenes
strain, isolated from a blood culture of a 2.2-year-old child admitted to the hospital for vomiting and coughing. The genome was composed of 5,063,674 bp and had 37.04% GC content. We detected 4,796 genes with predicted protein-coding functions, including those associated with antibiotic resistance.
Collapse
|
84
|
Nevers A, Doyen A, Malabat C, Néron B, Kergrohen T, Jacquier A, Badis G. Antisense transcriptional interference mediates condition-specific gene repression in budding yeast. Nucleic Acids Res 2019; 46:6009-6025. [PMID: 29788449 PMCID: PMC6158615 DOI: 10.1093/nar/gky342] [Citation(s) in RCA: 33] [Impact Index Per Article: 5.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/12/2017] [Accepted: 04/23/2018] [Indexed: 12/20/2022] Open
Abstract
Pervasive transcription generates many unstable non-coding transcripts in budding yeast. The transcription of such noncoding RNAs, in particular antisense RNAs (asRNAs), has been shown in a few examples to repress the expression of the associated mRNAs. Yet, such mechanism is not known to commonly contribute to the regulation of a given class of genes. Using a mutant context that stabilized pervasive transcripts, we observed that the least expressed mRNAs during the exponential phase were associated with high levels of asRNAs. These asRNAs also overlapped their corresponding gene promoters with a much higher frequency than average. Interrupting antisense transcription of a subset of genes corresponding to quiescence-enriched mRNAs restored their expression. The underlying mechanism acts in cis and involves several chromatin modifiers. Our results convey that transcription interference represses up to 30% of the 590 least expressed genes, which includes 163 genes with quiescence-enriched mRNAs. We also found that pervasive transcripts constitute a higher fraction of the transcriptome in quiescence relative to the exponential phase, consistent with gene expression itself playing an important role to suppress pervasive transcription. Accordingly, the HIS1 asRNA, normally only present in quiescence, is expressed in exponential phase upon HIS1 mRNA transcription interruption.
Collapse
Affiliation(s)
- Alicia Nevers
- Unité GIM, Institut Pasteur, Paris, France.,Sorbonne Université Pierre et Marie Curie, Paris, France
| | | | - Christophe Malabat
- Unité GIM, Institut Pasteur, Paris, France.,Bioinformatics and Biostatistics Hub, C3BI, Institut Pasteur, USR 3756 IP CNRS, Paris, France
| | - Bertrand Néron
- Bioinformatics and Biostatistics Hub, C3BI, Institut Pasteur, USR 3756 IP CNRS, Paris, France
| | | | - Alain Jacquier
- Unité GIM, Institut Pasteur, Paris, France.,CNRS UMR3525, Paris, France
| | - Gwenael Badis
- Unité GIM, Institut Pasteur, Paris, France.,CNRS UMR3525, Paris, France
| |
Collapse
|
85
|
Pan T, Flick P, Jain C, Liu Y, Aluru S. Kmerind: A Flexible Parallel Library for K-mer Indexing of Biological Sequences on Distributed Memory Systems. IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS 2019; 16:1117-1131. [PMID: 28991750 DOI: 10.1109/tcbb.2017.2760829] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 06/07/2023]
Abstract
Counting and indexing fixed length substrings, or $k$k-mers, in biological sequences is a key step in many bioinformatics tasks including genome alignment and mapping, genome assembly, and error correction. While advances in next generation sequencing technologies have dramatically reduced the cost and improved latency and throughput, few bioinformatics tools can efficiently process the datasets at the current generation rate of 1.8 terabases per 3-day experiment from a single sequencer. We present Kmerind, a high performance parallel $k$k-mer indexing library for distributed memory environments. The Kmerind library provides a set of simple and consistent APIs with sequential semantics and parallel implementations that are designed to be flexible and extensible. Kmerind's $k$k-mer counter performs similarly or better than the best existing $k$k-mer counting tools even on shared memory systems. In a distributed memory environment, Kmerind counts $k$k-mers in a 120 GB sequence read dataset in less than 13 seconds on 1024 Xeon CPU cores, and fully indexes their positions in approximately 17 seconds. Querying for 1 percent of the $k$k-mers in these indices can be completed in 0.23 seconds and 28 seconds, respectively. Kmerind is the first $k$k-mer indexing library for distributed memory environments, and the first extensible library for general $k$k-mer indexing and counting. Kmerind is available at https://github.com/ParBLiSS/kmerind.
Collapse
|
86
|
Lin CYG, Chao JL, Tsai HK, Chalker D, Yao MC. Setting boundaries for genome-wide heterochromatic DNA deletions through flanking inverted repeats in Tetrahymena thermophila. Nucleic Acids Res 2019; 47:5181-5192. [PMID: 30918956 PMCID: PMC6547420 DOI: 10.1093/nar/gkz209] [Citation(s) in RCA: 7] [Impact Index Per Article: 1.2] [Reference Citation Analysis] [Abstract] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/03/2018] [Revised: 03/03/2019] [Accepted: 03/26/2019] [Indexed: 12/13/2022] Open
Abstract
Eukaryotic cells pack their genomic DNA into euchromatin and heterochromatin. Boundaries between these domains have been shown to be set by boundary elements. In Tetrahymena, heterochromatin domains are targeted for deletion from the somatic nuclei through a sophisticated programmed DNA rearrangement mechanism, resulting in the elimination of 34% of the germline genome in ∼10,000 dispersed segments. Here we showed that most of these deletions occur consistently with very limited variations in their boundaries among inbred lines. We identified several potential flanking regulatory sequences, each associated with a subset of deletions, using a genome-wide motif finding approach. These flanking sequences are inverted repeats with the copies located at nearly identical distances from the opposite ends of the deleted regions, suggesting potential roles in boundary determination. By removing and testing two such inverted repeats in vivo, we found that the ability for boundary maintenance of the associated deletion were lost. Furthermore, we analyzed the deletion boundaries in mutants of a known boundary-determining protein, Lia3p and found that the subset of deletions that are affected by LIA3 knockout contained common features of flanking regulatory sequences. This study suggests a common mechanism for setting deletion boundaries by flanking inverted repeats in Tetrahymena thermophila.
Collapse
Affiliation(s)
- Chih-Yi Gabriela Lin
- Institute of Molecular Biology, Academia Sinica, 11529 Taipei, Taiwan
- Genome and Systems Biology Degree Program, National Taiwan University, 10617 Taipei, Taiwan
| | - Ju-Lan Chao
- Institute of Molecular Biology, Academia Sinica, 11529 Taipei, Taiwan
| | - Huai-Kuang Tsai
- Genome and Systems Biology Degree Program, National Taiwan University, 10617 Taipei, Taiwan
- Institute of Information Science, Academia Sinica, 11529 Taipei, Taiwan
| | - Douglas Chalker
- Department of Biology, Washington University in St. Louis, St. Louis, MO 63130, USA
| | - Meng-Chao Yao
- Institute of Molecular Biology, Academia Sinica, 11529 Taipei, Taiwan
- Genome and Systems Biology Degree Program, National Taiwan University, 10617 Taipei, Taiwan
| |
Collapse
|
87
|
Heydari M, Miclotte G, Van de Peer Y, Fostier J. Illumina error correction near highly repetitive DNA regions improves de novo genome assembly. BMC Bioinformatics 2019; 20:298. [PMID: 31159722 PMCID: PMC6545690 DOI: 10.1186/s12859-019-2906-2] [Citation(s) in RCA: 21] [Impact Index Per Article: 3.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2019] [Accepted: 05/17/2019] [Indexed: 11/10/2022] Open
Abstract
Background Several standalone error correction tools have been proposed to correct sequencing errors in Illumina data in order to facilitate de novo genome assembly. However, in a recent survey, we showed that state-of-the-art assemblers often did not benefit from this pre-correction step. We found that many error correction tools introduce new errors in reads that overlap highly repetitive DNA regions such as low-complexity patterns or short homopolymers, ultimately leading to a more fragmented assembly. Results We propose BrownieCorrector, an error correction tool for Illumina sequencing data that focuses on the correction of only those reads that overlap short DNA patterns that are highly repetitive in the genome. BrownieCorrector extracts all reads that contain such a pattern and clusters them into different groups using a community detection algorithm that takes into account both the sequence similarity between overlapping reads and their respective paired-end reads. Each cluster holds reads that originate from the same genomic region and hence each cluster can be corrected individually, thus providing a consistent correction for all reads within that cluster. Conclusions BrownieCorrector is benchmarked using six real Illumina datasets for different eukaryotic genomes. The prior use of BrownieCorrector improves assembly results over the use of uncorrected reads in all cases. In comparison with other error correction tools, BrownieCorrector leads to the best assembly results in most cases even though less than 2% of the reads within a dataset are corrected. Additionally, we investigate the impact of error correction on hybrid assembly where the corrected Illumina reads are supplemented with PacBio data. Our results confirm that BrownieCorrector improves the quality of hybrid genome assembly as well. BrownieCorrector is written in standard C++11 and released under GPL license. BrownieCorrector relies on multithreading to take advantage of multi-core/multi-CPU systems. The source code is available at https://github.com/biointec/browniecorrector. Electronic supplementary material The online version of this article (10.1186/s12859-019-2906-2) contains supplementary material, which is available to authorized users.
Collapse
Affiliation(s)
- Mahdi Heydari
- Department of Information Technology, Ghent University-imec, IDLab, Ghent, B-9052, Belgium.,Bioinformatics Institute Ghent, Ghent, B-9052, Belgium
| | - Giles Miclotte
- Department of Information Technology, Ghent University-imec, IDLab, Ghent, B-9052, Belgium.,Bioinformatics Institute Ghent, Ghent, B-9052, Belgium
| | - Yves Van de Peer
- Bioinformatics Institute Ghent, Ghent, B-9052, Belgium.,Center for Plant Systems Biology, VIB, Ghent, B-9052, Belgium.,Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, B-9052, Belgium.,Department of Genetics, Genome Research Institute, University of Pretoria, Pretoria, South Africa
| | - Jan Fostier
- Department of Information Technology, Ghent University-imec, IDLab, Ghent, B-9052, Belgium. .,Bioinformatics Institute Ghent, Ghent, B-9052, Belgium.
| |
Collapse
|
88
|
Competition among Nasal Bacteria Suggests a Role for Siderophore-Mediated Interactions in Shaping the Human Nasal Microbiota. Appl Environ Microbiol 2019; 85:AEM.02406-18. [PMID: 30578265 DOI: 10.1128/aem.02406-18] [Citation(s) in RCA: 59] [Impact Index Per Article: 9.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/01/2018] [Accepted: 12/14/2018] [Indexed: 12/26/2022] Open
Abstract
Resources available in the human nasal cavity are limited. Therefore, to successfully colonize the nasal cavity, bacteria must compete for scarce nutrients. Competition may occur directly through interference (e.g., antibiotics) or indirectly by nutrient sequestration. To investigate the nature of nasal bacterial competition, we performed coculture inhibition assays between nasal Actinobacteria and Staphylococcus spp. We found that isolates of coagulase-negative staphylococci (CoNS) were sensitive to growth inhibition by Actinobacteria but that Staphylococcus aureus isolates were resistant to inhibition. Among Actinobacteria, we observed that Corynebacterium spp. were variable in their ability to inhibit CoNS. We sequenced the genomes of 10 Corynebacterium species isolates, including 3 Corynebacterium propinquum isolates that strongly inhibited CoNS and 7 other Corynebacterium species isolates that only weakly inhibited CoNS. Using a comparative genomics approach, we found that the C. propinquum genomes were enriched in genes for iron acquisition and harbored a biosynthetic gene cluster (BGC) for siderophore production, absent in the noninhibitory Corynebacterium species genomes. Using a chrome azurol S assay, we confirmed that C. propinquum produced siderophores. We demonstrated that iron supplementation rescued CoNS from inhibition by C. propinquum, suggesting that inhibition was due to iron restriction through siderophore production. Through comparative metabolomics and molecular networking, we identified the siderophore produced by C. propinquum as dehydroxynocardamine. Finally, we confirmed that the dehydroxynocardamine BGC is expressed in vivo by analyzing human nasal metatranscriptomes from the NIH Human Microbiome Project. Together, our results suggest that bacteria produce siderophores to compete for limited available iron in the nasal cavity and improve their fitness.IMPORTANCE Within the nasal cavity, interference competition through antimicrobial production is prevalent. For instance, nasal Staphylococcus species strains can inhibit the growth of other bacteria through the production of nonribosomal peptides and ribosomally synthesized and posttranslationally modified peptides. In contrast, bacteria engaging in exploitation competition modify the external environment to prevent competitors from growing, usually by hindering access to or depleting essential nutrients. As the nasal cavity is a nutrient-limited environment, we hypothesized that exploitation competition occurs in this system. We determined that Corynebacterium propinquum produces an iron-chelating siderophore, and this iron-sequestering molecule correlates with the ability to inhibit the growth of coagulase-negative staphylococci. Furthermore, we found that the genes required for siderophore production are expressed in vivo Thus, although siderophore production by bacteria is often considered a virulence trait, our work indicates that bacteria may produce siderophores to compete for limited iron in the human nasal cavity.
Collapse
|
89
|
Atypical Hemolytic Listeria innocua Isolates Are Virulent, albeit Less than Listeria monocytogenes. Infect Immun 2019; 87:IAI.00758-18. [PMID: 30670551 DOI: 10.1128/iai.00758-18] [Citation(s) in RCA: 48] [Impact Index Per Article: 8.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/03/2019] [Accepted: 01/12/2019] [Indexed: 01/26/2023] Open
Abstract
Listeria innocua is considered a nonpathogenic Listeria species. Natural atypical hemolytic L. innocua isolates have been reported but have not been characterized in detail. Here, we report the genomic and functional characterization of representative isolates from the two known natural hemolytic L. innocua clades. Whole-genome sequencing confirmed the presence of Listeria pathogenicity islands (LIPI) characteristic of Listeria monocytogenes species. Functional assays showed that LIPI-1 and inlA genes are transcribed, and the corresponding gene products are expressed and functional. Using in vitro and in vivo assays, we show that atypical hemolytic L. innocua is virulent, can actively cross the intestinal epithelium, and spreads systemically to the liver and spleen, albeit to a lesser degree than the reference L. monocytogenes EGDe strain. Although human exposure to hemolytic L. innocua is likely rare, these findings are important for food safety and public health. The presence of virulence traits in some L. innocua clades supports the existence of a common virulent ancestor of L. monocytogenes and L. innocua.
Collapse
|
90
|
Whole-Genome Sequences of a Cluster of 14 Unidentified Related
Veillonella
sp. Strains from Human Clinical Samples and Type Strains of 3
Veillonella
Validated Species. Microbiol Resour Announc 2019; 8:8/12/e01743-18. [PMID: 30938711 PMCID: PMC6430328 DOI: 10.1128/mra.01743-18] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/28/2022] Open
Abstract
We report 17 draft genomes for 14 unidentified Veillonella sp. strains closely related in 16S rRNA gene-based phylogeny and type strains of 3 Veillonella species with the aims of deciphering relationships between related species, evaluating the accuracy of current thresholds for species delineation, and robustly describing new species in the genus. We report 17 draft genomes for 14 unidentified Veillonella sp. strains closely related in 16S rRNA gene-based phylogeny and type strains of 3 Veillonella species with the aims of deciphering relationships between related species, evaluating the accuracy of current thresholds for species delineation, and robustly describing new species in the genus.
Collapse
|
91
|
Leclère L, Horin C, Chevalier S, Lapébie P, Dru P, Peron S, Jager M, Condamine T, Pottin K, Romano S, Steger J, Sinigaglia C, Barreau C, Quiroga Artigas G, Ruggiero A, Fourrage C, Kraus JEM, Poulain J, Aury JM, Wincker P, Quéinnec E, Technau U, Manuel M, Momose T, Houliston E, Copley RR. The genome of the jellyfish Clytia hemisphaerica and the evolution of the cnidarian life-cycle. Nat Ecol Evol 2019; 3:801-810. [PMID: 30858591 DOI: 10.1038/s41559-019-0833-2] [Citation(s) in RCA: 88] [Impact Index Per Article: 14.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/17/2018] [Accepted: 01/30/2019] [Indexed: 12/14/2022]
Abstract
Jellyfish (medusae) are a distinctive life-cycle stage of medusozoan cnidarians. They are major marine predators, with integrated neurosensory, muscular and organ systems. The genetic foundations of this complex form are largely unknown. We report the draft genome of the hydrozoan jellyfish Clytia hemisphaerica and use multiple transcriptomes to determine gene use across life-cycle stages. Medusa, planula larva and polyp are each characterized by distinct transcriptome signatures reflecting abrupt life-cycle transitions and all deploy a mixture of phylogenetically old and new genes. Medusa-specific transcription factors, including many with bilaterian orthologues, associate with diverse neurosensory structures. Compared to Clytia, the polyp-only hydrozoan Hydra has lost many of the medusa-expressed transcription factors, despite similar overall rates of gene content evolution and sequence evolution. Absence of expression and gene loss among Clytia orthologues of genes patterning the anthozoan aboral pole, secondary axis and endomesoderm support simplification of planulae and polyps in Hydrozoa, including loss of bilateral symmetry. Consequently, although the polyp and planula are generally considered the ancestral cnidarian forms, in Clytia the medusa maximally deploys the ancestral cnidarian-bilaterian transcription factor gene complement.
Collapse
Affiliation(s)
- Lucas Leclère
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France
| | - Coralie Horin
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France
| | - Sandra Chevalier
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France
| | - Pascal Lapébie
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France.,Architecture et Fonction des Macromolécules Biologiques, Aix-Marseille Université, Marseille, France
| | - Philippe Dru
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France
| | - Sophie Peron
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France
| | - Muriel Jager
- Evolution Paris-Seine, Institut de Biologie Paris-Seine, Sorbonne Université, CNRS, Paris, France.,Institut de Systématique, Evolution, Biodiversité (ISYEB UMR 7205), Sorbonne Université, MNHN, CNRS, EPHE, Paris, France
| | - Thomas Condamine
- Evolution Paris-Seine, Institut de Biologie Paris-Seine, Sorbonne Université, CNRS, Paris, France
| | - Karen Pottin
- Evolution Paris-Seine, Institut de Biologie Paris-Seine, Sorbonne Université, CNRS, Paris, France.,Laboratoire de Biologie du Développement (IBPS-LBD, UMR7622), Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Paris, France
| | - Séverine Romano
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France
| | - Julia Steger
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France.,Laboratoire de Biologie du Développement (IBPS-LBD, UMR7622), Sorbonne Université, CNRS, Institut de Biologie Paris Seine, Paris, France
| | - Chiara Sinigaglia
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France.,Institut de Génomique Fonctionnelle de Lyon, École Normale Supérieure de Lyon, CNRS UMR 5242-INRA USC 1370, Lyon cedex 07, France
| | - Carine Barreau
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France
| | - Gonzalo Quiroga Artigas
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France.,The Whitney Laboratory for Marine Bioscience, University of Florida, St. Augustine, FL, USA
| | - Antonella Ruggiero
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France.,Centre de Recherche de Biologie cellulaire de Montpellier, CNRS UMR 5237, Université de Montpellier, Montpellier Cedex 5, France
| | - Cécile Fourrage
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France.,Service de Génétique UMR 781, Hôpital Necker-APHP, Paris, France
| | - Johanna E M Kraus
- Department for Molecular Evolution and Development, Centre of Organismal Systems Biology, University of Vienna, Vienna, Austria.,Sars International Centre for Marine Molecular Biology, University of Bergen, Bergen, Norway
| | - Julie Poulain
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Jean-Marc Aury
- Genoscope, Institut de Biologie François-Jacob, Commissariat à l'Energie Atomique, Université Paris-Saclay, Evry, France
| | - Patrick Wincker
- Génomique Métabolique, Genoscope, Institut François Jacob, CEA, CNRS, Univ Evry, Université Paris-Saclay, Evry, France
| | - Eric Quéinnec
- Evolution Paris-Seine, Institut de Biologie Paris-Seine, Sorbonne Université, CNRS, Paris, France.,Institut de Systématique, Evolution, Biodiversité (ISYEB UMR 7205), Sorbonne Université, MNHN, CNRS, EPHE, Paris, France
| | - Ulrich Technau
- Department for Molecular Evolution and Development, Centre of Organismal Systems Biology, University of Vienna, Vienna, Austria
| | - Michaël Manuel
- Evolution Paris-Seine, Institut de Biologie Paris-Seine, Sorbonne Université, CNRS, Paris, France.,Institut de Systématique, Evolution, Biodiversité (ISYEB UMR 7205), Sorbonne Université, MNHN, CNRS, EPHE, Paris, France
| | - Tsuyoshi Momose
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France
| | - Evelyn Houliston
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France
| | - Richard R Copley
- Laboratoire de Biologie du Développement de Villefranche-sur-mer, Sorbonne Université, CNRS, Villefranche-sur-mer, France.
| |
Collapse
|
92
|
Hu Y, Resende MFR, Bombarely A, Brym M, Bassil E, Chambers AH. Genomics-based diversity analysis of Vanilla species using a Vanilla planifolia draft genome and Genotyping-By-Sequencing. Sci Rep 2019; 9:3416. [PMID: 30833623 PMCID: PMC6399343 DOI: 10.1038/s41598-019-40144-1] [Citation(s) in RCA: 25] [Impact Index Per Article: 4.2] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/16/2018] [Accepted: 02/11/2019] [Indexed: 11/09/2022] Open
Abstract
Demand for all-natural vanilla flavor is increasing, but its botanical source, Vanilla planifolia, faces critical challenges arising from a narrow germplasm base and supply limitations. Genomics tools are the key to overcoming these limitations by enabling advanced genetics and plant breeding for new cultivars with improved yield and quality. The objective of this work was to establish the genomic resources needed to facilitate analysis of diversity among Vanilla accessions and to provide a resource to analyze other Vanilla collections. A V. planifolia draft genome was assembled and used to identify 521,732 single nucleotide polymorphism (SNP) markers using Genotyping-By-Sequencing (GBS). The draft genome had a size of 2.20 Gb representing 97% of the estimated genome size. A filtered set of 5,082 SNPs was used to genotype a living collection of 112 Vanilla accessions from 23 species including native Florida species. Principal component analysis of the genetic distances, population structure, and the maternally inherited rbcL gene identified putative hybrids, misidentified accessions, significant diversity within V. planifolia, and evidence for 12 clusters that separate accessions by species. These results validate the efficiency of genomics-based tools to characterize and identify genetic diversity in Vanilla and provide a significant tool for genomics-assisted plant breeding.
Collapse
Affiliation(s)
- Ying Hu
- Horticultural Sciences Department, University of Florida, Gainesville, FL, USA
| | - Marcio F R Resende
- Horticultural Sciences Department, University of Florida, Gainesville, FL, USA
| | - Aureliano Bombarely
- School of Plant and Environmental Sciences, Virginia Polytechnic Institute and State University, Blacksburg, VA, USA.,Department of Biosciences, Università degli Studi di Milano, Milan, Italy
| | - Maria Brym
- Tropical Research and Education Center, Horticultural Sciences Department, Homestead, FL, USA
| | - Elias Bassil
- Tropical Research and Education Center, Horticultural Sciences Department, Homestead, FL, USA.
| | - Alan H Chambers
- Tropical Research and Education Center, Horticultural Sciences Department, Homestead, FL, USA.
| |
Collapse
|
93
|
Kachroo P, Eraso JM, Beres SB, Olsen RJ, Zhu L, Nasser W, Bernard PE, Cantu CC, Saavedra MO, Arredondo MJ, Strope B, Do H, Kumaraswami M, Vuopio J, Gröndahl-Yli-Hannuksela K, Kristinsson KG, Gottfredsson M, Pesonen M, Pensar J, Davenport ER, Clark AG, Corander J, Caugant DA, Gaini S, Magnussen MD, Kubiak SL, Nguyen HAT, Long SW, Porter AR, DeLeo FR, Musser JM. Integrated analysis of population genomics, transcriptomics and virulence provides novel insights into Streptococcus pyogenes pathogenesis. Nat Genet 2019; 51:548-559. [PMID: 30778225 PMCID: PMC8547240 DOI: 10.1038/s41588-018-0343-1] [Citation(s) in RCA: 50] [Impact Index Per Article: 8.3] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2018] [Accepted: 12/21/2018] [Indexed: 12/22/2022]
Abstract
Streptococcus pyogenes causes 700 million human infections annually worldwide, yet, despite a century of intensive effort, there is no licensed vaccine against this bacterium. Although a number of large-scale genomic studies of bacterial pathogens have been published, the relationships among the genome, transcriptome, and virulence in large bacterial populations remain poorly understood. We sequenced the genomes of 2,101 emm28 S. pyogenes invasive strains, from which we selected 492 phylogenetically diverse strains for transcriptome analysis and 50 strains for virulence assessment. Data integration provided a novel understanding of the virulence mechanisms of this model organism. Genome-wide association study, expression quantitative trait loci analysis, machine learning, and isogenic mutant strains identified and confirmed a one-nucleotide indel in an intergenic region that significantly alters global transcript profiles and ultimately virulence. The integrative strategy that we used is generally applicable to any microbe and may lead to new therapeutics for many human pathogens.
Collapse
Affiliation(s)
- Priyanka Kachroo
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Jesus M Eraso
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Stephen B Beres
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Randall J Olsen
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
- Department of Pathology and Laboratory Medicine, Weill Cornell Medical College, New York, NY, USA
- Department of Microbiology and Immunology, Weill Cornell Medical College, New York, NY, USA
| | - Luchang Zhu
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Waleed Nasser
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Paul E Bernard
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Concepcion C Cantu
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Matthew Ojeda Saavedra
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - María José Arredondo
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Benjamin Strope
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Hackwon Do
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Muthiah Kumaraswami
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Jaana Vuopio
- Institute of Biomedicine, Medical Microbiology and Immunology, University of Turku, Turku, Finland
- National Institute for Health and Welfare, Helsinki, Finland
| | | | - Karl G Kristinsson
- Department of Clinical Microbiology, Landspitali University Hospital, Reykjavik, Iceland
- Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
| | - Magnus Gottfredsson
- Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
- Department of Infectious Diseases, Landspitali University Hospital, Reykjavik, Iceland
| | - Maiju Pesonen
- Helsinki Institute of Information Technology, Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
- Department of Computer Science, Aalto University, Espoo, Finland
| | - Johan Pensar
- Helsinki Institute of Information Technology, Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
| | - Emily R Davenport
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Andrew G Clark
- Department of Molecular Biology and Genetics, Cornell University, Ithaca, NY, USA
| | - Jukka Corander
- Helsinki Institute of Information Technology, Department of Mathematics and Statistics, University of Helsinki, Helsinki, Finland
- Department of Biostatistics, University of Oslo, Oslo, Norway
| | - Dominique A Caugant
- Division for Infection Control and Environmental Health, Norwegian Institute of Public Health, Oslo, Norway
| | - Shahin Gaini
- Medical Department, Infectious Diseases Division, National Hospital of the Faroe Islands, Tórshavn, Denmark
- Department of Infectious Diseases, Odense University Hospital, Odense, Denmark
- Department of Clinical Research, University of Southern Denmark, Odense, Denmark
- Department of Science and Technology, Centre of Health Research, University of the Faroe Islands, Tórshavn, Denmark
| | - Marita Debess Magnussen
- Faculty of Medicine, School of Health Sciences, University of Iceland, Reykjavik, Iceland
- Thetis, Food and Environmental Laboratory, Torshavn, Denmark
| | - Samantha L Kubiak
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Hoang A T Nguyen
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - S Wesley Long
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA
| | - Adeline R Porter
- Laboratory of Bacteriology, Rocky Mountain Laboratories, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Hamilton, MT, USA
| | - Frank R DeLeo
- Laboratory of Bacteriology, Rocky Mountain Laboratories, National Institute of Allergy and Infectious Diseases, National Institutes of Health, Hamilton, MT, USA
| | - James M Musser
- Center for Molecular and Translational Human Infectious Diseases Research, Department of Pathology and Genomic Medicine, Houston Methodist Research Institute and Houston Methodist Hospital, Houston, TX, USA.
- Department of Pathology and Laboratory Medicine, Weill Cornell Medical College, New York, NY, USA.
- Department of Microbiology and Immunology, Weill Cornell Medical College, New York, NY, USA.
| |
Collapse
|
94
|
Manekar SC, Sathe SR. Estimating the k-mer Coverage Frequencies in Genomic Datasets: A Comparative Assessment of the State-of-the-art. Curr Genomics 2019; 20:2-15. [PMID: 31015787 PMCID: PMC6446480 DOI: 10.2174/1389202919666181026101326] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Revised: 10/05/2018] [Accepted: 10/24/2018] [Indexed: 12/24/2022] Open
Abstract
Background In bioinformatics, estimation of k-mer abundance histograms or just enumerat-ing the number of unique k-mers and the number of singletons are desirable in many genome sequence analysis applications. The applications include predicting genome sizes, data pre-processing for de Bruijn graph assembly methods (tune runtime parameters for analysis tools), repeat detection, sequenc-ing coverage estimation, measuring sequencing error rates, etc. Different methods for cardinality estima-tion in sequencing data have been developed in recent years. Objective In this article, we present a comparative assessment of the different k-mer frequency estima-tion programs (ntCard, KmerGenie, KmerStream and Khmer (abundance-dist-single.py and unique-kmers.py) to assess their relative merits and demerits. Methods Principally, the miscounts/error-rates of these tools are analyzed by rigorous experimental analysis for a varied range of k. We also present experimental results on runtime, scalability for larger datasets, memory, CPU utilization as well as parallelism of k-mer frequency estimation methods. Results The results indicate that ntCard is more accurate in estimating F0, f1 and full k-mer abundance histograms compared with other methods. ntCard is the fastest but it has more memory requirements compared to KmerGenie. Conclusion The results of this evaluation may serve as a roadmap to potential users and practitioners of streaming algorithms for estimating k-mer coverage frequencies, to assist them in identifying an appro-priate method. Such results analysis also help researchers to discover remaining open research ques-tions, effective combinations of existing techniques and possible avenues for future research.
Collapse
Affiliation(s)
- Swati C Manekar
- Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, India
| | - Shailesh R Sathe
- Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur, India
| |
Collapse
|
95
|
Limasset A, Flot JF, Peterlongo P. Toward perfect reads: self-correction of short reads via mapping on de Bruijn graphs. Bioinformatics 2019; 36:1374-1381. [DOI: 10.1093/bioinformatics/btz102] [Citation(s) in RCA: 20] [Impact Index Per Article: 3.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 07/23/2018] [Revised: 01/07/2019] [Accepted: 02/18/2019] [Indexed: 12/25/2022] Open
Abstract
Abstract
Motivation
Short-read accuracy is important for downstream analyses such as genome assembly and hybrid long-read correction. Despite much work on short-read correction, present-day correctors either do not scale well on large datasets or consider reads as mere suites of k-mers, without taking into account their full-length sequence information.
Results
We propose a new method to correct short reads using de Bruijn graphs and implement it as a tool called Bcool. As a first step, Bcool constructs a compacted de Bruijn graph from the reads. This graph is filtered on the basis of k-mer abundance then of unitig abundance, thereby removing most sequencing errors. The cleaned graph is then used as a reference on which the reads are mapped to correct them. We show that this approach yields more accurate reads than k-mer-spectrum correctors while being scalable to human-size genomic datasets and beyond.
Availability and implementation
The implementation is open source, available at http://github.com/Malfoy/BCOOL under the Affero GPL license and as a Bioconda package.
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Antoine Limasset
- Evolutionary Biology & Ecology, Université Libre de Bruxelles (ULB), Bruxelles, Belgium
| | - Jean-François Flot
- Evolutionary Biology & Ecology, Université Libre de Bruxelles (ULB), Bruxelles, Belgium
- Interuniversity Institute of Bioinformatics in Brussels – (IB) 2, Brussels, Belgium
| | | |
Collapse
|
96
|
Chevrette MG, Carlson CM, Ortega HE, Thomas C, Ananiev GE, Barns KJ, Book AJ, Cagnazzo J, Carlos C, Flanigan W, Grubbs KJ, Horn HA, Hoffmann FM, Klassen JL, Knack JJ, Lewin GR, McDonald BR, Muller L, Melo WGP, Pinto-Tomás AA, Schmitz A, Wendt-Pienkowski E, Wildman S, Zhao M, Zhang F, Bugni TS, Andes DR, Pupo MT, Currie CR. The antimicrobial potential of Streptomyces from insect microbiomes. Nat Commun 2019; 10:516. [PMID: 30705269 PMCID: PMC6355912 DOI: 10.1038/s41467-019-08438-0] [Citation(s) in RCA: 208] [Impact Index Per Article: 34.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/18/2018] [Accepted: 01/11/2019] [Indexed: 12/29/2022] Open
Abstract
Antimicrobial resistance is a global health crisis and few novel antimicrobials have been discovered in recent decades. Natural products, particularly from Streptomyces, are the source of most antimicrobials, yet discovery campaigns focusing on Streptomyces from the soil largely rediscover known compounds. Investigation of understudied and symbiotic sources has seen some success, yet no studies have systematically explored microbiomes for antimicrobials. Here we assess the distinct evolutionary lineages of Streptomyces from insect microbiomes as a source of new antimicrobials through large-scale isolations, bioactivity assays, genomics, metabolomics, and in vivo infection models. Insect-associated Streptomyces inhibit antimicrobial-resistant pathogens more than soil Streptomyces. Genomics and metabolomics reveal their diverse biosynthetic capabilities. Further, we describe cyphomycin, a new molecule active against multidrug resistant fungal pathogens. The evolutionary trajectories of Streptomyces from the insect microbiome influence their biosynthetic potential and ability to inhibit resistant pathogens, supporting the promise of this source in augmenting future antimicrobial discovery.
Collapse
Affiliation(s)
- Marc G Chevrette
- Laboratory of Genetics, University of Wisconsin-Madison, Madison, 53706, WI, USA.,Department of Bacteriology, University of Wisconsin-Madison, Madison, 53706, WI, USA
| | - Caitlin M Carlson
- Department of Bacteriology, University of Wisconsin-Madison, Madison, 53706, WI, USA
| | - Humberto E Ortega
- School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, 14040-903, SP, Brazil
| | - Chris Thomas
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, 53705, WI, USA
| | - Gene E Ananiev
- McArdle Laboratory for Cancer Research, Wisconsin Institute for Medical Research, University of Wisconsin-Madison, Madison, 53705, WI, USA
| | - Kenneth J Barns
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, 53705, WI, USA
| | - Adam J Book
- Department of Bacteriology, University of Wisconsin-Madison, Madison, 53706, WI, USA
| | - Julian Cagnazzo
- Department of Bacteriology, University of Wisconsin-Madison, Madison, 53706, WI, USA
| | - Camila Carlos
- Department of Bacteriology, University of Wisconsin-Madison, Madison, 53706, WI, USA
| | - Will Flanigan
- Department of Bacteriology, University of Wisconsin-Madison, Madison, 53706, WI, USA
| | - Kirk J Grubbs
- Department of Bacteriology, University of Wisconsin-Madison, Madison, 53706, WI, USA
| | - Heidi A Horn
- Department of Bacteriology, University of Wisconsin-Madison, Madison, 53706, WI, USA
| | - F Michael Hoffmann
- McArdle Laboratory for Cancer Research, Wisconsin Institute for Medical Research, University of Wisconsin-Madison, Madison, 53705, WI, USA
| | - Jonathan L Klassen
- Department of Molecular and Cell Biology, University of Connecticut, Storrs, 06269, CT, USA
| | - Jennifer J Knack
- Department of Biology, Large Lakes Observatory, University of Minnesota-Duluth, Duluth, 55812, MN, USA
| | - Gina R Lewin
- School of Biological Sciences, Georgia Institute of Technology, Atlanta, 30332, GA, USA
| | - Bradon R McDonald
- Department of Bacteriology, University of Wisconsin-Madison, Madison, 53706, WI, USA
| | - Laura Muller
- Department of Bacteriology, University of Wisconsin-Madison, Madison, 53706, WI, USA
| | - Weilan G P Melo
- School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, 14040-903, SP, Brazil
| | - Adrián A Pinto-Tomás
- Center for Research in Microscopic Structures and Department of Biochemistry, School of Medicine, University of Costa Rica, San José, 10102, Costa Rica
| | - Amber Schmitz
- Department of Bacteriology, University of Wisconsin-Madison, Madison, 53706, WI, USA
| | | | - Scott Wildman
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, 53705, WI, USA
| | - Miao Zhao
- Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, 53705, WI, USA
| | - Fan Zhang
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, 53705, WI, USA
| | - Tim S Bugni
- Pharmaceutical Sciences Division, School of Pharmacy, University of Wisconsin-Madison, Madison, 53705, WI, USA
| | - David R Andes
- Department of Medicine, University of Wisconsin School of Medicine and Public Health, Madison, 53705, WI, USA
| | - Monica T Pupo
- School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo, Ribeirão Preto, 14040-903, SP, Brazil
| | - Cameron R Currie
- Department of Bacteriology, University of Wisconsin-Madison, Madison, 53706, WI, USA.
| |
Collapse
|
97
|
Abstract
BACKGROUND Single-cell sequencing experiments use short DNA barcode 'tags' to identify reads that originate from the same cell. In order to recover single-cell information from such experiments, reads must be grouped based on their barcode tag, a crucial processing step that precedes other computations. However, this step can be difficult due to high rates of mismatch and deletion errors that can afflict barcodes. RESULTS Here we present an approach to identify and error-correct barcodes by traversing the de Bruijn graph of circularized barcode k-mers. Our approach is based on the observation that circularizing a barcode sequence can yield error-free k-mers even when the size of k is large relative to the length of the barcode sequence, a regime which is typical single-cell barcoding applications. This allows for assignment of reads to consensus fingerprints constructed from k-mers. CONCLUSION We show that for single-cell RNA-Seq circularization improves the recovery of accurate single-cell transcriptome estimates, especially when there are a high number of errors per read. This approach is robust to the type of error (mismatch, insertion, deletion), as well as to the relative abundances of the cells. Sircel, a software package that implements this approach is described and publically available.
Collapse
Affiliation(s)
- Akshay Tambe
- Division of Biology and Biological Engineering, California Institute of Technology, 116 Kerckhoff Laboratory, Pasadena, CA 91125 USA
| | - Lior Pachter
- Departments of Biology and Computing & Mathematical Sciences, California Institute of Technology, 116 Kerckhoff Laboratory, Pasadena, CA 91125 USA
| |
Collapse
|
98
|
Zhao L, Xie J, Bai L, Chen W, Wang M, Zhang Z, Wang Y, Zhao Z, Li J. Mining statistically-solid k-mers for accurate NGS error correction. BMC Genomics 2018; 19:912. [PMID: 30598110 PMCID: PMC6311904 DOI: 10.1186/s12864-018-5272-y] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.6] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND NGS data contains many machine-induced errors. The most advanced methods for the error correction heavily depend on the selection of solid k-mers. A solid k-mer is a k-mer frequently occurring in NGS reads. The other k-mers are called weak k-mers. A solid k-mer does not likely contain errors, while a weak k-mer most likely contains errors. An intensively investigated problem is to find a good frequency cutoff f0 to balance the numbers of solid and weak k-mers. Once the cutoff is determined, a more challenging but less-studied problem is to: (i) remove a small subset of solid k-mers that are likely to contain errors, and (ii) add a small subset of weak k-mers, that are likely to contain no errors, into the remaining set of solid k-mers. Identification of these two subsets of k-mers can improve the correction performance. RESULTS We propose to use a Gamma distribution to model the frequencies of erroneous k-mers and a mixture of Gaussian distributions to model correct k-mers, and combine them to determine f0. To identify the two special subsets of k-mers, we use the z-score of k-mers which measures the number of standard deviations a k-mer's frequency is from the mean. Then these statistically-solid k-mers are used to construct a Bloom filter for error correction. Our method is markedly superior to the state-of-art methods, tested on both real and synthetic NGS data sets. CONCLUSION The z-score is adequate to distinguish solid k-mers from weak k-mers, particularly useful for pinpointing out solid k-mers having very low frequency. Applying z-score on k-mer can markedly improve the error correction accuracy.
Collapse
Affiliation(s)
- Liang Zhao
- Precision Medicine Research Center, Taihe Hospital, Hubei University of Medicine, Shiyan, China
- School of Computing and Electronic Information, Guangxi University, Nanning, China
| | - Jin Xie
- Precision Medicine Research Center, Taihe Hospital, Hubei University of Medicine, Shiyan, China
| | - Lin Bai
- School of Computing and Electronic Information, Guangxi University, Nanning, China
| | - Wen Chen
- Precision Medicine Research Center, Taihe Hospital, Hubei University of Medicine, Shiyan, China
| | - Mingju Wang
- Precision Medicine Research Center, Taihe Hospital, Hubei University of Medicine, Shiyan, China
| | - Zhonglei Zhang
- Precision Medicine Research Center, Taihe Hospital, Hubei University of Medicine, Shiyan, China
| | - Yiqi Wang
- Precision Medicine Research Center, Taihe Hospital, Hubei University of Medicine, Shiyan, China
| | - Zhe Zhao
- School of Computing and Electronic Information, Guangxi University, Nanning, China
| | - Jinyan Li
- Advanced Analytics Institute, Faculty of Engineering & IT, University of Technology Sydney, NSW 2007, Australia
| |
Collapse
|
99
|
Manekar SC, Sathe SR. A benchmark study of k-mer counting methods for high-throughput sequencing. Gigascience 2018; 7:5140149. [PMID: 30346548 PMCID: PMC6280066 DOI: 10.1093/gigascience/giy125] [Citation(s) in RCA: 33] [Impact Index Per Article: 4.7] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2017] [Accepted: 10/16/2018] [Indexed: 11/25/2022] Open
Abstract
The rapid development of high-throughput sequencing technologies means that hundreds of gigabytes of sequencing data can be produced in a single study. Many bioinformatics tools require counts of substrings of length k in DNA/RNA sequencing reads obtained for applications such as genome and transcriptome assembly, error correction, multiple sequence alignment, and repeat detection. Recently, several techniques have been developed to count k-mers in large sequencing datasets, with a trade-off between the time and memory required to perform this function. We assessed several k-mer counting programs and evaluated their relative performance, primarily on the basis of runtime and memory usage. We also considered additional parameters such as disk usage, accuracy, parallelism, the impact of compressed input, performance in terms of counting large k values and the scalability of the application to larger datasets.We make specific recommendations for the setup of a current state-of-the-art program and suggestions for further development.
Collapse
Affiliation(s)
- Swati C Manekar
- Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur 440 010, India
| | - Shailesh R Sathe
- Department of Computer Science and Engineering, Visvesvaraya National Institute of Technology, Nagpur 440 010, India
| |
Collapse
|
100
|
Palma Esposito F, Ingham CJ, Hurtado-Ortiz R, Bizet C, Tasdemir D, de Pascale D. Isolation by Miniaturized Culture Chip of an Antarctic bacterium Aequorivita sp. with antimicrobial and anthelmintic activity. ACTA ACUST UNITED AC 2018; 20:e00281. [PMID: 30225207 PMCID: PMC6139392 DOI: 10.1016/j.btre.2018.e00281] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/16/2018] [Revised: 09/04/2018] [Accepted: 09/04/2018] [Indexed: 01/28/2023]
Abstract
Novel microbial isolation approach allowed the identification of a Gram-negative Antarctic bacterium belonging to the genus Aequorivita. Aequorivita sp. showed antimicrobial and anthelmintic activity without toxic effect towards eukaryotic cells. The whole genome of Aequorivita sp. was sequenced and compared with other strains to identify biosynthetic gene clusters. This novel approach represents a promising strategy to isolate rare or novel strains useful for biotechnological applications.
Microbes are prolific sources of bioactive molecules; however, the cultivability issue has severely hampered access to microbial diversity. Novel secondary metabolites from as-yet-unknown or atypical microorganisms from extreme environments have realistic potential to lead to new drugs with benefits for human health. Here, we used a novel approach that mimics the natural environment by using a Miniaturized Culture Chip allowing the isolation of several bacterial strains from Antarctic shallow water sediments under near natural conditions. A Gram-negative Antarctic bacterium belonging to the genus Aequorivita was subjected to further analyses. The Aequorivita sp. genome was sequenced and a bioinformatic approach was applied to identify biosynthetic gene clusters. The extract of the Aequorivita sp. showed antimicrobial and anthelmintic activity towards Multidrug resistant bacteria and the nematode Caenorhabditis elegans. This is the first multi-approach study exploring the genomics and biotechnological potential of the genus Aequorivita that is a promising candidate for pharmaceutical applications.
Collapse
Affiliation(s)
- Fortunato Palma Esposito
- Institute of Protein Biochemistry, National Research Council, Naples, 80131, Italy.,Marine Biotechnology Department, Stazione Zoologica Anton Dohrn, Villa Comunale, Naples, 80121, Italy
| | | | - Raquel Hurtado-Ortiz
- CIP-Collection of Institut Pasteur, Department of Microbiology, Institut Pasteur, Paris, 75015, France.,CRBIP-Biological Resource Centre, Department of Microbiology, Institut Pasteur, Paris, 75015, France
| | - Chantal Bizet
- CIP-Collection of Institut Pasteur, Department of Microbiology, Institut Pasteur, Paris, 75015, France.,CRBIP-Biological Resource Centre, Department of Microbiology, Institut Pasteur, Paris, 75015, France
| | - Deniz Tasdemir
- GEOMAR Centre for Marine Biotechnology (GEOMAR-Biotech), Research Unit Marine Natural Products Chemistry, GEOMAR Helmholtz Centre for Ocean Research Kiel, Kiel, 24106, Germany
| | - Donatella de Pascale
- Institute of Protein Biochemistry, National Research Council, Naples, 80131, Italy.,Marine Biotechnology Department, Stazione Zoologica Anton Dohrn, Villa Comunale, Naples, 80121, Italy
| |
Collapse
|