1
|
Stritt C, Reitsma M, Marin AMG, Goig G, Dötsch A, Borrell S, Beisel C, Comas I, Brites D, Gagneux S. Gene conversion and duplication contribute to genetic variation in an outbreak of Mycobacterium tuberculosis. Microb Genom 2025; 11:001396. [PMID: 40310468 PMCID: PMC12046097 DOI: 10.1099/mgen.0.001396] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/05/2024] [Accepted: 03/17/2025] [Indexed: 05/02/2025] Open
Abstract
Repeats are the most diverse and dynamic but also the least well-understood component of microbial genomes. For all we know, repeat-associated mutations such as duplications, deletions, inversions and gene conversion might be as common as point mutations, but because of short-read myopia and methodological bias, they have received much less attention. Long-read DNA sequencing opens the perspective of resolving repeats and systematically investigating the mutations they induce. For this study, we assembled the genomes of 16 closely related strains of the bacterial pathogen Mycobacterium tuberculosis from Pacific Biosciences HiFi reads, with the aim of characterizing the full spectrum of DNA polymorphisms. We found that complete and accurate genomes can be assembled from HiFi reads, with read size being the main limitation in the presence of duplications. By combining a reference-free pangenome graph with extensive repeat annotation, we identified 110 variants, 58 of which could be assigned to repeat-associated mutational mechanisms such as strand slippage and homologous recombination. Whilst recombination events were less frequent than point mutations, they affected large regions and introduced multiple variants at once, as shown by three gene conversion events and a duplication of 7.3 kb that involved ppe18 and ppe57, two genes possibly involved in immune subversion. The vast majority of variants were present in single isolates, such that phylogenetic resolution was only marginally increased when estimating a tree from complete genomes. Our study shows that the contribution of repeat-associated mechanisms of mutation can be similar to that of point mutations at the microevolutionary scale of an outbreak. A large reservoir of unstudied genetic variation in this 'monomorphic' bacterial pathogen awaits investigation.
Collapse
Affiliation(s)
- Christoph Stritt
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Michelle Reitsma
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | | | - Galo Goig
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Anna Dötsch
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Sonia Borrell
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Christian Beisel
- Department of Biosystems Science and Engineering, ETH Zurich, Basel, Switzerland
| | - Iñaki Comas
- Biomedicine Institute of Valencia, Spanish Research Council (IBV-CSIC), Valencia, Spain
- Spanish Network for Research on Epidemiology and Public Health (CIBERESP), Carlos III Health Institute, Madrid, Spain
| | - Daniela Brites
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| | - Sebastien Gagneux
- Swiss Tropical and Public Health Institute, Allschwil, Switzerland
- University of Basel, Basel, Switzerland
| |
Collapse
|
2
|
Chakravarty S, Logsdon G, Lonardi S. RAmbler resolves complex repeats in human Chromosomes 8, 19, and X. Genome Res 2025; 35:863-876. [PMID: 40037839 PMCID: PMC12047272 DOI: 10.1101/gr.279308.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/13/2024] [Accepted: 02/06/2025] [Indexed: 03/06/2025]
Abstract
Repetitive regions in eukaryotic genomes often contain important functional or regulatory elements. Despite significant algorithmic and technological advancements in genome sequencing and assembly over the past three decades, modern de novo assemblers still struggle to accurately reconstruct highly repetitive regions. In this work, we introduce RAmbler (Repeat Assembler), a reference-guided assembler specialized for the assembly of complex repetitive regions exclusively from Pacific Biosciences (PacBio) HiFi reads. RAmbler (1) identifies repetitive regions by detecting unusually high coverage regions after mapping HiFi reads to the draft genome assembly, (2) finds single-copy k-mers from the HiFi reads, (i.e., k-mers that are expected to occur only once in the genome), (3) uses the relative location of single-copy k-mers to barcode each HiFi read, (4) clusters HiFi reads based on their shared barcodes, (5) generates contigs by assembling the reads in each cluster, and (6) generates a consensus assembly from the overlap graph of the assembled contigs. Here, we show that RAmbler can reconstruct human centromeres and other complex repeats to a quality comparable to the manually curated Telomere-to-Telomere human genome assembly. Across more than 250 synthetic data sets, RAmbler outperforms hifiasm, LJA, HiCANU, and Verkko across various parameters such as repeat lengths, number of repeats, heterozygosity rates, and depth of sequencing.
Collapse
Affiliation(s)
- Sakshar Chakravarty
- Department of Computer Science and Engineering, University of California, Riverside, California 92521, USA
| | - Glennis Logsdon
- Department of Genetics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, Pennsylvania 19103, USA
| | - Stefano Lonardi
- Department of Computer Science and Engineering, University of California, Riverside, California 92521, USA;
| |
Collapse
|
3
|
Kruasuwan W, Sawatwong P, Jenjaroenpun P, Wankaew N, Arigul T, Yongkiettrakul S, Lunha K, Sudjai A, Siludjai D, Skaggs B, Wongsurawat T. Comparative evaluation of commercial DNA isolation approaches for nanopore-only bacterial genome assembly and plasmid recovery. Sci Rep 2024; 14:27672. [PMID: 39532954 PMCID: PMC11557978 DOI: 10.1038/s41598-024-78066-2] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/20/2024] [Accepted: 10/28/2024] [Indexed: 11/16/2024] Open
Abstract
The advent of Oxford Nanopore Technologies has undergone significant improvements in terms of sequencing costs, accuracy, and sequencing read lengths, making it a cost-effective, and readily accessible approach for analyzing microbial genomes. A major challenge for bacterial whole genome sequencing by Nanopore technology is the requirement for a higher quality and quantity of high molecular weight DNA compared to short-read sequencing platforms. In this study, using eight pathogenic bacteria, we evaluated the quality, quantity, and fragmented size distribution of extracted DNA obtained from three different commercial DNA extraction kits, and one automated robotic platform. Our results demonstrated significant variation in DNA yield and purity among the extraction kits. The ZymoBIOMICS DNA Miniprep Kit (ZM) provided a higher purity of DNA compared to other kit-based extractions. All kit-based DNA extractions were successfully performed on all twenty-four samples using a single MinION flow cell, with the Nanobind CBB Big DNA kit (NB) yielding the longest raw reads. The Fire Monkey HMW-DNA Extraction Kit (FM) and the automated Roche MagNaPure 96 platform (RO) outperformed in genome assembly, particularly in gram-negative bacteria. Based on our finding, we recommend a minimum read coverage and raw read N50, obtained from the appropriate DNA extraction kit for each bacterial species, to optimize genome assembly and plasmid recovery. This approach will assist end-users in selecting the most effective kit-based extraction method for bacterial whole-genome assembly using only long-read nanopore sequences.
Collapse
Affiliation(s)
- Worarat Kruasuwan
- Division of Medical Bioinformatics, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
- Siriraj Long-read Lab (Si-LoL), Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Pongpun Sawatwong
- Division of Global Health Protection, Ministry of Public Health-U.S. Center of Diseases Control and Prevention, Nonthaburi, Thailand
| | - Piroon Jenjaroenpun
- Division of Medical Bioinformatics, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
- Siriraj Long-read Lab (Si-LoL), Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, USA
| | - Natnicha Wankaew
- Division of Medical Bioinformatics, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
- Siriraj Long-read Lab (Si-LoL), Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Tantip Arigul
- Division of Medical Bioinformatics, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
- Siriraj Long-read Lab (Si-LoL), Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Suganya Yongkiettrakul
- National Center for Genetic Engineering and Biotechnology (BIOTEC), National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Kamonwan Lunha
- National Center for Genetic Engineering and Biotechnology (BIOTEC), National Science and Technology Development Agency (NSTDA), Pathum Thani, Thailand
| | - Aunthikarn Sudjai
- Division of Medical Bioinformatics, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand
| | - Duangkamon Siludjai
- Division of Global Health Protection, Ministry of Public Health-U.S. Center of Diseases Control and Prevention, Nonthaburi, Thailand
| | - Beth Skaggs
- Division of Global Health Protection, Ministry of Public Health-U.S. Center of Diseases Control and Prevention, Nonthaburi, Thailand
| | - Thidathip Wongsurawat
- Division of Medical Bioinformatics, Research Department, Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand.
- Siriraj Long-read Lab (Si-LoL), Faculty of Medicine Siriraj Hospital, Mahidol University, Bangkok, Thailand.
- Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, USA.
| |
Collapse
|
4
|
Fehrenbach A, Mitrofanov A, Alkhnbashi O, Backofen R, Baumdicker F. SpacerPlacer: ancestral reconstruction of CRISPR arrays reveals the evolutionary dynamics of spacer deletions. Nucleic Acids Res 2024; 52:10862-10878. [PMID: 39268572 PMCID: PMC11472070 DOI: 10.1093/nar/gkae772] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/27/2024] [Revised: 08/12/2024] [Accepted: 08/28/2024] [Indexed: 09/17/2024] Open
Abstract
Bacteria employ CRISPR-Cas systems for defense by integrating invader-derived sequences, termed spacers, into the CRISPR array, which constitutes an immunity memory. While spacer deletions occur randomly across the array, newly acquired spacers are predominantly integrated at the leader end. Consequently, spacer arrays can be used to derive the chronology of spacer insertions. Reconstruction of ancestral spacer acquisitions and deletions could help unravel the coevolution of phages and bacteria, the evolutionary dynamics in microbiomes, or track pathogens. However, standard reconstruction methods produce misleading results by overlooking insertion order and joint deletions of spacers. Here, we present SpacerPlacer, a maximum likelihood-based ancestral reconstruction approach for CRISPR array evolution. We used SpacerPlacer to reconstruct and investigate ancestral deletion events of 4565 CRISPR arrays, revealing that spacer deletions occur 374 times more frequently than mutations and are regularly deleted jointly, with an average of 2.7 spacers. Surprisingly, we observed a decrease in the spacer deletion frequency towards both ends of the reconstructed arrays. While the resulting trailer-end conservation is commonly observed, a reduced deletion frequency is now also detectable towards the variable leader end. Finally, our results point to the hypothesis that frequent loss of recently acquired spacers may provide a selective advantage.
Collapse
Affiliation(s)
- Axel Fehrenbach
- Cluster of Excellence ‘Controlling Microbes to Fight Infections’, Mathematical and Computational Population Genetics, University of Tübingen, 72076 Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics (IBMI), University of Tübingen, 72076 Tübingen, Germany
| | - Alexander Mitrofanov
- Bioinformatics group, Department of Computer Science, University of Freiburg, 79085 Freiburg, Germany
| | - Omer S Alkhnbashi
- Center for Applied and Translational Genomics (CATG), Mohammed Bin Rashid University of Medicine and Health Sciences (MBRU), Dubai Healthcare City, 505055 Dubai, United Arab Emirates
- College of Medicine, Mohammed Bin Rashid University of Medicine and Health Sciences (MBRU), Dubai Healthcare City, 505055 Dubai, United Arab Emirates
| | - Rolf Backofen
- Bioinformatics group, Department of Computer Science, University of Freiburg, 79085 Freiburg, Germany
- Signalling Research Centres BIOSS and CIBSS, University of Freiburg, 79085 Freiburg, Germany
| | - Franz Baumdicker
- Cluster of Excellence ‘Controlling Microbes to Fight Infections’, Mathematical and Computational Population Genetics, University of Tübingen, 72076 Tübingen, Germany
- Institute for Bioinformatics and Medical Informatics (IBMI), University of Tübingen, 72076 Tübingen, Germany
| |
Collapse
|
5
|
Azizpour A, Balaji A, Treangen TJ, Segarra S. Graph-based self-supervised learning for repeat detection in metagenomic assembly. Genome Res 2024; 34:1468-1476. [PMID: 39029947 PMCID: PMC11529840 DOI: 10.1101/gr.279136.124] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/15/2024] [Accepted: 07/15/2024] [Indexed: 07/21/2024]
Abstract
Repetitive DNA (repeats) poses significant challenges for accurate and efficient genome assembly and sequence alignment. This is particularly true for metagenomic data, in which genome dynamics such as horizontal gene transfer, gene duplication, and gene loss/gain complicate accurate genome assembly from metagenomic communities. Detecting repeats is a crucial first step in overcoming these challenges. To address this issue, we propose GraSSRep, a novel approach that leverages the assembly graph's structure through graph neural networks (GNNs) within a self-supervised learning framework to classify DNA sequences into repetitive and nonrepetitive categories. Specifically, we frame this problem as a node classification task within a metagenomic assembly graph. In a self-supervised fashion, we rely on a high-precision (but low-recall) heuristic to generate pseudolabels for a small proportion of the nodes. We then use those pseudolabels to train a GNN embedding and a random forest classifier to propagate the labels to the remaining nodes. In this way, GraSSRep combines sequencing features with predefined and learned graph features to achieve state-of-the-art performance in repeat detection. We evaluate our method using simulated and synthetic metagenomic data sets. The results on the simulated data highlight GraSSRep's robustness to repeat attributes, demonstrating its effectiveness in handling the complexity of repeated sequences. Additionally, experiments with synthetic metagenomic data sets reveal that incorporating the graph structure and the GNN enhances the detection performance. Finally, in comparative analyses, GraSSRep outperforms existing repeat detection tools with respect to precision and recall.
Collapse
Affiliation(s)
- Ali Azizpour
- Department of Electrical and Computer Engineering, Houston, Texas 77005, USA;
| | - Advait Balaji
- Department of Computer Science, Rice University, Houston, Texas 77005, USA;
| | - Todd J Treangen
- Department of Computer Science, Rice University, Houston, Texas 77005, USA;
- Ken Kennedy Institute, Rice University, Houston, Texas 77005, USA
| | - Santiago Segarra
- Department of Electrical and Computer Engineering, Houston, Texas 77005, USA;
- Ken Kennedy Institute, Rice University, Houston, Texas 77005, USA
| |
Collapse
|
6
|
Pham P, Wood EA, Dunbar EL, Cox M, Goodman M. Controlling genome topology with sequences that trigger post-replication gap formation during replisome passage: the E. coli RRS elements. Nucleic Acids Res 2024; 52:6392-6405. [PMID: 38676944 PMCID: PMC11194060 DOI: 10.1093/nar/gkae320] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/17/2023] [Revised: 04/08/2024] [Accepted: 04/11/2024] [Indexed: 04/29/2024] Open
Abstract
We report that the Escherichia coli chromosome includes novel GC-rich genomic structural elements that trigger formation of post-replication gaps upon replisome passage. The two nearly perfect 222 bp repeats, designated Replication Risk Sequences or RRS, are each 650 kb from the terminus sequence dif and flank the Ter macrodomain. RRS sequence and positioning is highly conserved in enterobacteria. At least one RRS appears to be essential unless a 200 kb region encompassing one of them is amplified. The RRS contain a G-quadruplex on the lagging strand which impedes DNA polymerase extension producing lagging strand ssDNA gaps, $ \le$2000 bp long, upon replisome passage. Deletion of both RRS elements has substantial effects on global genome structure and topology. We hypothesize that RRS elements serve as topological relief valves during chromosome replication and segregation. There have been no screens for genomic sequences that trigger transient gap formation. Functional analogs of RRS could be widespread, possibly including some enigmatic G-quadruplexes in eukaryotes.
Collapse
Affiliation(s)
- Phuong Pham
- Departments of Biological Sciences and Chemistry, University of Southern California, Los Angeles, CA 90089-2910, USA
| | - Elizabeth A Wood
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706-1544, USA
| | - Emma L Dunbar
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706-1544, USA
| | - Michael M Cox
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706-1544, USA
| | - Myron F Goodman
- Departments of Biological Sciences and Chemistry, University of Southern California, Los Angeles, CA 90089-2910, USA
| |
Collapse
|
7
|
Atre M, Joshi B, Babu J, Sawant S, Sharma S, Sankar TS. Origin, evolution, and maintenance of gene-strand bias in bacteria. Nucleic Acids Res 2024; 52:3493-3509. [PMID: 38442257 DOI: 10.1093/nar/gkae155] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/11/2023] [Revised: 02/06/2024] [Accepted: 02/19/2024] [Indexed: 03/07/2024] Open
Abstract
Gene-strand bias is a characteristic feature of bacterial genome organization wherein genes are preferentially encoded on the leading strand of replication, promoting co-orientation of replication and transcription. This co-orientation bias has evolved to protect gene essentiality, expression, and genomic stability from the harmful effects of head-on replication-transcription collisions. However, the origin, variation, and maintenance of gene-strand bias remain elusive. Here, we reveal that the frequency of inversions that alter gene orientation exhibits large variation across bacterial populations and negatively correlates with gene-strand bias. The density, distance, and distribution of inverted repeats show a similar negative relationship with gene-strand bias explaining the heterogeneity in inversions. Importantly, these observations are broadly evident across the entire bacterial kingdom uncovering inversions and inverted repeats as primary factors underlying the variation in gene-strand bias and its maintenance. The distinct catalytic subunits of replicative DNA polymerase have co-evolved with gene-strand bias, suggesting a close link between replication and the origin of gene-strand bias. Congruently, inversion frequencies and inverted repeats vary among bacteria with different DNA polymerases. In summary, we propose that the nature of replication determines the fitness cost of replication-transcription collisions, establishing a selection gradient on gene-strand bias by fine-tuning DNA sequence repeats and, thereby, gene inversions.
Collapse
Affiliation(s)
- Malhar Atre
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - Bharat Joshi
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - Jebin Babu
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - Shabduli Sawant
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - Shreya Sharma
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| | - T Sabari Sankar
- School of Biology, Indian Institute of Science Education and Research, Thiruvananthapuram, Kerala 695551, India
| |
Collapse
|
8
|
Pham P, Wood EA, Dunbar EL, Cox MM, Goodman MF. Controlling Genome Topology with Sequences that Trigger Post-replication Gap Formation During Replisome Passage: The E. coli RRS Elements. BIORXIV : THE PREPRINT SERVER FOR BIOLOGY 2024:2023.10.01.560376. [PMID: 37873128 PMCID: PMC10592627 DOI: 10.1101/2023.10.01.560376] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Indexed: 10/25/2023]
Abstract
We report that the Escherichia coli chromosome includes novel GC-rich genomic structural elements that trigger formation of post-replication gaps upon replisome passage. The two nearly perfect 222 bp repeats, designated Replication Risk Sequences or RRS, are each 650 kb from the terminus sequence dif and flank the Ter macrodomain. RRS sequence and positioning is highly conserved in enterobacteria. At least one RRS appears to be essential unless a 200 kb region encompassing one of them is amplified. The RRS contain a G-quadruplex on the lagging strand which impedes DNA polymerase extension producing lagging strand ssDNA gaps, ≤2000 bp long, upon replisome passage. Deletion of both RRS elements has substantial effects on global genome structure and topology. We hypothesize that RRS elements serve as topological relief valves during chromosome replication and segregation. There have been no screens for genomic sequences that trigger transient gap formation. Functional analogs of RRS could be widespread, possibly including some enigmatic G-quadruplexes in eukaryotes.
Collapse
Affiliation(s)
- Phuong Pham
- Departments of Biological Sciences and Chemistry, University of Southern California, Los Angeles, CA 90089-2910
| | - Elizabeth A. Wood
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706-1544
| | - Emma L. Dunbar
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706-1544
| | - Michael M. Cox
- Department of Biochemistry, University of Wisconsin-Madison, Madison, WI 53706-1544
| | - Myron F. Goodman
- Departments of Biological Sciences and Chemistry, University of Southern California, Los Angeles, CA 90089-2910
| |
Collapse
|
9
|
Li Z, Liu X, Ning N, Li T, Wang H. Diversity, Distribution, and Chromosomal Rearrangements of TRIP1 Repeat Sequences in Escherichia coli. Genes (Basel) 2024; 15:236. [PMID: 38397225 PMCID: PMC10888264 DOI: 10.3390/genes15020236] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/17/2024] [Revised: 02/07/2024] [Accepted: 02/10/2024] [Indexed: 02/25/2024] Open
Abstract
The bacterial genome contains numerous repeated sequences that greatly affect its genomic plasticity. The Escherichia coli K-12 genome contains three copies of the TRIP1 repeat sequence (TRIP1a, TRIP1b, and TRIP1c). However, the diversity, distribution, and role of the TRIP1 repeat sequence in the E. coli genome are still unclear. In this study, after screening 6725 E. coli genomes, the TRIP1 repeat was found in the majority of E. coli strains (96%: 6454/6725). The copy number and direction of the TRIP1 repeat sequence varied in each genome. Overall, 2449 genomes (36%: 2449/6725) had three copies of TRIP1 (TRIP1a, TRIP1b, and TRIP1c), which is the same as E. coli K-12. Five types of TRIP1 repeats, including two new types (TRIP1d and TRIP1e), are identified in E. coli genomes, located in 4703, 3529, 5741, 1565, and 232 genomes, respectively. Each type of TRIP1 repeat is localized to a specific locus on the chromosome. TRIP1 repeats can cause intra-chromosomal rearrangements. A total of 156 rearrangement events were identified, of which 88% (137/156) were between TRIP1a and TRIP1c. These findings have important implications for future research on TRIP1 repeats.
Collapse
Affiliation(s)
- Zhan Li
- State Key Laboratory of Pathogens and Biosecurity, Beijing Institute of Microbiology and Epidemiology, No. 20 Dongda Street, Fengtai District, Beijing 100071, China; (Z.L.); (N.N.); (T.L.)
| | - Xiong Liu
- Chinese PLA Center for Disease Control and Prevention, Dongda Street 20#, Fengtai District, Beijing 100071, China;
| | - Nianzhi Ning
- State Key Laboratory of Pathogens and Biosecurity, Beijing Institute of Microbiology and Epidemiology, No. 20 Dongda Street, Fengtai District, Beijing 100071, China; (Z.L.); (N.N.); (T.L.)
| | - Tao Li
- State Key Laboratory of Pathogens and Biosecurity, Beijing Institute of Microbiology and Epidemiology, No. 20 Dongda Street, Fengtai District, Beijing 100071, China; (Z.L.); (N.N.); (T.L.)
| | - Hui Wang
- State Key Laboratory of Pathogens and Biosecurity, Beijing Institute of Microbiology and Epidemiology, No. 20 Dongda Street, Fengtai District, Beijing 100071, China; (Z.L.); (N.N.); (T.L.)
| |
Collapse
|
10
|
Chen Q, Wang B, Pan L. Efficient expression of γ-glutamyl transpeptidase in Bacillus subtilis via CRISPR/Cas9n and its immobilization. Appl Microbiol Biotechnol 2024; 108:149. [PMID: 38240797 DOI: 10.1007/s00253-023-12889-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/27/2023] [Revised: 10/17/2023] [Accepted: 10/24/2023] [Indexed: 01/23/2024]
Abstract
In this study, we successfully applied the strategy of combining tandem promoters and tandem signal peptides with overexpressing signal peptidase to efficiently express and produce γ-glutamyl peptidase (GGT) enzymes (BsGGT, BaGGT, and BlGGT) from Bacillus subtilis, Bacillus amyloliquefaciens, and Bacillus licheniformis in Bacillus subtilis ATCC6051Δ5. In order to avoid the problem of instability caused by duplicated strong promoters, we assembled tandem promoters of different homologous genes from different species. To achieve resistance marker-free enzyme in the food industry, we first removed the replication origin and corresponding resistance marker of Escherichia coli from the expression vector. The plasmid was then transformed into the B. subtilis host, and the Kan resistance gene in the expression plasmid was directly edited and silenced using the CRISPR/Cas9n-AID base editing system. As a result, a recombinant protein expression carrier without resistance markers was constructed, and the enzyme activity of the BlGGT strain during shake flask fermentation can reach 53.65 U/mL. The recombinant BlGGT was immobilized with epoxy resin and maintained 82.8% enzyme activity after repeated use for 10 times and 87.36% enzyme activity after storage at 4 °C for 2 months. The immobilized BlGGT enzyme was used for the continuous synthesis of theanine with a conversion rate of 65.38%. These results indicated that our approach was a promising solution for improving enzyme production efficiency and achieving safe production of enzyme preparations in the food industry. KEY POINTS: • Efficient expression of recombinant proteins by a combination of dual promoter and dual signal peptide. • Construction of small vectors without resistance markers in B. subtilis using CRISPR/Cas9n-AID editing system. • The process of immobilizing BlGGT with epoxy resin was optimized.
Collapse
Affiliation(s)
- Qianlin Chen
- School of Biology and Biological Engineering, Guangzhou Higher Education Mega Centre, South China University of Technology, Panyu District, Guangzhou, 510006, Guangdong, People's Republic of China
| | - Bin Wang
- School of Biology and Biological Engineering, Guangzhou Higher Education Mega Centre, South China University of Technology, Panyu District, Guangzhou, 510006, Guangdong, People's Republic of China
| | - Li Pan
- School of Biology and Biological Engineering, Guangzhou Higher Education Mega Centre, South China University of Technology, Panyu District, Guangzhou, 510006, Guangdong, People's Republic of China.
| |
Collapse
|
11
|
Hou Z, Xu Z, Wu M, Ma L, Sui L, Bian P, Wang T. Enhancement of Repeat-Mediated Deletion Rearrangement Induced by Particle Irradiation in a RecA-Dependent Manner in Escherichia coli. BIOLOGY 2023; 12:1406. [PMID: 37998005 PMCID: PMC10669199 DOI: 10.3390/biology12111406] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Subscribe] [Scholar Register] [Received: 10/08/2023] [Revised: 10/30/2023] [Accepted: 11/06/2023] [Indexed: 11/25/2023]
Abstract
Repeat-mediated deletion (RMD) rearrangement is a major source of genome instability and can be deleterious to the organism, whereby the intervening sequence between two repeats is deleted along with one of the repeats. RMD rearrangement is likely induced by DNA double-strand breaks (DSBs); however, it is unclear how the complexity of DSBs influences RMD rearrangement. Here, a transgenic Escherichia coli strain K12 MG1655 with a lacI repeat-controlled amp activation was used while taking advantage of particle irradiation, such as proton and carbon irradiation, to generate different complexities of DSBs. Our research confirmed the enhancement of RMD under proton and carbon irradiation and revealed a positive correlation between RMD enhancement and LET. In addition, RMD enhancement could be suppressed by an intermolecular homologous sequence, which was regulated by its composition and length. Meanwhile, RMD enhancement was significantly stimulated by exogenous λ-Red recombinase. Further results investigating its mechanisms showed that the enhancement of RMD, induced by particle irradiation, occurred in a RecA-dependent manner. Our finding has a significant impact on the understanding of RMD rearrangement and provides some clues for elucidating the repair process and possible outcomes of complex DNA damage.
Collapse
Affiliation(s)
- Zhiyang Hou
- Teaching and Research Section of Nuclear Medicine, School of Basic Medical Sciences, Anhui Medical University, Hefei 230032, China; (Z.H.); (Z.X.); (M.W.); (P.B.)
- Key Laboratory of High Magnetic Field and Ion Beam Physical Biology, Hefei Institutes of Physical Science, Chinese Academy of Sciences, Hefei 230031, China
- Science Island Branch, Graduate School of USTC, Hefei 230026, China
| | - Zelin Xu
- Teaching and Research Section of Nuclear Medicine, School of Basic Medical Sciences, Anhui Medical University, Hefei 230032, China; (Z.H.); (Z.X.); (M.W.); (P.B.)
| | - Mengying Wu
- Teaching and Research Section of Nuclear Medicine, School of Basic Medical Sciences, Anhui Medical University, Hefei 230032, China; (Z.H.); (Z.X.); (M.W.); (P.B.)
| | - Liqiu Ma
- Department of Nuclear Physics, China Institute of Atomic Energy, Beijing 102413, China;
- National Innovation Center of Radiation Application, Beijing 102413, China
| | - Li Sui
- Department of Nuclear Physics, China Institute of Atomic Energy, Beijing 102413, China;
- National Innovation Center of Radiation Application, Beijing 102413, China
| | - Po Bian
- Teaching and Research Section of Nuclear Medicine, School of Basic Medical Sciences, Anhui Medical University, Hefei 230032, China; (Z.H.); (Z.X.); (M.W.); (P.B.)
| | - Ting Wang
- Teaching and Research Section of Nuclear Medicine, School of Basic Medical Sciences, Anhui Medical University, Hefei 230032, China; (Z.H.); (Z.X.); (M.W.); (P.B.)
| |
Collapse
|
12
|
Liao X, Zhu W, Zhou J, Li H, Xu X, Zhang B, Gao X. Repetitive DNA sequence detection and its role in the human genome. Commun Biol 2023; 6:954. [PMID: 37726397 PMCID: PMC10509279 DOI: 10.1038/s42003-023-05322-y] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/29/2023] [Accepted: 09/04/2023] [Indexed: 09/21/2023] Open
Abstract
Repetitive DNA sequences playing critical roles in driving evolution, inducing variation, and regulating gene expression. In this review, we summarized the definition, arrangement, and structural characteristics of repeats. Besides, we introduced diverse biological functions of repeats and reviewed existing methods for automatic repeat detection, classification, and masking. Finally, we analyzed the type, structure, and regulation of repeats in the human genome and their role in the induction of complex diseases. We believe that this review will facilitate a comprehensive understanding of repeats and provide guidance for repeat annotation and in-depth exploration of its association with human diseases.
Collapse
Affiliation(s)
- Xingyu Liao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Wufei Zhu
- Department of Endocrinology, Yichang Central People's Hospital, The First College of Clinical Medical Science, China Three Gorges University, 443000, Yichang, P.R. China
| | - Juexiao Zhou
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Haoyang Li
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xiaopeng Xu
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Bin Zhang
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia
| | - Xin Gao
- Computational Bioscience Research Center (CBRC), Computer, Electrical and Mathematical Sciences and Engineering Division, King Abdullah University of Science and Technology (KAUST), Thuwal, 23955, Saudi Arabia.
| |
Collapse
|
13
|
Calder A, Snyder LAS. Diversity of the type VI secretion systems in the Neisseria spp. Microb Genom 2023; 9. [PMID: 37052605 DOI: 10.1099/mgen.0.000986] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/14/2023] Open
Abstract
Complete Type VI Secretion Systems were identified in the genome sequence data of Neisseria subflava isolates sourced from throat swabs of human volunteers. The previous report was the first to describe two complete Type VI Secretion Systems in these isolates, both of which were distinct in terms of their gene organization and sequence homology. Since publication of the first report, Type VI Secretion System subtypes have been identified in Neisseria spp. The characteristics of each type in N. subflava are further investigated here and in the context of the other Neisseria spp., including identification of the lineages containing the different types and subtypes. Type VI Secretion Systems use VgrG for delivery of toxin effector proteins; several copies of vgrG and associated effector / immunity pairs are present in Neisseria spp. Based on sequence similarity between strains and species, these core Type VI Secretion System genes, vgrG, and effector / immunity genes may diversify via horizontal gene transfer, an instrument for gene acquisition and repair in Neisseria spp.
Collapse
Affiliation(s)
- Alan Calder
- School of Life Sciences, Pharmacy, and Chemistry, Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE, UK
| | - Lori A S Snyder
- School of Life Sciences, Pharmacy, and Chemistry, Kingston University, Penrhyn Road, Kingston upon Thames, KT1 2EE, UK
| |
Collapse
|
14
|
Bertels F, Rainey PB. Ancient Darwinian replicators nested within eubacterial genomes. Bioessays 2023; 45:e2200085. [PMID: 36456469 DOI: 10.1002/bies.202200085] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/29/2022] [Revised: 11/17/2022] [Accepted: 11/17/2022] [Indexed: 12/03/2022]
Abstract
Integrative mobile genetic elements (MGEs), such as transposons and insertion sequences, propagate within bacterial genomes, but persistence times in individual lineages are short. For long-term survival, MGEs must continuously invade new hosts by horizontal transfer. Theoretically, MGEs that persist for millions of years in single lineages, and are thus subject to vertical inheritance, should not exist. Here we draw attention to an exception - a class of MGE termed REPIN. REPINs are non-autonomous MGEs whose duplication depends on non-jumping RAYT transposases. Comparisons of REPINs and typical MGEs show that replication rates of REPINs are orders of magnitude lower, REPIN population size fluctuations correlate with changes in available genome space, REPIN conservation depends on RAYT function, and REPIN diversity accumulates within host lineages. These data lead to the hypothesis that REPINs form enduring, beneficial associations with eubacterial chromosomes. Given replicative nesting, our hypothesis predicts conflicts arising from the diverging effects of selection acting simultaneously on REPINs and host genomes. Evidence in support comes from patterns of REPIN abundance and diversity in two distantly related bacterial species. Together this bolsters the conclusion that REPINs are the genetic counterpart of mutualistic endosymbiotic bacteria.
Collapse
Affiliation(s)
- Frederic Bertels
- Department of Microbial Population Biology, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Paul B Rainey
- Department of Microbial Population Biology, Max Planck Institute for Evolutionary Biology, Plön, Germany.,Laboratory of Biophysics and Evolution, CBI, ESPCI Paris, Université PSL, CNRS, Paris, France
| |
Collapse
|
15
|
Zhang Y, Zhang C, Huo W, Wang X, Zhang M, Palmer K, Chen M. An expectation-maximization algorithm for estimating proportions of deletions among bacterial populations with application to study antibiotic resistance gene transfer in Enterococcus faecalis. MARINE LIFE SCIENCE & TECHNOLOGY 2023; 5:28-43. [PMID: 36744155 PMCID: PMC9888353 DOI: 10.1007/s42995-022-00144-z] [Citation(s) in RCA: 2] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Subscribe] [Scholar Register] [Received: 01/22/2022] [Accepted: 08/25/2022] [Indexed: 06/18/2023]
Abstract
The emergence of antibiotic resistance in bacteria limits the availability of antibiotic choices for treatment and infection control, thereby representing a major threat to human health. The de novo mutation of bacterial genomes is an essential mechanism by which bacteria acquire antibiotic resistance. Previously, deletion mutations within bacterial immune systems, ranging from dozens to thousands of base pairs (bps) in length, have been associated with the spread of antibiotic resistance. Most current methods for evaluating genomic structural variations (SVs) have concentrated on detecting them, rather than estimating the proportions of populations that carry distinct SVs. A better understanding of the distribution of mutations and subpopulations dynamics in bacterial populations is needed to appreciate antibiotic resistance evolution and movement of resistance genes through populations. Here, we propose a statistical model to estimate the proportions of genomic deletions in a mixed population based on Expectation-Maximization (EM) algorithms and next-generation sequencing (NGS) data. The method integrates both insert size and split-read mapping information to iteratively update estimated distributions. The proposed method was evaluated with three simulations that demonstrated the production of accurate estimations. The proposed method was then applied to investigate the horizontal transfers of antibiotic resistance genes in concert with changes in the CRISPR-Cas system of E. faecalis. Supplementary Information The online version contains supplementary material available at 10.1007/s42995-022-00144-z.
Collapse
Affiliation(s)
- Yu Zhang
- School of Mathematical Sciences, Ocean University of China, Qingdao, 266000 China
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, TX 75080 USA
| | - Cong Zhang
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, TX 75080 USA
| | - Wenwen Huo
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080 USA
| | - Xinlei Wang
- Department of Statistical Science, Southern Methodist University, Dallas, TX 75205 USA
| | - Michael Zhang
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080 USA
- MOE Key Laboratory of Bioinformatics, Tsinghua University, Beijing, 100084 China
| | - Kelli Palmer
- Department of Biological Sciences, University of Texas at Dallas, Richardson, TX 75080 USA
| | - Min Chen
- Department of Mathematical Sciences, University of Texas at Dallas, Richardson, TX 75080 USA
- Department of Population and Data Sciences, UT Southwestern Medical Center, Dallas, TX 75390 USA
| |
Collapse
|
16
|
Malhotra N, Seshasayee ASN. Replication-Dependent Organization Constrains Positioning of Long DNA Repeats in Bacterial Genomes. Genome Biol Evol 2022; 14:6625829. [PMID: 35776426 PMCID: PMC9297083 DOI: 10.1093/gbe/evac102] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 06/27/2022] [Indexed: 01/29/2023] Open
Abstract
Bacterial genome organization is primarily driven by chromosomal replication from a single origin of replication. However, chromosomal rearrangements, which can disrupt such organization, are inevitable in nature. Long DNA repeats are major players mediating rearrangements, large and small, via homologous recombination. Since changes to genome organization affect bacterial fitness-and more so in fast-growing than slow-growing bacteria-and are under selection, it is reasonable to expect that genomic positioning of long DNA repeats is also under selection. To test this, we identified identical DNA repeats of at least 100 base pairs across ∼6,000 bacterial genomes and compared their distribution in fast- and slow-growing bacteria. We found that long identical DNA repeats are distributed in a non-random manner across bacterial genomes. Their distribution differs in the overall number, orientation, and proximity to the origin of replication, between fast- and slow-growing bacteria. We show that their positioning-which might arise from a combination of the processes that produce repeats and selection on rearrangements that recombination between repeat elements might cause-permits less disruption to the replication-dependent genome organization of bacteria compared with random suggesting it as a major constraint to positioning of long DNA repeats.
Collapse
|
17
|
Hartmann S, Ling M, Dreyer LSA, Zipori A, Finster K, Grawe S, Jensen LZ, Borck S, Reicher N, Drace T, Niedermeier D, Jones NC, Hoffmann SV, Wex H, Rudich Y, Boesen T, Šantl-Temkiv T. Structure and Protein-Protein Interactions of Ice Nucleation Proteins Drive Their Activity. Front Microbiol 2022; 13:872306. [PMID: 35783412 PMCID: PMC9247515 DOI: 10.3389/fmicb.2022.872306] [Citation(s) in RCA: 11] [Impact Index Per Article: 3.7] [Reference Citation Analysis] [Abstract] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/09/2022] [Accepted: 05/16/2022] [Indexed: 11/13/2022] Open
Abstract
Microbially-produced ice nucleating proteins (INpro) are unique molecular structures with the highest known catalytic efficiency for ice formation. Airborne microorganisms utilize these proteins to enhance their survival by reducing their atmospheric residence times. INpro also have critical environmental effects including impacts on the atmospheric water cycle, through their role in cloud and precipitation formation, as well as frost damage on crops. INpro are ubiquitously present in the atmosphere where they are emitted from diverse terrestrial and marine environments. Even though bacterial genes encoding INpro have been discovered and sequenced decades ago, the details of how the INpro molecular structure and oligomerization foster their unique ice-nucleation activity remain elusive. Using machine-learning based software AlphaFold 2 and trRosetta, we obtained and analysed the first ab initio structural models of full length and truncated versions of bacterial INpro. The modeling revealed a novel beta-helix structure of the INpro central repeat domain responsible for ice nucleation activity. This domain consists of repeated stacks of two beta strands connected by two sharp turns. One beta-strand is decorated with a TxT amino acid sequence motif and the other strand has an SxL[T/I] motif. The core formed between the stacked beta helix-pairs is unusually polar and very distinct from previous INpro models. Using synchrotron radiation circular dichroism, we validated the β-strand content of the central repeat domain in the model. Combining the structural model with functional studies of purified recombinant INpro, electron microscopy and modeling, we further demonstrate that the formation of dimers and higher-order oligomers is key to INpro activity. Using computational docking of the new INpro model based on rigid-body algorithms we could reproduce a previously proposed homodimer structure of the INpro CRD with an interface along a highly conserved tyrosine ladder and show that the dimer model agrees with our functional data. The parallel dimer structure creates a surface where the TxT motif of one monomer aligns with the SxL[T/I] motif of the other monomer widening the surface that interacts with water molecules and therefore enhancing the ice nucleation activity. This work presents a major advance in understanding the molecular foundation for bacterial ice-nucleation activity.
Collapse
Affiliation(s)
| | - Meilee Ling
- Department of Biology, Microbiology Section, Aarhus University, Aarhus, Denmark
- Department of Physics and Astronomy, Stellar Astrophysics Centre, Aarhus University, Aarhus, Denmark
- Department of Molecular Biology and Genetics, Section for Protein Science, Aarhus University, Aarhus, Denmark
| | - Lasse S. A. Dreyer
- Department of Molecular Biology and Genetics, Section for Protein Science, Aarhus University, Aarhus, Denmark
| | - Assaf Zipori
- Department of Earth and Planetary Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Kai Finster
- Department of Biology, Microbiology Section, Aarhus University, Aarhus, Denmark
- Department of Physics and Astronomy, Stellar Astrophysics Centre, Aarhus University, Aarhus, Denmark
| | - Sarah Grawe
- Institute for Tropospheric Research, Leipzig, Germany
| | - Lasse Z. Jensen
- Department of Biology, Microbiology Section, Aarhus University, Aarhus, Denmark
- Department of Physics and Astronomy, Stellar Astrophysics Centre, Aarhus University, Aarhus, Denmark
- Department of Molecular Biology and Genetics, Section for Protein Science, Aarhus University, Aarhus, Denmark
| | - Stella Borck
- Department of Biology, Microbiology Section, Aarhus University, Aarhus, Denmark
- Department of Physics and Astronomy, Stellar Astrophysics Centre, Aarhus University, Aarhus, Denmark
- Department of Molecular Biology and Genetics, Section for Protein Science, Aarhus University, Aarhus, Denmark
| | - Naama Reicher
- Department of Earth and Planetary Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Taner Drace
- Department of Molecular Biology and Genetics, Section for Protein Science, Aarhus University, Aarhus, Denmark
| | | | - Nykola C. Jones
- Department of Physics and Astronomy, The Institute for Storage Ring Facilities, Aarhus University, Aarhus, Denmark
| | - Søren V. Hoffmann
- Department of Physics and Astronomy, The Institute for Storage Ring Facilities, Aarhus University, Aarhus, Denmark
| | - Heike Wex
- Institute for Tropospheric Research, Leipzig, Germany
| | - Yinon Rudich
- Department of Earth and Planetary Sciences, Weizmann Institute of Science, Rehovot, Israel
| | - Thomas Boesen
- Department of Molecular Biology and Genetics, Section for Protein Science, Aarhus University, Aarhus, Denmark
- Interdisciplinary Nanoscience Center and Center for Electromicrobiology, Aarhus University, Aarhus, Denmark
- Thomas Boesen,
| | - Tina Šantl-Temkiv
- Department of Biology, Microbiology Section, Aarhus University, Aarhus, Denmark
- Department of Physics and Astronomy, Stellar Astrophysics Centre, Aarhus University, Aarhus, Denmark
- *Correspondence: Tina Šantl-Temkiv,
| |
Collapse
|
18
|
van Dijk B, Bertels F, Stolk L, Takeuchi N, Rainey PB. Transposable elements promote the evolution of genome streamlining. Philos Trans R Soc Lond B Biol Sci 2022; 377:20200477. [PMID: 34839699 PMCID: PMC8628081 DOI: 10.1098/rstb.2020.0477] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/29/2021] [Accepted: 08/30/2021] [Indexed: 12/25/2022] Open
Abstract
Eukaryotes and prokaryotes have distinct genome architectures, with marked differences in genome size, the ratio of coding/non-coding DNA, and the abundance of transposable elements (TEs). As TEs replicate independently of their hosts, the proliferation of TEs is thought to have driven genome expansion in eukaryotes. However, prokaryotes also have TEs in intergenic spaces, so why do prokaryotes have small, streamlined genomes? Using an in silico model describing the genomes of single-celled asexual organisms that coevolve with TEs, we show that TEs acquired from the environment by horizontal gene transfer can promote the evolution of genome streamlining. The process depends on local interactions and is underpinned by rock-paper-scissors dynamics in which populations of cells with streamlined genomes beat TEs, which beat non-streamlined genomes, which beat streamlined genomes, in continuous and repeating cycles. Streamlining is maladaptive to individual cells, but improves lineage viability by hindering the proliferation of TEs. Streamlining does not evolve in sexually reproducing populations because recombination partially frees TEs from the deleterious effects they cause. This article is part of the theme issue 'The secret lives of microbial mobile genetic elements'.
Collapse
Affiliation(s)
- Bram van Dijk
- Department of Microbial Population Biology, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Frederic Bertels
- Department of Microbial Population Biology, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Lianne Stolk
- Theoretical Biology, Department of Biology, Utrecht University, The Netherlands
| | - Nobuto Takeuchi
- School of Biological Sciences, University of Auckland, Auckland, New Zealand
| | - Paul B. Rainey
- Department of Microbial Population Biology, Max Planck Institute for Evolutionary Biology, Plön, Germany
- Laboratory of Biophysics and Evolution, CBI, ESPCI Paris, Université PSL, CNRS, Paris, France
| |
Collapse
|
19
|
Corneloup A, Caumont-Sarcos A, Kamgoue A, Marty B, Le PTN, Siguier P, Guynet C, Ton-Hoang B. TnpAREP and REP sequences dissemination in bacterial genomes: REP recognition determinants. Nucleic Acids Res 2021; 49:6982-6995. [PMID: 34161591 PMCID: PMC8266576 DOI: 10.1093/nar/gkab524] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/25/2020] [Revised: 05/27/2021] [Accepted: 06/17/2021] [Indexed: 11/12/2022] Open
Abstract
REP, diverse palindromic DNA sequences found at high copy number in many bacterial genomes, have been attributed important roles in cell physiology but their dissemination mechanisms are poorly understood. They might represent non-autonomous transposable elements mobilizable by TnpAREP, the first prokaryotic domesticated transposase associated with REP. TnpAREP, fundamentally different from classical transposases, are members of the HuH superfamily and closely related to the transposases of the IS200/IS605 family. We previously showed that Escherichia coli TnpAREP processes cognate single stranded REP in vitro and that this activity requires the integrity of the REP structure, in particular imperfect palindromes interrupted by a bulge and preceded by a conserved DNA motif. A second group of REPs rather carry perfect palindromes, raising questions about how the latter are recognized by their cognate TnpAREP. To get insight into the importance of REP structural and sequence determinants in these two groups, we developed an in vitro activity assay coupled to a mutational analysis for three different TnpAREP/REP duos via a SELEX approach. We also tackled the question of how the cleavage site is selected. This study revealed that two TnpAREP groups have co-evolved with their cognate REPs and use different strategies to recognize their REP substrates.
Collapse
Affiliation(s)
- Alix Corneloup
- Laboratoire de Microbiologie et de Génétique Moléculaires (LMGM), CBI, CNRS, Université Toulouse UPS, Toulouse, France
| | - Anne Caumont-Sarcos
- Laboratoire de Microbiologie et de Génétique Moléculaires (LMGM), CBI, CNRS, Université Toulouse UPS, Toulouse, France
| | | | - Brigitte Marty
- Laboratoire de Microbiologie et de Génétique Moléculaires (LMGM), CBI, CNRS, Université Toulouse UPS, Toulouse, France
| | - Phan Thai Nguyen Le
- Laboratoire de Microbiologie et de Génétique Moléculaires (LMGM), CBI, CNRS, Université Toulouse UPS, Toulouse, France
| | - Patricia Siguier
- Laboratoire de Microbiologie et de Génétique Moléculaires (LMGM), CBI, CNRS, Université Toulouse UPS, Toulouse, France
| | - Catherine Guynet
- Laboratoire de Microbiologie et de Génétique Moléculaires (LMGM), CBI, CNRS, Université Toulouse UPS, Toulouse, France
| | - Bao Ton-Hoang
- Laboratoire de Microbiologie et de Génétique Moléculaires (LMGM), CBI, CNRS, Université Toulouse UPS, Toulouse, France
| |
Collapse
|
20
|
Influences of ssDNA-RecA Filament Length on the Fidelity of Homologous Recombination. J Mol Biol 2021; 433:167143. [PMID: 34242669 DOI: 10.1016/j.jmb.2021.167143] [Citation(s) in RCA: 4] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2021] [Revised: 06/08/2021] [Accepted: 06/30/2021] [Indexed: 11/22/2022]
Abstract
Chromosomal double-strand breaks can be accurately repaired by homologous recombination, but genomic rearrangement can result if the repair joins different copies of a repeated sequence. Rearrangement can be advantageous or fatal. During repair, a broken double-stranded DNA (dsDNA) is digested by the RecBCD complex from the 5' end, leaving a sequence gap that separates two 3' single-stranded DNA (ssDNA) tails. RecA binds to the 3' tails forming helical nucleoprotein filaments.A three-strand intermediate is formed when a RecA-bound ssDNA with L nucleotides invades a homologous region of dsDNA and forms a heteroduplex product with a length ≤ L bp. The homology dependent stability of the heteroduplex determines how rapidly and accurately homologous recombination repairs double-strand breaks. If the heteroduplex is sufficiently sequence matched, repair progresses to irreversible DNA synthesis. Otherwise, the heteroduplex should rapidly reverse. In this work, we present in vitro measurements of the L dependent stability of heteroduplex products formed by filaments with 90 ≤ L ≤ 420 nt, which is within the range observedin vivo. We find that without ATP hydrolysis, products are irreversible when L > 50 nt. In contrast, with ATP hydrolysis when L < 160 nt, products reverse in < 30 seconds; however, with ATP hydrolysis when L ≥ 320 nt, some products reverse in < 30 seconds, while others last thousands of seconds. We consider why these two different filament length regimes show such distinct behaviors. We propose that the experimental results combined with theoretical insights suggest that filaments with 250 ≲ L ≲ 8500 nt optimize DSB repair.
Collapse
|
21
|
Tandem Repeats in Bacillus: Unique Features and Taxonomic Distribution. Int J Mol Sci 2021; 22:ijms22105373. [PMID: 34065296 PMCID: PMC8161180 DOI: 10.3390/ijms22105373] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2021] [Revised: 05/14/2021] [Accepted: 05/18/2021] [Indexed: 11/16/2022] Open
Abstract
Little is known about DNA tandem repeats across prokaryotes. We have recently described an enigmatic group of tandem repeats in bacterial genomes with a constant repeat size but variable sequence. These findings strongly suggest that tandem repeat size in some bacteria is under strong selective constraints. Here, we extend these studies and describe tandem repeats in a large set of Bacillus. Some species have very few repeats, while other species have a large number. Most tandem repeats have repeats with a constant size (either 52 or 20-21 nt), but a variable sequence. We characterize in detail these intriguing tandem repeats. Individual species have several families of tandem repeats with the same repeat length and different sequence. This result is in strong contrast with eukaryotes, where tandem repeats of many sizes are found in any species. We discuss the possibility that they are transcribed as small RNA molecules. They may also be involved in the stabilization of the nucleoid through interaction with proteins. We also show that the distribution of tandem repeats in different species has a taxonomic significance. The data we present for all tandem repeats and their families in these bacterial species will be useful for further genomic studies.
Collapse
|
22
|
Garrett SC. Pruning and Tending Immune Memories: Spacer Dynamics in the CRISPR Array. Front Microbiol 2021; 12:664299. [PMID: 33868219 PMCID: PMC8047081 DOI: 10.3389/fmicb.2021.664299] [Citation(s) in RCA: 31] [Impact Index Per Article: 7.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/04/2021] [Accepted: 03/12/2021] [Indexed: 01/22/2023] Open
Abstract
CRISPR-Cas (Clustered Regularly Interspaced Short Palindromic Repeats and CRISPR-associated genes) is a type of prokaryotic immune system that is unique in its ability to provide sequence-specific adaptive protection, which can be updated in response to new threats. CRISPR-Cas does this by storing fragments of DNA from invading genetic elements in an array interspersed with short repeats. The CRISPR array can be continuously updated through integration of new DNA fragments (termed spacers) at one end, but over time existing spacers become obsolete. To optimize immunity, spacer uptake, residency, and loss must be regulated. This mini-review summarizes what is known about how spacers are organized, maintained, and lost from CRISPR arrays.
Collapse
Affiliation(s)
- Sandra C Garrett
- Department of Genetics and Genome Sciences, Institute for Systems Genomics, UConn Health, Farmington, CT, United States
| |
Collapse
|
23
|
Noureen M, Kawashima T, Arita M. Genetic Markers of Genome Rearrangements in Helicobacter pylori. Microorganisms 2021; 9:621. [PMID: 33802974 PMCID: PMC8002640 DOI: 10.3390/microorganisms9030621] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/01/2021] [Revised: 03/11/2021] [Accepted: 03/12/2021] [Indexed: 11/16/2022] Open
Abstract
Helicobacter pylori exhibits a diverse genomic structure with high mutation and recombination rates. Various genetic elements function as drivers of this genomic diversity including genome rearrangements. Identifying the association of these elements with rearrangements can pave the way to understand its genome evolution. We analyzed the order of orthologous genes among 72 publicly available complete genomes to identify large genome rearrangements, and rearrangement breakpoints were compared with the positions of insertion sequences, genomic islands, and restriction modification genes. Comparison of the shared inversions revealed the conserved genomic elements across strains from different geographical locations. Some were region-specific and others were global, indicating that highly shared rearrangements and their markers were more ancestral than strain-or region-specific ones. The locations of genomic islands were an important factor for the occurrence of the rearrangements. Comparative genomics helps to evaluate the conservation of various elements contributing to the diversity across genomes.
Collapse
Affiliation(s)
- Mehwish Noureen
- Department of Genetics, SOKENDAI University, Yata 1111, Mishima 411-8540, Shizuoka, Japan;
| | - Takeshi Kawashima
- Bioinformation and DDBJ Center, National Institute of Genetics, Yata 1111, Mishima 411-8540, Shizuoka, Japan;
| | - Masanori Arita
- Bioinformation and DDBJ Center, National Institute of Genetics, Yata 1111, Mishima 411-8540, Shizuoka, Japan;
- RIKEN Center for Sustainable Resource Science, 1-7-22 Suehiro, Tsurumi, Yokohama 230-0045, Kanagawa, Japan
| |
Collapse
|
24
|
Park HJ, Gokhale CS, Bertels F. How sequence populations persist inside bacterial genomes. Genetics 2021; 217:6151697. [PMID: 33724360 PMCID: PMC8049555 DOI: 10.1093/genetics/iyab027] [Citation(s) in RCA: 5] [Impact Index Per Article: 1.3] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/21/2020] [Accepted: 02/04/2021] [Indexed: 01/04/2023] Open
Abstract
Compared to their eukaryotic counterparts, bacterial genomes are small and contain extremely tightly packed genes. Repetitive sequences are rare but not completely absent. One of the most common repeat families is REPINs. REPINs can replicate in the host genome and form populations that persist for millions of years. Here, we model the interactions of these intragenomic sequence populations with the bacterial host. We first confirm well-established results, in the presence and absence of horizontal gene transfer (hgt) sequence populations either expand until they drive the host to extinction or the sequence population gets purged from the genome. We then show that a sequence population can be stably maintained, when each individual sequence provides a benefit that decreases with increasing sequence population size. Maintaining a sequence population of stable size also requires the replication of the sequence population to be costly to the host, otherwise the sequence population size will increase indefinitely. Surprisingly, in regimes with high hgt rates, the benefit conferred by the sequence population does not have to exceed the damage it causes to its host. Our analyses provide a plausible scenario for the persistence of sequence populations in bacterial genomes. We also hypothesize a limited biologically relevant parameter range for the provided benefit, which can be tested in future experiments.
Collapse
Affiliation(s)
- Hye Jin Park
- Department of Evolutionary Theory, Max Planck Institute for Evolutionary Biology, Plön, 24306, Germany.,Asia Pacific Center for Theoretical Physics, Pohang, 37673, Korea.,Department of Physics, POSTECH, Pohang, 37673, Korea
| | - Chaitanya S Gokhale
- Research Group for Theoretical Models of Eco-evolutionary Dynamics, Department of Evolutionary Theory, Max Planck Institute for Evolutionary Biology, Plön, 24306, Germany
| | - Frederic Bertels
- Research Group for Microbial Molecular Evolution, Department of Microbial Population Biology, Max Planck Institute for Evolutionary Biology, Plön, 24306, Germany
| |
Collapse
|
25
|
Zakaria NN, Convey P, Gomez-Fuentes C, Zulkharnain A, Sabri S, Shaharuddin NA, Ahmad SA. Oil Bioremediation in the Marine Environment of Antarctica: A Review and Bibliometric Keyword Cluster Analysis. Microorganisms 2021; 9:microorganisms9020419. [PMID: 33671443 PMCID: PMC7922015 DOI: 10.3390/microorganisms9020419] [Citation(s) in RCA: 16] [Impact Index Per Article: 4.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/31/2020] [Revised: 01/22/2021] [Accepted: 01/25/2021] [Indexed: 02/07/2023] Open
Abstract
Bioremediation of hydrocarbons has received much attention in recent decades, particularly relating to fuel and other oils. While of great relevance globally, there has recently been increasing interest in hydrocarbon bioremediation in the marine environments of Antarctica. To provide an objective assessment of the research interest in this field we used VOSviewer software to analyze publication data obtained from the ScienceDirect database covering the period 1970 to the present, but with a primary focus on the years 2000–2020. A bibliometric analysis of the database allowed identification of the co-occurrence of keywords. There was an increasing trend over time for publications relating to oil bioremediation in maritime Antarctica, including both studies on marine bioremediation and of the metabolic pathways of hydrocarbon degradation. Studies of marine anaerobic degradation remain under-represented compared to those of aerobic degradation. Emerging keywords in recent years included bioprospecting, metagenomic, bioindicator, and giving insight into changing research foci, such as increasing attention to microbial diversity. The study of microbial genomes using metagenomic approaches or whole genome studies is increasing rapidly and is likely to drive emerging fields in future, including rapid expansion of bioprospecting in diverse fields of biotechnology.
Collapse
Affiliation(s)
- Nur Nadhirah Zakaria
- Department of Biochemistry, Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, Serdang 43400, Selangor, Malaysia; (N.N.Z.); (N.A.S.)
| | - Peter Convey
- British Antarctic Survey, NERC, High Cross, Madingley Road, Cambridge CB3 0ET, UK;
| | - Claudio Gomez-Fuentes
- Department of Chemical Engineering, Universidad de Magallanes, Avda, Bulnes 01855, Chile;
- Center for Research and Antarctic Environmental Monitoring (CIMAA), Universidad de Magallanes, Avda, Bulnes 01855, Chile
| | - Azham Zulkharnain
- Department of Bioscience and Engineering, College of Systems Engineering and Science, Shibaura Institute of Technology, 307 Fukasaku, Minuma-ku, Saitama 337-8570, Japan;
| | - Suriana Sabri
- Department of Microbiology, Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, Serdang 43400, Selangor, Malaysia;
| | - Noor Azmi Shaharuddin
- Department of Biochemistry, Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, Serdang 43400, Selangor, Malaysia; (N.N.Z.); (N.A.S.)
| | - Siti Aqlima Ahmad
- Department of Biochemistry, Faculty of Biotechnology and Biomolecular Sciences, Universiti Putra Malaysia, Serdang 43400, Selangor, Malaysia; (N.N.Z.); (N.A.S.)
- Center for Research and Antarctic Environmental Monitoring (CIMAA), Universidad de Magallanes, Avda, Bulnes 01855, Chile
- National Antarctic Research Centre, B303 Level 3, Block B, IPS Building, Universiti Malaya, Kuala Lumpur 50603, Malaysia
- Correspondence:
| |
Collapse
|
26
|
Unique Features of Tandem Repeats in Bacteria. J Bacteriol 2020; 202:JB.00229-20. [PMID: 32839174 DOI: 10.1128/jb.00229-20] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.4] [Reference Citation Analysis] [Abstract] [Key Words] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/20/2020] [Accepted: 08/17/2020] [Indexed: 02/06/2023] Open
Abstract
DNA tandem repeats, or satellites, are well described in eukaryotic species, but little is known about their prevalence across prokaryotes. Here, we performed the most complete characterization to date of satellites in bacteria. We identified 121,638 satellites from 12,233 fully sequenced and assembled bacterial genomes with a very uneven distribution. We also determined the families of satellites which have a related sequence. There are 85 genomes that are particularly satellite rich and contain several families of satellites of yet unknown function. Interestingly, we only found two main types of noncoding satellites, depending on their repeat sizes, 22/44 or 52 nucleotides (nt). An intriguing feature is the constant size of the repeats in the genomes of different species, whereas their sequences show no conservation. Individual species also have several families of satellites with the same repeat length and different sequences. This result is in marked contrast with previous findings in eukaryotes, where noncoding satellites of many sizes are found in any species investigated. We describe in greater detail these noncoding satellites in the spirochete Leptospira interrogans and in several bacilli. These satellites undoubtedly play a specific role in the species which have acquired them. We discuss the possibility that they represent binding sites for transcription factors not previously described or that they are involved in the stabilization of the nucleoid through interaction with proteins.IMPORTANCE We found an enigmatic group of noncoding satellites in 85 bacterial genomes with a constant repeat size but variable sequence. This pattern of DNA organization is unique and had not been previously described in bacteria. These findings strongly suggest that satellite size in some bacteria is under strong selective constraints and thus that satellites are very likely to play a fundamental role. We also provide a list and properties of all satellites in 12,233 genomes, which may be used for further genomic analysis.
Collapse
|
27
|
Ejigu GF, Jung J. Review on the Computational Genome Annotation of Sequences Obtained by Next-Generation Sequencing. BIOLOGY 2020; 9:E295. [PMID: 32962098 PMCID: PMC7565776 DOI: 10.3390/biology9090295] [Citation(s) in RCA: 31] [Impact Index Per Article: 6.2] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Subscribe] [Scholar Register] [Received: 08/21/2020] [Revised: 09/13/2020] [Accepted: 09/16/2020] [Indexed: 12/16/2022]
Abstract
Next-Generation Sequencing (NGS) has made it easier to obtain genome-wide sequence data and it has shifted the research focus into genome annotation. The challenging tasks involved in annotation rely on the currently available tools and techniques to decode the information contained in nucleotide sequences. This information will improve our understanding of general aspects of life and evolution and improve our ability to diagnose genetic disorders. Here, we present a summary of both structural and functional annotations, as well as the associated comparative annotation tools and pipelines. We highlight visualization tools that immensely aid the annotation process and the contributions of the scientific community to the annotation. Further, we discuss quality-control practices and the need for re-annotation, and highlight the future of annotation.
Collapse
Affiliation(s)
| | - Jaehee Jung
- Department of Information and Communication Engineering, Myongji University, Yongin-si 17058, Gyeonggi-do, Korea;
| |
Collapse
|
28
|
Insights into the molecular diversity of Plasmodium vivax merozoite surface protein-3γ (pvmsp3γ), a polymorphic member in the msp3 multi-gene family. Sci Rep 2020; 10:10977. [PMID: 32620822 PMCID: PMC7335089 DOI: 10.1038/s41598-020-67222-z] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/19/2020] [Accepted: 06/02/2020] [Indexed: 12/17/2022] Open
Abstract
Plasmodium vivax merozoite surface protein 3 (PvMSP3) is encoded by a multi-gene family. Of these, PvMSP3α, PvMSP3β and PvMSP3γ, are considered to be vaccine targets. Despite comprehensive analyses of PvMSP3α and PvMSP3β, little is known about structural and sequence diversity in PvMSP3γ. Analysis of 118 complete pvmsp3γ sequences from diverse endemic areas of Thailand and 9 reported sequences has shown 86 distinct haplotypes. Based on variation in insert domains, pvmsp3γ can be classified into 3 types, i.e. Belem, Salvador I and NR520. Imperfect nucleotide repeats were found in six regions of the gene; none encoded tandem amino acid repeats. Predicted coiled-coil heptad repeats were abundant in the protein and displayed variation in length and location. Interspersed phase shifts occurred in the heptad arrays that may have an impact on protein structure. Polymorphism in pvmsp3γ seems to be generated by intragenic recombination and driven by natural selection. Most P. vivax isolates in Thailand exhibit population structure, suggesting limited gene flow across endemic areas. Phylogenetic analysis has suggested that insert domains could have been subsequently acquired during the evolution of pvmsp3γ. Sequence and structural diversity of PvMSP3γ may complicate vaccine design due to alteration in predicted immunogenic epitopes among variants.
Collapse
|
29
|
Damnjanovic D, Vázquez-Campos X, Winter DL, Harvey M, Bridge WJ. Bacteriophage genotyping using BOXA repetitive-PCR. BMC Microbiol 2020; 20:154. [PMID: 32527227 PMCID: PMC7291552 DOI: 10.1186/s12866-020-01770-2] [Citation(s) in RCA: 1] [Impact Index Per Article: 0.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/21/2019] [Accepted: 03/29/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND Repetitive-PCR (rep-PCR) using BOXA1R and BOXA2R as single primers was investigated for its potential to genotype bacteriophage. Previously, this technique has been primarily used for the discrimination of bacterial strains. Reproducible DNA fingerprint patterns for various phage types were generated using either of the two primers. RESULTS The similarity index of replicates ranged from 89.4-100% for BOXA2R-PCR, and from 90 to 100% for BOXA1R-PCR. The method of DNA isolation (p = 0.08) and the phage propagation conditions at two different temperatures (p = 0.527) had no significant influence on generated patterns. Rep-PCR amplification products were generated from different templates including purified phage DNA, phage lysates and phage plaques. The use of this method enabled comparisons of phage genetic profiles to establish their similarity to related or unrelated phages and their bacterial hosts. CONCLUSION The findings suggest that repetitive-PCR could be used as a rapid and inexpensive method to preliminary screen phage isolates prior to their selection for more comprehensive studies. The adoption of this rapid, simple and reproducible technique could facilitate preliminary characterisation of a large number of phage isolates and the investigation of genetic relationship between phage genotypes.
Collapse
Affiliation(s)
- Dragica Damnjanovic
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, UNSW Sydney, Kensington, Australia
| | - Xabier Vázquez-Campos
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, UNSW Sydney, Kensington, Australia
| | - Daniel L. Winter
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, UNSW Sydney, Kensington, Australia
| | - Melissa Harvey
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, UNSW Sydney, Kensington, Australia
| | - Wallace J. Bridge
- School of Biotechnology and Biomolecular Sciences, Faculty of Science, UNSW Sydney, Kensington, Australia
| |
Collapse
|
30
|
Pearman WS, Freed NE, Silander OK. Testing the advantages and disadvantages of short- and long- read eukaryotic metagenomics using simulated reads. BMC Bioinformatics 2020; 21:220. [PMID: 32471343 PMCID: PMC7257156 DOI: 10.1186/s12859-020-3528-4] [Citation(s) in RCA: 51] [Impact Index Per Article: 10.2] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/05/2019] [Accepted: 04/30/2020] [Indexed: 11/10/2022] Open
Abstract
BACKGROUND The first step in understanding ecological community diversity and dynamics is quantifying community membership. An increasingly common method for doing so is through metagenomics. Because of the rapidly increasing popularity of this approach, a large number of computational tools and pipelines are available for analysing metagenomic data. However, the majority of these tools have been designed and benchmarked using highly accurate short read data (i.e. Illumina), with few studies benchmarking classification accuracy for long error-prone reads (PacBio or Oxford Nanopore). In addition, few tools have been benchmarked for non-microbial communities. RESULTS Here we compare simulated long reads from Oxford Nanopore and Pacific Biosciences (PacBio) with high accuracy Illumina read sets to systematically investigate the effects of sequence length and taxon type on classification accuracy for metagenomic data from both microbial and non-microbial communities. We show that very generally, classification accuracy is far lower for non-microbial communities, even at low taxonomic resolution (e.g. family rather than genus). We then show that for two popular taxonomic classifiers, long reads can significantly increase classification accuracy, and this is most pronounced for non-microbial communities. CONCLUSIONS This work provides insight on the expected accuracy for metagenomic analyses for different taxonomic groups, and establishes the point at which read length becomes more important than error rate for assigning the correct taxon.
Collapse
Affiliation(s)
- William S Pearman
- School of Natural and Computational Sciences, Massey University, Private Bag 102904, North Shore, Auckland, 0745, New Zealand.
| | - Nikki E Freed
- School of Natural and Computational Sciences, Massey University, Private Bag 102904, North Shore, Auckland, 0745, New Zealand
| | - Olin K Silander
- School of Natural and Computational Sciences, Massey University, Private Bag 102904, North Shore, Auckland, 0745, New Zealand.
| |
Collapse
|
31
|
Olson ND, Treangen TJ, Hill CM, Cepeda-Espinoza V, Ghurye J, Koren S, Pop M. Metagenomic assembly through the lens of validation: recent advances in assessing and improving the quality of genomes assembled from metagenomes. Brief Bioinform 2020; 20:1140-1150. [PMID: 28968737 DOI: 10.1093/bib/bbx098] [Citation(s) in RCA: 82] [Impact Index Per Article: 16.4] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/02/2017] [Revised: 07/13/2017] [Indexed: 01/09/2023] Open
Abstract
Metagenomic samples are snapshots of complex ecosystems at work. They comprise hundreds of known and unknown species, contain multiple strain variants and vary greatly within and across environments. Many microbes found in microbial communities are not easily grown in culture making their DNA sequence our only clue into their evolutionary history and biological function. Metagenomic assembly is a computational process aimed at reconstructing genes and genomes from metagenomic mixtures. Current methods have made significant strides in reconstructing DNA segments comprising operons, tandem gene arrays and syntenic blocks. Shorter, higher-throughput sequencing technologies have become the de facto standard in the field. Sequencers are now able to generate billions of short reads in only a few days. Multiple metagenomic assembly strategies, pipelines and assemblers have appeared in recent years. Owing to the inherent complexity of metagenome assembly, regardless of the assembly algorithm and sequencing method, metagenome assemblies contain errors. Recent developments in assembly validation tools have played a pivotal role in improving metagenomics assemblers. Here, we survey recent progress in the field of metagenomic assembly, provide an overview of key approaches for genomic and metagenomic assembly validation and demonstrate the insights that can be derived from assemblies through the use of assembly validation strategies. We also discuss the potential for impact of long-read technologies in metagenomics. We conclude with a discussion of future challenges and opportunities in the field of metagenomic assembly and validation.
Collapse
|
32
|
Bachar A, Itzhaki E, Gleizer S, Shamshoom M, Milo R, Antonovsky N. Point mutations in topoisomerase I alter the mutation spectrum in E. coli and impact the emergence of drug resistance genotypes. Nucleic Acids Res 2020; 48:761-769. [PMID: 31777935 PMCID: PMC6954433 DOI: 10.1093/nar/gkz1100] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/24/2019] [Revised: 09/27/2019] [Accepted: 11/21/2019] [Indexed: 11/14/2022] Open
Abstract
Identifying the molecular mechanisms that give rise to genetic variation is essential for the understanding of evolutionary processes. Previously, we have used adaptive laboratory evolution to enable biomass synthesis from CO2 in Escherichia coli. Genetic analysis of adapted clones from two independently evolving populations revealed distinct enrichment for insertion and deletion mutational events. Here, we follow these observations to show that mutations in the gene encoding for DNA topoisomerase I (topA) give rise to mutator phenotypes with characteristic mutational spectra. Using genetic assays and mutation accumulation lines, we find that point mutations in topA increase the rate of sequence deletion and duplication events. Interestingly, we observe that a single residue substitution (R168C) results in a high rate of head-to-tail (tandem) short sequence duplications, which are independent of existing sequence repeats. Finally, we show that the unique mutation spectrum of topA mutants enhances the emergence of antibiotic resistance in comparison to mismatch-repair (mutS) mutators, and leads to new resistance genotypes. Our findings highlight a potential link between the catalytic activity of topoisomerases and the fundamental question regarding the emergence of de novo tandem repeats, which are known modulators of bacterial evolution.
Collapse
Affiliation(s)
- Amit Bachar
- Department of Plant and Environmental Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Elad Itzhaki
- Department of Plant and Environmental Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Shmuel Gleizer
- Department of Plant and Environmental Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Melina Shamshoom
- Department of Plant and Environmental Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Ron Milo
- Department of Plant and Environmental Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel
| | - Niv Antonovsky
- Department of Plant and Environmental Sciences, Weizmann Institute of Science, Rehovot, 7610001, Israel.,Laboratory of Genetically Encoded Small Molecules, The Rockefeller University, New York, NY 10065, USA
| |
Collapse
|
33
|
Cury J, Oliveira PH, de la Cruz F, Rocha EPC. Host Range and Genetic Plasticity Explain the Coexistence of Integrative and Extrachromosomal Mobile Genetic Elements. Mol Biol Evol 2020; 35:2230-2239. [PMID: 29905872 PMCID: PMC6107060 DOI: 10.1093/molbev/msy123] [Citation(s) in RCA: 38] [Impact Index Per Article: 7.6] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 02/06/2023] Open
Abstract
Self-transmissible mobile genetic elements drive horizontal gene transfer between prokaryotes. Some of these elements integrate in the chromosome, whereas others replicate autonomously as plasmids. Recent works showed the existence of few differences, and occasional interconversion, between the two types of elements. Here, we enquired on why evolutionary processes have maintained the two types of mobile genetic elements by comparing integrative and conjugative elements (ICE) with extrachromosomal ones (conjugative plasmids) of the highly abundant MPFT conjugative type. We observed that plasmids encode more replicases, partition systems, and antibiotic resistance genes, whereas ICEs encode more integrases and metabolism-associated genes. ICEs and plasmids have similar average sizes, but plasmids are much more variable, have more DNA repeats, and exchange genes more frequently. On the other hand, we found that ICEs are more frequently transferred between distant taxa. We propose a model where the different genetic plasticity and amplitude of host range between elements explain the co-occurrence of integrative and extrachromosomal elements in microbial populations. In particular, the conversion from ICE to plasmid allows ICE to be more plastic, while the conversion from plasmid to ICE allows the expansion of the element's host range.
Collapse
Affiliation(s)
- Jean Cury
- Microbial Evolutionary Genomics, Institut Pasteur, Paris, France.,CNRS, UMR3525, Paris, France
| | - Pedro H Oliveira
- Microbial Evolutionary Genomics, Institut Pasteur, Paris, France.,CNRS, UMR3525, Paris, France
| | - Fernando de la Cruz
- Departamento de Biologia Molecular e Instituto de Biomedicina y Biotecnologia de Cantabria (IBBTEC), Universidad de Cantabria-CSIC, Santander, Spain
| | - Eduardo P C Rocha
- Microbial Evolutionary Genomics, Institut Pasteur, Paris, France.,CNRS, UMR3525, Paris, France
| |
Collapse
|
34
|
Mahfooz S, Srivastava A, Yadav MC, Tahoor A. Comparative genomics in phytopathogenic prokaryotes reveals the higher relative abundance and density of long-SSRs in the smallest prokaryotic genome. 3 Biotech 2019; 9:340. [PMID: 31478033 DOI: 10.1007/s13205-019-1872-8] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/04/2018] [Accepted: 08/13/2019] [Indexed: 10/26/2022] Open
Abstract
Frequency and distribution of long-SSRs were studied in 18 phytopathogenic prokaryotes. Higher relative abundance of the long-SSRs was observed in phytopathogenic prokaryotes when compared to non-pathogenic control. The frequency of these SSRs was positively correlated with size and GC content of the genomes of phytopathogenic prokaryotes. Interestingly, phytopathogens with higher GC content in the genome were found to posses longer repeat motifs of SSRs, whereas those having lesser GC content were harbouring shorter repeat motifs. Higher abundance of tri- and hexa-nucleotide repeat motifs were the characteristic of actinomycetes, where as higher abundance of mono- and tetra-nucleotide repeats were the characteristic of the mollicutes. The maximum relative abundance and relative density of SSR were found in the smallest genome of host-adapted pathogen Aster yellow, however, length of microsatellite repeat units was the least. On the basis of presence of SSRs in the housekeeping genes, a phylogenetic relationship between these phytopathogenic prokaryotes was deduced and compared with the phylogeny developed based on 16S ribosomal RNA gene.
Collapse
|
35
|
Xu M, Lawrence JG, Durand D. Selection, periodicity and potential function for Highly Iterative Palindrome-1 (HIP1) in cyanobacterial genomes. Nucleic Acids Res 2019; 46:2265-2278. [PMID: 29432573 PMCID: PMC5861425 DOI: 10.1093/nar/gky075] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/17/2017] [Accepted: 01/25/2018] [Indexed: 02/05/2023] Open
Abstract
Highly Iterated Palindrome 1 (HIP1, GCGATCGC) is hyper-abundant in most cyanobacterial genomes. In some cyanobacteria, average HIP1 abundance exceeds one motif per gene. Such high abundance suggests a significant role in cyanobacterial biology. However, 20 years of study have not revealed whether HIP1 has a function, much less what that function might be. We show that HIP1 is 15- to 300-fold over-represented in genomes analyzed. More importantly, HIP1 sites are conserved both within and between open reading frames, suggesting that their overabundance is maintained by selection rather than by continual replenishment by neutral processes, such as biased DNA repair. This evidence for selection suggests a functional role for HIP1. No evidence was found to support a functional role as a peptide or RNA motif or a role in the regulation of gene expression. Rather, we demonstrate that the distribution of HIP1 along cyanobacterial chromosomes is significantly periodic, with periods ranging from 10 to 90 kb, consistent in scale with periodicities reported for co-regulated, co-expressed and evolutionarily correlated genes. The periodicity we observe is also comparable in scale to chromosomal interaction domains previously described in other bacteria. In this context, our findings imply HIP1 functions associated with chromosome and nucleoid structure.
Collapse
Affiliation(s)
- Minli Xu
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| | - Jeffrey G Lawrence
- Department of Biological Sciences, University of Pittsburgh, Pittsburgh, PA 15260, USA
| | - Dannie Durand
- Department of Biological Sciences, Carnegie Mellon University, Pittsburgh, PA 15213, USA.,Department of Computer Science, Carnegie Mellon University, Pittsburgh, PA 15213, USA
| |
Collapse
|
36
|
Marijon P, Chikhi R, Varré JS. Graph analysis of fragmented long-read bacterial genome assemblies. Bioinformatics 2019; 35:4239-4246. [DOI: 10.1093/bioinformatics/btz219] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/09/2018] [Revised: 02/19/2019] [Accepted: 03/26/2019] [Indexed: 11/14/2022] Open
Abstract
Abstract
Motivation
Long-read genome assembly tools are expected to reconstruct bacterial genomes nearly perfectly; however, they still produce fragmented assemblies in some cases. It would be beneficial to understand whether these cases are intrinsically impossible to resolve, or if assemblers are at fault, implying that genomes could be refined or even finished with little to no additional experimental cost.
Results
We propose a set of computational techniques to assist inspection of fragmented bacterial genome assemblies, through careful analysis of assembly graphs. By finding paths of overlapping raw reads between pairs of contigs, we recover potential short-range connections between contigs that were lost during the assembly process. We show that our procedure recovers 45% of missing contig adjacencies in fragmented Canu assemblies, on samples from the NCTC bacterial sequencing project. We also observe that a simple procedure based on enumerating weighted Hamiltonian cycles can suggest likely contig orderings. In our tests, the correct contig order is ranked first in half of the cases and within the top-three predictions in nearly all evaluated cases, providing a direction for finishing fragmented long-read assemblies.
Availability and implementation
https://gitlab.inria.fr/pmarijon/knot .
Supplementary information
Supplementary data are available at Bioinformatics online.
Collapse
Affiliation(s)
- Pierre Marijon
- Inria, Université de Lille, CNRS, Centrale Lille, UMR 9189 – CRIStAL, Lille F-59000, France
| | - Rayan Chikhi
- Institut Pasteur, C3BI USR 3756 IP CNRS, Paris, France
| | - Jean-Stéphane Varré
- Université de Lille, CNRS, Centrale Lille, Inria, UMR 9189 – CRIStAL, Lille F-59000, France
| |
Collapse
|
37
|
Prabha R, Singh DP. Cyanobacterial phylogenetic analysis based on phylogenomics approaches render evolutionary diversification and adaptation: an overview of representative orders. 3 Biotech 2019; 9:87. [PMID: 30800598 DOI: 10.1007/s13205-019-1635-6] [Citation(s) in RCA: 3] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/01/2018] [Accepted: 02/11/2019] [Indexed: 12/12/2022] Open
Abstract
Phylogenetic studies based on a definite set of marker genes usually reconstruct evolutionary relationships among the prokaryotic species. Based on specific target sequences, such studies represent variations and allow identification of similarities or dissimilarities in organisms. With the advent of completely sequenced genomes and accumulation of information on whole prokaryotic genomes, phylogenetic reconstructions should be considered more reliable if they are ideally based on entire genomes to resolve phylogenetic interest. We applied phylogenomics approaches taking into account completely sequenced cyanobacterial genomes to reconstruct underlying species that represented major taxonomic classes and belonged to distinctly different habitats (freshwater, marine, soils, and rocks). We did not rely on describing phylogeny of all representative class of cyanobacterial species on the basis of only ribosomal gene, 16S rDNA gene. In contrast, we analyzed combined molecular marker and phylogenomics approaches (genome alignment, gene content and gene order, composition vector and protein domain content) for accurately inferring phylogenetic relationship of species. We have shown that this approach reflects the impact of evolution on the organisms and considers connects with the ecological adaptation in cyanobacteria in different habitats. Analysis revealed that the members from marine habitat occupy different profile than those from freshwater. Impact of GC content and genomic repetitiveness over the diversification of cyanobacterial species and their possible role in adaptation was also reflected. Members occupying similar habitats cover more evolutionary distance together and also evolve various strategies for adaptation and survival either through genomic repetitiveness or preferences for genes of particular functions or modified GC content. Genomes undergo different changes for their adaptation in diverse habitats.
Collapse
Affiliation(s)
- Ratna Prabha
- 1ICAR-National Bureau of Agriculturally Important Microorganisms, Kushmaur, Maunath Bhanjan, 275101 India
- 2Department of Biotechnology, Mewar University, Gangrar, Chittorgarh, Rajasthan India
| | - Dhananjaya P Singh
- 1ICAR-National Bureau of Agriculturally Important Microorganisms, Kushmaur, Maunath Bhanjan, 275101 India
| |
Collapse
|
38
|
Seitz A, Hanssen F, Nieselt K. DACCOR–Detection, characterization, and reconstruction of repetitive regions in bacterial genomes. PeerJ 2018; 6:e4742. [PMID: 29868249 PMCID: PMC5983011 DOI: 10.7717/peerj.4742] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.3] [Reference Citation Analysis] [Abstract] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/22/2017] [Accepted: 04/20/2018] [Indexed: 11/20/2022] Open
Abstract
The reconstruction of genomes using mapping-based approaches with short reads experiences difficulties when resolving repetitive regions. These repetitive regions in genomes result in low mapping qualities of the respective reads, which in turn lead to many unresolved bases. Currently, the reconstruction of these regions is often based on modified references in which the repetitive regions are masked. However, for many references, such masked genomes are not available or are based on repetitive regions of other genomes. Our idea is to identify repetitive regions in the reference genome de novo. These regions can then be used to reconstruct them separately using short read sequencing data. Afterward, the reconstructed repetitive sequence can be inserted into the reconstructed genome. We present the program detection, characterization, and reconstruction of repetitive regions, which performs these steps automatically. Our results show an increased base pair resolution of the repetitive regions in the reconstruction of Treponema pallidum samples, resulting in fewer unresolved bases.
Collapse
|
39
|
Nelson M, Guhlin J, Epstein B, Tiffin P, Sadowsky MJ. The complete replicons of 16 Ensifer meliloti strains offer insights into intra- and inter-replicon gene transfer, transposon-associated loci, and repeat elements. Microb Genom 2018; 4. [PMID: 29671722 PMCID: PMC5994717 DOI: 10.1099/mgen.0.000174] [Citation(s) in RCA: 12] [Impact Index Per Article: 1.7] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022] Open
Abstract
Ensifer meliloti (formerly Rhizobium meliloti and Sinorhizobium meliloti) is a model bacterium for understanding legume–rhizobial symbioses. The tripartite genome of E. meliloti consists of a chromosome, pSymA and pSymB, and in some instances strain-specific accessory plasmids. The majority of previous sequencing studies have relied on the use of assemblies generated from short read sequencing, which leads to gaps and assembly errors. Here we used PacBio-based, long-read assemblies and were able to assemble, de novo, complete circular replicons. In this study, we sequenced, de novo-assembled and analysed 10 E. meliloti strains. Sequence comparisons were also done with data from six previously published genomes. We identified genome differences between the replicons, including mol% G+C and gene content, nucleotide repeats, and transposon-associated loci. Additionally, genomic rearrangements both within and between replicons were identified, providing insight into evolutionary processes at the structural level. There were few cases of inter-replicon gene transfer of core genes between the main replicons. Accessory plasmids were more similar to pSymA than to either pSymB or the chromosome, with respect to gene content, transposon content and G+C content. In our population, the accessory plasmids appeared to share an open genome with pSymA, which contains many nodulation- and nitrogen fixation-related genes. This may explain previous observations that horizontal gene transfer has a greater effect on the content of pSymA than pSymB, or the chromosome, and why some rhizobia show unstable nodulation phenotypes on legume hosts.
Collapse
Affiliation(s)
- Matthew Nelson
- 1Biotechnology Institute and Department of Soil, Water, and Climate, University of Minnesota, St. Paul, MN 55108, USA
| | - Joseph Guhlin
- 2Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN 55108, USA
| | - Brendan Epstein
- 2Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN 55108, USA
| | - Peter Tiffin
- 2Department of Plant and Microbial Biology, University of Minnesota, St. Paul, MN 55108, USA
| | - Michael J Sadowsky
- 1Biotechnology Institute and Department of Soil, Water, and Climate, University of Minnesota, St. Paul, MN 55108, USA
| |
Collapse
|
40
|
Acuña-Amador L, Primot A, Cadieu E, Roulet A, Barloy-Hubler F. Genomic repeats, misassembly and reannotation: a case study with long-read resequencing of Porphyromonas gingivalis reference strains. BMC Genomics 2018; 19:54. [PMID: 29338683 PMCID: PMC5771137 DOI: 10.1186/s12864-017-4429-4] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/26/2017] [Accepted: 12/29/2017] [Indexed: 12/15/2022] Open
Abstract
BACKGROUND Without knowledge of their genomic sequences, it is impossible to make functional models of the bacteria that make up human and animal microbiota. Unfortunately, the vast majority of publicly available genomes are only working drafts, an incompleteness that causes numerous problems and constitutes a major obstacle to genotypic and phenotypic interpretation. In this work, we began with an example from the class Bacteroidia in the phylum Bacteroidetes, which is preponderant among human orodigestive microbiota. We successfully identify the genetic loci responsible for assembly breaks and misassemblies and demonstrate the importance and usefulness of long-read sequencing and curated reannotation. RESULTS We showed that the fragmentation in Bacteroidia draft genomes assembled from massively parallel sequencing linearly correlates with genomic repeats of the same or greater size than the reads. We also demonstrated that some of these repeats, especially the long ones, correspond to misassembled loci in three reference Porphyromonas gingivalis genomes marked as circularized (thus complete or finished). We prove that even at modest coverage (30X), long-read resequencing together with PCR contiguity verification (rrn operons and an integrative and conjugative element or ICE) can be used to identify and correct the wrongly combined or assembled regions. Finally, although time-consuming and labor-intensive, consistent manual biocuration of three P. gingivalis strains allowed us to compare and correct the existing genomic annotations, resulting in a more accurate interpretation of the genomic differences among these strains. CONCLUSIONS In this study, we demonstrate the usefulness and importance of long-read sequencing in verifying published genomes (even when complete) and generating assemblies for new bacterial strains/species with high genomic plasticity. We also show that when combined with biological validation processes and diligent biocurated annotation, this strategy helps reduce the propagation of errors in shared databases, thus limiting false conclusions based on incomplete or misleading information.
Collapse
Affiliation(s)
- Luis Acuña-Amador
- Institut de Génétique et Développement de Rennes, CNRS, UMR6290, Université de Rennes 1, Rennes, France.,Laboratorio de Investigación en Bacteriología Anaerobia, Centro de Investigación en Enfermedades Tropicales, Facultad de Microbiología, Universidad de Costa Rica, San José, Costa Rica
| | - Aline Primot
- Institut de Génétique et Développement de Rennes, CNRS, UMR6290, Université de Rennes 1, Rennes, France
| | - Edouard Cadieu
- Institut de Génétique et Développement de Rennes, CNRS, UMR6290, Université de Rennes 1, Rennes, France
| | - Alain Roulet
- GenoToul Genome & Transcriptome (GeT-PlaGe), INRA, US1426, Castanet-Tolosan, France
| | - Frédérique Barloy-Hubler
- Institut de Génétique et Développement de Rennes, CNRS, UMR6290, Université de Rennes 1, Rennes, France.
| |
Collapse
|
41
|
Bertels F, Gallie J, Rainey PB. Identification and Characterization of Domesticated Bacterial Transposases. Genome Biol Evol 2017; 9:2110-2121. [PMID: 28910967 PMCID: PMC5581495 DOI: 10.1093/gbe/evx146] [Citation(s) in RCA: 7] [Impact Index Per Article: 0.9] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 08/02/2017] [Indexed: 12/26/2022] Open
Abstract
Selfish genetic elements, such as insertion sequences and transposons are found in most genomes. Transposons are usually identifiable by their high copy number within genomes. In contrast, REP-associated tyrosine transposases (RAYTs), a recently described class of bacterial transposase, are typically present at just one copy per genome. This suggests that RAYTs no longer copy themselves and thus they no longer function as a typical transposase. Motivated by this possibility we interrogated thousands of fully sequenced bacterial genomes in order to determine patterns of RAYT diversity, their distribution across chromosomes and accessory elements, and rate of duplication. RAYTs encompass exceptional diversity and are divisible into at least five distinct groups. They possess features more similar to housekeeping genes than insertion sequences, are predominantly vertically transmitted and have persisted through evolutionary time to the point where they are now found in 24% of all species for which at least one fully sequenced genome is available. Overall, the genomic distribution of RAYTs suggests that they have been coopted by host genomes to perform a function that benefits the host cell.
Collapse
Affiliation(s)
- Frederic Bertels
- New Zealand Institute for Advanced Study, Massey University at Albany, Auckland, New Zealand.,Department of Evolutionary Theory, Max Planck Institute for Evolutionary Biology, Plön, Germany.,Department of Microbial Population Biology, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Jenna Gallie
- Department of Evolutionary Theory, Max Planck Institute for Evolutionary Biology, Plön, Germany
| | - Paul B Rainey
- New Zealand Institute for Advanced Study, Massey University at Albany, Auckland, New Zealand.,Department of Microbial Population Biology, Max Planck Institute for Evolutionary Biology, Plön, Germany.,Laboratoire de Génétique de l'Evolution, Ecole Supérieure de Physique et de Chimie Industrielles de la Ville de Paris (ESPCI ParisTech), PSL Research University, Paris, France
| |
Collapse
|
42
|
Deptula P, Laine PK, Roberts RJ, Smolander OP, Vihinen H, Piironen V, Paulin L, Jokitalo E, Savijoki K, Auvinen P, Varmanen P. De novo assembly of genomes from long sequence reads reveals uncharted territories of Propionibacterium freudenreichii. BMC Genomics 2017; 18:790. [PMID: 29037147 PMCID: PMC5644110 DOI: 10.1186/s12864-017-4165-9] [Citation(s) in RCA: 15] [Impact Index Per Article: 1.9] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 06/07/2017] [Accepted: 10/05/2017] [Indexed: 12/14/2022] Open
Abstract
BACKGROUND Propionibacterium freudenreichii is an industrially important bacterium granted the Generally Recognized as Safe (the GRAS) status, due to its long safe use in food bioprocesses. Despite the recognized role in the food industry and in the production of vitamin B12, as well as its documented health-promoting potential, P. freudenreichii remained poorly characterised at the genomic level. At present, only three complete genome sequences are available for the species. RESULTS We used the PacBio RS II sequencing platform to generate complete genomes of 20 P. freudenreichii strains and compared them in detail. Comparative analyses revealed both sequence conservation and genome organisational diversity among the strains. Assembly from long reads resulted in the discovery of additional circular elements: two putative conjugative plasmids and three active, lysogenic bacteriophages. It also permitted characterisation of the CRISPR-Cas systems. The use of the PacBio sequencing platform allowed identification of DNA modifications, which in turn allowed characterisation of the restriction-modification systems together with their recognition motifs. The observed genomic differences suggested strain variation in surface piliation and specific mucus binding, which were validated by experimental studies. The phenotypic characterisation displayed large diversity between the strains in ability to utilise a range of carbohydrates, to grow at unfavourable conditions and to form a biofilm. CONCLUSION The complete genome sequencing allowed detailed characterisation of the industrially important species, P. freudenreichii by facilitating the discovery of previously unknown features. The results presented here lay a solid foundation for future genetic and functional genomic investigations of this actinobacterial species.
Collapse
Affiliation(s)
- Paulina Deptula
- Department of Food and Environmental Sciences, University of Helsinki, 00014 Helsinki, Finland
| | - Pia K. Laine
- Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland
| | | | | | - Helena Vihinen
- Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland
| | - Vieno Piironen
- Department of Food and Environmental Sciences, University of Helsinki, 00014 Helsinki, Finland
| | - Lars Paulin
- Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland
| | - Eija Jokitalo
- Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland
| | - Kirsi Savijoki
- Department of Food and Environmental Sciences, University of Helsinki, 00014 Helsinki, Finland
| | - Petri Auvinen
- Institute of Biotechnology, University of Helsinki, 00014 Helsinki, Finland
| | - Pekka Varmanen
- Department of Food and Environmental Sciences, University of Helsinki, 00014 Helsinki, Finland
| |
Collapse
|
43
|
Roachford OSE, Nelson KE, Mohapatra BR. Comparative genomics of four Mycoplasma species of the human urogenital tract: Analysis of their core genomes and virulence genes. Int J Med Microbiol 2017; 307:508-520. [PMID: 28927691 DOI: 10.1016/j.ijmm.2017.09.006] [Citation(s) in RCA: 8] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/20/2017] [Revised: 08/29/2017] [Accepted: 09/04/2017] [Indexed: 12/23/2022] Open
Abstract
The variation in Mycoplasma lipoproteins attributed to genome rearrangements and genetic insertions leads to phenotypic plasticity that allows for the evasion of the host's defence system and pathogenesis. This paper compared for the first time the genomes of four human urogenital Mycoplasma species (M. penetrans HF-2, M. fermentans JER, M. genitalium G37 and M. hominis PG21) to categorise the metabolic functions of the core genes and to assess the effects of tandem repeats, phage-like genetic elements and prophages on the virulence genes. The results of this comparative in silico genomic analysis revealed that the genes constituting their core genomes can be separated into three distinct categories: nuclear metabolism, protein metabolism and energy generation each making up 52%, 31% and 23%, respectively. The genomes have repeat sequences ranging from 3.7% in M. hominis PG21 to 9.5% in M. fermentans JER. Tandem repeats (mostly minisatellites) and phage-like proteins (including DNA gyrases/topoisomerases) were randomly distributed in the Mycoplasma genomes. Here, we identified a coiled-coil structure containing protein in M. penetrans HF-2 which is significantly similar to the Mem protein of M. fermentans ɸMFV1. Therefore, a Mycoplasma prophage seems to be embedded within M. penetrans HF-2 unannotated genome. To the best of our knowledge, no Mycoplasma phages or prophages have been detected in M. penetrans. This study is important not only in understanding the complex genetic factors involved in phenotypic plasticity and virulence in the relatively understudied Mycoplasma species but also in elucidating the effective arrangement of their redundant minimal genomes.
Collapse
Affiliation(s)
- Orville St E Roachford
- Department of Biological and Chemical Sciences, The University of the West Indies, Cave Hill Campus, Bridgetown BB 11000, Barbados.
| | - Karen E Nelson
- J. Craig Venter Institute, 9714 Medical Center Drive, Rockville, MD 20850, USA
| | - Bidyut R Mohapatra
- Department of Biological and Chemical Sciences, The University of the West Indies, Cave Hill Campus, Bridgetown BB 11000, Barbados
| |
Collapse
|
44
|
Das G, Das S, Dutta S, Ghosh I. In silico identification and characterization of stress and virulence associated repeats in Salmonella. Genomics 2017; 110:23-34. [PMID: 28827093 DOI: 10.1016/j.ygeno.2017.08.002] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 02/21/2017] [Revised: 05/09/2017] [Accepted: 08/03/2017] [Indexed: 01/05/2023]
Abstract
So much genomic similarities yet causing different diseases, is like a paradox in Salmonella biology. Repeat is one of the probes that can explain such differences. Here, a comparative genomics approach is followed to identify and characterize repeats that might play role in adaptation and pathogenesis. Repeats are non-randomly distributed in the genomes except few typhoid causing strains. Perfect long repeats are rare compare to polymorphic ones and both are statistically consistent. Significant differences in repeat densities in stress related genes manifest its probable participation in survival and virulence. 573 and 1053 repeat loci have been identified which are exclusively associated with stress and virulent genes respectively. In Salmonella Typhi, an octameric VNTR locus is found in between acrD and yffB genes having more than 25 perfect copies across Salmonella Typhi but possesses only single copy in other serovars. This repeat can be used as a diagnostic probe for typhoid.
Collapse
Affiliation(s)
- Gourab Das
- School of Computational and Integrative Sciences, Jawaharlal Nehru University (JNU), New Mehrauli Road, Munirka, New Delhi, Delhi 110067, India
| | - Surojit Das
- National Institute of Cholera and Enteric Diseases (NICED), P-33, C.I.T. Road, Scheme XM, Beleghata, Kolkata 700010, India
| | - Shanta Dutta
- National Institute of Cholera and Enteric Diseases (NICED), P-33, C.I.T. Road, Scheme XM, Beleghata, Kolkata 700010, India
| | - Indira Ghosh
- School of Computational and Integrative Sciences, Jawaharlal Nehru University (JNU), New Mehrauli Road, Munirka, New Delhi, Delhi 110067, India.
| |
Collapse
|
45
|
Böhnke S, Perner M. Unraveling RubisCO Form I and Form II Regulation in an Uncultured Organism from a Deep-Sea Hydrothermal Vent via Metagenomic and Mutagenesis Studies. Front Microbiol 2017; 8:1303. [PMID: 28747908 PMCID: PMC5506194 DOI: 10.3389/fmicb.2017.01303] [Citation(s) in RCA: 6] [Impact Index Per Article: 0.8] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 04/12/2017] [Accepted: 06/28/2017] [Indexed: 12/04/2022] Open
Abstract
Ribulose-1,5-bisphosphate carboxylase/oxygenase (RubisCO) catalyzes the first major step of carbon fixation in the Calvin-Benson-Bassham (CBB) cycle. This autotrophic CO2 fixation cycle accounts for almost all the assimilated carbon on Earth. Due to the primary role that RubisCO plays in autotrophic carbon fixation, it is important to understand how its gene expression is regulated and the enzyme is activated. Since the majority of all microorganisms are currently not culturable, we used a metagenomic approach to identify genes and enzymes associated with RubisCO expression. The investigated metagenomic DNA fragment originates from the deep-sea hydrothermal vent field Nibelungen at 8°18′ S along the Mid-Atlantic Ridge. It is 13,046 bp and resembles genes from Thiomicrospira crunogena. The fragment encodes nine open reading frames (ORFs) which include two types of RubisCO, form I (CbbL/S) and form II (CbbM), two LysR transcriptional regulators (LysR1 and LysR2), two von Willebrand factor type A (CbbO-m and CbbO-1), and two AAA+ ATPases (CbbQ-m and CbbQ-1), expected to function as RubisCO activating enzymes. In silico analyses uncovered several putative LysR binding sites and promoter structures. Functions of some of these DNA motifs were experimentally confirmed. For example, according to mobility shift assays LysR1’s binding ability to the intergenic region of lysR1 and cbbL appears to be intensified when CbbL or LysR2 are present. Binding of LysR2 upstream of cbbM appears to be intensified if CbbM is present. Our study suggests that CbbQ-m and CbbO-m activate CbbL and that LysR1 and LysR2 proteins promote CbbQ-m/CbbO-m expression. CbbO-1 seems to activate CbbM and CbbM itself appears to contribute to intensifying LysR’s binding ability and thus its own transcriptional regulation. CbbM furthermore appears to impair cbbL expression. A model summarizes the findings and predicts putative interactions of the different proteins influencing RubisCO gene regulation and expression.
Collapse
Affiliation(s)
- Stefanie Böhnke
- Molecular Biology of Microbial Consortia, Biocenter Klein Flottbek, University of HamburgHamburg, Germany
| | - Mirjam Perner
- Molecular Biology of Microbial Consortia, Biocenter Klein Flottbek, University of HamburgHamburg, Germany
| |
Collapse
|
46
|
Elevated Rate of Genome Rearrangements in Radiation-Resistant Bacteria. Genetics 2017; 205:1677-1689. [PMID: 28188144 PMCID: PMC5378121 DOI: 10.1534/genetics.116.196154] [Citation(s) in RCA: 11] [Impact Index Per Article: 1.4] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 10/13/2016] [Accepted: 01/30/2017] [Indexed: 01/27/2023] Open
Abstract
A number of bacterial, archaeal, and eukaryotic species are known for their resistance to ionizing radiation. One of the challenges these species face is a potent environmental source of DNA double-strand breaks, potential drivers of genome structure evolution. Efficient and accurate DNA double-strand break repair systems have been demonstrated in several unrelated radiation-resistant species and are putative adaptations to the DNA damaging environment. Such adaptations are expected to compensate for the genome-destabilizing effect of environmental DNA damage and may be expected to result in a more conserved gene order in radiation-resistant species. However, here we show that rates of genome rearrangements, measured as loss of gene order conservation with time, are higher in radiation-resistant species in multiple, phylogenetically independent groups of bacteria. Comparison of indicators of selection for genome organization between radiation-resistant and phylogenetically matched, nonresistant species argues against tolerance to disruption of genome structure as a strategy for radiation resistance. Interestingly, an important mechanism affecting genome rearrangements in prokaryotes, the symmetrical inversions around the origin of DNA replication, shapes genome structure of both radiation-resistant and nonresistant species. In conclusion, the opposing effects of environmental DNA damage and DNA repair result in elevated rates of genome rearrangements in radiation-resistant bacteria.
Collapse
|
47
|
Schäfers C, Blank S, Wiebusch S, Elleuche S, Antranikian G. Complete genome sequence of Thermus brockianus GE-1 reveals key enzymes of xylan/xylose metabolism. Stand Genomic Sci 2017; 12:22. [PMID: 28174620 PMCID: PMC5292009 DOI: 10.1186/s40793-017-0225-7] [Citation(s) in RCA: 4] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/24/2016] [Accepted: 12/23/2016] [Indexed: 11/20/2022] Open
Abstract
Thermus brockianus strain GE-1 is a thermophilic, Gram-negative, rod-shaped and non-motile bacterium that was isolated from the Geysir geothermal area, Iceland. Like other thermophiles, Thermus species are often used as model organisms to understand the mechanism of action of extremozymes, especially focusing on their heat-activity and thermostability. Genome-specific features of T. brockianus GE-1 and their properties further help to explain processes of the adaption of extremophiles at elevated temperatures. Here we analyze the first whole genome sequence of T. brockianus strain GE-1. Insights of the genome sequence and the methodologies that were applied during de novo assembly and annotation are given in detail. The finished genome shows a phred quality value of QV50. The complete genome size is 2.38 Mb, comprising the chromosome (2,035,182 bp), the megaplasmid pTB1 (342,792 bp) and the smaller plasmid pTB2 (10,299 bp). Gene prediction revealed 2,511 genes in total, including 2,458 protein-encoding genes, 53 RNA and 66 pseudo genes. A unique genomic region on megaplasmid pTB1 was identified encoding key enzymes for xylan depolymerization and xylose metabolism. This is in agreement with the growth experiments in which xylan is utilized as sole source of carbon. Accordingly, we identified sequences encoding the xylanase Xyn10, an endoglucanase, the membrane ABC sugar transporter XylH, the xylose-binding protein XylF, the xylose isomerase XylA catalyzing the first step of xylose metabolism and the xylulokinase XylB, responsible for the second step of xylose metabolism. Our data indicate that an ancestor of T. brockianus obtained the ability to use xylose as alternative carbon source by horizontal gene transfer.
Collapse
Affiliation(s)
- Christian Schäfers
- Institute of Technical Microbiology, Hamburg University of Technology (TUHH), Kasernenstraße 12, 21073 Hamburg, Germany
| | - Saskia Blank
- Institute of Technical Microbiology, Hamburg University of Technology (TUHH), Kasernenstraße 12, 21073 Hamburg, Germany
| | - Sigrid Wiebusch
- Institute of Technical Microbiology, Hamburg University of Technology (TUHH), Kasernenstraße 12, 21073 Hamburg, Germany
| | - Skander Elleuche
- Institute of Technical Microbiology, Hamburg University of Technology (TUHH), Kasernenstraße 12, 21073 Hamburg, Germany
| | - Garabed Antranikian
- Institute of Technical Microbiology, Hamburg University of Technology (TUHH), Kasernenstraße 12, 21073 Hamburg, Germany
| |
Collapse
|
48
|
Cattani AM, Siqueira FM, Guedes RLM, Schrank IS. Repetitive Elements in Mycoplasma hyopneumoniae Transcriptional Regulation. PLoS One 2016; 11:e0168626. [PMID: 28005945 PMCID: PMC5179023 DOI: 10.1371/journal.pone.0168626] [Citation(s) in RCA: 5] [Impact Index Per Article: 0.6] [Reference Citation Analysis] [Abstract] [MESH Headings] [Grants] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 09/22/2016] [Accepted: 12/02/2016] [Indexed: 12/15/2022] Open
Abstract
Transcriptional regulation, a multiple-step process, is still poorly understood in the important pig pathogen Mycoplasma hyopneumoniae. Basic motifs like promoters and terminators have already been described, but no other cis-regulatory elements have been found. DNA repeat sequences have been shown to be an interesting potential source of cis-regulatory elements. In this work, a genome-wide search for tandem and palindromic repetitive elements was performed in the intergenic regions of all coding sequences from M. hyopneumoniae strain 7448. Computational analysis demonstrated the presence of 144 tandem repeats and 1,171 palindromic elements. The DNA repeat sequences were distributed within the 5' upstream regions of 86% of transcriptional units of M. hyopneumoniae strain 7448. Comparative analysis between distinct repetitive sequences found in related mycoplasma genomes demonstrated different percentages of conservation among pathogenic and nonpathogenic strains. qPCR assays revealed differential expression among genes showing variable numbers of repetitive elements. In addition, repeats found in 206 genes already described to be differentially regulated under different culture conditions of M. hyopneumoniae strain 232 showed almost 80% conservation in relation to M. hyopneumoniae strain 7448 repeats. Altogether, these findings suggest a potential regulatory role of tandem and palindromic DNA repeats in the M. hyopneumoniae transcriptional profile.
Collapse
Affiliation(s)
- Amanda Malvessi Cattani
- Centro de Biotecnologia, Programa de Pós-Graduação em Biologia Celular e Molecular, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Rio Grande do Sul, Brazil
| | - Franciele Maboni Siqueira
- Centro de Biotecnologia, Programa de Pós-Graduação em Biologia Celular e Molecular, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Rio Grande do Sul, Brazil
| | - Rafael Lucas Muniz Guedes
- Laboratório de Bioinformática, Laboratório Nacional de Computação Científica (LNCC), Petrópolis, Rio de Janeiro, Brazil
| | - Irene Silveira Schrank
- Centro de Biotecnologia, Programa de Pós-Graduação em Biologia Celular e Molecular, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Rio Grande do Sul, Brazil
- Centro de Biotecnologia, Departamento de Biologia Molecular e Biotecnologia, Instituto de Biociências, Universidade Federal do Rio Grande do Sul (UFRGS), Porto Alegre, Rio Grande do Sul, Brazil
- * E-mail:
| |
Collapse
|
49
|
Hammoumi S, Vallaeys T, Santika A, Leleux P, Borzym E, Klopp C, Avarre JC. Targeted genomic enrichment and sequencing of CyHV-3 from carp tissues confirms low nucleotide diversity and mixed genotype infections. PeerJ 2016; 4:e2516. [PMID: 27703859 PMCID: PMC5045873 DOI: 10.7717/peerj.2516] [Citation(s) in RCA: 20] [Impact Index Per Article: 2.2] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Download PDF] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 05/05/2016] [Accepted: 09/01/2016] [Indexed: 12/18/2022] Open
Abstract
Koi herpesvirus disease (KHVD) is an emerging disease that causes mass mortality in koi and common carp, Cyprinus carpio L. Its causative agent is Cyprinid herpesvirus 3 (CyHV-3), also known as koi herpesvirus (KHV). Although data on the pathogenesis of this deadly virus is relatively abundant in the literature, still little is known about its genomic diversity and about the molecular mechanisms that lead to such a high virulence. In this context, we developed a new strategy for sequencing full-length CyHV-3 genomes directly from infected fish tissues. Total genomic DNA extracted from carp gill tissue was specifically enriched with CyHV-3 sequences through hybridization to a set of nearly 2 million overlapping probes designed to cover the entire genome length, using KHV-J sequence (GenBank accession number AP008984) as reference. Applied to 7 CyHV-3 specimens from Poland and Indonesia, this targeted genomic enrichment enabled recovery of the full genomes with >99.9% reference coverage. The enrichment rate was directly correlated to the estimated number of viral copies contained in the DNA extracts used for library preparation, which varied between ∼5000 and ∼2×107. The average sequencing depth was >200 for all samples, thus allowing the search for variants with high confidence. Sequence analyses highlighted a significant proportion of intra-specimen sequence heterogeneity, suggesting the presence of mixed infections in all investigated fish. They also showed that inter-specimen genetic diversity at the genome scale was very low (>99.95% of sequence identity). By enabling full genome comparisons directly from infected fish tissues, this new method will be valuable to trace outbreaks rapidly and at a reasonable cost, and in turn to understand the transmission routes of CyHV-3.
Collapse
Affiliation(s)
- Saliha Hammoumi
- Institut des Sciences de l'Evolution de Montpellier, UMR226 IRD-CNRS-UM-EPHE , Montpellier , France
| | | | - Ayi Santika
- Main Center for Freshwater Aquaculture Development , Sukabumi , Indonesia
| | - Philippe Leleux
- Plate-forme Genotoul Bioinfo, UR875 Biométrie et Intelligence Artificielle, Institut National de la Recherche Agronomique , Castanet-Tolosan , France
| | - Ewa Borzym
- Department of Fish Diseases, National Veterinary Research Institute , Pulawy , Poland
| | - Christophe Klopp
- Plate-forme Genotoul Bioinfo, UR875 Biométrie et Intelligence Artificielle, Institut National de la Recherche Agronomique , Castanet-Tolosan , France
| | - Jean-Christophe Avarre
- Institut des Sciences de l'Evolution de Montpellier, UMR226 IRD-CNRS-UM-EPHE , Montpellier , France
| |
Collapse
|
50
|
Ojala V, Mattila S, Hoikkala V, Bamford JK, Hiltunen T, Jalasvuori M. Scoping the effectiveness and evolutionary obstacles in using plasmid-dependent phages to fight antibiotic resistance. Future Microbiol 2016; 11:999-1009. [PMID: 27503765 DOI: 10.2217/fmb-2016-0038] [Citation(s) in RCA: 9] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022] Open
Abstract
AIM To investigate the potential evolutionary obstacles in the sustainable therapeutic use of plasmid-dependent phages to control the clinically important conjugative plasmid-mediated dissemination of antibiotic resistance genes to pathogenic bacteria. MATERIALS & METHODS The lytic plasmid-dependent phage PRD1 and the multiresistance conferring plasmid RP4 in an Escherichia coli host were utilized to assess the genetic and phenotypic changes induced by combined phage and antibiotic selection. RESULTS & CONCLUSIONS Resistance to PRD1 was always coupled with either completely lost or greatly reduced conjugation ability. Reversion to full conjugation efficiency was found to be rare, and it also restored the susceptibility to plasmid-dependent phages. Consequently, plasmid-dependent phages constitute an interesting candidate for development of sustainable anticonjugation/antiresistance therapeutic applications.
Collapse
Affiliation(s)
- Ville Ojala
- Department of Biological & Environmental Science, Centre of Excellence in Biological Interactions, University of Jyväskylä, Jyväskylä, Finland.,Department of Food & Environmental Sciences/Microbiology & Biotechnology, University of Helsinki, PO Box 65, Helsinki 00014, Finland
| | - Sari Mattila
- Department of Biological & Environmental Science, Centre of Excellence in Biological Interactions, University of Jyväskylä, Jyväskylä, Finland.,Department of Food & Environmental Sciences/Microbiology & Biotechnology, University of Helsinki, PO Box 65, Helsinki 00014, Finland
| | - Ville Hoikkala
- Department of Biological & Environmental Science, Centre of Excellence in Biological Interactions, University of Jyväskylä, Jyväskylä, Finland.,Department of Food & Environmental Sciences/Microbiology & Biotechnology, University of Helsinki, PO Box 65, Helsinki 00014, Finland
| | - Jaana Kh Bamford
- Department of Biological & Environmental Science, Centre of Excellence in Biological Interactions, University of Jyväskylä, Jyväskylä, Finland.,Department of Food & Environmental Sciences/Microbiology & Biotechnology, University of Helsinki, PO Box 65, Helsinki 00014, Finland
| | - Teppo Hiltunen
- Department of Biological & Environmental Science, Centre of Excellence in Biological Interactions, University of Jyväskylä, Jyväskylä, Finland.,Department of Food & Environmental Sciences/Microbiology & Biotechnology, University of Helsinki, PO Box 65, Helsinki 00014, Finland
| | - Matti Jalasvuori
- Department of Biological & Environmental Science, Centre of Excellence in Biological Interactions, University of Jyväskylä, Jyväskylä, Finland.,Department of Food & Environmental Sciences/Microbiology & Biotechnology, University of Helsinki, PO Box 65, Helsinki 00014, Finland
| |
Collapse
|