1
|
Liao X, Li Y, Wu Y, Li X, Shang X. Deep Learning-Based Classification of CRISPR Loci Using Repeat Sequences. ACS Synth Biol 2025; 14:1813-1828. [PMID: 40261207 DOI: 10.1021/acssynbio.5c00174] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 04/24/2025]
Abstract
With the widespread application of the CRISPR-Cas system in gene editing and related fields, along with the increasing availability of metagenomic data, the demand for detecting and classifying CRISPR-Cas systems in metagenomic data sets has grown significantly. Traditional classification methods for CRISPR-Cas systems primarily rely on identifying cas genes near CRISPR arrays. However, in cases where cas gene information is absent, such as in metagenomes or fragmented genome assemblies, traditional methods may fail. Here, we present a deep learning-based method, CRISPRclassify-CNN-Att, which classifies CRISPR loci solely based on repeat sequences. CRISPRclassify-CNN-Att utilizes convolutional neural networks (CNNs) and self-attention mechanisms to extract features from repeat sequences. It employs a stacking strategy to address the imbalance of samples across different subtypes and uses transfer learning to improve classification accuracy for subtypes with fewer samples. CRISPRclassify-CNN-Att demonstrates outstanding performance in classifying multiple subtypes, particularly those with larger sample sizes. Although CRISPR loci classification traditionally depends on cas genes, CRISPRclassify-CNN-Att offers a novel approach that serves as a significant complement to cas-based methods, enabling the classification of orphan or distant CRISPR loci. The proposed tool is freely accessible via https://github.com/Xingyu-Liao/CRISPRclassify-CNN-Att.
Collapse
Affiliation(s)
- Xingyu Liao
- School of Computer Science, Northwestern Polytechnical University, Xi'an, Shanxi 710072, China
| | - Yanyan Li
- School of Computer Science, Northwestern Polytechnical University, Xi'an, Shanxi 710072, China
| | - Yingfu Wu
- School of Computer Science, Northwestern Polytechnical University, Xi'an, Shanxi 710072, China
| | - Xingyi Li
- School of Computer Science, Northwestern Polytechnical University, Xi'an, Shanxi 710072, China
| | - Xuequn Shang
- School of Computer Science, Northwestern Polytechnical University, Xi'an, Shanxi 710072, China
| |
Collapse
|
2
|
Xu X, Gu P. Overview of Phage Defense Systems in Bacteria and Their Applications. Int J Mol Sci 2024; 25:13316. [PMID: 39769080 PMCID: PMC11676413 DOI: 10.3390/ijms252413316] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 11/08/2024] [Revised: 12/07/2024] [Accepted: 12/10/2024] [Indexed: 01/11/2025] Open
Abstract
As natural parasites of bacteria, phages have greatly contributed to bacterial evolution owing to their persistent threat. Diverse phage resistance systems have been developed in bacteria during the coevolutionary process with phages. Conversely, phage contamination has a devastating effect on microbial fermentation, resulting in fermentation failure and substantial economic loss. Accordingly, natural defense systems derived from bacteria can be employed to obtain robust phage-resistant host cells that can overcome the threats posed by bacteriophages during industrial bacterial processes. In this review, diverse phage resistance mechanisms, including the remarkable research progress and potential applications, are systematically summarized. In addition, the development prospects and challenges of phage-resistant bacteria are discussed. This review provides a useful reference for developing phage-resistant bacteria.
Collapse
Affiliation(s)
| | - Pengfei Gu
- School of Biological Science and Technology, University of Jinan, Jinan 250022, China;
| |
Collapse
|
3
|
Rostampour M, Panahi B, Masoumi Jahandizi R. The CRISPR-Cas system in Lactiplantibacillus plantarum strains: identification and characterization using a genome mining approach. Front Microbiol 2024; 15:1394756. [PMID: 39678914 PMCID: PMC11638214 DOI: 10.3389/fmicb.2024.1394756] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 03/02/2024] [Accepted: 10/31/2024] [Indexed: 12/17/2024] Open
Abstract
Clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (CAS) genes make up bacteria's adaptive immune system against bacteriophages. In this study, 675 sequences of Lactiplantibacillus plantarum isolates deposited in GenBank were analyzed in terms of diversity, occurrence, and evolution of the CRISPR-Cas system. This study investigated the presence, structural variations, phylogenetic relationships, and diversity of CRISPR-Cas systems in 675 L. plantarum strains. The analysis revealed that 143 strains harbor confirmed CRISPR-Cas systems, with subtype II-A being predominant. Moreover, targeting phages and plasmid diversity between the predicted systems were dissected. The results indicated that approximately 22% of the isolates with verified and complete CRISPR systems exhibited the coexistence of both subtypes II-A and I-E within their genomes. The results further showed that in subtype II-A, the length of the repeat sequence was 36 nucleotides, on average. In addition, the number of spacers in subtypes II-A and I-E varied between 1-24 and 3-16 spacers, respectively. The results also indicated that subtype II-A has nine protospacer adjacent motifs, which are 5'-CC-3', 5'-GAA-3', 5'-TGG-3', 5'-CTT-3', 5'-GGG-3', 5'-CAT-3', 5'-CTC-3', 5'-CCT-3', and 5'-CGG-3'. In addition, the identified systems displayed a potential for targeting Lactobacillus phages. The investigation of the relationship between the targeting of Lactobacillus phages by the antiphage system in L. plantarum species showed that subtype II-A had the highest diversity in targeting Lactobacillus phages than subtype I-E. In conclusion, current findings offer a perspective on the prevalence and evolution of the CRISPR-Cas system in L. plantarum, contributing novel insights to the expanding field of CRISPR-Cas systems within lactobacillus strains. This knowledge establishes a foundation for future applied studies focused on enhancing phage resistance in industrial fermentation, reducing contamination risks, and improving product quality. The identified targeting diversity may also foster advancements in phage therapy through the development of CRISPR-based antimicrobials.
Collapse
Affiliation(s)
| | - Bahman Panahi
- Department of Genomics, Branch for Northwest and West Region, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research, Education and Extension Organization (AREEO), Tabriz, Iran
| | | |
Collapse
|
4
|
Sheykholeslami N, Mirzaei H, Nami Y, Khandaghi J, Javadi A. Ecological and evolutionary dynamics of CRISPR-Cas systems in Clostridium botulinum: Insights from genome mining and comparative analysis. INFECTION, GENETICS AND EVOLUTION : JOURNAL OF MOLECULAR EPIDEMIOLOGY AND EVOLUTIONARY GENETICS IN INFECTIOUS DISEASES 2024; 123:105638. [PMID: 39002873 DOI: 10.1016/j.meegid.2024.105638] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Track Full Text] [Subscribe] [Scholar Register] [Received: 04/17/2024] [Revised: 06/11/2024] [Accepted: 07/07/2024] [Indexed: 07/15/2024]
Abstract
Understanding the prevalence and distribution of CRISPR-Cas systems across different strains can illuminate the ecological and evolutionary dynamics of Clostridium botulinum populations. In this study, we conducted genome mining to characterize the CRISPR-Cas systems of C. botulinum strains. Our analysis involved retrieving complete genome sequences of these strains and assessing the diversity, prevalence, and evolution of their CRISPR-Cas systems. Subsequently, we performed an analysis of homology in spacer sequences from identified CRISPR arrays to investigate and characterize the range of targeted phages and plasmids. Additionally, we investigated the evolutionary trajectory of C. botulinum strains under selective pressures from foreign invasive DNA. Our findings revealed that 306 strains possessed complete CRISPR-Cas structures, comprising 58% of the studied C. botulinum strains. Secondary structure prediction of consensus repeats indicated that subtype II-C, with longer stems compared to subtypes ID and IB, tended to form more stable RNA secondary structures. Moreover, protospacer motif analysis demonstrated that strains with subtype IB CRISPR-Cas systems exhibited 5'-CGG-3', 5'-CC-3', and 5'-CAT-3' motifs in the 3' flanking regions of protospacers. The diversity observed in CRISPR-Cas systems indicated their classification into subtypes IB, ID, II-C, III-B, and III-D. Furthermore, our results showed that systems with subtype ID and III-D frequently harbored similar spacer patterns. Moreover, analysis of spacer sequences homology with phage and prophage genomes highlighted the specific activities exhibited by subtype IB and III-B against phages and plasmids, providing valuable insights into the functional specialization within these systems.
Collapse
Affiliation(s)
- Naiymeh Sheykholeslami
- Department of Food Hygiene, Faculty of Veterinary Medicine, Tabriz Medical Sciences, Islamic Azad University, Tabriz, Iran
| | - Hamid Mirzaei
- Department of Food Hygiene, Faculty of Veterinary Medicine, Tabriz Medical Sciences, Islamic Azad University, Tabriz, Iran; Department of food Biotechnology, Biotechnology Research Center, Tabriz Branch, Islamic Azad University, Tabriz, Iran.
| | - Yousef Nami
- Department of Food Biotechnology, Branch for Northwest & West Region, Agricultural Biotechnology Research, Institute of Iran, Agricultural Research, Education and Extension Organization (AREEO), Tabriz, Iran.
| | - Jalil Khandaghi
- Department of food Biotechnology, Biotechnology Research Center, Tabriz Branch, Islamic Azad University, Tabriz, Iran; Department of Food Science and Technology, Sarab Branch, Islamic Azad University, Sarab, Iran
| | - Afshin Javadi
- Department of Food Hygiene, Faculty of Veterinary Medicine, Tabriz Medical Sciences, Islamic Azad University, Tabriz, Iran; Department of food Biotechnology, Biotechnology Research Center, Tabriz Branch, Islamic Azad University, Tabriz, Iran
| |
Collapse
|
5
|
Ghaffarian S, Panahi B. Occurrence and diversity pattern of CRISPR-Cas systems in Acetobacter genus provides insights on adaptive defense mechanisms against to invasive DNAs. Front Microbiol 2024; 15:1357156. [PMID: 39056004 PMCID: PMC11270541 DOI: 10.3389/fmicb.2024.1357156] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/17/2023] [Accepted: 07/02/2024] [Indexed: 07/28/2024] Open
Abstract
The Acetobacter genus is primarily known for its significance in acetic acid production and its application in various industrial processes. This study aimed to shed light on the prevalence, diversity, and functional implications of CRISPR-Cas systems in the Acetobacter genus using a genome mining approach. The investigation analyzed the CRISPR-Cas architectures and components of 34 Acetobacter species, as well as the evolutionary strategies employed by these bacteria in response to phage invasion and foreign DNA. Furthermore, phylogenetic analysis based on CAS1 protein sequences was performed to gain insights into the evolutionary relationships among Acetobacter strains, with an emphasis on the potential of this protein for genotyping purposes. The results showed that 15 species had orphan, while20 species had complete CRISPR-Cas systems, resulting in an occurrence rate of 38% for complete systems in Acetobacter strains. The predicted complete CRISPR-Cas systems were categorized into I-C, I-F, I-E, and II-C subtypes, with subtype I-E being the most prevalent in Acetobacter. Additionally, spacer homology analysis revealed against such the dynamic interaction between Acetobacter strains and foreign invasive DNAs, emphasizing the pivotal role of CRISPR-Cas systems in defending against such invasions. Furthermore, the investigation of the secondary structures of CRISPR arrays revealed the conserved patterns within subtypes despite variations in repeat sequences. The exploration of protospacer adjacent motifs (PAMs) identified distinct recognition motifs in the flanking regions of protospacers. In conclusion, this research not only contributes to the growing body of knowledge on CRISPR-Cas systems but also establishes a foundation for future studies on the adaptive defense mechanisms of Acetobacter. The findings provide valuable insights into the intricate interplay between bacteria and phages, with implications for industrial applications and potential biotechnological advancements.
Collapse
Affiliation(s)
- Sara Ghaffarian
- Department of Cellular and Molecular Biology, Faculty of Sciences, Azarbaijan Shahid Madani University, Tabriz, Iran
| | - Bahman Panahi
- Department of Genomics, Branch for Northwest & West region, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research, Education and Extension Organization (AREEO), Tabriz, Iran
| |
Collapse
|
6
|
Panahi B, Dehganzad B, Nami Y. CRISPR-Cas systems feature and targeting phages diversity in Lacticaseibacillus rhamnosus strains. Front Microbiol 2023; 14:1281307. [PMID: 38125580 PMCID: PMC10731254 DOI: 10.3389/fmicb.2023.1281307] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/22/2023] [Accepted: 11/20/2023] [Indexed: 12/23/2023] Open
Abstract
One of the most important adaptive immune systems in bacteria against phages is clustered regularly interspaced short palindromic repeats (CRISPR) and CRISPR-associated (CAS) genes. In this investigation, an approach based on genome mining was employed to characterize the CRISPR-Cas systems of Lacticaseibacillus rhamnosus strains. The analysis involved retrieving complete genome sequences of L. rhamnosus strains, and assessing the diversity, prevalence, and evolution of their CRISPR-Cas systems. Following this, an analysis of homology in spacer sequences from identified CRISPR arrays was carried out to investigate and characterize the range of target phages. The findings revealed that 106 strains possessed valid CRISPR-Cas structures (comprising CRISPR loci and Cas genes), constituting 45% of the examined L. rhamnosus strains. The diversity observed in the CRISPR-Cas systems indicated that all identified systems belonged to subtype II-A. Analyzing the homology of spacer sequences with phage and prophage genomes discovered that strains possessing only CRISPR-Cas subtype II targeted a broader spectrum of foreign phages. In summary, this study suggests that while there is not significant diversity among the CRISPR-Cas systems identified in L. rhamnosus strains, there exists notable variation in subtype II-A systems between L. rhamnosus and other lactobacilli. The diverse nature of these CRISPR-Cas systems underscores their natural activity and importance in adaptive immunity.
Collapse
Affiliation(s)
- Bahman Panahi
- Department of Genomics, Branch for Northwest and West Region, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research, Education and Extension Organization (AREEO), Tabriz, Iran
| | - Behnaz Dehganzad
- Department of Natural Sciences, University of Tabriz, Tabriz, Iran
| | - Yousef Nami
- Department of Food Biotechnology, Branch for Northwest and West Region, Agricultural Biotechnology Research Institute of Iran (ABRII), Agricultural Research, Education and Extension Organization (AREEO), Tabriz, Iran
| |
Collapse
|